Opening Session on Friday (25/03/2011)

See more photos from the day

Opening Session of THATCamp Melbourne; people proposing sessions

Posted in Uncategorized | Leave a comment

Blog posts and reports

ThatCamp Melbourne has come to a close but the knowledge shared and the great work produced shouldn’t stay ‘indoors’. In the spirit of linked open data please post your learnings, ruminations, reports, session links here so we can build on the ThatCamp experience. I have taken the liberty of kicking off with a clarion call of my own:

The Great PROV Data Challenge was an absolute ripper with all teams producing some brilliant work. We’d love to share your entries with the world so if anyone out there from The Kelly Gang, The Otway Rangers and The Grave Diggers would like to add a link to their hack in reply to this post we’d be really grateful!

Posted in #thatcamp, Uncategorized | Leave a comment

Defining Collections in a Digital Research Archive.

Rachel Wilson and I would like to take up a discussion around what happens to our ideas about off-line collections when they take life online. This is something that we are nutting out as part of our work on The Screen Media Research Archive (an ANDS-funded project that aims to establish a national management standard for the digital archiving of higher education screen media research produced in a variety of formats). We would like to look at some of the underlying conceptual challenges for defining ‘collections’ in this context, including: What protocols already exist for defining collections by provenance, materiality or subject and how do these apply (or not) to a research archive? Are donated collections static by definition? What level of granularity applies to collections within collections? Are there a minimum/maximum number of items required to form a collection? Should collections describe items that are not currently included (that have been de-selected for example)? What value is there in defining collections in terms of their users rather than the items per se?

Posted in Uncategorized | Leave a comment

Technology and the Humanities – A Theoretical and Practical Meta-Discussion

It is uncontroversial to claim that the non-human is efficacious, capable of effecting change. Not merely the non-human, but the non-living is efficacious in this way. Indeed, we count among its effects. Neither is it controversial that we are changed by the technology we use. What may be controversial though, is ‘how?’

Perhaps we need to begin by expanding the notion of technology itself. Might we for instance regard the eukaryotic revolution (when bacteria-like procaryotes invaded the membranes of other prokaryotes creating the first multicellular organisms) as a technological moment, when one entity becomes augmented by its use/relationship of another? On the other hand, making the concept so inclusive may render it redundant.

Perhaps then the focus should be not on making the concept of technology more inclusive in an expansive sense, but rather more inclusive in an intensive sense – expanding the notion of technological efficacy within technology itself, as we already understand it to be, a (mostly) human thing or process. How might we intensify our conception of technology in this way? Firstly, by trying to think the details; the how of the RECIPROCITY between ourselves and the technology we create and employ, and Secondly, by trying to think how this technology can PROLIFERATE and CHANGE beyond our actual intentions.

RECIPROCITY can be thought of in a simple ‘man and his tools’ example. The man who, using a hammer to beat a length of metal into shape, is fundamentally different to the man without such an implement. This man is augmented by his implement: materially and mentally. There occur changes in the man himself beyond the changes he intends to effect in the length of metal. The implement itself is augmented in becoming an implement – a length of wood with a steel weight attached, it is now a hammer. The implications of reciprocity are further intensified when we come to the subject of machines as opposed to simple tools, and even further, computers and (the possibility, actuality of?) intelligent machines.

Questions of PROLIFERATION and CHANGE necessarily entail the way technology exists as an idea. How can the ideas of technology in general, as well as the ideas corresponding to particular technologies, change independently of human intention? This is perhaps a more difficult (and thus more contentious) area to consider. We might consider the concept of memes, replicating units of cultural information (including technological information) that spread independently of human intention in the minds of human beings, as well as through human media (books and the internet to use just two examples). Memes represent an attempt to think about the development of human culture in evolutionary terms. Thus, some memes seem to be fitter than others, better adapted for surviving in a given environment, and so proliferate more successfully than others, thereby altering/ augmenting that environment in what might be regarded as a technological fashion. Again, when we come to consider computers, networks and the possibility of intelligent machines, the implications of these ideas become radically intensified. Can we think of technology as operating in this way?

I propose a two part structure for this session. In the first half, I propose that we discuss the ideas sketched above. Are they useful for thinking about technology? Can we make adjustments, hone and improve them? In the second half I propose applying the results of the first discussion to the work and research interest of the particular ‘unconferencers’ present: how do these reconceptions of technology impact on the technology they are employing in their own research? And how does this in turn affect the relationship between this technology and the area of the humanities they are engaged in?

Posted in Uncategorized | Tagged | Leave a comment

Merging space and time – in GoogleEarth and ‘TemporalEarth’

History and archaeology are intimately connected with questions of what happened “where” and “when”, but traditional paper-based representations (eg. maps and timelines) are quite restrictive in what they can represent.

I’m interested in sharing skills and ideas with others on representing historical processes in space and time, by linking dynamic maps and animations with an interactive timeline.  I’ll demonstrate the existing time-based capabilities of GoogleEarth and its authoring language KML – which have some exciting possibilities but also significant limitations.  I’ll also introduce my own project, TemporalEarth, which pushes the technical boundaries to help discipline-experts author representations on topics in history, archaeology, geology and related disciplines.  My initial prototype, SahulTime, provides some compelling ideas for how such methods could be put to use, and will provide a good basis for further discussion.

Ultimately, I hope to grow the project to become an explorable, comprehensive set of resources relating to Australia’s history on all timescales and spatial scales.  So I wonder if anyone would be interested getting together to research/develop a visualisation covering their area of expertise.  Possible examples: archaeological occupation sites, tribal boundaries, trading routes, Australian explorers, convict settlements, pastoral settlement, Aboriginal-European conflict, reserves/mission stations, the Gold Rush, urban development, railways, family histories, locational histories, artwork histories… basically anything that needs to be understood using both space and time.

Posted in Session Proposal | Leave a comment

Load my laptop

I mentioned in my Bootcamp session outline that if people wanted to play along it would be good to have a web server stack installed. This sounds much scarier than it is is, and it really gives you the opportunity to start playing with a variety of web technologies in the safety of your own computer. So I was thinking that perhaps it would be good to have a ‘load my laptop’ session where folks can bring along their computers and other folks (hopefully not just me) can help them install web servers, databases, programming languages, editors, version control etc etc.

Anyone interested?

Posted in Session Proposal | 1 Comment

Can I propose a second session?

I propose we design a digital humanities unit from scratch. We should profile the staff and resources and the funding methods. We might even get cute and come up with a few fictional project descriptions that tell the story of how we came to be involved, how we thought through the designs and implementation and what sort of problems we encountered. The nice thing about doing this as a group design exercise is that the resulting account wont sound like one person winging about the problems they have experienced with a particular institution but instead may end up being quite a good descriptive document that will help us better define the actual on the ground processes of doing digital humanities.

Posted in Uncategorized | 1 Comment

Session – Political Issues Analysis System using iFish

This session will be used to explore the evolving applications of online political communication strategies and tools in order to understand the ‘deliberative’ processes of users. The session will be related to an ongoing project that deals with ‘political issues’ such as health, the economy, and environment and how users discover and make sense of these issues. 

Emerging research suggests that the Internet’s capacity to easily produce information has also led to data overload, undermining its deliberative potential.  With the advent of the National Broadband Network the ‘data deluge’ promises to intensify increasing the need for political information—in its various guises—to be delivered in much more meaningful ways.

We’ve partially developed one tool to be applied and extended we call ‘Poli-Fish’. It aggregates and rates the policies of registered Australian political parties (from their web sites) and displays them within various political spectrums. Through sliders, the user can explore a broad spectrum of policies along with the associated political parties. The prototype can be seen in action here:

http://disweb.dis.unimelb.edu.au/projects/ifish/poliFish/gFish.swf

The approach we are attempting is both innovative and unique because it combines the theoretical understandings of Politics and Media Studies with the technical proficiency of Humanities Computing, eDemocracy and Information Systems to expose important issues of online political information to critique in ways that were previously unavailable. This all raises various discussion points for the session on the dichotomy between the availability of government and other data sources and effective online deliberative design. The work may open up theoretical and technological pathways towards a more genuinely identifiable (and sustainable) online political engagement.

The Poli-Fish application is an instance of another project called iFish. The iFish project addresses the challenging problems of using online environments to offer advice relating to complex sets of data. The challenge is to maintain a user’s engagement in the system long enough to explore possible outcomes in a system that is not completely deterministic and offers no single right answer. Our particular research interest in using an engaging, affective interface to attract and maintain a person’s attention, whilst at the same time trying to keep their focus on the task presented. Hence we want to encourage exploration with mind on task. More information on the iFish project can be found here:

http://disweb.dis.unimelb.edu.au/projects/ifish/

 

Posted in Uncategorized | 1 Comment

Proposal to develop PhD Coursework in Digital Methods

I understand that in the medium-term future all students undertaking a PhD through the Arts Faculty will be required to complete coursework in addition to their thesis. Calls have been made for suggestions for suitable courses.

My proposed session aims to discuss options for coursework in digital methods for the Arts and Humanities, and if possible set up collaborations to keep the project moving forward.

In the session we might begin to discuss which e-methods, which applications, and which skills are most appropriate to expose PhD students to, which pedagogical methods are best for this, how a curriculum might be structured, what outcomes we would want for our students, and so on.

My initial thinking on this is to offer two short courses….

1. A Critical Survey of Digital Methods in the Humanities and Social Sciences

The purpose of this short course is alert students to the range of electronic methods available to scholars for the purposes of document and data capture, collaboration and communication, data analysis, publishing and dissemination, data structure and enhancement, practice-led research, and research strategy and project management. Having completed this survey, students will be in a position to critically assess the value of particular digital methods for their own research work. The course may be offered largely online.

2. Applied Digital Methods in the Humanities and Social Sciences

The purpose of this short course is to enable students to gather practical expertise in the application of particular digital methods. It is anticipated that students will have previously completed “A Critical Survey of Digital Methods in the Humanities and Social Sciences” and that groups of interested students are clustering around particular applications. The course may be offered in intensive mode.

A few examples of the resources and tools we can draw on are here…

Posted in Session Proposal | Leave a comment

Fun with Trove Newspapers

The NLA’s Trove Newspapers database is a magnificent resource for digital history, but it’s currently not very easy to do detailed analysis of content. I’ve been working on a few tools which make this easier and I’d be interested in giving a bit of a how-to session to explain them and the technologies they use and kick off a ‘what next’ discussion.

At the base of my tools is a screen-scraper which, in the absence of an official API, retrieves article information in machine-readable form. I’ve used this to create a harvester, which you can use to dump the results of your search to a CSV file for further analysis. It can also retrieve text and pdf versions of the articles. You can read more about the havester on my blog.

To provide a more quick and dirty picture of search results over time, I’ve created another little harvest tool that gets the number of results for each year and calculates what proportion this is of total articles (in Trove) for that year. I’ve used this to create a few interactive graphs as a demonstration.

I’ve also used the scraper to create my own ‘unofficial’ Newspapers API on Google App Engine. I put this up to encourage people to have a play and think about the possibilities.

Depending on people’s interests I’m happy to walk through the development of these tools as well as describing how to use them.

But I’m also very interested in talking about further possibilities for analysing and visualising the results. I’ve been playing around with VoyeurTools, but I wonder what other ideas people have. What about comparisons with other data sources like Google’s ngrams? What other tools do we need? What other databases should we mine?

And lastly it would be good to talk about how we engage directly with the NLA to build a community of digital researchers and help encourage the development of tools like APIs.

Posted in Session Proposal | Tagged , , , | 1 Comment

Socializing Scholarly data- reputation management and other issues for affectual design

Our project, Design and Art of Australia Online http://blogs.unsw.edu.au/daao/ is opening up contributions to the database to larger project community.  Among other things we will developing communities around different spheres of scholarly research through  social media style tools, conventions and practices. We will also be using some classic crowdsourcing techniques for datawashing and sub-editorial style projects. In other words, we will traversing a whole range of different cohorts with different needs and expectations of the site.

If we want lots of people to input quality data, we’ve been working hard on thinking about how to make the experience of using our site ‘rewarding’  ( as well as pleasurable). We’ve done a lot of work on user experience  of data entry logical and pleasurable, but now we are turning our heads to the issues of community moderation system in a mixed cohort setting (has to generate value to academic, glam and ‘amateur’ research communities) and have been thinking about how to manage social reputation and activity systems ( ie badges etc).

We’ve been doing quite a bit work researching existing scholarly crowdsourcing projects, like the Zooniverse  Science Projects  http://www.zooniverse.org/home and of course the NLA’s newspaper projects etc. We’ve also been looking at badging reward systems like Foursquare etc. to see what productive ( if any) analogies could be made that produce any type of value for contributors.

Anyone interested in having a session on how you manage the affectual and reputational side of data? How to link permissions to reputations? How to give value to immaterial labour of online scholarly practices? Bluntly put, how to design ‘win- win’ into the front end of content hungry database? Practical focus: specifically looking at badging and reputation and permission systems.

Really happy to share or negotiate session with anyone who has similar problems to solve.

 

Posted in Uncategorized | 1 Comment

ADB Mining

I’d like to propose a session that aims to produce (or failing that, define) a tool that will take in an Australian historical text and locate within this text all the mentions of people named in the Australia Dictionary of Biography (ADB).

Annotation in the Dictionary of Sydney


For each person referenced I would like to create a metadata record for the corresponding ADB entry in my own database and link this to an annotation record pointing to the mention in the text. Clearly it follows that if a given person is mentioned multiple times then multiple annotation records should be created all pointing to a single ADB metadata record.

I’d ideally like to be able to scan the text and have it OCRed into TEI first but I think that would be a bit ambitious. I wanted to at least to mention this as a reminder that ultimately the process should be input agnostic – one should as easily load up a newly created Word document as a scan of a historical manuscript.

I’ve hacked around over the years with offset markup using TEI source files in Heurist. We render TEI documents as XHTML using Cocoon, allow the user to select passages of text then create an annotation record which records the annotated string, the source text, the location within the text and an optional pointer to another record – which could be an ADB entry metadata record. The Dictionary of Sydney project in particular makes extensive use of this data structure with mentions of famous Sydneysiders within text entries pointing – via annotations – to person entity records. As often as not these then reference the ADB entry.

This tool would greatly simplify the current workflow of the Dictionary of Sydney and many of the projects I am involved with would also benefit from this process. The Dictionary of Sydney however relies on a large team of volunteers and a pretty active editorial team to create annotations “manually”. It would be great to have a semi automated procedure – possibly based around a bunch of clever XSL transforms and an XML feed from the ADB – that would enable me to connect a given research database with the ADB and indeed beyond (to paraphrase Buzz Lightyear).

Posted in Session Proposal | 2 Comments

Mapping concepts in time and space

I am interested in how to develop a tool that would be able to map the major concepts of a discipline and then to be able to link these concepts across disciplines.  The trick is to be able to map data from discourse – so that individual ideas about concepts, which are rich and varied, can be shown.  I have two (dream) versions of what this might result in. The first is a game of word association that allows someone to pick multiple words and results in a string of words that can be explained in more depth. The second is a more complex reckoning of this that might involve a sort of discourse analysis wherein the literature provides definitions of concepts and links between concepts.

An example might be Archival Science and the concept of the archive. There are many different ways this concept has been defined over the years – and this is closely related to culture and society. In the recent past there has been some very significant works published that talks about the archive – and with which other words such as power, responsibility, construction and so on might be relevant. Additionally, Australian concepts of the archive are quite a bit different to our American counterparts. Furthermore, people in other disciplines, even those as close as Library Science, or Knowledge Management, define the archive as something else again (or do they?).

I see this concept mapping as a research resource, as well as communications tool. I like the idea of people who are the current ‘big thinkers’ in the discipline adding their own definitions about concepts. I would also think it would be great to get some cool visualisations out of it.

Posted in Session Proposal | Leave a comment

Session proposal – time aligned text and media

Media provides the basis for much Humanities research. In Linguistics, field recordings are transcribed and then used as the proof for linguistic analysis. Access to media and provision of online versions of media and transcripts has become much easier with the release of HTML5 – no more streaming servers, no more flash. How can we use these technologies and what are the implications for the way we present the results of our research? I will be illustrating an example we have worked on recently, called EOPAS. Time-codes within media can be called from ordinary html pages so a dictionary can now have citation forms of headwords called from within large media files rather than having several thousand mp3 files on the server. This session will toss around ideas about using such media/text installations.

Posted in Uncategorized | Leave a comment