Category Archives: Activity

TRACER tutorial, Göttingen, May 2017!

We’re excited to announce that eTRAP will be giving its next text reuse tutorial as a pre-conference workshop of the Datech International Conference being held in Göttingen, Germany!

The tutorial will run on 30th May at the Historical Library Building (“Vortragsraum”, Papendiek 14, first floor) of the University of Göttingen.

The tutorial builds on eTRAP’s research activities, most of which deploy our TRACER machine. TRACER is a suite of algorithms aimed at investigating text reuse in different corpora, be those prose, poetry, in Italian, Latin, Ancient Greek or medieval German. TRACER provides researchers with statistical information about the texts germany-652967_1280under investigation and its integrated reuse visualiser, TRAViz, displays the reuses in a more readable format for further study.

This tutorial is for anyone wishing to independently understand, use and run TRACER on his/her own data. For the purpose of the tutorial, participants will initially be working on an English data-set provided by eTRAP. Depending on the overall progress, we may also allocate some time for investigating the participants’ own data-sets! For more information about previous editions of this tutorial, visit our Events page.

If you’re interested in exploring text reuse between two or multiple texts (in the same language) and would like to learn how to do it semi-automatically, then this tutorial is for you! In order to provide everyone with adequate (technical) assistance, the workshop can only accommodate 15 participants. To apply to the tutorial, please send a short CV and a brief motivation letter to contact(at)etrap(dot)eu by 30th April 2017. Those accepted will have to register for the conference at http://ddays.digitisation.eu/registration/

In summary:
WHAT: TRACER tutorial for computational text reuse detection
WHEN: 30th May 2017, 9am-6pm
WHERE: GCDH, Seminar Room 1 (ground floor), Heyne Haus, Papendiek 16, 37073 Göttingen, Germany
WHO: For humanists and computer scientists alike who bring their own laptop
HOW MANY: Maximum of 15 participants
HOW: You may attend by applying to the email address provided and then registering to the conference. Registration to the conference is necessary for attending the workshop.  There will be an extra charge of €50 for catering at the workshop and to receive the conference pack
LANGUAGE: The workshop will be in English, with assistance in German should it be necessary
OTHER: You will receive very clear instructions on what to bring and prepare before the workshop

We look forward to seeing you in Göttingen!

Transkribus: A User Report

Melina Jander, an eTRAP Research Assistant, has written a short user report on our experience with the Handwritten Text Recognition (HTR) tool Transkribus. We’re currently using Transkribus as part of our pilot project TrAIN (Tracing Authorship In Noise), which aims at defining the noise-threshold that affects computational analyses on HTR’d and OCR’d texts . How much noise do we have to correct? How much can we leave in?

The report describes progress made thus far. A second report will be published in 2017 to report on Transkribus‘ automation process on our data.

You can download the report from our Output page.

2016-11-04 Update: The Transkribus website advertises our user report here.

REFLECTING on the recent Digital Humanities Hackathon on Text Re-Use – “Don’t leave your data problems at home!”

The Hackathon week is over and looking back on it the eTRAP team agrees…it was a hit!
23 participants from 15 different institutions and 8 countries hacking away at research questions on their laptops to achieve the same goal, albeit with different datasets. And the goal was achieved. Our hackers were humanists with a desire to find textual reuses across different works of the same author or across several authors from different times and locations. They brought data in English, German, Latin, Sanskrit, Hebrew and even Arabic and Estonian, spanning across many genres – from folkloristic poetry, to narratives and letters, from lists of citations to biblical texts. From day one they were led by computer scientist and leader of eTRAP, Marco Büchler, through each of the six steps required by the TRACER tool (1) to perform scans of the texts in search of reuse. By using the command line like pros, hackers preprocessed their data and set the parameters they needed to guarantee the most informative outcome. The week culminated with a tutorial on TRAViz (2), an open source variant graph visualisation tool created and presented by Stefan Jänicke (3), which allows users to create a swish visualisation with the results yielded by the TRACER tool.

Continue reading

GDDH 2015: Conclusions

As the first series of the Göttingen Dialog in Digital Humanities (GDDH) has just come to a close (sob!), it’s time for us to take a few minutes to reflect on its outcome and on the things we’d like to bring to the next series.

GDDH turned out to be a great success! We did not only accept 14 full papers from 11 institutions in 5 countries, but have secured a deal with Digital Humanities Quarterly to publish each contribution in a special issue. The series touched upon numerous different fields, joint by the thread that is Digital Humanities: Digital Classics, Topic Modelling, Text Visualisation, Digital Editions, 3D Motion Capture, Social Networks, Television Media, Web History, Digital Collections, Geographic Information Systems and Text Mining… (*catches breath*) WOW! We’re also currently busy evaluating the best paper and presentation – the winner, who will receive a 500€ cash prize, will be announced very soon.

GDDH_stats_map
GDDH 2015 speakers: dots correspond to affiliations of speakers; dot colour represents gender. [Click the image to view the interactive version, where you can find more detailed information].

Continue reading