Digital Breadcrumbs of Brothers Grimm


             Find our data-set on Github HERE

This research project concerns itself with Brothers Grimm, two notable and influential figures in both international literature and in the history of Göttingen1 . Primarily known for their large collection of folk tales, the brothers also wrote thousands of letters documenting their personal and professional lives, thus giving us a unique insight into the development of their textual legacy. This legacy is at the heart of our research, whose focus on text reuse necessarily implies parallel investigations into authorship, style, variant analysis and networks in order to fully appreciate the texts’ complexity and impact on society.

The name of our project refers to the traces of the Brothers Grimm’s tales that we can find in other countries and cultures  – many of which are now stored online in archives, catalogues, websites and articles! Starting from the Brother’s household tales, it is our aim to follow these traces towards answering our research questions, just like Hansel and Gretel followed the Breadcrumbs towards home!

Research data-set

Described as “a great monument to European literature” (David and David, 1964, p. 180),2 Jacob and Wilhelm Grimm’s masterpiece Kinder- und Hausmärchen has captured adult and child imagination for over 200 years. International cinema, literature and folklore have borrowed and adapted the brothers’ fairy tales in multifarious ways, inspiring themes and characters in numerous cultures and languages.3

Despite being responsible for their mainstream circulation, the brothers were not the minds behind all fairy tales. Indeed, Jacob and Wilhelm themselves collected and adapted their stories from earlier written and oral traditions, some of them dating back to as far as the seventh century BC, and made numerous changes to their own collection (ibid., p. 183) producing seven distinct editions between 1812 and 1857.

The same tale often appears in different forms and versions across cultures and time, making it an interesting case-study for  textual and cross-lingual comparisons. Is it possible to compare the Grimm brothers’ Snow White and the Seven Dwarves to Pushkin’s Tale of the Dead Princess and the Seven Nights? Can we compare the Grimm brothers’ version of Cinderella to Charles Perrault’s Cinderella? In order to do so it is crucial to find those elements that both tales have in common. Essentially, one must find those measurable primitives that, if present in a high number – and in a similar manner – in both texts, make the stories comparable. We identify these primitives as the motifs of a tale. Prince’s Dictionary of Narratology describes motifs as “..minimal thematic unit[s]”, 4 which can be recorded and have been recorded in the Thompson Motif-index.5 Hans-Jörg Uther, who expanded Aarne-Thompson classification system  (AT number system) in 2004 defined a motif as:

“…a broad definition that enables it to be used as a basis for literary and ethnological research. It is a narrative unit, and as such is subject to a dynamic that determines with which other motifs it can be combined. Thus motifs constitute the basic building blocks of narratives.” (Uther, 2004)

Research Questions

Our research rests on two questions:

  1. How can we detect motifs,  our text reuse units, across languages and versions?
  2. How does the human mind identify a motif in context? Can machines do the same?

Case Study

To answer these questions, three Grimm tales have been selected for the project6: Snow White, The Fisherman and his Wife and Puss in Boots.7 For each, we identify the motifs and their parallels in a sample number of tale collections. For example, did you know that while in the German Schneewittchen the young and beautiful Snow White finds refuge among the seven dwarves, in the Russian version she stays with seven knights? And that in the Italian La Bella Venezia, the young girl finds protection among thieves? In Albania among forty dragons? In Armenia with a sleeping prince in a house tucked away in the woods?

Our annotations will serve as the training data needed to train the tools we will use, TRACER8, the Google Custom Search (online approach)9 and Apache Lucene (offline approach)10, to automatically detect motifs in other folktale traditions.

Moreover, and in collaboration with Thierry Declerck, multilingual motifs collected will be transformed into SKOS XL for the inclusion in existing ontological research and resources.

Research value

In a 2016 article, expert folklorist Timothy Tangherlini pointed out that:

“Over the course of the past decade […] the size and scope of digital archives of folklore have exploded, and the magnitude of digital materials available for folkloristic consideration has increased exponentially.” (Tangherlini, 2016, p. 5).
and that:
“We are in the very early days of working computationally with rich folklore resources […].” (Tangherlini, 2016, p. 10).

This project is working with these rich and large materials and advancing two (1 and 4 below) of the four research areas of Computational Folkloristics Tangherlini outlined in 2013: 1) collecting and archiving, (2) indexing and classifying, (3) visualization and navigation, and (4) analysis. By manually collecting multilingual tales we can train our software to automatically detect motifs as text reuse units at scale and across languages, thus contributing both humanistic and computational experience to the rapidly evolving field of Computational Folkloristics. Our research does not presume to draw expert conclusions on the intertextual transmission of folklore but, rather, to support foklorists and literary scholars with this type of study by automating the information retrieval process.

Project information

Project duration: October 2015-December 2018.
Team: Emily Franzini, Greta Franzini, Gabriela Rotari, Melina Jander, Marco Büchler, Mahdi Solhdoust, Franziska Pannach
Consultant: Thierry Declerck.

Project presentations

Updates and existing results of the project have been presented as:

  • A conference presentation at the Digital Humanities in Estonia A° 2015: Conference on translingual and transcultural digital humanities in Tartu, Estonia;
  • An invited talk at the Généalogies et intertextualités numérique: L’atelier du Bodmer Lab in Geneva, Switzerland;
  • An invited talk at the Digital Scholarship Seminar at the National University of Ireland in Galway, Ireland;
  • A poster at the Digital Humanities 2016 Conference in Kraków, Poland;
  • A panel presentation at the Digital Humanities 2016 Conference in Kraków, Poland.

For more information about these presentations and to download all the materials, please visit our Output page.


1. Jacob and Wilhelm were active in Göttingen between 1825-1838. Jacob was employed as Professor and Head Librarian, and Wilhelm as Professor. They left Göttingen for refusing to sign a civil oath to the king. In financial difficulty, they moved to nearby Kassel.
2. David, A., David, M. E. (1964) ‘A Literary Approach to the Brothers Grimm’, Journal of the Folklore Institute, 1(3), pp. 180-196 [Online]. Available at: (Accessed: 26 October 2015).
3. In December 2015, e-codices released a digitised copy of the oldest surviving handwritten manuscript of the Kinder- und Hausmärchen. You can view it at: (Accessed: 21 January 2016).

4. Read the full entry at: (Accessed: 21 January 2016).
5. Available at: (Accessed: 21 January 2016).
6. From the 1812 edition of the Kinder- und Hausmärchen.
7. Schneewittchen, Vom Fischer und seiner Frau and Der gestiefelte Kater respectively.
8. See:
9. Available at: (Accessed: 17 June 2016).
10. Available at: (Accessed: 17 June 2016).