Word frequencies in the Digital Collection of Sources in the Works of the School of Salamanca.

The fundamental importance of the School of Salamanca for the early modern discourse about law, politics, religion, and ethics is widespread among of philosophers and legal historians. These early modern texts extend beyond the core authors, and serve to analyze the history of the Salamanca School’s origins and influence, as well as its internal discourse contexts within the context of the future dictionary entries.
Especially on the topic of dictionaries, the idea to test and explore our corpus with modern NLP applications came up in the project a long time ago. We often asked ourselves which lemmas or information we could find with help of a text analysis, and above all how complex this realization would be with our data. Thus, in 2021 we started a natural language processing task (word frequency distribution) by using the Python programming language to explore our corpus and establish groundwork for further text mining. Continue reading “Word frequencies in the Digital Collection of Sources in the Works of the School of Salamanca.”

The School of Salamanca Text Workflow: From the early modern print to TEI-All.

Since its beginning in 2013, the Salamanca Project has been developing a text editing workflow based on methods and practices for sustainable and scalable text processing. Sustainability in text processing encompasses not only reusability of the tools and methods developed and applied, but also long-term documentation and traceability of the development of the text data. This documentation will provide an important starting point for future research work. Moreover, the text preparation must be scalable, since the Digital Source Collection comprises a relatively large mass of texts for a full-text digital edition project: in total, it will involve more than 108,000 printed pages from early modern prints in Latin and Spanish, which must be edited in an efficient and at the same time quality-assured manner.

In the following, I will introduce the sequence of stages that each work in the Digital Source Collection goes through, from locating a suitable digitization template in a public library to metadata and the completion of a full text in TEI All format, enriched with the project’s specifications. Continue reading “The School of Salamanca Text Workflow: From the early modern print to TEI-All.”