The project

In recent years, new technologies have favoured the development of forms of written language with features that bring them closer to speech: this is the case, for example, of messages typed on social networks and chats, which are distinguished from more traditional written forms by a more informal use of language, with the presence of emoticons and emoji, a peculiar use of punctuation and capitalization, etc.. More recently the spread of voice messages on instant messaging systems, such as WhatsApp or Telegram has allowed a return to speech, which finds space, in turn with peculiar characteristics, in writing. This begs the question: how do we speak when we write? How do we write when we speak? In other words, what are the characteristics of this “new Italian”?

The WhAP! project, conceived in 2020 by a group of students from the Department of Humanities, was created to answer these questions. The main goal is to build an online resource (the WhAP corpus) that includes written data (text chats) and spoken data (voice messages), also taking into account several parameters (e.g. age range and gender of the writers). The resource will allow to observe these hybrid interactions, in which spoken and written interact giving rise to new forms of conversation.

The corpus, still under construction, counts at the moment 58 chats (for about 22 thousand words) and 211 voice messages. From these data, it has already been possible to draw some initial results; for example, the frequent presence of languages other than Italian, such as English or Spanish, as well as various Italian dialects, from Lombard to Sardinian:

(1) 19/04/20, 23:13 – EM01: Easy easy, allora a posto!
(2) 08/05/20, 11:57 – EM01: Ti ricordi come si chiama il ristorante mega bueno dove siamo andate a mangiare con Renata e Franca da te?

In order to answer the starting questions, it will be necessary, on the one hand, to increase the data available and, on the other hand, to make it accessible to potentially interested people. The final goal is therefore to make the corpus, adequately anonymized, public and freely consultable, available both to specialists (for linguistic research) and, more generally, to people interested in phenomena concerning language.

Objectives that the project intends to realistically achieve with the money raised through this crowdfunding campaign
– Training of students in the creation and management of the resource (WhAP corpus)
– Implementation of the resource (creation and development of the interface, online publication, subsequent management)
– Organisation of a conference to present the resource
– Publication of a short guide to the use of social networks

Link :
Instagram: @corpus.whap

Project leader:


Goal: 10.000 Euro

Details of costs (by macro-items):

5.000 €

scholarships [processing and data entry].

2.500 €

events organisation

1.500 €

Computer technician [interface consultation and data addition].

1.000 €

small guide to the use of social networks [publication costs].


Media Gallery


Visualizza CV

Ilaria Fiorentini

| Founder

Ilaria Fiorentini

| Founder
Researcher in Linguistics and lecturer in Sociolinguistics and Pragmatics and Text Linguistics at the University of Pavia. Her main research topics include language contact, discourse markers, multilingualism in school contexts and online languages; on these she has published several articles in Italian and international scientific journals. She is the author of a monograph entitled Segnali di contatto. Italiano e ladino nelle valli del Trentino-Alto Adige (Milan, FrancoAngeli, 2017) and co-editor of the volumes La classe plurilingue (Bologna, BUP, 2020) and Building categories in interaction: linguistic resources at work (Amsterdam, John Benjamins, 2021).
Visualizza CV

Marco Forlano

Marco Forlano

Marco Forlano holds a degree in Theoretical and Applied Linguistics from the University of Pavia and is currently a PhD student in Linguistic Sciences at the Universities of Bergamo and Pavia. He is mainly interested in sociolinguistics, with particular attention to aspects related to plurilingualism and synchronic language contact.
Visualizza CV

Nicholas Nese

Nicholas Nese

PhD candidate in Linguistic Sciences at the University of Pavia and the University of Bergamo. His research project is aimed at investigating the acquisition of new phonological categories of Arabic LS in Italian-speaking learners. He obtained his LM in Linguistics at the University of Pavia with a thesis on socio-phonetic variation in a community of practice. A former student of Collegio Giasone del Maino, he is currently a tutor in General Linguistics. His research interests include sociolinguistics, forensic linguistics and language teaching.
Visualizza CV

Chiara Zanchi

Chiara Zanchi

Researcher in Linguistics at the University of Pavia, where she teaches Glottology (Humanities degree) and Laboratory of Linguistic Data Analysis (Master's degree in Theoretical and Applied Linguistics and Modern Languages). She obtained her PhD from the University of Pavia and the University of Bergamo in 2018 with a thesis on comparative morphosyntax on Vedic, Homeric Greek, Church Slavonic and Old Irish. She worked for four years as a research fellow in linguistics at the University of Pavia. She has carried out numerous periods of study, research and teaching abroad: in Salzburg (Austria), Jena (Germany), Columbus (OH, USA) and Madrid (Spain). Her main research interests are in the fields of Indo-European linguistics (with a focus on ancient Greek), pragmatics (of ancient and modern languages), cognitive linguistics and language resource construction.
Visualizza CV

Gruppo WhAP!

Gruppo WhAP!

Giulia Andreoli, Carmela Avagliano, Elisa Badone, Edoardo Bogni, Maddalena Bressler, Silvia Cangiano, Elena Clemente, Elena Didoni, Martina Fenini, Elena Fioraliso, Federica Fiore, Rebecca Gatti, Anna Landoni, Fabio Lentini, Giulia Marino, Barbara Mazzotta, Gloria Morsello, Rachele Oggionni, Francesca Padovan, Ilaria Rizza, Elisabetta Sala, Marta Sandri, Davide Siano, Greta Spagnoletti, Giulia Telari, Benedetta Testa, Francesca Torchio, Luisa Troncone, Alessandro Valeri, Martina Verdelli, Lucia Volpi. Studenti e studentesse delle lauree triennali in Lettere e in Lingue e della laurea magistrale in Linguistica dell’Università di Pavia.


  1. Didoni Elena
  2. Loddo Alessandra
  3. Sciandrone Luigi
  4. Spataro Martina
  5. Giribaldi Elena
  6. Forlano Marco
  7. EVENTO 31 MARZO 2023
  8. Ricca Elena
  9. Elena Fiordaliso
  10. RS
  11. De Paola Camilla Maria
  12. Bogni Edoardo
  13. Sciandrone Luigi
  14. Telari Giulia
  21. Franca Montrucchio
  22. Telari Jenny Maran Pier
  23. Falcone Alessio
  24. Bogni Edoardo
  25. Mazzotta Barbara
  26. Sanacore Daniele
  27. Baldini Alessia
  28. Fontana Martina Paola Maria
  29. Sbordoni Chiara
  30. Rebecca Gatti
  31. Nese Nicholas
Show All


Do you want to know more? Contact the Research Leader!

Your Name (required)

Your Email (required)


Your Message

I accept the terms and privacy policy.

This Campaign has ended. No more pledges can be made.