This interview is part of a report on the workshop.
Can you explain what you mean by sentiment detection?
It refers to the emotions we can trace in a text. We can now detect those emotions by training machine learning algorithms. Using this technology, you can follow hate speech, for example. The whole endeavour of text mining attempts to give some knowledge of written language to computers.
Do you tend to investigate public online data such as social media posts?
We trawl through social media because that is where the most heated debates are happening right now. There we can explore a subject, or a user, and we can actually deduct personalities from social media. From all the training data available, an algorithm can scan unlabelled text and detect hidden emotions.
Apart from emotion detection, can you use the same technology to produce psychographs of different users?
In that case, you would need the assistance of psychologists. Building a personality profile is something different, because you have to combine the personality traits with the emotions and that is not always a linear relation, so although doable, it is more complicated. You cannot accomplish it as a linguist, the work is interdisciplinary. There are currently efforts underway to combine personalities with a specific writing style. Apart from the gender distinction, a problem that has been solved, researchers are trying to combine all the classical personality traits (extrovert, introvert, emotional etc.) through their writing style and find possible combinations.
What is the purpose of this research?
To build machine intelligence, a smart dialogue system that can respond to emotions. All dialogue is based on emotions; but what we are talking about is a challenge for researchers. If, for example, you want to track someone, you want to spot paedophile activity patterns and their grooming processes, or hate speech, you can enrich your algorithm with specific labels of emotional behaviour to have a better chance of tracking these people. Positive or negative labelling alone is not enough, that is merely the first level.
So machine learning can let you factor in the context in which a conversation is taking place?
Yes, it can better classify and filter the content. It is a very delicate process.
Are there basic threads in viral content in terms of its linguistic analysis?
I have seen is that emotion is affective. Not just positive, but also negative emotions. Visual stimulus is also strong.
When you are saying that you are observing emotion, how do you detect emotion when it’s not that obvious?
In our field, we have a ‘sentiment lexicon’, which includes not just words but also the various contexts of these words. It is a very complicated statistical thinking tool for algorithms to detect not only the word but also the context. As we say in linguistics, you know a word by the company it keeps. So if I have not only the words, but also the context, then I have a much richer environment to play with.
Is emotion more important than appealing visuals for virality?
If you have appealing visuals, you create emotions. The dopamine levels rise when you see a video. In our research we mainly consider the text, not the visuals.
Have you worked with other EU institutions?
Yes, I am currently building a start-up with a colleague who is an expert in computational stylometry. We are trying to build ‘authorship fingerprinting’ that can identify unique writing styles. We are also working on text analytics, to train Greek algorithms to track emotion in Greek language text. The ultimate goal is to link the personality to the writing style, but this, as I said, is a complicated process. Under the Horizon 2020 European research programme, we cooperate with a lot of institutions on the participation of young children in digital media and how kids react to online content. We also look at how we can use digital technology in education.