Rather than merely scanning text for politeness, as past computational linguistics methods have, this one actually changes directives or requests that use either impolite or neutral language by restructuring them or adding words to make them more well-mannered. "Say that more politely", for instance, might become "Could you please say that more politely?"
Apparently politeness is not about using words or phrases such as "please" and "thank you" it means making language a bit less direct, so that instead of saying "you should do X", the sentence becomes something like "let us do X."
Goodness knows why lettuce is more polite.
Shrimai Prabhumoye, a doctoral student at CMU's Language Technologies Institute and her colleagues -- Aman Madaan, Amrith Setlur and Tanmay Parekh -- presented their research Monday at the Association for Computational Linguistics' annual meeting, which is being held virtually this week. They see many potential uses for their work, including emails and chatbots.
At the heart of their experiment is a dataset of 1.39 million sentences analysed for politeness and labeled with a politeness score. The team then developed a "tag and generate" approach, which identifies sentences that are outright impolite, or could just use a manners boost, and tweaks them with words and phrases Emily Post would be more approving of.
"Yes, go ahead and remove it" becomes "Yes, we can go ahead and remove it". Adding "we", the researchers explain, creates the sense that the burden of the request is shared by speaker and addressee.
"Not yet -- I'll try this weekend" becomes "Sorry, not yet -- I'll try to make sure this weekend", with the apology politely conveying that the requested action might be something of a burden.
These might seem like super-subtle changes. But as anyone who's ever puzzled over a text message knows, nuance can easily get lost in written communication, leading to misinterpretation.
While politeness plays a crucial role in social and professional interactions, standards of what it looks like vary from culture to culture, so for their work, the team focused on speakers of North American English in a formal setting.
The CMU team's dataset comes from a surprising, though rather appropriate, source: emails exchanged by employees at Enron, the Texas-based energy company at the center of a high-profile accounting fraud scandal that brought into question the accounting practices of many corporations. The researchers have released their "politeness transfer" dataset on Github so others interested in the topic can build on the work.