Building a system for automatically generating news in different domains out of structured data.

Lensing Media

Doaccelerate end of application
Doaccelerate end of application

Building a system for automatically generating news in different domains out of structured data.

Lensing Media

Lensing Media is a family-run publishing company from Dortmund. „Local networking“ connects all the companies of Lensing Media. More than 3,000 employees work for our family-owned company, now in its fourth generation. ogether we publish the daily newspapers Ruhr Nachrichten, Dorstener Zeitung, Halterner Zeitung and Münsterland Zeitung as well as market-leading advertising journals. We operate printing plants, mail services and trade journals and also are more focusing on our digital business.

1.

Challenge Introduction

Building a system for automatically generating news in different domains out of structured data.

2.

Challenge Details

The work of journalists has significantly changed over the last years as a result of the dawn of digitalization. The technological progress, the emergence of new narrative forms, the efficiency increment of automated workflows, and the availability of big data, has created a new form of journalism known as robot-journalism. Within this field the term NLG (natural language generation) refers to explicitly programmed systems, for example, to write news from structured data. Such systems are characterized by the fact that they do not need any human intervention after they have been programmed. However, the development of such a system is linked to deep know-how in different fields such as machine learning and cloud computing, just to mention few. We at Lensing Media aim to develop this kind of system to support our newsrooms in automatically writing news based on data coming from fields like sport or finance. To develop a mature NLG system that can be successfully brought into production, following requirements need to be fulfilled:

– Transparency: Understanding of how the text are generated
– Accuracy: No misleading facts
– Modifiability and Transferability: Transferability to other domains
– Natural flow of speech: The generated texts should be written as naturally as possible

Please note that the system must be developed in Python. Also, the underlying algorithms must be based on controllable end-to-end models such as gpt2. Template-based systems must not be used.

Lensing Media_Logo

Lensing Media
GmbH & Co. KG

Website