Text analytics in industry: Challenges, desiderata and trends

The recent decades have witnessed an unprecedented expansion in the volume of unstructured data in digital textual formats. Companies are now starting to recognize the potential economic value lying untapped in their text data repositories and sources, including external ones, such as social media platforms, and internal ones, such as safety reports and other company-specific document collections. Information extracted from these textual data sources is valuable for a range of enterprise application and for informed decision making. In this article we provide a systematic review of the current state of the art in the application of text analytics in industry. Our review is structured along three dimensions: the application context, the methods and techniques utilized, and the evaluation procedure. Based on the review, we identify the different challenges and constraints that an real-world, industrial environment imposes on text analytics techniques, as opposed to their deployment in more controlled, research environments. In addition, we formulate a set of desiderata that text analytics techniques should satisfy in order to alleviate these challenges and to ensure their successful deployment in industry. Furthermore, we discuss future trends in text analytics and their potential application in industry.

Fecha publicación: 02/05/2016
Autor: Ashwin Ittooa, Le Minh Nguyenb, Antal van den Bosche
Referencia: Computers in Industry. Volume 78, May 2016, Pages 96–107

Enlace original