Text analytics in industry: Challenges, desiderata and trends
The recent decades have witnessed an unprecedented expansion in the volume of unstructured data in digital textual formats. Companies are now starting to recognize the potential economic value lying untapped in their text data repositories and sources, including external ones, such as social media platforms, and internal ones, such as safety reports and other company-specific document collections. Information extracted from these textual data sources is valuable for a range of enterprise application and for informed decision making. In this article we provide a systematic review of the current state of the art in the application of text analytics in industry. Our review is structured along three dimensions: the application context, the methods and techniques utilized, and the evaluation procedure. Based on the review, we identify the different challenges and constraints that an real-world, industrial environment imposes on text analytics techniques, as opposed to their deployment in more controlled, research environments. In addition, we formulate a set of desiderata that text analytics techniques should satisfy in order to alleviate these challenges and to ensure their successful deployment in industry. Furthermore, we discuss future trends in text analytics and their potential application in industry.