Abstract | The research presented in this paper investigates the use of a text mining approach for automatic taxonomy generation and text categorisation for the content management system of Alergoclínica, a private clinic of Dermatology and Allergies in São Paulo, Brazil. Text mining has been of interest for many years, but despite the ever-increasing range of text mining applications available, there are neither common standards nor shared evaluation criteria to enable comparison among the different approaches. Numerous problems are addressed by various groups, often using private data sets, so that it is virtually impossible to determine the quality, performance and scalability of the existing systems. Three text mining tools, selected against specific criteria, were investigated to determine their suitability in this real world environment. Surprisingly, the study shows that none are really effective for the task, though each gave some useful output. None could be recommended for full scale implementation. |
---|