Patent Mining and Its Applications
CEUR Workshop Proceedings
Due to the large amount of available patent data, it is no longer feasible for industry actors to manually create their own termi- nology lists and ontologies. Furthermore, domain specific the- sauruses are rarely accessible to the research community. In this paper we present extraction of hyponymy lexical relations con- ducted on patent text using lexico-syntactic patterns. We explore the lexico-syntactic patterns. Since this kind of extraction involves Natural Language Processing we also compare the extractions made with and without domain adaptation of the extraction pipeline. We also deployed our modified extraction method to other text genres in order to demonstrate the method´s portability to other text do- mains. From our study we conclude that the lexico-syntactic pat- terns are portable to domain specific text genre such as the patent genre. We observed that general Natural Language Processing tools, when not adapted to the patent genre, reduce the amount of correct hyponymy lexical relation extractions and increase the number of incomplete extractions. This was also observed in other domain specific text genres.
Information and Communication Technology
L. Andersson, M. Lupu, J. Palotti, F. Piroi, A. Hanbury, A. Rauber