Information Extraction - Facilitating Smart Data Management
A wealth of information is hidden within unstructured text
located in company documents, e-mails, newspaper articles,
web pages, etc. This information is best exploited in a
structured or relational form, which is more suitable for
searching and integration with the relational databases,
and for text mining.
Accordia’s Information Extraction System produces a
structured representation of the information that is
buried in unstructured text documents: free-text
documents written in natural language, and
semi-structured pages.
There are three major components of our Information Extraction System:
- The Named Entity Recognizer (NER), which finds and classifies:
-
(1) the names of people, organizations, and geographic locations
(2) the date and time expressions, percentages, and money amounts - The Co-Reference Resolution (CoRe) module, which discovers
- identity relations between entities in and across documents.
- The Relation Extraction (RE) module, which finds relations between
- recognized entities.
