Infogistics' Xtractor.

[ about infogistics ]
[ products ]
[ partners & customers ]
[ in the spotlight ]
[ jobs ]
[ contact ]
[ home ]

download integrator-level documentation.

download case study in HR domain

Xtractor

Xtractor is an engine that sifts through large volumes of texts and creates database records for the objects that are mentioned in the text, such as people, organisations, locations, vehicles, etc. Xtractor is able to read these documents and take the key elements from it based on a set of rules defined by the user. Eg. CV, medical reports, police reports, Xtractor is able to identify names, places, dates, qualifications and specialised terms (medical names etc.) and also determine the relations between these entities.

Infogistics' extraction technology matured at the American Defence Research Agency (DARPA) sponsored Message Understanding Competitions where the Edinburgh system was significantly outperforming other systems in its class. One of the first industrial application of Infogistics' advanced text-extraction technology was Human Resource domain. Xtractor was trained to read and understand candidates resumes, identifying personal information, major skills, employment history and qualifications from them. Xtractor since then has been applied across a variety of domains, ranging from extraction of content from news to extraction of product specifications from various on-line sources.

Xtractor applies the rules of natural language and mathematical methods to identify, normalise and unify information about objects, so for example, if somebody is referred to as 'John D. Smith', 'John', 'J. Smith', or even 'he', Xtractor can figure out that this is the same person. Most importantly, Xtractor identifies the key relationships or links between the objects that it finds. For example, from a sentence 'White is a managing director of Cyber Corp.' Xtractor will identify that there is a person with surname 'White', there is a company called 'Cyber Corp.' and it will create an 'employment' link between these two objects. It will also record particular details of the employment: 'managing director'.

Xtractor can be linked to a document Viewer where the objects Xtractor finds in are highlighted, and information aggregated from many different places of a document can be seen at a glance. For instance, at the beginning of a document Xtractor identifies that John Smith is 42 year old, later on it appends that he works at Good Co. Ltd., at the end of this document it identifies that he drives a red Nissan. Xtractor makes sure that all this information is kept under the single record 'John Smith', suitable for further processing by the computer, or for displaying to humans. Further processing may include populating a database, triggering of an event, the application of workflow or the request for further information or processing.

[ home ] [ about infogistics ] [ products ]
[ in the spotlight ] [ jobs ] [ contact ] [ partners & customers ]