The first exercise our team of experts went through was to build the Datamaran Ontology. The Ontology is a dictionary of topics and related key terms that the engine searches for. It consists of financial, economic, environmental, social, employment and corporate governance topics. As an example, anti-corruption is a topic in our Ontology. In order to give the most complete overview, the engine will also search for a multitude of related key terms such as corruption and bribery. The current Ontology searches for over 100 different topics consisting of 6,000+ key terms and a combination of their related terms. Our experts built this Ontology by manually annotating a high number of sources (e.g. sustainability reports, financial reports, SEC-filings, corporate websites, regulations and social/online media) and by analyzing which topics appear in financial and sustainability reporting frameworks. Both HTML as well as PDF sources are analyzed by the Datamaran engine. We ensure our ontologies are mapped against the main reporting frameworks and guidelines, including the Global Reporting Initiative, United Nations Global Compact, International Integrated Reporting Council, and Sustainability Accounting Standards Board. Our ontology and related key terms, and relationships between terms for which the engine searches across the above-mentioned sources, uses techniques such as Natural Language Processing (NLP), semantic analysis, and machine learning; it includes a growing collection of topics.
Our team of industry and legal experts developed the ontology in collaboration with our data scientists and Technical Advisory Committee. The results of the automated text analysis are manually reviewed by experts in the field at each step of the process against manually annotated reports (“golden records”) .