FIRST - large scale inFormation extraction and Intergration infRastructure for SupporTing financial decision making
FIRST project addresses the extreme challenges of dealing in real-time with vast and constantly growing amounts of heterogeneous data and information in financial markets. The financial industry represents a role model for critical, information-bound domains. Especially the end-users of information as for instance financial analysts, investment managers, market regulators, financial advisors, and individual investors rely on their ability to quickly identify and interpret relevant information. By using this information these user groups try to identify dynamically evolving and potentially risk bearing situations (e.g. shocks and crashes). Relevant and correct information assists these users in staying ahead of the market to place their ideal investment decision, detect market manipulation, or simply to give advice to a client. The problem hereby evolves through the vast amounts of information opportunities which make it nearly impossible for a user to concentrate on the essential information.
Therefore, FIRST develops and provides an Information and Communication Technology (ICT) infrastructure that will:
- Collect and process massive amounts of heterogeneous, structured and unstructured data as for instance textual data, largely scattered web information from blogs, bulletin boards etc., or historical data from economic databases
- Integrate this data into a financial knowledge base for further analysis
- Exploit this data by developing and employing a range of highly scalable online event detection and prediction models, visualization models, and decision-support models that will deliver pertinent information to the decision maker.
FIRST helps by opening up new before-the-fact information for earlier/better treatment of evolving conditions in advanced financial decision making.
The FIRST project aims to provide a large-scale information extraction and integration infrastructure supporting non-ICT skilled end users for on-demand financial information access and execution of financial market analyses. Innovations in FIRST are:
- information extraction from unreliable semi-structured sources on a massive scale and in near real-time,
- automatic reuse of existing ontologies, large-scale ontology learning, and
- advanced decision models making use of high-level semantic features.
Methods and advancements
FIRST implements a systematic strategy for scaling its methods and software infrastructure to the processing of massive amounts of information in real-time by three steps during the timeline of the project:
- Functional prototype
- Scaling for non time-critical processing of massive historical data, and
- Scaling for massive live streams of structured feeds, textual news wire feeds, as well as semi-structured Web information in real-time