ANR Exploration of Historical Big Data (HBDEX) Project

Principal investigator : Pierre-Cyrille Hautcoeur ; grant by Agence nationale de la recherche

This interdisciplinary project aims first at developing a new technology in order to extract automatically financial data from historical sources and enter them into the DFIH database. This new technology is being developed thanks to a close cooperation between to IT labs IRISA and LITIS, and the DFIH team at the PSE. It will be tested on the Over the counter (OTC) market called the Coulisse between 1871 and 1961 (about 235.000 pages, 30 million lines). The ultimate aim is to create a technology that could be adapted to new sources at low cost, which will be targeted in the European project EURHISFIRM, that will build on the HBDEX experience. The second aim of the project is to develop new mathematical methods in order to understand the formation and evolution of prices on the financial markets. It will be conducted in cooperation between CAMS and PSE.
The project is planned to last from 2017 to 2021.


PSE : Pierre-Cyrille Hautcoeur, Angelo Riva, Raphael Hekimian, Elisa Grandi
CAMS : Jean-Pierre Nadal, Annick Vignes
LITIS : Sébastien Adam, Thierry Paquet, Clément Chatelain, Pierrick Tranouez, Stéphane Nicolas, Pierre Héroux
IRISA : Bertrand Coüasnon, Aurélie Lemaître, Ivan Leplumev, Yann Ricquebourg,

Summary :

The major research trends involve innovative methods of production, processing and analysis throughout the whole value chain of data, but also the development of original solutions for the extraction of innovative knowledge. However, “born digital” Big Data lacks the historical depth required to understand the current dynamics of society. Using a major technological innovation in ICT, HBDEX proposes a major contribution to our understanding of the functioning of financial markets and historical events. The financial crisis of 2008 has once again highlighted the weakness of the empirical foundations of explanatory models. The Paris financial market has been for a long time organized around two co-existing markets, the centralized and regulated Paris Bourse, and the Coulisse, an unregulated bilateral OTC market. It is likely that the differences in organizations and their evolution have affected the behavior of the markets and their interaction with the real economy. One of the bottlenecks to our understanding of financial markets is the scarcity of long-term data. Long-term data are needed to update the stylized facts used in models and more generally to test models in different historical and geographical contexts, especially models concerning structural changes. ICT are becoming ever more central to major scientific, economic and social issues, calling for close collaboration with other disciplines in order to design solutions adapted to their specific needs.
By providing a breakthrough in ICT, the interdisciplinary (computer science, economic history and economics) HBDEX project has three goals : (1) to design innovative technology based on artificial intelligence to ensure the production chain of big data from historical tabular sources and to overcome the technological bottleneck impeding the “Big Data Revolution” in the sciences of the past ; (2) to integrate into the Equipment DFIH the daily prices of the Coulisse over the 20th century and then produce an efficient tool for the understanding of the financial markets functioning (3) the comparative exploitation of data already produced by the Equipex DFIH to be produced by HBDEX. The developed technology could lay the foundations for a national platform to spark the Big Data Revolution in historical social sciences, this is the scaling-up in the variety and quantity of available data.