EU Contract Hub, an integrated search and analysis tool for public procurement contracts within the European Union

Public procurement is the process by which a public body (such as a ministry or local authority) buys works, goods, or services from companies. It is an essential part of service delivery and a sizeable portion of internal Gross Domestic Product (GDP) in European member states. In this work, a data framework is developed to facilitate the analysis of trends in public procurement, with a special focus on healthcare procurement. It encompasses the complete data cycle; from the collection of contract information to their processing into a digital interactive tool: EU Contract Hub. This tool can be used to extract numerical and visual information from public procurement contracts. EU Contract Hub is based on a document-oriented storage, which optimizes searches for structured and unstructured data, allows us to quickly create data visualizations and integrate them into interactive dashboards, and enables textual searches within the contracts’ contents. This framework is accompanied by several analysis of the data on issues with quality, consistency, and interoperability, summarizing the context behind these problems, and presenting solutions to mitigate them.
The data presented in our tool are extracted from Tenders Electronic Daily (TED), the online version of the ’Supplement to the Official Journal’ of the EU, dedicated to European public procurement. We extract the data in two formats: XML and CSV. The XML format contains all the data entered in the form, including descriptions, with new contracts downloaded daily, and only the Contract Award Notices are uploaded to the tool. The CSV format is more concise, but includes more precise value estimates. We leverage the data in both formats to extract the most precise amount of information about the contracts. We aim to collect all contracts in TED from 2018 until today, giving us access to information about more than 1.5 million contracts. Contract data is processed to remove outlier values, add useful information such as CPV descriptions, and incorporate attribute-based rules to characterize aspects like the Procurement Route. This structured and enhanced data are then indexed into a final, easy-to-use index that answers a series of predefined questions. We also automatically translate in English the title and descriptions of said contracts to expand the reach of the information to a larger audience, given that this information is mostly written in the native language of the procurement country. The EU Contract Hub service is hosted on a web domain and provides open access to query its data.

​Public procurement is the process by which a public body (such as a ministry or local authority) buys works, goods, or services from companies. It is an essential part of service delivery and a sizeable portion of internal Gross Domestic Product (GDP) in European member states. In this work, a data framework is developed to facilitate the analysis of trends in public procurement, with a special focus on healthcare procurement. It encompasses the complete data cycle; from the collection of contract information to their processing into a digital interactive tool: EU Contract Hub. This tool can be used to extract numerical and visual information from public procurement contracts. EU Contract Hub is based on a document-oriented storage, which optimizes searches for structured and unstructured data, allows us to quickly create data visualizations and integrate them into interactive dashboards, and enables textual searches within the contracts’ contents. This framework is accompanied by several analysis of the data on issues with quality, consistency, and interoperability, summarizing the context behind these problems, and presenting solutions to mitigate them.
The data presented in our tool are extracted from Tenders Electronic Daily (TED), the online version of the ’Supplement to the Official Journal’ of the EU, dedicated to European public procurement. We extract the data in two formats: XML and CSV. The XML format contains all the data entered in the form, including descriptions, with new contracts downloaded daily, and only the Contract Award Notices are uploaded to the tool. The CSV format is more concise, but includes more precise value estimates. We leverage the data in both formats to extract the most precise amount of information about the contracts. We aim to collect all contracts in TED from 2018 until today, giving us access to information about more than 1.5 million contracts. Contract data is processed to remove outlier values, add useful information such as CPV descriptions, and incorporate attribute-based rules to characterize aspects like the Procurement Route. This structured and enhanced data are then indexed into a final, easy-to-use index that answers a series of predefined questions. We also automatically translate in English the title and descriptions of said contracts to expand the reach of the information to a larger audience, given that this information is mostly written in the native language of the procurement country. The EU Contract Hub service is hosted on a web domain and provides open access to query its data. Read More