Data-driven decision-making is transforming the public sector. Armed with this knowledge, Greece's National Infrastructures for Research and Technology (GRNET S.A.) and the University of Macedonia (UoM) have embarked on a ground-breaking pilot project with the Big Data Test Infrastructure (BDTI) as the backbone. Their mission is to leverage the BDTI to convert the Greek National Registry of Administrative Public Services (MITOS) into Linked Open Data, thereby enhancing transparency, efficiency, and accessibility of public services. This article delves into the journey of the MitosLOD project, highlighting its innovative approach and the significant role of the BDTI environment.
The vision
GRNET S.A., a leading public sector technology research institute in Greece, and UoM, renowned for its expertise in eGovernment research, have joined forces with a shared vision. They aim to transform MITOS, which provides structured descriptions of over 3,000 public services, into Linked Open Data. This transformation aligns MITOS with European Union standard models like the Core Public Service Vocabulary Application Profile (CPSV-AP) and the Core Criterion and Core Evidence Vocabulary (CCCEV).
💡The Core Public Service Vocabulary Application Profile is a data model designed to standardise the description of public services. It provides a common way to detail the characteristics, processes, and requirements of public services, ensuring clarity and consistency across different governmental and organisational systems
💡 The Core Evidence and Criterion Vocabulary is a simplified, reusable and extensible data model for describing the principles and means a private entity must fulfil to become eligible or be qualified to perform public services or participate in public procurement. A Criterion is a rule or principle used to judge, evaluate or test something. An Evidence is how a Criterion may be proven.
The objective
The primary objective of this pilot project is to create a dynamic, queryable endpoint for public service data, facilitating easy access and retrieval via open standards and semantic technologies like SPARQL (the standard query language and protocol for Linked Open Data on the web or for RDF triplestores).
What is Linked Open Data?
Linked Open Data (LOD) represents a significant advancement in sharing and utilising information. It involves transforming structured open data into interlinked, machine-readable formats, enabling disparate datasets to be interoperable and queried in ways that were previously difficult to achieve. This approach not only enhances data discoverability and accessibility but also fosters innovation by allowing developers, researchers, and policymakers to create new applications and insights by combining different datasets and knowledge graphs.
The concept of Linked Open Data was introduced by Tim Berners-Lee, the inventor of the World Wide Web, who proposed a set of principles known as the 5-star deployment scheme for open data. These principles outline the best practices for publishing and connecting data on the web, with the highest rating (5 stars) awarded to data linked to other data and available for querying using open standards such as SPARQL.
The MitosLOD project exemplifies the benefits of Linked Open Data adoption, showcasing how it can improve the delivery and accessibility of public services.
Discovering the BDTI framework
"BDTI provided us with a robust platform to experiment with technologies like Apache Airflow and Virtuoso, essential for our data transformation pipeline."
The BDTI framework, a free cloud-based environment equipped with open-source tools for every stage of the entire data journey, emerged as the perfect solution for MitosLOD. Introduced to the team during the EGOV 2023 international conference, BDTI promised an infrastructure that minimises the need for extensive setup and configuration, enabling rapid development and testing.
Efthimios Tambouris, Professor in the Department of Applied Informatics at the University of Macedonia, emphasises the significance of BDTI: "BDTI provided us with a robust platform to experiment with technologies like Apache Airflow and Virtuoso, essential for our data transformation pipeline."
Leveraging BDTI: How MitosLOD will work
The MitosLOD pilot is based on a pipeline developed leveraging different services offered from the BDTI framework:
- Data gathering: Python scripts call the MITOS API in JSON format
- Processing and transformation: The retrieved data are processed and stored in CSV RDF triples (a standard model for data interchange on the web), adhering to the CPSV-AP mapping
- Publication: Apache Airflow is setup to publish the transformed linked data using Virtuoso
The different steps are orchestrated with Apache Airflow by first calling the micro-services responsible for data retrieval and then moving to the next steps
This streamlined process allows the transformation of MITOS data into Linked Open Data and provides access using a Virtuoso public endpoint for testing purposes.
Underpinning the MitosLOD project is the collaboration between GRNET S.A. and the University of Macedonia. This partnership brings together GRNET's extensive experience in digital transformation and UoM's deep expertise in eGovernment research. Both parties note what makes the collaboration so fruitful:
- Shared expertise leads to more robust and innovative solutions.
- Having teams from two organisations work together brings diverse perspectives that foster creativity and innovation.
Key outcomes so far
- The entire process from MITOS API data retrieval to Linked Data publication is now automated, significantly reducing manual intervention and increasing efficiency.
- The pilot's use case will be published in a conference paper in collaboration with the BDTI team, highlighting their work's academic and practical relevance.
"BDTI's flexibility and scalability make it an excellent choice for projects aiming to harness big data tools with minimal setup effort. We strongly recommend it to other public administrations looking to innovate."
While the project is still in the final stage of pipeline finalisation, these achievements underscore the potential of Linked Open Data in improving public service interoperability and delivery.
Lessons learned and recommendations
Reflecting on their journey, the MitosLOD team offers valuable insights for other organisations considering the BDTI framework:
- Define objectives and expected outcomes from the outset to effectively guide the project. As always, be prepared for possible technical or other challenges.
Foster a collaborative environment that brings together diverse expertise to tackle challenges. In the case of MitosLOD, the university team step in to handle technical tasks and the GRNET colleagues manage policy and organisational areas, making an excellent team.
Take full advantage of BDTI's readily available tools and infrastructure to streamline development processes and test new tools and implementations. Capitalise on open standards and open software to formulate an excellent testbed for experimentation particularly when combined with a readily available, freely provided cloud infrastructure such as BDTI.
Professor Tambouris advises, "BDTI's flexibility and scalability make it an excellent choice for projects aiming to harness big data tools with minimal setup effort. We strongly recommend it to other public administrations looking to innovate."
Looking ahead
With the foundational infrastructure in place, the MitosLOD team is set to explore developing end-user applications, such as chatbots, to further enhance public engagement with administrative services. These applications will provide citizens with easy access to information about public services, such as passport applications, thereby simplifying their interactions with government agencies.
By leveraging the BDTI framework and the MitosLOD pilot, GRNET S.A. and UoM are pioneering a path towards smarter, more efficient government services. As they continue to refine their processes and develop user-centric applications, their work promises to serve as an inspiring model for public administrations, illustrating the profound impact of data-driven innovation on public service delivery.
We encourage public administrations to explore similar initiatives, utilising frameworks like BDTI to unlock the potential of their data and build public services on top of it. Embrace collaboration, harness innovative tools, and join the movement towards a more transparent, efficient, and citizen-friendly public sector.
Details
- Publication date
- 19 June 2024
- Author
- Directorate-General for Digital Services