
Delivered through an interesting use case -- analysing multiple open datasets to build a dashboard that explores funding and initiatives for green projects in highly polluted European cities -- the course unfolds over six weeks of sessions.
Session 2 was held on Friday, the 16th of February, and it provided a comprehensive overview of data cleaning and transformation fundamentals, the second point along our data journey.
Building on learnings from session 1 — when participants got to grips with accessing and ‘reading in’ data from multiple sources — the second session dove deeper into what to do with the data once we have it, including:
- Cleaning the data using techniques like identifying relevant data, removing duplicates, and handling missing values
- Transforming the data with a refresher on data types and understanding what’s meant by chaining the shape of the data
Attendees asked some great questions during the Q&A portion of the session. One participant asked about combining data types to generate a unique identifier. Another asked about how to interpret and transform data, which includes delimiters like commas or periods.
Have you missed any of the sessions so far? Catch up here
In session three on Friday, the 1st of March, we’ll continue onto our next step in the data journey: Data bending and storage.
Understanding when and how to blend data from multiple sources can lead to more comprehensive insights. So, in the next session, participants will gain the knowledge and skills needed to automate data blending and efficiently store datasets.
What to Expect:
- Data Blending Techniques: Discover how to combine and merge datasets seamlessly, identifying opportunities to concatenate files with similar structures.
- PostgreSQL and PgAdmin: Explore how to store data in a relational database and learn the fundamentals of SQL
- KNIME for Data Blending: Dive deeper into KNIME to automate the process of blending data from various sources, making your analysis more comprehensive.
- Introduction to SQL with pgAdmin
It will be an interesting day, packed to the brim with learnings. We hope to see you there!
Details
- Publication date
- 28 February 2024
- Author
- Directorate-General for Digital Services