Main repository for the Data Science Across Disciplines module offered at the Centre for Interdisciplinary Methodologies at the University of Warwick
Home
Session-01
Session-02
Session-03
Session-04
Session-05
Session-06
Session-07
Session-08
Session-09
View the Project on GitHub cagatayTurkay/data-science-across-disciplines
This week explores the notion of structures and how data science can enable the extraction of “hidden” underlying groups – clusters – and hierarchical structures from data. We discuss the different techniques to surface and generate artificial boundaries and how the resulting artefacts can be interpreted. This session then investigates how artificial and abstract spaces can be constructed through different “projection” techniques, and how these spaces help us navigate data that are high-dimensional in nature and apply analytic frameworks to them.
The practical lab explores the use of clustering techniques, compares alternatives, and discusses interpretability issues, and we also review how we can deal with data sets that consists of several variables.
## Highlights of the lecture
We start our discussions this week with the notion of spaces and how spaces we construct can help us in understanding complex phenomena. We then look into how spaces come into play when we explore high-dimensional data sets. We look into how a number of techniques help us construct artificial spaces – projections – that provide the basis for further operations. We then look into the topic of cluster analysis and will discuss a number of techniques that can help us explore “structures” in low and high dimensional data sets. We will also explore how projection spaces and the notions of distance come into play.
## Practical Lab Session
In the practical session, we will explore a number of computational routines and tools that support dimension reduction and projection tasks. We will also look into cluster analysis and explore functions that can help us run cluster analysis routines. We will have a particular attention on the interpretation of these complex tools and will be working with multivariate data sets.
Required reading
Further reading