data-science-across-disciplines

Main repository for the Data Science Across Disciplines module offered at the Centre for Interdisciplinary Methodologies at the University of Warwick

Home
Detailed Information

:: Sessions ::

Session-01
Session-02
Session-03
Session-04
Session-05
Session-06
Session-07
Session-08
Session-09


View the Project on GitHub cagatayTurkay/data-science-across-disciplines

Data Science Across Disciplines

Session-04: STRUCTURES AND SPACES

This week explores the notion of structures and how data science can enable the extraction of “hidden” underlying groups – clusters – and hierarchical structures from data. We discuss the different techniques to surface and generate artificial boundaries and how the resulting artefacts can be interpreted. This session then investigates how artificial and abstract spaces can be constructed through different “projection” techniques, and how these spaces help us navigate data that are high-dimensional in nature and apply analytic frameworks to them.

The practical lab explores the use of clustering techniques, compares alternatives, and discusses interpretability issues, and we also review how we can deal with data sets that consists of several variables.

## Highlights of the lecture

We start our discussions this week with the notion of spaces and how spaces we construct can help us in understanding complex phenomena. We then look into how spaces come into play when we explore high-dimensional data sets. We look into how a number of techniques help us construct artificial spaces – projections – that provide the basis for further operations. We then look into the topic of cluster analysis and will discuss a number of techniques that can help us explore “structures” in low and high dimensional data sets. We will also explore how projection spaces and the notions of distance come into play.

## Practical Lab Session

In the practical session, we will explore a number of computational routines and tools that support dimension reduction and projection tasks. We will also look into cluster analysis and explore functions that can help us run cluster analysis routines. We will have a particular attention on the interpretation of these complex tools and will be working with multivariate data sets.

Reading lists & Resources

Required reading

Optional reading

Further reading