What affect does demographic data have on science behind the algorithm?

Learning Analytics enable universities to use algorithms to support decision making. But what affect does demographic data have on science behind the algorithm?

Demographic data undoubtedly plays a major part in Higher Education. We understand that some student groups do less well than others and strategies to support these groups should be put in place to improve their likelihood of success. However, in adding demographic filters to Learning Analytics projects, institutions run the risk of demotivating their students and missing valuable opportunities to provide assistance to those in need of academic development or wellbeing support.

Demographics in Higher Education are important as analysis of them can:

  1. Identify disparity in academic achievement
  2. Support targeted provision of inclusive and positive action for those less privileged
  3. Track demographic changes in student populations, aiding institutional planning

Demographics can also feed into a number of key aspects of university life such as recruitment and resource planning, reporting for the Higher Education Statistics Agency (HESA) reporting and meeting the fair-access obligations of the new regulator, the Office for Students. Planning teams are therefore familiar with demographics as important data points for supporting university activities – and this unlikely to change.

But with the rise of data-led, machine-learning techniques, it is now possible to use data to enhance our understanding of learners and their learning environments. This has resulted in the rapid emergence within the sector of learning analytics which use algorithms to support decision making and offer new areas of insight.

Learning analytics provide the ability to measure and codify students to support progress, attainment and retention at institutional and individual-student levels. They allow targeted, real-time interactions specific to individual students’ needs automatically.

Demographic factors within the analytics model

Naturally the starting point for many universities is to consider demographic factors as an influence within any analytical model. For one, there are clear attainment differences between many demographic groups but what do demographics do in this context? Typically, they reinforce what we already know and they can predict a student outcome: students from a certain postcode do less well, therefore a system predicts under performance for them.

But here is the rub; if nothing changes along the students’ learning pathway, then the algorithm will be predictive. Having a factory production line with the same inputs and processes will deliver the same outputs every time. To change a student’s trajectory, you need to do something different – or they need to be encouraged to do something different.

More importantly, demographics do not necessarily change. No one can change their background, where they come from or their ethnicity. So surely we need to focus on those factors that are within each individual student’s gift to manage?

My issue is that organisations embed demographic factors into data algorithms that hold the potential to operate with bias. We must acknowledge that there is no free pass when it comes to an algorithm. There is inherent bias when creating a mathematical model and there will be a weight attributed to certain conditions which then indicates an outcome. Moreover, no algorithm is universal in nature. Some are good for certain conditions; others are not.

We have seen high-profile cases (including Facebook) where algorithms are influencing social media news feeds by promoting stories that the algorithm suggests is most relevant to the user. This creates individual echo chambers where a user continually hears and sees the same curated content, so their world view becomes a list of what the system sees as their interests. Worryingly, opposing views are extinguished (or curated out).

This is not a new argument; gender and racial bias in algorithms are topics of heated debate but my argument is latent bias exists in many Learning Analytics approaches. If we give a machine biased data, then the machine will be biased in its outputs.

Consider ranking a student’s potential because of a demographic attribute that historically shows a particular group has not fared as well? Or planning based upon supporting particular students because of inherent bias within the algorithm?

Selecting a Black Asian Minority Ethnic (BAME) student over a white student when deciding whom to provide outreach support to is still bias. Having a system representing students fairly and equally needs to be at the heart of any initiative.

For instance, what if a white student had mental health issues yet they were overlooked just because they came from a privileged background? Judging the assignment of resources on a familial history trait is fundamentally wrong. Similarly, what if a BAME student was written off by a tutor because in their view, offering additional support was a pointless endeavour and the system allows them to confirm that view?

If the start-point (demographics) is an issue, does the end-point also need further consideration? Blanket assumptions about what constitutes success also influence set goals in an algorithm. For example, if we only measure success with a grade outcome, this fails to recognise the full value of higher education by simply providing a classification at the end of the process.

Furthermore, focussing on predicting a grade creates unwanted consequences in negative perceptions from students. Why should a Black and Minority Ethnic student be potentially downgraded in their ‘outcome predictions’ because of their heritage? Yet we assign significance to a single, all-encompassing demographic attribute. Demographics do not represent what we are. We are the minutiae of a complex picture of many influences of society, belief, motivations, culture and so on.

Learning Analytics offers Higher Education a means to change; unpacking the learning process in such a way that allows for more rapid and responsive institutions, offering better, more personalised support when it is actually required; and allowing a student to derive the very best value from their fees.

Institutions can gain new and significant insights from Learning Analytics if the start and end points are defined with awareness of what they are there to achieve.

The new General Data Protection Regulation (GDPR) means institutions are obligated to share how automated decisions are derived and offer students a means to challenge these. This has huge consequences when considering an academic tutor discussing the reason they have asked for an academic review with a student because a system has identified them as ‘at risk’.

Could an academic tutor respond? Moreover, does the institution even understand how the calculation has been reached to be able to explain it? This simple factor alone could limit learning analytics to research and closed-door planning rather than becoming a truly democratic tool that offers students a means to be successful and change.

Universities are investing in ways to maximise the return from their resources and are becoming more reliant on tools to support better decision making. But at what cost? Ignoring the fact that algorithms have consequences could cost more than red faces and a legal case.

Learning Analytics enable universities to use algorithms to support decision making. But what affect does demographic data have on science behind the algorithm? Demographic data undoubtedly plays a major part in Higher Education. We understand that some student groups do less well than others and strategies to support these groups should be put in place […]


Fill in your details in the form to the right to access the full article.

Want to Know More ?

Fill in your details and we’ll be in touch.
  • To learn more about our Privacy Policy click here.