Big Data

As IDEL progressed we seem to have moved away from perhaps some of the more theoretical and human topics, towards a more technical focus. Week 9 was focused on digital badges and blockchaining, and week 10 looks at the concept of ‘big data’ and its implications for education.

Like blockchaining, I’m aware of the concept of big data, but have not got to grips on what this means, or what it can do for us. I think there’s a danger some of these initiatives can be seen as simply buzzwords or part of the zeitgeist, without any longevity to them. But it’s apparent by digging into both blockchaining and big data that these are unlikely to be fleeting developments and are likely to underpin many technical changes over the next few years.

Unsurprisingly, and in line with the rest of the course so far, the focus on the readings has been to contrast the opportunities big data provides with some of the pitfalls. There’s also been a focus on some of the blind spots in this area.

So starting with the positive aspects, with Selwyn (2015) highlighting three areas:

  • Increased ability to use data to measure goals, targets, benchmarks, performance indicators etc
  • An ability to harmonise and standardise across borders, whether these be institutional or geographic
  • To provide a basis for an infrastructure for education to be understood and organised

But naturally, there are some challenges. Eynon (2013) puts the spotlight on three aspects for concern:

  • The ethics behind the sourcing, mining, interpretation and ultimately use of the data
  • The scope of the data, what can be measured as a results, and the questions it can (and can’t) help us answer
  • Inequalities linked to the sourcing and accessing of any data.

The area of ethics is one discussed in detail across several of the recommended readings, as it seems to be a grave cause for concern. A wonderful example is given by Williamson (2015) on use of data provided by Facebook, and the criticism afterward about the permission (or lack thereof) around the use of the data (the defence being that it was already in the ‘public realm’).

To provide an example related to university admissions, at present applications are (largely) based on Academic results at an undergraduate level. However ‘big data’ could provide the ability to forecast degree completion, and perhaps future earning potential, and even link this to social backgrounds and family history. Naturally, on the one hand, this could be empowering – providing institutes with more insight into how to support their students to succeed. The rather dystopian view is that this could prejudice entry requirements, and given some of the metrics being forced upon universities (e.g. ‘satisfaction ratings) and the subsequent ‘slap on the wrists’ as a result of this, it doesn’t seem impossible that big data will be used in this way at some point.

Selwyn (2015) extrapolates these issues further to explore what they may mean for institutes and their students. He argues that this increase in performance metrics may create an “intensification of managerialism within education”, which suggests a move towards a workplace more typical of a commercial organisation.

This commercial influence is unsurprising, given the history of big data. It seems to me that using the processes and themes of big data in education comes with it with ‘baggage’, because of its very design. Because of the commercial background too, it’s important to critique the strength and role of the educational or academic voice within technical developments such as this. This harks back to one of IDEL’s earliest topics, about the role of technology in education, and who is at the table when it comes to the discussions and implementation of this. This seems to be another example of where education could be perceived as a recipient of this technology, rather than helping to shape it.

This is of particular concern when you read about Pearson’s developments in Williamson (2015). Some of the criticism of Pearson could be seen as a reaction against their developments, but when Williamson argues that Pearson could be using big data to create new “models of cognitive development and learner progression”, then this would be a major red flag. Pearson’s main responsibility is to their shareholders, to their students, so it’s important that the progression of technical initiatives like big data is not left to commercial educational companies to drive.

A common issue picked up by Eynon (2013), Selwyn (2015) and WIlliamson (2015), is that of an institution’s capability, or more specifically the personnel within it, to use big data. ‘Use’ in this context, is quite a broad term, from sourcing and mining the data, through to combining with different sources and interpreting it. Williamson argues that there are “several competencies for education data science”, and that there is a significant deficit in the numbers of those equipped with the necessary skills. The skills are a blend of the technical (computational and statistical skills), the educational and an understanding of the ethical and social concerns in this area. As such, Williamson argues that educational data science is very much a field in its own right, rather than an appendage to statistical analysis. Naturally, if this is an area that is significantly under-resourced, then this reduces the impact education can have in shaping big data.

This may also be more difficult to fix than Eynon envisages. The demand for talent – given the nature of the role – is spread across both commercial and educational organisations, meaning commercial companies may be able to outbid educational institutes for their services. It may be one thing to recognise the issue, but fixing it may be increasingly difficult.

I picked up on several themes across the papers that have been discussed earlier in IDEL.

Given the rise of commercial influence in this area (in particular), there seemed to be a ‘call to action’ to the wider educational crowd to become more vocal, and come more centre-stage in these discussions. Selwyn (2013) argues that “the opportunity now exists for educational research to develop nuanced approaches to understanding, and then offering alternatives to, the dominant data conditions that are being established across educational contexts”. This reminded me of Biesta (2013), in his call for teachers to teach, and Bayne (2015) to ensure academia has a role to play in wider technological developments.

Biesta’s references to a neo-liberalistic agenda also pop up in Selwyn – “expanded access to data allows institutions and individuals to operate more efficiently, effectively and equitably”, and Eynon also references themes of efficiency and cost-effectiveness in big data.

Selwyn (2013) also uses the metaphor of water in his discussions around big data. ‘Deluge’, ‘flow’, and ‘flood’ are terms used, and I think this possibly inevitable. The comparisons between data and water are natural – is it aplenty, can travel at speed (rivers) or not (lakes), comes from many different sources and directions, and requires real skill to manage. It’s also a fundamental of life, and you can argue data is the bedrock of economies now (it’s even been termed as more valuable than oil). The dystopian view is that it can also be a dangerous force of huge power, and like recent devastating floods all over the planet, can pose an immediate danger to us through years of mismanagement.

I thought it was interesting that Selwyn (2015) points out that the sociological approach to data is to assume that there are already some inherent issues with it. This admission of lack of neutrality is quite refreshing, and makes a lot of sense. It’s a battle that’s difficult to fight – it’s probably a better use of time to acknowledge this and work out how to deal with it than try and fix at source. The rhizomatic metaphor is also apparent here, in that Selwyn argues that “this approach is careful to acknowledge that data are profoundly shaping of, as well as shaped by, social interests”.

As a final thought, I liked this quote from Eynon (2013) – “We must not get seduced by Big Data”. I think if you were to replace ‘Big Data’ with ‘technology’, you’ve probably got the core theme of IDEL in a nutshell.