What is Big Data?


Today, almost any interaction made over the internet or through the consumption of goods and services is being tracked, stored, and used in targeted ways. This has led to the notion of big data -- massive amounts of data that reflect the behavior and actions of various populations. Data scientists and data collection platforms are now able to computationally organize petabytes and exabytes of data so that it is easy to analyze and identify patterns that may have otherwise gone undetected. With the complexity surrounding such large, diverse sets of data, displaying the information is crucial to its success. Visual data analysis blends highly advanced computational methods with sophisticated graphics engines to illuminate patterns and structure even the most complex visual presentations. Information visualization uses infographics, the graphical representation of technical data designed to be quickly and easily understood. In education, data mining is already underway to target at-risk students, personalize learning, and create flexible pathways to success. As education institutions become more adept at working with and interpreting big data, they can make more informed decisions that reflect real learner needs.

INSTRUCTIONS: Enter your responses to the questions below. This is most easily done by moving your cursor to the end of the last item and pressing RETURN to create a new bullet point. Please include URLs whenever you can (full URLs will automatically be turned into hyperlinks; please type them out rather than using the linking tools in the toolbar).

Please "sign" your contributions by marking with the code of 4 tildes (~) in a row so that we can follow up with you if we need additional information or leads to examples- this produces a signature when the page is updated, like this: - gordon gordon Jul 21, 2016

(1) How might this technology be relevant to the educational sector you know best?

Developing new ways of teaching/learning and personalizing learning - an urgent need for today's HE with its large diversity amongst the students (and I'm thinking much further than at-risk students) - can only tak place on the basis of big data. I think BD is one of the most important fields, if not the most important field in the years to come. - ole ole Aug 9, 2016 - lkoster lkoster Sep 30, 2016 - DaveP DaveP Oct 2, 2016

At Curtin we have defined 5 domains for increased use of data science methods ( our preferred terminology, since not all the data we want to understand is 'big' and we'd like to not fall into the hype mode. The five domains are 1. Prior to enrollment (e.g recruitment, understanding our markets, developing longitudinal relationships with future students, adding educational value to informal learning experiences, developing acceleration programs); 2. Who is the learner (e.g learning potential, psychological preferences, friends and social circles, intentions with study, current life plans, academic progress); 3. What are we offering to learners (e.g. Microcredits, access to library materials for success in cohort use of those resources o address curriculum challenges, alternative learning paths); 4. How are we deliverying to learners (e.g. adaptovely re-deploying blended and online resources, serious games, distributed learning); and 5. Post-university (e.g. Employability net3orks, alumni networks, continuing professional growth, business opportunities, first notice for contracts, etc.). - david.c.gibson david.c.gibson Sep 11, 2016 - ole ole Oct 3, 2016

  • At the University of Nevada Las Vegas we are using software for searching, monitoring, and analyzing machine-generated big data to develop metrics to measure the contribution the Office of Online Education makes to improving retention, progression and completion numbers at the institution. These metrics should help us reach out to departments where instructional design improvements could lead to improved student success. It will help us determine which populations are most successful in online courses and help us evaluate the impact instructional design changes make on student success.- elizabeth.barrie elizabeth.barrie Sep 30, 2016

(2) What themes are missing from the above description that you think are important?

  • There are two flaws in the big data hype. First, all data is big data these days. (Some important data is not now and never will be 'big' - e.g. A learner's parents' level of education is an important factor but is not itself big. What is really interesting nowadays is that all sizes of data can be brought together into models, and those models can be amenable to data science methods of discovery of patterns and making inferences. - david.c.gibson david.c.gibson Sep 11, 2016). As the cost of storage drops toward zero and the ability to collect data soars, we collect and save so many more bits of data that anything could be considered big data. The second flaw, I'd say, is that big data is not not necessarily more informative. What we are collecting is more points of data, but we dont always have the right models ( I think this is right an on target. I would say though that the missing theme is not just visualization, but machine learning, automated inferences, self-organizing clusters, and in general, getting massive help from the machine in exploring the complexity of the data. But the issue is as much one of 'creating an appropriate model' for the data and question at hand. We will not usually find a model on the shelf and apply it to some new question and new data set. - david.c.gibson david.c.gibson Sep 11, 2016) to make sense of the data. For example, we might know every click a student makes in an LMS, but can we translate that into menaing? Do more clicks mean more learning? What if we correlate clicks to performance? Does that means we should encourage students to use the LMS more? Or just to click more? These are pretty well known issues in the study of big data. But BIG data sounds so exciting that we forget sometimes the delicate relationship between data and understanding. I guess this is just a caustion to look at big data as one thing (it's just a way of forcasting how much more data we have at hand) or to assume that more is better. As I like to say on my campus: Do we need to tools do a better job of identifying at risk students using statistical methods when we don't have enough advisors to work with the students we already know are struggling? - david.thomas david.thomas Sep 9, 2016 - helga helga Sep 19, 2016 This is the critical point, if we can develop an institutional system for identifying students at risk, do we have an institutional response to dealing with the issues raised? - nwitt nwitt Oct 2, 2016 - ole ole Oct 3, 2016
  • Mo' data mo' problems - the sheer scale of the amount of data collected means it is very difficult to sort the wheat from the chaff and make use of it. I can see how it would make financial sense to use data analysis instead of trained staff, but just because we can, does not necessarily mean we should, especially when dealing with the Wellbeing of our staff and students. - damian.mcdonald damian.mcdonald Sep 22, 2016

(3) What do you see as the potential impact of this technology on higher education?

See above. - ole ole Aug 9, 2016
  • add your response here
  • One of the exciting (perhaps scary to some people) possibilities is to be able to scale globally while personalizing educational experiences. Another is the automation of some of the low level instructional and tutorial feedback needed by learners in the early stages of acquiring new knowledge and skills. It is possible that if ther4 is going to be an 'Uber' moment that disrupts higher education in a significant way, it will have a big data component or may be underpinned by creative data science methods. If I had to guess, one major disruption might come in the area of assessment and validation of learning, then giving credentials for that learning.- david.c.gibson david.c.gibson Sep 11, 2016 - ole ole Oct 3, 2016


(4) Do you have or know of a project working in this area?

At my department a project working on the basis of big data within the field of examination behavior is being started in these days. I'll keep you posted if interesting details pop up. - ole ole Aug 9, 2016

Curtin Univerisity has several projects and is envisioning a whole-of-university approach and capability. - david.c.gibson david.c.gibson Sep 11, 2016
  • add your response here
  • As mentioned above the University of Nevada Las Vegas has begun a project to use data from multiple systems (CRM, SIS, LMS) to develop baseline data on student success in online courses. One use of the data will be to examine the impact of instructional design innovations. As innovative approaches are implemented in key courses, the impact of the innovations will be monitored to determine which innovations should be spread to other courses. - elizabeth.barrie elizabeth.barrie Sep 30, 2016
  • Plymouth University in the UK has been running a project working with key stakeholders - Students, Staff, Senior Leaders and Governors to investigate perceptions and issues around big data and learning analytics - nwitt nwitt Oct 2, 2016

Please share information about related projects in our Horizon Project Sharing Form.