Well, the course has been running for 3 weeks, and I’ve generally kept up, reading quite a few blog articles and then catching up with the actual course content on a Friday when time allows. Admittedly this is my first MOOC and I can see how a lot of people feel overwhelmed and lost (I’m currently feeling this!).
But anyway, I digress. I’m not 100% sure where this blog post will go, but initially I’m thinking this will form part of a series, otherwise it’ll turn into a flown blown essay – and neither the reader nor I want that.
Enterprise Solution to Aid Analytics
A lot of the talk that I have seen in the discussion forums has been regarding how to get hold of data sources. I’m fortunate in that working within the IT department of my University, and also contributing to the integration of various systems, I tend to have access to large datasets wherever I look. On the other hand, if we can think of a benefit of having some data, then I generally know who to ask.
But I’d like to take a step back from this point and look at System Architecture design that would really aid institutions take part in performing data analytics. I’ve been fortunate enough to be a fundamental part of the development of Data Exchange System at a previous institution, which has set me in good stead for designing a system architecture to avoid duplicates, and resolve issues in needing to ‘cleanse’ data across multiple systems.
A University Data Exchange System
Whilst the real world is never quite this simple, for the purposes of this written article not exploring every eventuality, a University could be seen as having 2 primary source systems:
- HR System (for staff details)
- Student Record System (for the Student and Curriculum data)
This data would then flow to a number of satellite systems, for example:
- VLE/MLE (Blackboard, Moodle, etc…)
- Library System
- Swipe Card / Attendance Monitoring system.
There are of course plenty of others, but these as possibly most relevant to all institutions.
Now these systems evidently need to be linked up. Firstly, to create the initial data in the various satellite systems, and then to periodically update it. One (poor) approach could be to create direct links from the source system to each satellite system like below:
I say poor for a number of reasons. A primary reason is that the next time the Student Record System is replaced, well, every system needs to have their link regenerated from scratch. There is also the issue of duplicate accounts entering the system (if preventive action being in place in the source systems), then they quickly end up in all satellite systems, leading to loss of man-hours dedicated to cleaning up duplicate data – let alone the mention of the impact on student experience.
My preferred solution is to have a central piece of the jigsaw which processes the changes. This central piece can perform a number of functions but primarily:
- It can attempt to detect duplicates and prevent them from entering the satellite systems – flagging these issues at source.
- It can, for example, issue each student a unique identifier such as a GUID, which can be used to send to all satellite systems (and this is where LAK comes in – making it easier to query an entity across multiple systems)
- It can also have the logic regarded to determine if I change needs to be sent to a satellite system. For example, a staff member changing their address probably doesn’t need to be sent to the VLE. It would however need to go to the Library System.
- It can also have the benefit of merging staff and student accounts into one – where the member of staff is also a student. That would save from having two logins, two ID cards, etc….
It also addresses the concern highlighted above of only one link needs to be re-worked should a system be replaced. There are also further benefits in having the ability to queue up changes in the event of the system being down for essential maintenance, broken(!), along with potentially centrally storing the business logic for the processing rules.
So, I’ll leave it there for now. But with the introduction of a unique identifier, centrally stored, we can now start to perform analytics from every system within this architecture. There may of course be links between student performance (or drop-out) and the amount of ‘churn’ through the system, and provided we introduced some kind of logging to this system we’d be able to perform some checks on this and identify patterns.
The next post in the series may come soon (I’m waiting for a long running process to complete) and will look a potential solution to integrate the systems to make retrieval and analysis of data easier (and lend a hand to other mechanisms to deliver student expectations).