Building on Alma Analytics

Analytics @ Lancaster University Library
IGeLU 2014
John Krug, Systems and Analytics Manager, Lancaster University Library
Lancaster University, the Library
and Alma
We are in Lancaster in the UK North West.
~ 12,000 FTE students, ~ 2300 FTE Staff
Library has 55 FTE staff, building refurbishment in progress
University aims to be 10, 100 – Research, Teaching, Engagement
Global outlook with partnerships in Malaysia, India, Pakistan and
a new Ghana campus
• Alma implemented January 2013 as an early adopter.
• I am Systems and Analytics Manager, at LUL since 2002 to
implement Aleph – systems background, not library
• How can library analytics help?
Alma Analytics reporting and
• Following implementation of Alma, analytics dashboards
rapidly developed for common reporting tasks
• Ongoing work in this area, refining existing and developing
new reports
Fun with BLISS
B Floor 9AZ (B)
347 lines of this!
Projects & Challenges
• LDIV – Library Data, Information & Visualisation
• ETL experiments done using PostgresQL and Python
• Data from Aleph, Alma, Ezproxy, etc.
• Smaller projects:
• e.g. Re-shelving performance – required to use Alma Analytics
returns data along with the number of trolleys re-shelved daily.
• Challenges – Infrastructure, Skills, time
• Lots of new skills/knowledge needed for Analytics. For us :
Alma analytics (OBIEE), python, Django, postgres, Tableau, nginx,
openresty, lua, json, xml, xsl, statistics, data preparation, ETL, etc, etc,
Alma analytics data extraction
• Requires using a SOAP API (thankfully a RESTful API is now
available for Analytics)
• SOAP support for python not very good, much better with
REST. Currently using the suds python library with a few bug
fixes for compression, ‘&’ encoding, etc.
• A script get_analytics invokes the required report,
manages collection of multiple ‘gets’ if the data is large and
produces a single XML file result.
• Needs porting from SOAP to REST.
• Data extraction from Alma Analytics is straight forward,
especially with REST
Data from other places
Ezproxy logs
Enquiry/exit desk query statistics
Re-shelving performance data
Shibboleth logs, hopefully soon. We are dependent on central IT
Library building usage counts
Library PC usage statistics
JUSP & USTAT aggregate usage data
University faculty and department data
Social networking
New Alma Analytics subject areas, especially uResolver data
Gaps in the electronic resource
• Currently we have aggregate data from JUSP, USTAT
• Partial off campus picture from ezproxy, but web orientated
rather than resource
• Really want the data from Shibboleth and uResolver
• Why the demand for such low level data about individuals?
The library and learner analytics
• Learner analytics a growth field
• Driven by a mass of data from VLEs and MOOCs …. and
• Student satisfaction & retention
• Intervention(?)
low(library borrowing) & low(eresource access) &
high(rate of near late or late submissions) &
• The library can’t do all that, but the university could/can
• Library can provide data
The library as data provider
• LAMP – Library Analytics & Metrics
Project from JISC
We will be exporting loan and anonymised
student data for use by LAMP.
They are experimenting with dashboards
and applications
Prototype application later this year.
Overlap with our own project LDIV
• The Library API
• For use by analytics projects within the university
• Planning office, Student Services and others
The Library API
• Built using openresty, nginx, lua
• Restful like API interface
• e.g. Retrieve physical loans for a patron
GET (or json)
<?xml version="1.0" encoding="UTF-8"?>
<call_no>AZKF.S75 (H)</call_no>
<loan_date>2014-07-10 15:44:00</loan_date>
<returned_date>2014-08-15 10:16:00</returned_date>
<call_no_2>B Floor Red Zone</call_no_2>
<due_date>2015-06-19 19:00:00</due_date>
"rownum": 1,
"key": "000473908000010-200208151016173",
"patron": "b3ea5253dd4877c94fa9fac9",
"loan_date": "2014-07-10 15:44:00",
"due_date": "2015-06-19 19:00:00",
"returned_date": "2014-08-15 10:16:00",
"item_status": "01",
"num_renewals": 0,
"material": "BOOK",
"bor_status": "03",
"bor_type": "34",
"call_no": "AZKF.S75 (H)",
"call_no_2": "B Floor Red Zone",
"collection": "MAIN",
"rowid": 3212
How does it work?
• Nginx configuration maps REST url to database query
location ~ /ploans/(?<patron>\w+) {
## collect and/or set default parameters
rewrite ^ /ploans_paged/$patron:$start:$nrows.$fmt;
location ~ /ploans_paged/(?<patron>\w+):(?<start>\d+):(?<nrows>\d+)\.json {
postgres_pass database;
rds_json on;
postgres_query HEAD GET "
select * from ploans where patron = $patron
and row >= $start and row < $start + $nrows";
Proxy for making Alma Analytics
API requests
• e.g. Analytics report which produces
• nginx configuration
location /aa/patron_count {
set $b "api-na.hosted.exlibri … lytics/reports";
set $p "path=%2Fshared%2FLancas … tron_count";
set $k "apikey=l7xx6c0b1f6188514e388cb361dea3795e73";
proxy_pass https://$b?$p&$k;
• So users of our API can get data
directly from Alma Analytics and
we manage the interface they use
and shield them from any API
changes at Ex Libris.
Re-thinking approaches
• Requirements workshops
• Application development
• Data provider via API interfaces
• RDF/SPARQL capability
• LDIV – Library Data, Information and Visualisation
• Still experimenting
• Imported data from ezproxy logs, GeoIP databases, student
data, primo logs, a small amount of Alma data
• Really need Shibboleth and uResolver data
• Tableau as the dashboard to these data sets
Preliminary results
More at!/
• First UK Analytics SIG meeting Oct 14 following EPUG-UKI AGM
• Questions?

similar documents