Digital Preservation - Digital Humanities Austria

Österreichische Tage der Digitalen Geisteswissenschaften
save the data - workshop on digital repositories, Dec 2nd 2014
Johannes Spitzbart
Phonogrammarchiv, Austrian Academy of Sciences
Collecting, preserving and documenting primary audiovisual
research sources (unique) and making them available for research
Scholarly research (ethnomusicology, ethnology, linguistics, context
& discourse)
Technical research & development (standardization, i.e. IASA-TC,
preservation, replay & digitization of recordings, patent “Method
for Reconditioning of Data Carriers”) and training
Role in the digital research infrastructure:
Digitization, long-term preservation, data accessibility
Metadata capture, comprehensive documentation, database
access, searchability (online catalogue)
Manually managed system:
File Server – RAID 10 (= mirrored) HDD array, 40TB (usable capacity); LTO Tape Library
(two backup copies)
Eternal preservation through continuous data and format (if needed) migration
Checksums for file integrity check
easy files/folder management, in-house maintenance, flexible (independent from
proprietary management software; hosting of any file format), probably easier disaster
additional effort for manual file/folder management and linking to database (human
Network Switch
1-Gigabit Ethernet
Several workstations
Digitized (approx. 50%) tape/disc collection (audio)
Born digital output of supported external & own
projects (audio)
Digitized acquired collections (audio)
Gigabit Ethernet
file servers
Workspace (for temp data)
2 independent, scalable hard drive storage units in
storage area network environment (mirrored hard
drives, RAID 10)
Data backups managed by system administrator
Tape library with LTO (=tape
data storage) drive
MySQL DB, PHP frontend, custom developed (daily backup onto
storage system)
Elaborate structure due to archival documentation needs
(comprehensive metadata capture)
Taxonomies & controlled vocabularies (Hornbostel-Sachs, languages,
ethnic groups, etc.)
Different access levels (visitors read only, admin for taxonomies and
AV playback in browser (MP3, MPEG-2)
English version (work in progress)
Comprehensive documentation
(technical, content descriptive and
contextual) at item level (=recording)
which is a prerequisite for accessibility
and potential use
… slimmed down copy of in-house database on dedicated
“exposed” server
 Reduced data set & short samples
 Focus on usability and sophisticated search possibilities
 Connected to Europeana through Dismarc (weekly updated)
Open Access?
Legal constraints
(Intellectual property rights)
Ethical issues (sensitive content)
 full length online publication not possible
to date
INPUT (external, main part):
Supported research projects get ...
◦ Methodological support/advice
◦ Technical support (recording equipment & training, )
◦ Preservation of the outcome (data: field recordings, metadata)
◦ Exclusive usage right for six years
… and provide on their part ...
◦ The original field recordings
◦ Their description (= predefined set of metadata) for proper documentation
Interested users ...
◦ Browse online catalogue, listen to samples, inquire via email
◦ Are provided with access copies via Download (small handling fees or fixed
rates for commercial use, e.g. media, exhibitions)

similar documents