ebook img

Advanced digital preservation PDF

516 Pages·2011·19.621 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Advanced digital preservation

Advanced Digital Preservation David Giaretta Advanced Digital Preservation 123 DavidGiaretta STFCandAllianceforPermanentAccess Yetminster,Dorset UnitedKingdom [email protected] FurtherProjectInformationandOpenSourceSoftwareunder: http://www.casparpreserves.eu http://developers.casparpreserves.eu http://www.alliancepermanentaccess.org ISBN978-3-642-16808-6 e-ISBN978-3-642-16809-3 DOI10.1007/978-3-642-16809-3 SpringerHeidelbergDordrechtLondonNewYork LibraryofCongressControlNumber:2011921005 ACMCodesH.3,K.4,K.6 ©Springer-VerlagBerlinHeidelberg2011 Thisworkissubjecttocopyright.Allrightsarereserved,whetherthewholeorpartofthematerialis concerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation,broadcasting, reproductiononmicrofilmorinanyotherway,andstorageindatabanks.Duplicationofthispublication orpartsthereofispermittedonlyundertheprovisionsoftheGermanCopyrightLawofSeptember9, 1965,initscurrentversion,andpermissionforusemustalwaysbeobtainedfromSpringer.Violations areliabletoprosecutionundertheGermanCopyrightLaw. Theuseofgeneraldescriptivenames,registerednames,trademarks,etc.inthispublicationdoesnot imply, even in the absence of a specific statement, that such names are exempt from the relevant protectivelawsandregulationsandthereforefreeforgeneraluse. Coverdesign:deblik,Berlin Printedonacid-freepaper SpringerispartofSpringerScience+BusinessMedia(www.springer.com) “Howtopreserveallkindsofdigitalobjects” and “OAIS:whatitmeansandhowtouseit” and “TheCASPARbook” and “Everythingyouwantedtoknowaboutdigital preservationbutwereafraidtoask” Preface There has been a growing recognition of the need to address the fragility of the digital information that is deluging all aspects of our lives, whether in business, scientific,administrative,imaginativeorculturalactivities. Society’s growing dependence on the digital for its smooth operation as it becomesaninformationsocietyprovidestherealurgencyforaddressingthisissue. Thiscasehasbeenmadeverywellinthelargenumberofbooksandarticlesalready published on the topic of digital preservation and therefore this case will not be expandeduponinthisbook. Sincetherearemanybooksaboutdigitalpreservationwhyisthereaneedforyet onemore? Atthetimeofwritingthebooksandarticlesondigitalpreservation,forthemost part,focusonconsiderationofdocuments,imagesandwebpages;thingswhichare normally just displayed by software for a human to view or listen to (or perhaps smell,tasteortouch).Wewillrefertotheseasthingswhicharerendered. Yetthereareclearlymanymoretypesofdigitalobjectsonwhichourlivesdepend andwhichmayneedtobepreserved,suchasdatabases,scientificdataandsoftware itself.Thesearethingswhicharenotsimplyrendered–theyareprocessedandused inmanydifferentways. It should become clear to the reader that the tools and techniques used for pre- serving rendered objects are inadequate for all these other types of digital objects and we need to set our sights higher and wider. This book provides the concepts, techniquesandtoolswhichareneeded. Of course it is easy to make claims about digital preservation techniques – and there are many such claims! Therefore it is important that evidence is provided to supportanysuchclaims,whichwedoforourclaimsbyusingacceleratedlifetime scenariosabouttheimportantchangeswhichwillchallengeus.Weuseasexamples a variety of digital objects from many sources and show tools and techniques by whichtheymaybepreserved. vii viii Preface 1 WhoShouldReadThisBookandWhy? This book is aimed at those who have problems in preserving digitally encoded information that they need to solve, especially where it goes beyond simply pre- serving rendered objects. The PARSE.Insight survey [1] suggests that while all researchershavedocumentsandimages,abouthalfhavenon-rendereddigitalhold- ings such as raw data, scientific/statistical data, databases and software, therefore thisbookshouldbeofwideinterest. Itshouldalsobeessentialreadingforthosewhowishtoaudittheirownarchives, perhaps in advance of an independent audit, about how well they are doing in the preservationofthedigitallyencodedinformationwhichhasbeenentrustedtothem. Researchersindigitalpreservationtheoryanddevelopersoftoolsandtechniques shouldalsofindvaluableinformationhere.Developersintheareaofe-Science(also knownasCyberinfrastructure)mayalsogainanumberofusefulinsights. Some of the material in this book may be found to be too technical by some readers. For those readers we suggest that they skim over such material in order to at least be aware of the issues. This will allow them to advise more technical implementerswhowillcertainlyneedsuchdetails. Tofurtherhelpreaders,thebookissupportedbyotherresources,includingmany hoursofvideosandpresentationsfromtheCASPARproject[2],whichprovides❍ an elevator pitch for digital preservation, ❍ examples of digital preservation from severalrepositories,❍ detailedlecturesbythecontributorstothisbookonmanyof theissuesdescribedhereand❍ lecturesabout,andvideocapturesof,manyofthe softwarecomponents.Theopensourcesoftwareandfurtherdocumentationisalso available. 2 StructureofThisBook Part I of the book provides the concepts and theoretical basis that are needed, introducing, as examples along the way, digital objects from many sources. Since much of this book is based on the work of the CASPAR project, the examples will be derived from many disciplines including science, cultural heritage and contemporaryperformingarts. The approach we take throughout is one of asking the questions which we believeareasonablyintelligentpersonmayask,andthenprovidinganswerstothem. Sometimes, when there are some subtle but important points, we guide the reader towardstheappropriatequestions.Asnotedabove,thiswillleadusintoanumber of technical issues which will not be to the taste of all readers but all topics are necessaryforatleastsomereaders. PartIIofthebookshowspracticalexamplesofpreservingavarietyofspecific objectsandgivesdetailsofarangeoftoolsandtechniques.Oneobviousquestion, which an intelligent (but sceptical) reader may ask is “these tools and techniques may do something but why should I believe that they help to preserve things?” Preface ix Afterall,theonlyrealwaywouldbetolivealongtimeandcheckthesupposedly preservedobjectsinthefuture.Howeverthatisnotverypractical,andperhapsmore importantlyitdoesnothelponetodecidenowwhethertofollowthewaysproposed in this book. Choosing the wrong way could have a disastrous effect on what one intendstoleaveforfuturegenerations! Weprovidewhatwebelieveisstrongevidencethatwhatisproposeddoesactu- ally work for a wide variety of digital objects from many disciplines, through a numberofacceleratedlifetimescenarios,validatedbymembersoftheappropriate communities. Part III provides answers to the questions about how to ensure that resources devoted to preserve digital objects are not wasted, showing a number of ways in which effort can be shared. In addition this part provides guidance on how to evaluate whether a particular repository (perhaps your own) is doing a good job, and where it might be improved. This part also describes the thinking behind the workcarriedouttoproducetheISOstandardsonwhichtheinternationalauditand certificationprocesscanbebased. Throughout the book we indicate points where experience shows thereisadangerofmisunderstandingbythesymbol 3 PreservationandCuration This book is about digital preservation but there is another term which is being used, namely digital curation. The UK Digital Curation Centre [3] used to define this in the following way: “Digital curation is maintaining and adding value to a trustedbodyofdigitalinformationforcurrentandfutureuse;specifically,wemean the active management and appraisal of data over the life-cycle of scholarly and scientific materials”. This definition has been changed more recently to “Digital curationinvolvesmaintaining,preservingandaddingvaluetodigitalresearchdata throughoutitslifecycle”.Sometimesthephrase“digitalcurationandpreservation” isalsoused. Wepreferthetermpreservationinthisbooksincewedonotwishtorestrictour considerationto“scholarlyandscientificmaterials”nor“researchdata”,becausewe wishtoensurewecanapplyourtechniquestoallkindsofdigitalobjectsincluding, forexample,commercialandlegalmaterial.Nordowewishtorestrictourselvesto onlya“trustedbodyofdigitalinformation”–sinceonemightwishtopreservefal- sifieddataforexampleasevidenceforlegalproceedings.Moreoveraswewillsee, our definition of preservation requires that if we are to preserve digitally encoded information we must ensure it remains understandable and usable. In other words preservation is the sine qua non of curation. For example it is possible to manage x Preface andpublishdigitallyencodedinformationwithoutregardtofutureuse;ontheother handifonewishestoensurefutureaswellascurrentuse,onemustunderstandthe requirementsforpreservation. 4 OAISDefinitions OAIS[4]playsacentralroleinthisbook.Manydefinitions,andsomedescriptive text,aretakenfromtheupdatedOAIS;theseareshownasbolditalics. 5 Acknowledgements This book would not have been written without the work carried out by the many membersoftheCASPAR[2],DCC[3]and PARSE.Insight[1]projects,aswellas themembersofCCSDS[5] andotherswhohaveworkedondevelopingOAIS[3] and the standards for certification of digital repositories [6], all of whom must be thankedfortheirefforts. A fuller list of contributors may be found in “Contributors” at the end of the book. Finallytheeditorandmainauthorofthisbookwouldliketothankhisfamily,in particularhiswifeKrystinaanddaughterZoe,fortheirsupportandhelpinpreparing thisbookforpublication. Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 What’sSoSpecialAboutDigitalThings? . . . . . . . . . . . . 2 1.2 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 TheReallyFoolproofSolutionforDigitalPreservation . . . . . . . 7 PartI Theory–TheConceptsandTechniquesWhichAre EssentialforPreservingDigitallyEncodedInformation 3 IntroductiontoOAISConceptsandTerminology . . . . . . . . . . 13 3.1 PreserveWhat,forHowLongandforWhom? . . . . . . . . . 13 3.2 What“Metadata”,HowMuch“Metadata”? . . . . . . . . . . . 16 3.3 Recursion–APervasiveConcept . . . . . . . . . . . . . . . . 26 3.4 DisincentivesAgainstDigitalPreservation . . . . . . . . . . . 28 3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 4 TypesofDigitalObjects . . . . . . . . . . . . . . . . . . . . . . . . 31 4.1 Simplevs.Composite . . . . . . . . . . . . . . . . . . . . . . 31 4.2 Renderedvs.Non-rendered . . . . . . . . . . . . . . . . . . . 33 4.3 Staticvs.Dynamic . . . . . . . . . . . . . . . . . . . . . . . . 38 4.4 Activevs.Passive . . . . . . . . . . . . . . . . . . . . . . . . 38 4.5 Multiple-Classifications . . . . . . . . . . . . . . . . . . . . . 39 4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 5 ThreatstoDigitalPreservationandPossibleSolutions . . . . . . . 41 5.1 WhatCanBeReliedonintheLong-Term? . . . . . . . . . . . 43 5.2 WhatOthersThinkAboutMajorThreats toDigitalPreservation . . . . . . . . . . . . . . . . . . . . . . 44 5.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 6 OAISinMoreDepth . . . . . . . . . . . . . . . . . . . . . . . . . . 47 6.1 OAISConformance. . . . . . . . . . . . . . . . . . . . . . . . 49 6.2 OAISMandatoryResponsibilities . . . . . . . . . . . . . . . . 50 6.3 OAISInformationModel . . . . . . . . . . . . . . . . . . . . . 53 6.4 OAISFunctionalModel . . . . . . . . . . . . . . . . . . . . . 63 xi

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.