Rapid technology evolution lets Digital Archives (DA) face great challenges in preserving the contents of digital objects. A standard approach to preservation is to migrate digital objects to new technologies periodically. There, object representations may change whereas their contents must not. \r\n\r\nIn large-scale scenarios automated quality assurance is a major concern: Did a given migration process preserve all relevant object properties? However, automation is often hindered as preservation requirements are expressed informally. In these cases, quality assurance is often hand-crafted, which is time-consuming, expensive, and error-prone.\r\n\r\nWe introduce a framework that is designed to support automation of migration and quality assurance processes in digital archiving. In particular, we express semantic preservation requirements formally; automated routines test migration processes for adherence to them. Theoretic well-foundedness and smooth integration into internal workflows of DAs have been important design goals. \r\n\r\nA customizable, state-based archival context enables workflow integration. Here, we presuppose little system knowledge only: Objects must be uniquely identifiable. Users can integrate domain-specific object types and functionality. Well-defined state changes capture the effects of migration processes. There, our system ensures object immutability based on a formal notion of object contents. \r\n\r\nSemantic requirements of the form “When transforming objects o_1,...,o_n, preserve property P” constrain migration processes. This preservation language bases on full first order logic. Object properties are captured by so-called concepts. Preservation requirements refer to concepts by name so that implementation details are hidden. This keeps specifications readable and less prone to changing implementations. \r\n\r\nOur generic notion of preservation relates (1) source and target objects, (2) object histories, and (3) concepts. A concept is preserved if the target objects are new versions of the source objects and the concept equally holds for the source and target objects. There, we permit different concept implementations for the source and target objects --- we support content migration.\r\n\r\nWhen migrations are executed, our system traces changes to digital objects and reports constraint violations automatically. There, object traces derive from (iterated) object transformations. Concept interfaces allow for partial, thus, efficient tracing here. Reports relate concrete source and target objects to violated requirements. This facilitates adequate and customized reactions.\r\n\r\nDue to a coherently formal underpinning, our methods satisfy a high degree of “trustworthiness”. A case study in the field of website migration shows that our methods are applicable, beneficial, and scale to a relevant problem size. In the case study we also demonstrate formal model construction facilities of our framework. We have developed a general approach to integrating graph-based queries and apply these methods to automated URL construction. Runtime measurements show acceptable performance when using our prototype implementation.\r\n
«Rapid technology evolution lets Digital Archives (DA) face great challenges in preserving the contents of digital objects. A standard approach to preservation is to migrate digital objects to new technologies periodically. There, object representations may change whereas their contents must not. \r\n\r\nIn large-scale scenarios automated quality assurance is a major concern: Did a given migration process preserve all relevant object properties? However, automation is often hindered as preser...
»