sunnuntai 11. maaliskuuta 2012

Origins of the Finnish database

Hello again!
The topic for our lecture last week was databases and how to build them. As the key person to construction of (local) databases, was a certain gentleman by the name Henrik Grönroos pointed out. This man, working as a librarian at the National Library, had always been very interested about book and started early on categorising and listing estate inventories and auction catalogues, the first of which dated back to the 17th Century in Finland. He was interested in discovering the history of the book - who bought books, and what kind of books were they? This ambition eventually led to a very thorough and vast compilation called "Boken i Finland" (1996). Grönroos also published several essays on the special characteristics of books, books collectors and readers.

Grönroos' lifework inspired the establishment of the Henrik database (Henrik-tietokanta). This database allows a possibility of solving the connections between the owners of different books and the books' significance of the Finnish cultural landscape, especially during the Swedish reign. In this manner we can say that Grönroos was a pioneer within the foundation of Finnish databases, and that it is him and his ambitious work we owe thanks to for the vast databases we use nowadays. This also gave rise to ER model (entity-relationship model, more about it on Wikipedia 8D!), which basically means an abstract and conceptual representation of data.

Well, since I suck at coding and I'm not generally interesting in building tiring tables in Excel, I'll leave it to someone more able than me 8D! Now I'm off to write on my Pro-seminar and study for a retake exam in Art History, both due in a few days! See ya!

tiistai 6. maaliskuuta 2012

A bit more detailed presentation of long-term digital... preservation

Hello again! As a little extra assignment to amend my absence from the Digital culture course two weeks ago (I did mention this in my earlier blog post, but oh, well...), I'm supposed to offer you a bit more detailed presentation of long-term digital preservation.

As required for the assignment, I took a little look at The National Digital Library and its PAS project (eng. long-term preservation). I was greeted by a rather epic welcoming text in English:"National Digital Library is a project which aims to ensure that electronic materials of Finnish culture and science are managed with high standard, are easily accessed and will be securely preserved for a long period of time. Participating archives, libraries and museums work together in saving our national heritage in a digital format and in making it available for all.". This means trustworthy storage of digital information for several decades and even centuries. Equipment,software and file formats age and expire, but in spite of this information should remain comprehensible. Customers of the PAS service are organisations, to which the preservation service is offered. These are primarily organisations within the administrative sector of The Ministry of Education and Culture, which answer for the preservation of the spiritual and physical cultural heritage.

During the last few years libraries, archives and museums have digitised their material in noteworthy investments. They acquire as well a lot of originally digital material. The current PAS project (2010-2013) is a continuation of another identical project back from 2008-2010. The basic idea with the project is to preserve material and aid the accessibility and usability of the information reserve of between libraries, archives and museums.

As of 2011, there has been around 687 million objects within digital preservation at the National Digital Library, the majority of this documents, pictures and material from web archives. This is, without doubt, a very prestigious and ambitious project. The project also includes clear directions and guidelines on the different kinds of file format and their eligibility and qualification for storage and transfer. According to the NDL, there is a slight difference between these two categories; files eligible for storage are in good shape and usable for a long time ahead, as material similar for files qualified for transfer have already been stored in the National Digital Library in bigger quantities before. It is strictly forbidden to alter or change the format of the files the least to facilitate the actual transfer of the files, as this can damage the file. Each and every small alternation is a risk to the preservation of the file and only the newest methods should be used in the actual digitising process. According to the newest survey, the most popular file formats among the digitised material is jpg/jpeg, pdf, tiff, doc/docx and mp3 - not very surprising, I'd say.

The PAS project also wants to offer full service and advertises attendance with highest priorities in areas such as trustworthy storage,consultation and support, and planning the storage. The most important kind of material has been carefully mapped into different sections. Among files most in need of storage are listed on their page are e.g. files, which are important for preservation because of their authenticity, which contain dynamics and&or interactiveness and files which cannot, files which cannot be used in their current state and files which are deemed impossible to convert to another format.

Altogether 500 million objects of this kind and more have been reported to exist by different organisations participating in the project. According to the survey, migration (transferring of data to newer system environments) as a method of long-term preservation in the PAS project serves the purpose of storing all kinds of file formats and thus also serves the needs of all different organisations participating in the project. Thus migration will be the first-hand preservation method in planning of the PAS solution of the NDL.

I hope this post could offer a somewhat thorough picture of the services NDL offers through their PAS project and services. One only has to hope it will be more publicly accessible in the future, but as a researcher one surely gains access quite easily, or that's what I expect. Who knows, we'll see in time...

About born digital objects & digital preservation

Hello again! Because of the nasty fever I suffered earlier, I could not grace the Digital culture course with my presence two weeks ago, so here's on update I'll have to write about the topic for the lecture I missed out on. The topic of the lecture on February 20th was born digital objects (sounds quite curious, doesn't it?) and digital preservation in general.

Born digital objects are according to Wikipedia:" materials that originate in a digital form.It is most often used in relation to digital libraries and the issues that go along with said organizations, such as digital preservation and intellectual property. However, as technologies have advanced and spread, the concept of being born-digital has also been discussed in relation to personal consumer-based sectors, with the rise of e-books and evolving digital music. Other terms that might be encountered as synonymous include “natively digital,” “digital-first,” and “digital-exclusive." A lot of text, huh?

Well, basically it seems that born digital objects or born-digital, as they are also called, consist of growing group of materials, range from websites, forums, communities, wikis. In short anything that was or has been created in a digital environment can be considered born digital material. Furthermore, according to Wikipedia:"There exists some inconsistency in defining born-digital materials. Some believe such materials must exist in digital form exclusively; in other words, if it can be transferred into a physical, analog form, it is not truly born-digital. However, others maintain that while these materials will often not have a subsequent physical counterpart, having one does not bar them from being classified as 'born-digital'."

Although most of the digital material online, like those counted above, can easily be digitised, some of the material online do not meet the same lucky fate. Material such as online newspapers, photographs, Internet disseminated TV shows and webcomics all have their origins in a time prior to the use of computers, but due to popularity have the material has been converted to a digital format, resulting in separate born-digital creations. This allows each of these materials to reach a bigger audience of interest and gain interest. The accessibility and easily usable format has made them a popular part of everyday life. E-books are a good sample of well-integrated digitisation, which have also awakened people's interest towards born digital materials and digital preservation. This is especially important to remember as a researcher, when we can all count on finding valuable research sources online in the future rather than in a physical form.

Digital preservation (again, thank you Wikipedia, you are quite impeccable at times!) is the set of processes, activities and management of digital information over time to ensure its long term accessibility. The goal of digital preservation is to preserve materials resulting from digital reformatting, and particularly information that is born-digital with no analog counterpart. Because of the relatively short lifecycle of digital information, preservation is an ongoing process. As we can see, digital preservation is essentially what makes born digital material accessible to us. Preserving Internet and its vast expanse of material is a real challenge for scientists and researchers alike, especially when the desired outcome is long-term storage for decades ahead. Long-term is defined as "long enough to be concerned with the impacts of changing technologies, including support for new media and data formats, or with a changing user community. Long Term may extend indefinitely". It is important that the storage is error-free and allows retrieving of acquired data without risk of corrupting the digital storage and the files it contains. Most of this digital material is encoded and needs to be interpreted into usable presentations such as pictures, text,charts, images or sounds. It is also important to plan the actual process of preservation, so that the material will be as undamaged as possible by the digitisation itself and when transferred to the storage. After all, much of future research will be relying on such digitally preserved material and its unaltered condition.