CONTENTUS – Technologies for the media library of the future
Germany’s 30,000 libraries, museums and archives contain an incredible wealth of knowledge in the form of millions of books, images, tapes and films. Researchers involved in CONTENTUS are exploring how these cultural assets can be made available to as many people as possible and preserved for future generations.
One of the major challenges facing the knowledge society is to provide easy access to knowledge and cultural assets through multiple media channels. Within the framework of CONTENTUS, ideas and technologies are being developed for an infrastructure allowing cultural institutions and information providers to work toward that goal. With the help of these new technologies, large quantities of data can be automatically processed and semantically linked, whether in the form of texts, images or video and audio recordings.
The work of CONTENTUS is closely coordinated with the German government's "German Digital Library" initiative. A comprehensive description of content using metadata – which are comparable to an index or table of contents for finding digital media – plays an important role in this context. Another challenge is to link the various digital knowledge records on the Internet and to integrate user-generated content. To that end, researchers are developing intelligent algorithms that automate all of the necessary processes and allow users to add their own information to existing digital content. This results in interlinked, next-generation virtual media libraries.
Next-generation multimedia librariesThese multimedia libraries cross-link the collections of traditional libraries, media archives and broadcasting stations to create a new information structure, bringing together providers and users via the Internet. Users are being offered new opportunities to contribute their knowledge to existing multimedia resources. At the same time, the data are linked semantically – based on their substance – to form a new kind of knowledge network. This allows providers to optimize the structure and quality of their multimedia collections.
One example demonstrates the benefits of this next-generation multimedia library: A search for “Lucullus” yields results in a variety of media, including texts, images, films and audio recordings. Thanks to the use of semantic technologies, it also shows content that is logically associated with that topic. Thus the search might show scores or recordings of “The Condemnation of Lucullus” as well as other works by the composer Paul Dessau or the author Bertolt Brecht, for example “The Good Person of Szechwan.” Upon request, it will display information about the great interpreters of Brecht's works and the pieces they have sung, or provide links to music companies that sell Brechtian songs online. While it generally takes a long time for users of traditional archives to find specific works or pieces in a collection, users of the new multimedia library will be able to search rapidly and take advantage of logical results that include multiple media.
Advantages for users and providersWith CONTENTUS, users will find it easier to search and navigate through digitalized cultural assets. It will be a simple matter for them to share results and insights with other users and providers, as well as to add their own information by inserting links. With the help of technologies for the qualitative processing and semantic linking of multimedia resources, smaller libraries and archives, too, can be part of the planned information structure and make their multimedia collections available to a wider audience.
Research and development in the service of knowledgeIn a six-step process, analog data media are digitally prepared for use in multimedia libraries and archives, using a highly automated process. First, high-throughput procedures help to digitalize large quantities of analog content quickly and efficiently. Second, the quality of the material that has been imported is measured and optimized with the help of technologies developed specifically for that purpose. Text, key word, voice or speech recognition methods are then used to produce metadata for describing the relevant content; without this step, it would be very difficult to find digitalized books, films or sound recordings in large data collections. The fourth step uses these metadata to link content items as part of a semantic networking process. As illustrated by the “Lucullus” search example, users are given suggestions regarding additional information that is related to the search term – so that they can explore the topic further, if they choose. In addition to carrying out a full-text search, users can ask questions like “Which of Thomas Mann’s children were also writers, and what did they write?” Step five brings in users and experts who can add to existing content. The final step completes the process, resulting in cross-linked, multimedia access to information – a milestone in the development of a Web-based knowledge infrastructure.