(Read best with MS Explorer)

 

Virtual Exhibits


Theory, methods, and tools for
development of virtual exhibits on demand

 

Joan C. Nordbotten

Department of Information Science

University of Bergen

joan@ifi.uib.no,  http://ifi.uib.no/staff/joan

 

 

 

Summary       

Museums world-wide are deploying both virtual exhibits and multimedia collections for use by researchers, educators and the general public. With today’s technology, users searching for thematic information from multiple autonomous sites must currently perform a series of separate processes to: locate a reference list to relevant sites, search each individual site for relevant information, extract relevant data, and construct a local collection for ‘off-line’ development of an integrated presentation. The problem can be summarized as a need for methods and tools to aid users in locating, accessing, and extracting relevant information from multimedia, multi-database systems developed and maintained by autonomous museums.

 

Two principle problems hinder support for location and access to multiple data sources. First there is a lack of agreement on how semantically consistent metadata for description of museum collections should be created. Thereafter a user friendly query language and processing system must be developed to support the formulation of search criteria, search in a multidatabase space, and integration and presentation of search results. The current project aims to develop methods and tools to address these problems by integrating and extending existing methods and tools developed separately for metadata, multimedia, and multidatabase management.

 

Project context and problem area

Museums world-wide are deploying both virtual exhibits and electronic collections for use by researchers, educators and the general public[i]. Most virtual exhibits consist of text and 2D images, though inclusion of video clips and audio material is increasing.  As high quality scanning equipment becomes affordable, museums are beginning to undertake the task of making 3D scans of their object collections. These activities lead, invariably, to a set of separate but related electronic data collections (databases) consisting of text, in standard and HTML/XML formats, 2D and 3D images, and video and audio clips. Given that museums house overlapping collections, the result is a large set of Internet accessible multimedia, multi-database systems.

 

Museum exhibits, and their virtual counterparts, are crafted ’by hand’ on a case-by-case basis and include only a small portion of the museum’s actual and electronic collections, respectively.  Development of an exhibit takes months and several man-years of effort. The resulting exhibit is static in the sense that users cannot extend or tailor the content for their own special needs, for example to form a specialized exhibit as an element in an educational context. Supporting user developed exhibits, or specialized user defined collections, requires user access to the underlying data collections from multiple museums.

 

Locating relevant Web exhibits is most frequently done through an Internet search engine using descriptive keywords. The user given keywords are matched to site indexes created from local Web site names and descriptions. It is a common experience that the search result (when not empty) is both large and imprecise, i.e. contains many irrelevant site references. It is up the user to use the result reference list to access sites and extract relevant information. This is a tedious process, giving at best a result limited to information included in Web exhibits, i.e. it does not give access to the underlying museum collections.

 

Individual museums are making their electronic collections of scanned objects and electronic catalog information available via their museum home site[ii]. These collections can typically be browsed in creator (sculptor, artist, author) or chronological sequence (from start to end), searched using a hierarchical set of given search terms, or by filling in a search template[iii]. Unfortunately, there is little or no standardization of collection descriptions (metadata) or search systems for museum collections. Also, there is no external search engine capable of searching multiple museum collections.

 

For example, a student or researcher may wish to locate information on the use of precious metals and gems in royal jewelry in Europe during the middle ages. Relevant information is located in national museums, as well as national archives, libraries, and collections maintained for the royal families. Using current Internet search engines, the user could retrieve a list of relevant Web sites that would not necessarily be precise or complete and would likely not include links to underlying museum collections. Thereafter the user would need to search via established links to locate relevant data, using the various local search systems.

 

The problem can be summarized as a need for methods and tools to aid users in locating, accessing, and extracting relevant information from multimedia, multi-database systems developed and maintained by autonomous museums.

 

 

Current Status

There are a number of theories, methods, tools, and IT systems that can be extended to provide a solution for accessing autonomous multimedia, multi-database systems.

 

Resource Location & Metadata development

Locating relevant data and information

requires that metadata describing the semantic content of each collection be made available in a standardized format that supports semantic integration [Bearman and Trant ’98]. There are numerous metadata standardization activities in process, perhaps best know is the development efforts behind Dublin Core (DC) in which there are 13 working groups including one with museum representatives[iv]. The Dublin Core effort began from requirements of the Digital Library community and proposals for extension to other cultural application areas have been made, for example in [Bearman et al, ’99]. Other proposals, developed for museum collection description include CIDOC [Doerr’98] and the Warwick Framework [Lagoze’96]. While proposals for description of multimedia semantic content can be found in [Lu ’99, Marcus ’96, Subrahmanian’98, and Wu et.al ‘00].

 

The DC proposal consists of 13 basic metadata elements for describing aspects of data items or resources. Certain descriptive values, for example for subject and type, are recommended to come from controlled vocabularies[v]. It is well known that the use of controlled vocabularies for developing collection metadata/indexes requires trained users since the vocabulary seldom matches that of general public users [Hillman,01]. This problem is also well known in the information retrieval community where much research has gone into linguistic based methods for selecting descriptive document terms and establishing thesauruses to aid match between user query terms and the collection index terms [Baeza-Yates ’99, Kowalski ‘01].  The problem is also well known in the multidatabase research community where a structural analysis of database schemas has been the dominant approach for synonym resolution [Elmagarmid et.al ‘99].

 

Data access and retrieval

As noted above, Internet search engines retrieve lists of Web site ULRs after matching the user key word request with site indexes constructed from the metadata available to the search engine via crawler activity. No actual data is returned to the user who must then continue his/her search using established links on the referenced Web sites.

 

Multimedia database management systems, including document retrieval systems, also use a keyword based search, but return actual resources (multimedia or document objects). These systems can also search for resources similar to a given resource, perhaps from the response to an initial keyword search [Baeza-Yates ’99, Kowalski ’01, Lu ’99, and Wu et.al ‘00]. The problem with these approaches is that the search engine is specialized to one data type (text, image, video, audio, or spatial), thus requiring multiple queries, one to each data collection. Current Object relational DB management systems, for example IBM’s DB2, can search for multiple multimedia data type within one database by combining access methods. However, these systems do not address the autonomous multidatabase problem.

 

Multidatabase systems can search and retrieve data from multiple source databases, but only structured databases managed by relational or object-oriented DB management systems.

 

In the multimedia and multidatabase approaches, extensions to SQL3 are proposed and used, perhaps with a form interface. The major problem here is also that this is not a user friendly language as it requires user knowledge of the structure of the underlying database to be searched, an impractical restriction for an environment with a large number of multimedia data collections.

 

Presentation

Query results are generally given as a list of objects/resources. These are commonly unordered, since the user keyword search is usually too limited to indicate a preference for ranking. If the result is to be formed as an exhibit, the user must then do so.

 

 


The Virtual Exhibit project description

Context

Figure 1 shows the main components in a system for construction of virtual exhibits resulting from a database query. The lower section of the figure, labeled Bergen Museum, depicts an anticipated database set partially funded by an NFR grant to establish a database of scanned 3D objects. The system is to be managed by an object-relational database system, Informix/DB2, which supports SQL3, and contains image, video, audio, and document processing extenders. The museum catalog is currently being transcribed to an IT based catalog using the museum’s standard format.

 

The upper part of Figure 1 shows the main components of the Virtual Exhibit project. A semantic model of the museum catalog is to be designed and used for constructing the metadata to be stored in a semantic schema for the system. The query language SemQL is to be constructed as an extension to SQL3 for use of the semantic schema. A virtual exhibit construction module will be developed to present the results of queries to the system.

 


 

 

 

 

 

 


Figure 1: Virtual Exhibit system components

 

 

Processing a SemQL query will utilize the semantic schema and a domain thesaurus to locate relevant museum objects. The presentation module will construct a set of Web pages, a topic exhibit, for presentation of the query response. The resulting exhibit can be further modified by refining and resubmitting the SemQL query.


Project goals

The primary project goal is to develop methods and tools for creating ’on-demand’ virtual exhibits from multiple multimedia systems. The new methods and tools are to be an integration and extension of appropriate methods developed separately for Metadata, multimedia, and multidatabase management.

 

Sub goals and tasks:

  1. Develop a semantic model for the metadata required for description of semantic content metadata for museum data collections (databases).
  2. Develop a schema structure for storage of the metadata as required from task 1.
  3. Establish a method for integrating the metadata model from pt.1 with the basic metadata model of Dublin Core, thus supporting interoperability with other systems that use this technology.
  4. Extend the SQL3 query language and processor to support search by semantic content, as defined in the metadata schema developed in task 2, for electronic museum artifacts stored in multiple databases.
  5. Develop an exhibit construction module for presentation of query search results.
  6. Demonstrate the above techniques in prototype system for an educational application.

 

 

 

 

Research methods

The fundamental research method used will be prototype system development for feasibility and resource requirement testing of the envisioned system. Usability testing will then be used to refine the system for practical use by anticipated museum collection users.

 

Three basic system components need to be developed.

1.         A metadata system tailored to museum collection users. Initially, the “media abstraction approach” proposed by Marcus and Subrahmanian (1996) and later extended by Subrahmanian (1998) will be used to develop a semantic model for the museum collection. This media abstraction approach outlines a structure for capturing semantic content in multimedia collections. It has not yet been developed into a working prototype, so that our result can be considered method development. Assuming this is successful, the result will be integrated with the semantic content descriptions as specified in the current Dublin Core standard for description of artifacts.

2.         Thereafter, a semantic query language based on SQL3+, SemQL, will be developed. This language will combine functionality for multimedia, multidatabase access as outlined in Baeza-Yates (1999) and Elmagarmid (1999), respectively. In addition, SemQL, will use the semantic schema structure developed under component 1 above. Currently, there is no known query language that combines multimedia search in multidatabase systems.

3.         Finally, a generic exhibit generator will be developed for presentation of SemQL query results.

 


Test and Evaluation

Three phases of evaluation are planned:

1.         Prototype testing will be used to demonstrate the feasibility of combining the above elements into a museum resource retrieval system.

2.         Precision & Recall measures, common in document processing systems (Kowalski ‘01), will be used for quality testing of SemQL. The test environment will consist of a carefully constructed set of data collections containing the scanned images and descriptive documents for a set of 3 thematically close types of museum objects.

3.         Usability testing will be done by eliciting thematic descriptions of presentation exhibits, defined by educators interested in the thematic content of the test databases. These will form the source material for the SemQL query and the presentation model testing.

 

 

Project schedule

The following schedule assumes that 3 students will be working on this project in metadata and schema development.

 

 

Task

2002

2003

2004

2005

Metadata model development

X

X

X

X

 

 

 

 

 

 

 

 

 

 

 

 

Schema structure

 

 

X

X

X

X

 

 

 

 

 

 

 

 

 

 

Integration with DC

 

 

 

 

X

X

X

X

 

 

 

 

 

 

 

 

SQL3 extension

 

 

X

X

X

X

X

X

X

X

 

 

 

 

 

 

Exhibit prototype

 

 

 

 

 

 

X

X

X

X

 

 

 

 

 

 

Test with ed. Application

 

 

 

 

 

 

 

 

X

X

X

X

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Publishing

 

 

 

X

X

 

 

X

X

 

X

X

 

 

 

 

 

 

 

Project result presentations

It is expected that the project will produce 2 master's degree theses, one each in 2003 and 2004, and 1 doctoral degree thesis in 2004.

 

At least 6 articles, 2 per year, are planned for conferences for:

-         Virtual Museum developers,

-         researchers, developers, and standards committees for metadata for cultural heritage,

-         research communities in multimedia and multidatabase management.

At least 3 papers are planned for research journals addressing the above audiences.

 

The product system should be useful for Bergen Museum's educational development work and will hopefully be of interest to a wider museum community.

 

 

 

 

Resources

Project environment:

The project is being initiated as a collaboration between Bergen Museum and the Department of Information Science at the University of Bergen. A contact net is being established with both national and international museums and researchers interested in collaborating on development of interoperable resource collections.

 

Bergen Museum (re. BM section of Figure 1) is currently implementing an IT based catalog of museum objects using a local catalog standard. Under an NFR project, the museum has acquired multimedia processing equipment with Informix/DB2 OR-DBMS with text and image extenders and SQL3+ with text and image search support, as well as resources for initiating a 3D scanning project of museum objects

. The museum is in the process of designing a data collection for scanned 3D objects.

 

Jan Erik Vold, head of Bergen Museum’s, Dokumentasjons- og IT-avdeling

Muséplass 3

 

The Department of Information Science (informasjonsvitenskap) has long experience working on database projects and has graduate students with a strong background in multimedia and multidatabase management[vi]. The department also has computer equipment that can be used in this project.

 

 

 

Project assistants will be recruited from the Department of Information Science.

 

 

 

Reference persons:

  1. Professor Niels Windfeld Lund, Inst. for dokumentvitenskap, Universitetet i Tromsø
  2. Professor Vera Goebel, Inst. for informatikk, Universitetet i Oslo
  3. Professor Ingeborg Sølvberg, NTNU.
  4. Professor Martha Crosby, Dept. of Information and Computer Science, University of Hawaii at Manoa, Honolulu, HI

 


 

References:

Baeza-Yates,R. and Ribeiro-Neto,B. (1999) Modern Information Retrieval. Addison–Wesley.

Bearman, D. and Trant, J. (1998). Unifying Cultural Memory. Information Landscapes for a Learning Society, 1998. And presentation at UK Office of Library Networking Conference, July 1998. Also at www.archimuse.com/papers/ukoln98paper/index.html

Bearman,D., Miller,E., Rust,G., Trant,J., and Weibel,S. (1999). A Common Model to Support Interoperable Metadata, Progress report on reconciling metadata requirements from the Dublin Core and INDECS/DOI Communities. D-Lib Magazine, Volume 5 Number 1, Also at http://www.dlib.org/dlib/january99/bearman/01bearman.html

Doerr,M. and Dionissiadou,I. (1998). Data Example of the CIDOC Reference Model. Epitaphios GE34604 Benaki Museum, Athens Greece. Also at http://www.geneva-city.ch:80/musinfo/cidoc/oomodel/epitaphios.htm

Elmagarmid,A., Rusinkiewicz,M., and Sheth,A. (1999). Management of Heterogeneous and Autonomous Database Systems. Morgan Kaufmann.

Hillman, D. (2001) Using Dublin Core. http://dublincore.org/documents/usageguide/

Kowalski, G. (2001). Information Retrieval Systems : Theory and Implementation, 2nd ed. Kluwer International.

Lagoze, C. (1996). The Warwick Framework - A Container Architecture for Diverse Sets of Metadata.   D-Lib Magazine, July/August 1996. ISSN 1082-9873.

http://www.dlib.org/dlib/july96/lagoze/07lagoze.html

Lu,G. (1999) Multimedia Database Management Systems. Artech House, London.

Marcus,S. & Subrahmanian,V.S. (1996). Towards a Theory of Multimedia Database Systems. In Subrahmanian & Jajodia, ed. Multimedia Database Systems. Springer-Verlag, 1996. Pp 1-35.

Subrahmanian , V. S.  (1998). Principles of Multimedia Database Systems. Morgan Kaufmann.

Wu,J.K., Kankanhalli,M.S., Llim,J, and Hong,D. (2000) Perspectives on Content-Based Multimedia Systems. Kluwer Academic Publ.

 

 

 

Footnotes:



[i]  The Virtual Library of museum pages, VLMP, at http://www.icom.org/vlmp/ provides an extensive list of museum Web sites.

[ii]  The Museums and the Web conference, 2001, held a competition for the best museum Web sites in several categories. A short list of research sites can be found at http://www.archimuse.com/mw2001/best/research.html.

[iii]  A description of the search functions for the Ohio Outdoor Sculpture Inventory collection is maintained at http://www.sculpturecenter.org/oosi/oohints.htm.

[iv]  The home site for Dublin Core is at http://dublincore.org

[v]  The DC Metadata Element Set, V1.1 Description is at http://dublincore.org/documents/1999/07/02/dces .

[vi] A description of the annual graduate course in multidatabase management, held at UiB is maintained at http://www.ifi.uib.no/staff/joan/mdbm/Mdbm-sem.htm.