
Moving Image Collections:
A Library of Congress-AMIA Collaboration
Table of Contents:
Overview
Organization Directory Database
Cataloging Utility
Ingest/Mapping Facility
Union Catalog Data Registry
Element Tables
Indexing Tables
Open Archives Initiative
Search & Retrieval
Dynamic web pages
Export Utility
Education, Outreach & Research
Overview:
A critical area of interest for the Association of Moving Image Archivists and the Library of Congress is the identification and preservation of moving image organizations worldwide. AMIA determined that a significant new direction for the organization could be the development of a collaborative catalog of moving image materials-the Moving Image Collections portal (MIC). The goal behind the AMIA MIC is to provide a window to the world's moving image collections for researchers, exhibitors, and the general public that also allows preservationists to collaborate in describing and maintaining these unique cultural resource and thus avoid costly duplication of effort. Moving images are unique in the use of multiple information streams (audio, visual and textual) to provide a compelling and immersive educational experience. Yet, moving images have remained isolated from the mainstream as an information resource, rarely cited in research papers, for example, or consulted as primary reference sources. Therefore another important objective of MIC is to bring a very flexible but standardized metadata architecture to these diverse resources to integrate moving images into the information mainstream with the understanding that society values most highly what it understands and uses.
The MIC has been designed with several innovative components:
1. A directory of moving image organizations worldwide that collects information in areas such as size and formats of collection, organization roles and audiences served, collection preservation status and issues, and collection genres. In addition, the directory will support free-text information fields and graphics to generate a dynamic web page for each organization. The tight integration of the directory with the union catalog to build a dynamic information space is one of the key innovations of the AMIA MIC design.
2. Cataloging support to enable any participating organization to create records in at least two standardized descriptive cataloging formats-MARC21 and Dublin Core--for ingest into the AMIA-MIC. This cataloging facility will serve as a significant outreach tool to the many smaller organizations that lack a web-based catalog. In phase I, database implementations in a low-cost or open source database management systems, such as My SQL, Microsoft Access and Filemaker Pro will be provided to create a standards-based record for ingest and export.
3. A union catalog that incorporates both open-source Z39.50 capabilities and support for the Open Organizations Initiative. This innovative structure involves a metadata record structure that provides several innovative features:
a. mapping to core data elements in a registry both to aid in record ingest and to provide consistent, interpretable search results,
b. support for the organization's own data element labels and data element display order
c. extensible format-independent metadata design that accommodates searching, export and display in MARC21, Dublin Core, MPEG-7, the organization's own format and additional metadata record formats adopted by the MIC over time, such as MODS.
4. A flexible portal design that integrates directory information with the union catalog to provide dynamic portal generation based on user-selected criteria (e.g. format, geographic location, collection genre types, audience served, organization roles) as well as more durable portals developed and maintained by AMIA or by participating organizations (e.g. digital video portals, feature film organizations, subject-specific portals, etc.).
5. A web presence for every participating organization. A significant number of smaller AMIA organizations lack a web-based catalog or any web presence, beyond a home page. The directory and the metadata architecture of the union catalog are designed to generate dynamic home pages as well as dynamic search and display screens to provide an immediate web presence and web-based catalog for any participating organization.
6. An outreach and education space for each portal that provides features such as:
a. step-by-step instructions for installing and implementing supported databases and cataloging in supported formats
b. a clearinghouse with links to information on cataloging and preserving moving image materials
c. links to training and conference opportunities, scholarships and grants
d. reference and information sharing via email discussion lists and chat relay
e. a mentoring program for providing archival expertise to small and nontraditional organizations, such as corporate organizations with small amounts of moving images in their collections
f. an online "match-making" service to facilitate identifying the appropriate organizations for organizations, corporations and the general public wishing to donate moving image materials to a library or organization.
Alpha implementer organizations will provide metadata records for all moving image materials or a focused subset of their moving image materials to the MIC. They will submit a directory record for their organization and will and actively participate in all functionalities of the MIC architecture (dynamic web pages, search portals, etc.) to test the MIC architecture and to contribute to its design, evaluation and revision. Alpha implementers will represent a range of moving image organizations with materials in film, video and digital formats. Small and large collections representing different genres and target audiences will be selected with a heavy initial focus on the sciences, which are underrepresented on the world wide web and which will be featured in the "Science Goes to the Movies" portal. Alpha implementers will be actively involved in the revision, testing and evaluation of the MIC core registry elements, the metadata display and export formats and all initial functionalities of the MIC.
Components of the MIC:
1. Organization directory
database
The organization directory database will be completed by all participating organizations.
Initial participation in the MIC will be through completion of the organizations
database input form, which will result in a searchable directory of moving image
organizations. The database-input form will include, at a minimum, the following
information:
§ OrgID. The OrgID is a code, which should have a standardized format. The Library of Congress organization code standard is under active consideration for this data element. The OrgID will be linked within the data registry database to all elements with the data element labels in use at that organization, as well as to the information ("values") within the elements, as parsed into separate data element tables. This will identify moving image assets as belonging to an organization, will allow searches to be limited to an organization and will play a key role in the creation of the organization's dynamic web presence, as listed below. It will also enable any participating organization to participate in other union catalog initiatives by exposing their data for mining by a union catalog implementation, based on OrgID.
§ Directory information. Contact and location information, including address, phone number, URL, email address, contact person(s), etc.
§ Organization type. A controlled list of approximately 10 possible types (e.g. corporate, academic, public, K12 library, production house, stock footage house, etc.). All types that apply would be selected since many organizations fall within multiple types, e.g. an academic organization that also licenses footage.
§ Organization role. This should be a controlled list of options in which all that apply are selected, including research, education, footage licensing, corporate organization, etc
§ Hours and days of service.
§ Audience served. This can also be a controlled option list, including organization or corporation members only, researchers, etc. that also allows a free-text description, such as "non-profit organizations providing medical assistance to developing countries."
§ Conditions for use. This could be a list, or a free-text field, in which the organization describes conditions for use of the collection. (e.g. onsite use by University of X faculty and students only).
§ Collection size.
§ Collection and genre strengths. This should be a controlled list that can include medical, theological, feature film, broadcast news, etc. and can be as detailed (e.g. Asian avant garde) or broad (silent films) as desired. Each organization would check all that apply.
§ Formats collected. Physical format, e.g., film, video, digital.
§ OAI participation flag. Yes/no flag will indicate whether the MIC should make records from the organization available for harvesting for OAI initiatives.
§ Preservation activities. A free-text description of preservation activities of the organization. Intended to facilitate information sharing among organizations.
§ Preservation contact information. Contact information including title of contact, email, phone number, etc.
§ Cataloging activities. A combination controlled list and free-text description of the metadata standards and technologies used. Intended to facilitate information sharing among organizations.
§ Cataloging contact information. Contact information including title of contact, email, phone number, etc.
§ Database contact. Contact for administration of the MIC database, to schedule dataloads, evaluate data maps, etc.
§ Programming, collections and research support activities. A free-text description highlighting programming, collections and research support offered to organization users. This field is intended for display on the dynamic organization web page, for introducing the organization to potential users.
§ Public service contact. Contact information including title of contact, email, phone number, etc.
§ Homepage URL. URL for the organization's own homepage
§ Searchpage URL. URL for the organization's web-enabled searchable catalog.
§ Z39.50 database flag. Yes/no flag will indicate whether the organization's web database is Z39.50 capable.
§ Date of last database update. This will indicate the currency of the record displayed as a search result.
The above list is not prescriptive. Additional information can also be included as desired by the community.
The organization directory database will serve a multitude of purposes. To begin with, it provides a directory of moving image organizations and can be completed very early in Phase I of the MIC by any organization, regardless of whether the organization's data is loaded in the MIC. This directory can be used to dynamically generate "domain maps" that show at a glance key features of the moving image organization domain. JAVA and XSLT can generate dynamic graphical representations of the community, as the following illustration demonstrates.
One of the most important uses of the organization directory record will be to generate websites for participating organizations. An organization will be able to provide a logo and image of the organization and select from several design templates to create a "home page" and search screen. The home page will be dynamically populated with information from the organizations directory database record and will either provide a search screen on the home page or will link to a second page with search screen. Searches from this search page will be directed to the organization's metadata records only and will use the data element labels supplied by the organization, in the order in which the data elements appear in the organization's original record. This will provide a web front-end and a web presence for any participating organization, so that they can advertise their collection to their targeted user base. Those organizations with an existing home page but without a web-enabled database can link their existing home page to a MIC URL generating a dynamic search page for the organization, thus providing transparent access to their collections from their own home page.
Another important use of the organizations directory database is to provide a means for the end-user to pre-select organizations to search against, based on role, subject areas, formats held or access availability. End users can select organizations from a drop down menu, "domain at a glance" pages or can search the organization database to find organizations to search against, based on pre-defined criteria. Also, the organizations directory database will allow concatenation of databases by format, genre, etc. to develop dynamic portals for more customized searching and information.
The Z39.50 flag will be used to generate a search button that, when clicked, launches a Z39.50 search against the organization's home database to retrieve the most current record and, eventually, holdings information.
The OAI flag will be used to indicate that the organization's holdings are available for data mining to support other consortial initiatives (for example, a CIMI union catalog), via the MIC's open organizations initiative protocol (OAI) implementation.
Information from the organizations directory database will be used to add information and links for display with records retrieved for moving image materials owned by the organization through the global search facility, including: URL for the organization's own home page; date of last database update, directory information: (address, contact person, etc.), and restrictions on service or use. Some of this value-added information will be supplied for all organizations, as determined by the MIC Organizations Directory Working Group.. Other information will be optional as selected by the organization, which could be determined by setting a flag (e.g. checking a box, "display with search results").
2. Cataloging Utility
The cataloging utility can be a front-end input form that allows a participating
organization to create a record directly in a core format (MARC21, Dublin Core)
and then feeds that record into the mapping utility and also to the export utility
for export back to the home organizations database. The full cataloging utility
will be developed in a future phase of MIC. For phase I, MIC will provide a
downloadable MARC21 Core database, perhaps through a collaboration with IMAP.
The fields in this database should be exactly matched to the front-end cataloging
input form in Phase II of MIC development, to allow the participating organization
to download and implement its own record database on a local computer platform
in a format suitable for ingest into the MIC union catalog.
In collaboration with the Video Development Initiative and Rutgers University Libraries, a downloadable database in an open-source or low-cost RDBMS will be provided that supports both Dublin Core and MPEG-7, for use by current and potential MIC participants.
The export utility will support record export in field and comma-delimited formats, to provide easy export of data from the MIC to the organization's home database. This would essentially insure that any organization with an adequate computer and Internet access could develop and maintain a MARC21 or Dublin Core standards-based online catalog.
The cataloging utility record template developed in Phase II may have a matchpoint recordID as well as an auto-assigned recordID. These separate concepts are important. The participating organization using MIC as a cataloging utility would first search the MIC union catalog for a closely matching record. If a matching record, or closely matching record were found, entering the recordID in the matchpoint recordID field would populate all the data elements in the template with the information from the matchpoint record. However, a new recordID would be auto-assigned.
This is important because the MIC union catalog would retain a unique record for each moving image asset owned by an organization, regardless of whether that asset is an exact duplicate of an asset owned by another organization and already available in the MIC.
Also, an organization might select a record for an asset that is close, but not identical, to the moving image asset being cataloged. In this case, the organization would edit all the fields that were unlike, so the records would no longer describe duplicate assets. Given that moving images lack the precision match points that print materials with control numbers such as ISBN and ISSN have, it seems best, at least until the MIC is very large, that organizations manually search the database and select a matchpoint recordID rather than that the MIC utility matches against selected criteria to automatically supply a record.
If an organization is modifying its own record, the organization would be able to update the record, and the utility will replace the existing record, rather than creating a new record. This function will utilize an open-source authentication and authorization mechanism-Shibboleth-- to insure that either the organization that originally created the record or a MIC administrator are the only agents authorized to edit and overwrite records. Shibboleth will be implemented in a collaboration with the Internet-2 Video on Demand Middleware Working Group.
The matchpoint recordID could be used after Phase II to evaluate the feasibility of deduping records and adding organizationID identifiers to representative records, which would be retrieved in a search, as opposed to individual records. An SQL query would generate a report by title to identify possible duplicate records in the MIC union catalog. True "duplicate records" would be populated with the matchpoint recordID of the selected "representative record." The individual records would of course be retained, for search and retrieval on the organization's individual search page and for export to the home organization as part of the cataloging facility. The matchpoint recordID would serve as a unique identifier for deduping purposes, similar to the use of OCLC number and ISBN for deduping in integrated library system implementations. It would also be used to link holdings identification for individual organizations to a representative record.
If a deduping facility were implemented in Phase III development, instructions for creating new records would require the cataloger, or the system, to eliminate the matchpoint recordID after record creation if the resulting record was not an exact duplicate. An example would be a cataloger using a record for Gone with the Wind to create the catalog record for the German version, Vom Winde Verweht. The cataloger would want the efficiency of populating credit information, plot abstract, etc. from the English record but many fields, including title, would obviously be different. MIC could also usefully explore collaboration with existing bibliographic utilities such as OCLC (via the OCLC Connexion utility) or the Research Library Group's RLIN bibliographic utility.
3. Ingest/Mapping Facility
The Ingest/Mapping facility will be a suite of mediated and automatic utilities
to register data elements, store data values keyed to data elements, and to
create index tables for fielded queries. An input form and online contextual
guide will assist organizations with mapping their data to the registry. The
input form should include the map but also, for each data element, whether controlled
vocabulary, formatting principle or free-text is used for the data element values.
Maximum and minimum field lengths should also be provided, if at all possible.
The mapping/ingest facility will examine data based on parameters in the input
form, map data elements and values against the organizations-specific registry
map, and store organization-specific data values in data element tables. Indexable
data values will be added to specified indexes either on-the-fly by the search
engine or as a batch program run according to a set schedule ("chron job").
Records that fail to load will be written to an error file without halting or terminating the loading process. The following reports, at a minimum, should be created for review by the MIC technical committee and the submitting organization: total records loaded, records that failed to load by bibID, and record fields discarded, by field label name.
It is recommended that, at least in the beginning, record structure maps should be completed by the organization and sent to the MIC database manager and Semantic Registry Subcommittee for review. The organization should be contacted by the MIC database manager and interviewed before the data ingest process begins. The record structure map and a printout of the displayed organization record should be reviewed together in the interview. The review process will look for data anomalies and will help to insure semantic integrity between the semantic registry element and the organization's equivalent data element. It will also insure that the organization is happy with the display and export formats for its records.
The initial ingest should be a small, focused sample of not more than 1,000 records. The ingest sample should be parsed, added to the database and queried in "global" and "native organization" mode. The MIC database manager (or alternatively a Semantic Registry Subcommittee representative) and the organization representative will evaluate both display formats and agree that the organization's data is accurately and adequately represented before the ingest process continues. An organization record in the organization directory database should be required for participation in the MIC union catalog. The organization should also decide what value-added information should be pulled from the organization directory record for display on a results screen or full title screen when a bibliographic record from that organization is retrieved.
4. Union Catalog Data
Registry
The Union Catalog Data Registry has been discussed in some detail above. Developing
an XML schema within the Resource Description Framework (RDF) will allow MIC
to establish a MIC namespace for future data mining in a distributed registry
environment. RDF is a both a transport mechanism and an envelope for data elements
to enable interoperability. Multiple registries (e.g. MIC Core, MARC21 core,
Dublin Core, MPEG-7 core) within the same metadata record can be referenced
using namespaces and wrapper elements to enable automated reference to online
registries to obtain semantic meaning, data value formatting, etc. Field labels
in record displays can be hyperlinked to definitions in the semantic registry,
to assist end users in interpreting record information.
5. Element Tables
Element Tables comprise the bibliographic database and will contain the data
values (e.g. the "information") associated with fields or data element
labels. I recommend individual tables for each data element to allow for repeatable
elements. A multivalue database would make individual tables unnecessary. Also,
the structured database approach for a relational database architecture that
I discussed earlier would reduce the element tables to just those indexed elements
in the minimal locator record.
One significant issue for search and retrieval will be the lack of standardization for data values that can be expected with the heterogeneous data from moving image organizations. Even a fairly simple data element, like "date of publication" will not be straightforward. Date in 260 $c is usually a single 4-digit year (e.g. 2001) while Dublin Core recommends a YYYY-MM-DD format, e.g. 2001-04-06. A sophisticated search engine with tolerance for data ambiguity is important to improve precision of search and retrieval.
The bibliographic database will be developed in an open-source RDBMS supporting export in XML for a migration path to native XML. Two RDBMS implementations are under consideration: MySQL and PostgreSQL. Both RDBMSs are in test as of December 2002, with a final decision in January 2003.
6. Indexing Tables
Indexing tables are automatically created by search utilities and DBMS's with
on-board search utilities. The indexing table should provide automatic term
de-duping and also thesaurus support to provide some level of authority control.
Indexing tables should be exposed to end users for browsing to identify terms
for precision searching.
Organizations should identify controlled vocabularies used for any element as part of the initial mapping and data load process. Given that many different controlled vocabularies-LCSH, MESH, TGN, etc. as well as uncontrolled key words may be used, I recommend that minimal, or no, authority control be provided, at least in the early stages of the MIC. Authority control can also affect efficiency, particularly when the database exceeds six figures.
Search engines can also provide thesauri. Thesauri support will not be provided in Phase I and it is recommended that, at most, authority control when implemented should consist of concatenating synonymous and related terms into relation tables in a thesaurus to create "see also" references. See also references expand or refine the scope of a search without replacing one term for another. For example, "cats" and "pets" could be considered related terms, as would "cats" and "tigers" (considered members of the "big cats" grouping), so that a search on "cats" would include see also references to "pets" and "tigers," but a search on pets would not reference "tigers."
Given that different controlled vocabularies offer precision and richness in specific subject areas, it can be dangerous to replace a term from a specialized vocabulary with an apparently equivalent term from a more general thesaurus. In the absence of a shared thesaurus and standardized formatting, implementing the "match and replace" functionality of an authority control system is difficult, if not impossible, to achieve.
It is also recommended that hyperlinks to the indexes for retrieved search terms in a record, so that a record with the subject heading "cats" would launch a search on all records including the subject term "cats." This is standard for most online catalog implementations.
7. Open Archives Initiative
Protocol (OAI) Table
The Open Archives Initiative is an http-based protocol that queries metadata
in distributed repositories based on archiveID, recordID and datestamps for
creation, modification and deletion of the metadata record. OAI transactions
require a data provider, which is the repository supplying metadata in response
to a "get" command from the service provider. A metadata record supplied
on request must include the archiveID, recordID and datestamp and Dublin Core
simple as a supported metadata format, but beyond that can include any metadata
format with an OAI profile. OAI profiles have been established for Dublin Core
simple (unqualified Dublin Core) and MARC21, among other formats. OAI is important
for the MIC for the following reasons:
§ Required for participation in the NSF-sponsored National Science Digital Library. Phase I MIC development is being funded by the NSF for participation in the NSDL. We are committed to the concepts and principles behind NSDL, which is the use of standards-based interoperability protocols like OAI to provide an integrated and easy to use portal for digital educational resources.
§ Value-added service for participants. OAI provides the means for MIC participants to participate in other union catalog initiatives. Every participant has a unique archiveID ("orgID") and can participate independently, and almost transparently, in data harvesting initiatives. A geophysical film organization could participate in a virtual "earthquake research repository." A dance film organization could participate in a "virtual dance repository" consisting of dance films, dance photographs and art work, books on dance, etc. Virtual exhibitions involving MIC organizations and CIMI museums could become a reality. Eventually, MIC will be able to use its OAI implementation to pull together dynamically "virtual collections" of digital materials related to film and video, such as movie posters, stills, scripts, etc. from other organizations and museums.
§ Interoperability with other communities. Digital library initiatives, such as the Digital Library Federation, are early adopters of OAI, for example. These libraries contain considerable textual material that could be integrated with AMIA moving image materials to create media-rich information communities.
The OAI table contains the http fields required for data harvesting. The OAI table will be updated dynamically whenever a datestamp changes for a record-when the record is first loaded, when it is modified, as determined by a record load or a modification by the owning organization through the cataloging facility, and when it is deleted. Although the deleted record will be removed, the archiveID, recordID and date deleted datestamp will remain to document the history of the record for data harvesting requests.
8. Search and Retrieval
Facility
Search & Retrieval for bibliographic records requires certain features particularly
relevant to organizations and libraries, including:
§ Fielded and full-text searching. Full-text searching will not only accommodate lengthy notes fields, as described in the structured database architecture previously discussed, but also searching of training and outreach materials provided elsewhere on the MIC website.
§ Z39.50 compliance. Z39.50 compliance can support a hybrid union catalog/distributed gateway architecture, a focused Z39.50 search based on unique record ID to retrieve the most current record for Z39.50 enabled organizations and also can provide value-added searching of related materials in a subject domain, as described in the Research, Education and Outreach section.
§ Thesaurus support. A thesaurus can provide useful cross-references for enhanced searching.
MIC developers are looking very seriously at the Yaz/Zebra/Zap! Search facility for providing the capabilities required, including integrated Z39.50 support, in an open source environment.
The MIC's search and retrieval facility will provide significant value-added services to users and to organizations. Both global and organization-specific searching will be supported. End users should be able to select an organization or multiple organizations for searching from a graphical or text-based organizations map, from a topical list based on collection areas documented in the organization directory record, or from an alphabetical drop down list. In Phase II of MIC development, the organization directory database's Z39.50 flag will be used to integrate federated searches against Z39.50 compliant organizations with the holdings of those organizations loaded locally in a union catalog for a hybrid union catalog/distributed catalog, for improved flexibility and scalability.
For each organization retrieved through a search, a checkbox should allow the end user to select that organization, multiple organizations or all organizations retrieved by the directory search, for collection searching. Those organizations with only an organization record for the directory search but without data would have a grayed out checkbox and a note indicating that catalog data is not currently available.
In addition, searches by broad format (film, video, digital) or subject/genre should be supported. Selecting a subject or genre would launch a search against those organizations that selected those subject and genre areas from the controlled list in the organization database record input form. Broad format would be similarly generated from the organization directory database.
A global search should result in a standardized display to facilitate as far as possible "apples to apples" record display.
9. Dynamic Web Pages
for Individual Organizations and Research Initiatives
Dynamic Web Pages are an important component for value-added information and
functionality for individual organizations participating in the MIC. Selecting
an individual organization to search will take the end user to a dynamically
generated "home page" and "search page" for the organization.
Participating organizations will be able to customize the home page with some
style choices in font, logo/image, layout and information derived from the organizations
database. The organization database record will include free-text fields for
additional data to be displayed on the organization's "home page"
or "search page." Each organization will be able to have a minimum
of three pages via the MIC-a "home page," a search page and a results
page. Records displayed will use the unique data element labels from the organization
and will display in the data element order for the original record. Many smaller
organizations currently have no web presence. The database resides on a standalone
computer and is searched by the organization's staff in response to walk-in
and telephone queries.
The MIC will immediately facilitate a web presence with a searchable web database for all participating organizations, giving them another avenue to reach their users, advertise their services and expose their collections. In addition, each organization will be able to update its directory record at any time to keep the data displayed on dynamic web pages current and relevant. Each organization will be required to update the directory page at least annually to insure some level of currency and accuracy. A static web page of participating organizations hyperlinked to the dynamic home page will be generated and submitted to general web search engines to insure that all participating organizations are indexed and discovered. Organizations will be able to "opt out" of the static web page.
The search page can have value-added services specific to an organization's clientele, such as federated searching of the databases of similar organizations. The curator of a local television news organization, for example, who frequently searches other organizations for materials her organization doesn't have, could include the option to search those organizations in a federated search, to provide 24/7 reference assistance to her customers.
Other organizations have a web presence, or can easily arrange a web presence with an ISP, but lack the expertise or resources to mount a web-enabled database. Those organizations might select to only use the search and results pages and hyperlink from their home page to the search and results page to maintain their existing web presence with greatly enhanced collection accessibility. MIC should have the ability to reference and retrieve the independent home page URL in place of a dynamically generated home page.
All web pages will be dynamically generated with XSLT displays of data reported in HTML or XML, at the organization's option. Another significant benefit to all organizations will be the ability to display and export their data in multiple XML formats-Dublin Core, MIC MARC21 Core, MPEG-7 and the organization's proprietary record format. As new metadata standards emerge,--including those developed by the community--they can also be accommodated by the MIC.
Dynamic web pages will also enable the creation of significant subsets, enabling collaborations, resulting in customized portals, of organizations collections based on commonalties such as subject and type of organization. An obvious example would be local news organizations, which could use a federated search portal to implement controlled vocabularies, data elements and other commonalties for indexing television station logbooks. Legal and medical moving image organizations might use the portal to explore privacy and rights metadata issues peculiar to their clientele by concatenating their collections for shared rights metadata development and testing. Another consortial area might be a digital video portal for those organizations with linked video objects for download or streaming. A MIC digital video portal could be managed by the AMIA Digital Organizations Interest Group committee, and could provide a research space for that committee to explore digital access strategies such as MPEG7 nontextual indexing implementations as part of MIC's research mandate. Other portals might be based on specific services or issues, such as moving image programming or digital rights management. The portal architecture can thus serve as a technology platform for participating organizations in the pursuit of grants to fund collaborative activities.
Value-added retrieval services will include the ability to select multiple records for saving or printing and emailing of search results to the end user.
Dynamic web pages will utilize several possible templates and some design objects that might not be stored in a database (logo, picture of organization, etc.) but will draw most information from the organization directory database. This will allow MIC to support any participating organization without significant storage or management overhead.
10. Export Utility
The final functionality will be an export utility that exports records in XML
and XML/ RDF, with associated XML Schemas for the MIC registry, MARC21, Dublin
Core and MPEG-7 core, as well as comma and field-delimited for loading into
database systems such as FileMaker Pro and Access, and as an ASCII flat file.
Exporting can be automatically requested by the organization or by a service
provider, with the permission of the organization, as part of an OAI data mining
initiative.
.
11. Education, Outreach and Research
During the research for this project, many excellent suggestions were received for training and outreach. Most organizations felt that the MIC should support the goal of universal collection access. Suggestions that support this goal include:
A. Education:
1. Cataloging:
§ Glossary, overviews, webliographies and white papers on key metadata concepts such as registries, data elements, controlled vocabularies, format mapping, RDF, and XML.
§ A searchable knowledgebase of archivists and organizations with expertise in different formats, database designs, etc. (Note: the directory database data elements for cataloging have been designed to support this search).
§ Webliographies for metadata and resource discovery formats, such as MARC21, Dublin Core, EAD and MPEG7.
§ A FAQ for metadata formats.
§ Presentations and training for major formats, for cataloging in MARC21 Core , Dublin Core and MPEG-7 using MIC, particularly using MIC-supported database implementations should be presented regionally and at AMIA national conferences.
§ An alert for conferences and training opportunities in each format that are offered by other organizations, such as the SAA workshop on EAD and the OCLC Institute's various metadata and cataloging workshops.
§ Integration of the AMIA Cataloging and Documentation Committee liaison program into MIC. Incorporate identified activities into a "Moving Image Organizations at a glance" with other groups, such as OLAC, SAA. ALCTS Media Committee.
§ Partnerships with other organizations providing cataloging education, such as OCLC and IMAP for outreach to small and nontraditional organizations currently lacking online cataloging capabilities.
§ Links to online training, such as Dublin Core implementation guidelines and XML.
§ Funding opportunities for cataloging collections.
§ Field-by-field instructions for completing the MIC MARC21 Core and Dublin Core records.
§ Links to web-enabled databases for moving image materials cataloged
in different formats.
B. Education and Outreach:
1. Outreach to Moving Image Organizations
§ A "match-making" service for owners of private collections wanting to donate those collections to organizations to help them discover and contact the most appropriate organization, based at least in part on queries against the organization directory database.
§ A mentoring program, to match experienced archivists with museums, organizations, etc. with archival collections who lack the expertise to preserve and document the collections
§ A speaker's bureau of archivists with experience in different formats, for presentations at local and regional conferences.
§ Training in digitizing, streaming, storing and managing digital video files and digital video repositories.
2. Outreach to the public
§ As a high profile website, the MIC has the opportunity to educate the public about the fragility of the nation's moving image materials. A photo essay of a film preservation procedure is one possible web exhibit. A "memorial wall" of moving image materials that are permanently lost is another possibility. The MIC will be inherently interesting for those who enjoy searching for films, etc. and thus can serve to publicize the goals and concerns of AMIA.
§ A "how-to" page with sample, customizable letter, educating end users on obtaining rights to use or license footage.
§ A primer on researching moving image materials.
§ "Virtual reference desk" technologies that dynamically forward reference questions to the appropriate organization, based on knowledgebase attributes, are emerging from the library world. Virtual reference desk technologies should be evaluated and implemented in Phase II or Phase III of MIC Implementation. The Library of Congress and OCLC have partnered in implementing a virtual reference desk technology that could perhaps be implemented for this service.
3. Focused AMIA Community Building
Another innovative suggestion was that the MIC could serve as a forum for focused communities exploring issues with description and access, such as news organizations looking at finding aids, organizations with digital video assets exploring MPEG7 and other nontextual indexing strategies, etc. This role could be supported with:
§ Dynamic web pages for specific communities, with a federated search screen and durable education and outreach pages to create a portal.
§ An "email discussion list" facility that allows any group to establish an email list for discussion of community-level issues. The list discussions could be organization and searchable to benefit any visitor to the website.
C. Research
The MIC has an opportunity to provide an information space for research into improved description, indexing and access methodologies, including:
§ Use of the MIC search facility to test new search strategies and displays, particularly the integration of textual and nontextual indexing strategies.
§ Collaborations, with established AMIA committees, such as the AMIA Preservation Committee, to develop a preservation data registry and with the AMIA Digital Organizations Interest Group to explore MPEG7 and other nontextual indexing for digital files.