Andrei Lopatenko homepage

Publications

Distributed Informational Retrieval for Science

DAML Resource Guide

RDF Transformations 

RDF Query Facility

Semantic Web for  AURIS-MM project

Projects

Topics of interest:

Vocabularies

 

 

Andrei Lopatenko's

Resource  Guide to Metadata

for Science, Research, Education and Technology

 

Enhanced with SHOE

original page of Resource Guide is http://purl.org/NET/cerif-daml
INTRODUCTION
  1. Introduction into metadata for Research and Technology

CERIF-2000

  1. About CERIF-2000
  2. CERIF-2000 implementation
    1. RDBMS (Relation Database Management System)
      1. Database schema
      2. What is about CERIF-2000 vocabularies
      3. CERIF and Multimedia
      4. CERIF and Community of Science (COS) Research Profiles
      5. CERIF and description of research information sites and services
    2. RDF and knowledge technologies for CERIF-2000
      1. Research and Technology metadata ontologies and schemas
      2. Information retrieval of annotated knowledge about research
      3. Metadata harvesting
        1. Open Archives Initiative
        2. RDF Crawling
  3. Description of Research Information providers and services, searching and browsing information about CRIS, semantic description CRIS

 

 

RESOURCE GUIDE

  1. Resource Guide to Metadata for Science and Research
    1. Metadata for Science News
    2. Metadata. General issues.
    3. Metadata. Science, Research, Technology
      1. General research metadata standards
      2. Educational metadata standards
      3. Specific scientific metadata standards
      4. Industrial research, innovations
      5. Metadata databases.
      6. Value-added metadata
      7. Glossaries, Vocabularies, Thesauri
    4. SHOE Ontologies for Research and Technology
    5. OIL Ontologies for CRIS
    6. DAML Ontologies for Research and Education
    7. Ontolingua Ontologies for Science
    8. Implementation of metadata for science
    9. Development and usage of metadata

 

Metadata for Research and Technology

Due to the exponential growth of information, there are a number of problems arising in information access, management, retrieval.  To tackle these problems there is a strong need in advanced forms of organization of information.

One of the forms of information organization is a description of information by metadata - structured data describing information or characteristics of information items.  Metadata standards define how information maybe described, which values and how may be attributed to information items,  which classification can be used for classification of information items or values of attributes.

A number of metadata standards and classifications for Research and Technology  appeared. References on some of them

About CERIF-2000

CERIF or  Common European Research Information Format is a data model and set of guidelines for developing Information Systems for research and technology. 

CERIF is recommended by CORDIS  

Introductions into CERIF

  1. K. Jeffery, "ERGO: European Research Gateways Online and CERIF: Computerised Exchange of Research Information Format", ERCIM News No.35 - October 1998
  2. Presentation  of CERIF by Keith Jeffery       

CERIF homepage.   CERIF Task Group

CERIF  

"The major design objectives for the CERIF 2000 data model are to provide:

  1. A full CRIS data model with flexibility to allow the majority of existing CRIS to accommodate their own database structures;
  2. A base framework for data exchange.

As with all information systems these objectives are not 100% compatible in terms of a single data model. For example, a data model covering all existing CRIS database structures would not be a very practical model for data exchange, as not all CRIS would be interested in every piece of data from every other CRIS.

The approach therefore in meeting the design objectives is threefold:

  1. To define a full CRIS data model which will cover the database structures of the majority of existing CRIS;

  2. To define a set of data models which could provide examples for data exchange (since there are an infinity of possible exchange data models between CRIS). These example data models also illustrate that it is not necessary to implement the full CRIS data model if the requirement is for only a particular subset;

  3. To define a metadata data model to provide a uniform summary-level view over heterogeneous information sources.

"

CERIF data model covers such types of entities as persons, projects, publications, research results, patents, equipment etc.

CERIF-2000 implementation

There are several kinds of CERIF-2000 implementation suggested by CORDIS - Relational database, Object-Oriented, Information Retrieval, RDF/XML. Advanteges and disadvantages of each kind of implementation described in CERIF-2000 Guidelines chapter "7.4 Implementation scenarios".  

Current main direction of CERIF development lead by Andrei    Lopatenko

CERIF-2002

  1. CERIF-GAV - implementation of information retrieval system based on distributed database technology with Global-As-View approach. For efficient retrieval of data from distributed, not very diverse in semantic and structure databases
    1. draft
  2. CERIF-SW - Semantic Web for transparent access to distributed heterogeneous sources of scientific information. This solution also includes Description Logic approach for advanced information retrieval.
    1. see Semantic Web and CERIF  part if this guide
    2. DL approach
  3. CERIF-WS - Web Services - to provide transport level protocol for Semantic Web solution and for compatibility of CERIF with emerging enterprise standards
    1. current state - CERIF SOAP server implemented
  4. CERIF-EP - enterprise portal - building enterprise portal for R&D organizations. Lead by Rutherford Appleton Lab  - Prof. Keith Jeffery, Dr. R. Stewart.
  5. CERIF-XML Data Exchange. Very related to CERIF-SW and CERIF-WS solutions, to provide data exchange facilities for R&S systems
    1. XML schemas implemented for the prototype see GAV draft

RDBMS

Database schema 

CORDIS suggests developed relational database for CERIS-2000. There are several versions of CERIF database are published.

In AURIS-MM we tried to develop Relational database   implementation of CRIS based on CERIF-2000 and found out that there are a number of errors in CERIF-2000 oracle DDL file suggested by CORDIS.

Description of  the errors I found and actions to correct them are described here (6 July 2001)

My revisited version of CERIF. (Version of 6th July 2001 ). This version of CERIF can be downloaded into Oracle database without errors at  database creation stage. 

There is the  next set of found errors(9th July 2001)  and action to correct them

The last my revisited version of CERIF-2000 database for Oracle 

(9 July 2001 ). There are no inconsistencies of check constraints and data types in this version. 

Description of CERIF-2000 entities and attributes generated by Oracle Designer 6i

What about CERIF-2000 vocabularies?

CERIF-2000 actively use vocabularies for classifying events, publications, expertise skills and other entities. Complete description of CERIF-2000 vocabularies in CERIF guidelines, CERIF-2000 Subject Indexing

In CERIF SQL file for Oracle vocabularies are defined as a check constraints, constraining sets of values of different type, status tables. But in CERIF DDL there is no real content of vocabulary tables, so it is need be inputted manually, what requires time and efforts.

Here you can download SQL files for populating CERIF-2000 vocabulary tables   with a  real content. You can run this file in Oracle SqlPlus

CERIF and Multimedia

One of the ways to present information about research is multimedia presentations. Now more and more information presented as  videos ( advanced  visualization), audio and so on. CERIF-2000 does not allow to describe multimedia elements. To describe multimedia information   about research ER (Entity-Relationship) diagram, database schema (extension to CERIF) and RDF Schema, Java Server Pages are developed.

Multimedia enhancement  for CERIF is not part of CERIF-2000 standard, but independent module which can be used for any CERIF-2000 compatible database or independent system. The only dependence on CERIF is foreign keys from table MultimediaElements to Person, Project, Event, Orgunit tables of CERIF, and foreign key form multimedia_titles to cerif.language.

Some attributes of ER diagram for Multimedia are not included into RDF Schema. We advice to use Dublin Core RDF Schema.  

  1. ER Model for Multimedia
  2. Oracle Designer Report for ER diagram for Multimedia 1, 2
  3. Database Schema for multimedia
  4. RDF Schema for multimedia  
  5. Visualization of RDF Schema for multimedia by Rudolh
  6. Visualization of RDF Schema for multimedia by W3 RDF Validator
  7. Example of RDF description of multimedia
  8. Visualization of RDF example
  9. Classification and description multimedia. Ontology (0.1 version). html, OilEd RDFS, oil, daml + oil,  sql
  10. Graphical presentation of multimedia classification (generated from DAML + OIL ontology by FRODO RDFSViz )
  11. Multimedia creation in AURIS-MM

CERIF and Community of Science (COS) Research Profiles

Community of Science is one of the leading providers of research information. COS has a number of services helping researchers find funding, promote their results and expertise. For description researchers COS has more sophisticated researcher profiles then CERIF-2000. To make possible description of COS-compatible researcher's expertise in CERIF databases we suggest extension module for CERIF-2000

RDF examples might  seem incorrect according to RDF Model and syntax specification (W3C Recommendation 22 February 1999).  As example, containers specified in a different way.  In creation of RDF examples, new RDF developments were used rdf-containers-syntax-ambiguity, rdf-containers-syntax-vs-schema. Jena parser already handles new RDF developments.

 

  1. ER Model for COS-compatible researcher's profile
  2. Entity and Attributes report
  3. Database definition report
  4. Database schema for COS-compatible like researcher's profile (Oracle) sql, sqs, tab, ind, con
  5. RDF Schema for COS-like researcher's profile
    1. To specify contact, name, organization name, role, title information I recommend to use Representing vCard v3.0 in RDF schema
  6. RDF Simple example for COS-compatible researcher's profile.
  7. Visualization of RDF Schema for for COS-compatible researcher's profile
  8. Visualization of RDF simple example for for COS-compatible researcher's profile by Rudolh
  9. Architecture of AURIS-MM for Community of Science proposal

CERIF and description of research information sites

CERIF CRIS as gateway and portal

  1. ER Model for description of research information site  (based on Dublin Core, UDDI, RSS)
  2. UML Model  for description of research information site  (based on Dublin Core, UDDI, RSS)
  3. Entity and attributes report
  4. Database schema  for description of research information site  (based on Dublin Core, UDDI, RSS) 1 ,2 ,3, 4
  5. RDF Schema for description of research information site  (based on Dublin Core, UDDI, RSS)
  6. RDF Examples
  7. Very simple N3 RDF Example of description  http://www.tuwien.ac.at according to RSS
  8. Syndication methods, examples
  9. Software Library for very simple and easy to learn RSS generation

CERIF and description of research policies, activities

  1. DAML + OIL ontology The version of 14 November 2001. Very short

RDF, DAML + OIL and knowledge technologies for CERIF-2000

News

26 January 2001. CERIF-2000 ontology revisited. New version, version for OilEd (RDFS + some OIL Extensions), HTML. Sample date [projects, persons, organizations] for CERIF-2002 ontology. CERIF-2002

Commentary:  CERIF ontology does not have such classes as PhD student, faculty  member, professor, which are defined in other ontologies for research information (University ontology of J. Hefflin, Semantic Web Research Community ontology, or science ontology) or ontology of science by Fred Freitas. The aim of CERIF ontology is to provide general description of metadata for interoperable solutions between different research bodies in different countries, research communities or research organizations (both public and private). So any specific terms are excluded. Mappings or definitions of specific terms in terms of CERIF will be included into future CERIF ontology extensions 

For research information systems the new knowledge technology can help to solve the following problems: 

  • research data usage coming from domains with different architectures published by different group of researchers, 
  • usage of different vocabularies in different research and policy-maker communities, complexity of requests for information, 
  • needs to understand actuality and completeness of information, 
  • organizing large sets of heterogeneous data objects. 

It is also very important for R&D services to provide a targeted delivery of information to persons who need it without information noise. A solution would be to formalize the users' demands for information and matching them with the available information.

Generally, knowledge technologies are technologies for knowledge presentation, retrieval, reasoning.

Excellent results for solving the above problems - integration of heterogeneous sources, development knowledge bases, matching user demands and stored information, congruentness of information system - were achieved in Global Information system community, Federated Databases, Description Logics community. Such projects like TSIMMIS, Observer, Carnot, Information Manifold, InfoSleuth, SIMS, HERMES, Garlic and lot of others could be good examples.

It is a strong need to develop such knowledge solution for Research Information Systems. But due to a large number of such system, the integration of them with the web, and the need for simplicity of technologies, the best way is to use Semantic Web technologies.

Our first objective is to test Semantic Web technologies for CRIS, could these technologies be used for knowledge presentation, retrieval, reasoning for Research Information. If they really can be used then our next goal is to develop a knowledge structure embracing for CRIS systems. So Austrian public research information will better serve the information needs of researchers, policy-makes, investors.

From Scientific publishing on the 'semantic web' by T. Berners-Lee, James Handler

"In the long run, the effects on publishing may be far more profound. There is an eternal conflict between operating rapidly as a small group and taking the time to communicate more widely. The former is more efficient but produces a subculture whose concepts and results are not understood by others. The latter can be painfully slow. The world works as a spectrum between these extremes, with a tendency to start small - from the personal idea - and filter over time towards a wider commonality of concept. The joining together of subcultures when there is a need for a wider common language is an essential process in the development of human communication.

The semantic web will facilitate the development of automated methods for helping users to understand the content produced by those in other scientific disciplines. On the semantic web, one will be able to produce machine-readable content that will provide, say, automated translation between the output of a scientific device and the input of a datamining package used in some other discipline, or a self-evolving translator that allows one group of scientists to directly interact with the technical data produced by another.

These new products will allow users to create relationships that allow communication when the commonality of concept has not (yet) led to a commonality of terms. The semantic web will provide unifying underlying technologies to allow these concepts to be progressively linked into a universal web of knowledge, and will therefore help to break down the walls erected by lack of communication, and allow researchers to find and understand products from other scientific disciplines. The very notion of a journal of medicine separate from a journal of bioinformatics, separate from the writings of physicists, chemists, psychologists and even kindergarten teachers, will someday become as out of date as the print journal is becoming to our graduate students."

Research metadata ontologies and schemas

Semantic Web Content Accessibility Guidelines for Current Research Information Systems (CRIS) and Web content developers of research relevant information at the universities and research institutions   (MS Word version) (June 2001) 

OIL Ontology for CERIF-2000  DAML-OIL Ontology for CERIF-2000 HTML Presentation of CERIF-2000 Ontology  (06 July 2001)

Simple demonstration of publishing data from CERIF database into RDF

The main intention of joint CERIF + XXX(where XXX is a Math-Net, KA2, UMD.edu for universities, EPA - USA Environment Protection Agency) ontology development is the development of data access to data described in different formats then CERIF-2000 metadata format. Describing terms and vocabularies  of those metadata formats  in CERIF terms will allow to include data described in that formats into AURIS-MM network.

As example homepages, publication, projects, pages of projects can be described in German using Math-Net metadata format, based on DublinCore metadata set. Math-Net metadata format, intended for description of research information, is a different from CERIF. But German informational resources have a real value for Austrian researchers - potential users of  AURIS-WW, and it would be worth to make them accessible also through AURIS-MM. Some researcher in Austria can use Math-Net metadata set and AURIS-MM should not lack information about their research.

If a page is tagged  as "Software Library"  by Math-Net schema, it would be good if researchers could find it search for "Result products" of CERIF. If a page tagged as a "Bibliographic Search" by Math-Net schema, it is interesting for researchers looking for services of institutes, and looking for publications. 

Even we gathered information about pages and other information resources with their Math-Net descriptions, how to make this information   classified and searchable in terms of AURIS-MM (CERIF).  To do it, we develop ontologies, including terms from different vocabularies (CERIF and others, as example, Math-Net) and describing relation between terms of this vocabularies.

As example 

In CERIF + Math-Net it is described  that Math-Net News covered by CERIF Event

It means that any page tagged as a "News" (it Math-Net, News maybe Conferences ,Position, ScheduleOfEvents  ) in Math-Net contains information about resource/resources which are  Event or Events for CERIF. So if researcher ask AURIS-MM to find pages about scientific or academician event, among data and pages inputted and registered according CERIF, one will get also links to Math-Net tagged page.

There several methods of  development ontology-based integration of information systems (see E. Mena, V. Kashyap, A. Sheth, A. Illarramendi, "Observer: 
An Approach for Query Processing in Global Information Systems based on Interoperation across Pre-existing Ontologies", International Journal Distributed and Parallel Databases, 1998
)

  1. A global ontology-based approach. The domain ontologies are integrated into one global ontology and it is partitioned into microtheories.  Very difficult for support approach and maybe impossible in current CRIS
  2. A group of "loosely coupled" approaches advocated in "Observer:..." where instead of integrating the pre-existing ontologies, interoperation between them is achieved via terminological relationships represented between terms across the ontologies.  

In AURIS-MM we try to implement second approach for CRIS 

OIL Ontology for CERIF + DublinCore and Math-Net metadata set  (July 22th 2001)

DAML + Oil  Ontology for CERIF + DublinCore and Math-Net metadata set (July 22th 2001)

HTML presentation of CERIF + DC and Math-Net ontology (July 22th 2001) (Unfortunately, OilEd did not generate html presentation of full ontology, so it is cut on his page)

RDF for researcher's profiles (based on Community of Science's researcher description). Example, Graphical representation of example

RDF Schema for CERIF (specified in W3 and Ontoknowledge terms)(06 July 2001)

RDF Schema for CERIF (specified in W3 terms)(06 July 2001)

RDF Schema for CERIF (1 May 2001  )

OIL Ontology for CERIF and KA2(Ontobroker) combined (included ontology of Dieter Fensel) 

HTML presentation of Ontology for CERIF and KA2(Ontobroker) combined (included ontology of Dieter Fensel) 

The old version of CERIF RDF Schema (May 2001)

RDF schema and extended application profiles framework for developing domain specific metadata for research and science. RDF schema was proposed as a data encoding tool for data presentation and data exchange between Current Research Information Systems   (CRIS) at 9th EuroCRIS platform meeting. RDF Schema for CERIF is a part of framework (html presentation) for developing data integration solutions for Information System for Science. 

Some RDF about Austrian projects (more human understood, but some parsers does not understand abbreviated syntax), another RDF presentation.   Some projects and persons of Vienna University and Technology  DAML File, projects and orgunits DAML file

Ortelius presentation as ontology, ontology for CRIS description

Distributed database technologies for CERIF-2000

Information retrieval of annotated knowledge about research

Additional information about semantic retrieval of research data (paper for conference)

Toolkit for querying RDF Data (RDQL (Jena), RQL (Sesame, FORTH) compatible)

How to deal with vocabulary.

How to accumulate new thesaurus, provided by data source

If information seeker vocabulary is different from data source vocabulary

How to investigate relation between terms inside vocabulary and between vocabularies

How to represent vocabulary terms in visual forms for data request.

Visual interface for querying RDF data

Metadata harvesting

Once the metadata describing research information are published, they should be harvested for creating informational research base. There are several  possible ways to do it:

Use of content dissemination technologies based on functional API or sets of request over HTTP, such as Metadata Harvesting protocol of Open Archives Initiative

Crawling the web using information about scattered RDF/metadata databases. 

Open Archives Initiative (OAI)

The technology to facilitate the efficient dissemination of content. Current
primary application of OAI is interoperabilty of Research Information
systems to open up much broader access to digital materials about research,
especially e-prints.
OAI managed by Steering Committee (policy decisions) and Technical Committee
(interoperability infrastructure). Main coordinators are Herbert Van de
Sompel and Carl Lagoze, at Cornell University.
"Support for Open Archives Initiative activities comes from the Digital
Library Federation and the Coalition for Networked Information. Additional
support for work on Open Archives Initiatives protocols comes from National
Science Foundation Grant No. IIS-9817416 and Defense Advanced Projects
Agency Grant No. N66001-98-1-8908."  OAI FAQ
OAI protocol   allow information systems to
    browse each other collections of metadata describing information
resource
    find out each metadata standards use other system
    interchange metadata
OAI implemented as a protocol over HTTP.  OAI and other participants in OAI
provide a number of software tools for publishing data through OAI, data
access and harvesting data from other repositores.
CRIS can use OAI beeing
  Data Provider   - publishing data so end-users using repository browser or
other CRIS can use its data
  Service Provide  - using the metadata of other CRIS for building
value-added services
  Data Harvester -  harvesting research data from other CRIS for own needs
The OAI is not intented to replace such technologies such as Z39.50 or LDAP,
but it is an easy-to-implement and easy-to-deploy alternative to develop
interoperable solutions (for CRIS, as example)
Currently about or more 35  system use OAI, among them American
Philosophical Society, Caltech Computer Science Technical,  arXiv (advanced
system by supported  National Science Foundation for access to articles and
e-prints in Physics, Mathematics and Computer Science )
OAI homepage http://www.openarchives.org
OAI FAQ http://www.openarchives.org/faq.htm
Overview paper http://www.cs.cornell.edu/lagoze/papers/oai-jcdl.pdf
OAI protocol http://www.openarchives.org/OAI/openarchivesprotocol.htm
Information Systems, supporting OAI (repositories)
http://oaisrv.nsdl.cornell.edu/Register/BrowseSites.pl


 

RDF Crawling

Another way to collect research information from different information providers is

  • information providers - publish information about research as RDF (maybe DAML+OIL) files or marked up  html files
  • Research Knowledge Bases or just  CRIS   collect that information by crawlers - software agent which can analyze content of page, determine if that page relevant, download it if relevant, find on which pages this page references to, analyze and download referenced pages

RDF Crawler by Ontobroker

Some RDF information from this site gathered by RDF Crawler

 

Description of Research Information providers and services, searching and browsing information about CRIS, semantic description CRIS

Sorry, will appear little later

Resource Guide to Metadata for Science and Research

Metadata  for Science News

07 January 2002[Ontologies] People ontology. General ontology of people, by their gender, age, occupations, beliefs can be used for sociological research information systems, Ontology is not expressed in formal language like DAML + OIL.

07 January 2002[Usage study]. How Do Physicists Use an E-Print Archive?. D-Lib, Dec-2001. Analysis of use of e-print services

07 January 2002[Ontologies;tools].KAON - The Karlsruhe Ontology and Semantic Web Tool Suite. An integrated set of tools for creation and maintaining of DAML + OIL ontologies,  ontology-based annotation of knowledge, creating RDF, publishing content of relational databases into RDF. Web layer is planed to be developed. Comment: I already tried to use Ontomat and found  it is rather good tool for end users, despite some problerms with understanding ontologies and presentation of big set of instances.  Also REVERSE is excellent tool and I do now anything like it but it does not provide enaugh flexibility. If they have open API I will integrate my toolset into REVERSE (to support some operatinal semantic, domain mappings, different presentation of objects in database, some problems with multiple to multiple relations and their mapping into relations between resources). I highly recommend this toolset. Also I used RDF Crawler from the same team and it works perfectly in creating RDf model from file scattered on the web. I have application to connec RDF Crawler to Jena RDF toolset and integration with JSP platform for web site registration-crawling

07 January 2002. [Ontologies;Specific scientific metadata standard] An use of DAML + OIL: an ontology in the ophtalmology domain  Application of DAML + OIL for a heuristic application in the ophthalmology domain. In the article the real application of DAML + OIL to scientific application is shown

07 January 2002. [Ontologies; distributed information retrieval; specific scientific metadata standards]Project TAMBIS.  Transparent access to heterogeneous distributed sources of biological information. Ontology-driven user interface, description logics. Ontology for biological data. see also. Online demonstration

Baker, P.G., Goble, C.A., Bechhofer, S., Paton, N.W., Stevens, R. and Brass, A. An Ontology for Bioinformatics Applications, Bioinformatics, Vol 15, No. 6, 510--520, 1999.[zipped postscript 103K] [PDF 270K].

07 January 2002[Ontologies;specific scientific metadata standards]. Biological Ontology Committee in Japan. A set of biological ontologies accepted by Japan scientists.  Also a collection of bio-Ontologies in the world

04 October 2001. (CIDX) The Chemical Industry Data Exchange has announced a public release of new XML format for chemical industry (Chemical eStandard). The new transactions developed include: certificate of analysis, report of testing results, invoice response, shipment status request, shipment instructions, price and availability request, and price and availability response. Announcement, XML Cover reference page

29 September 2001. EDT-ms. Metadata Standard for Electronic Thesis and Dissertations. The set of metadata elements, based on Dublin Core to describe an electronic thesis and dissertations.

23 September 2001. Vocabulary Universal Standard Products and Services Classification (UNSPSC).  The vocabulary developed to classify both products and services throughout the global marketplace.  In Current Research Information Systems vocabulary can be used to describe industrial relevance of applied research projects and results. View or search vocabulary, crosswalk files


10 September. Ontology of Science(Stanford Protege project) by Fred Freitas. Ontology for scientific and education information. Base terms Scientific Document ( 20 subterms), Organization (6 subterms), Project (3 subterms), Event (12 subterms), Person (15 subterms), Product (1 subterm). Based on KA2 ontology developed by    Knoweldge Annotation Initiative of the Knowledge Acquisition Community. I generated HTML presentation of Ontology of Science (using Protege) HTML

09 September. Some very specific ontology (algebra, matrix, etc) for science are provided by KSL at Stanford. Please, look at Ontolingua Ontologies part of the Resource Guide 

09 September. Guideline Interchange Format (the last version (Draft 3.4) April 2001, InterMed Laboratory). Specification for structured representation of medical guidelines. Allows to describe various types of guidelines and guideline content. Excellent in description of complicated decission proccesses (like what to do to cure patient from  illness XXX if ... ).  Description of guidelines documents and . RDF Schema , Ontology,  UML diagramms, ER diagramms, GLIF Expression Language (GEL) are provided.

07 September.  Preliminary agenda of EuroCRIS 10th meeting published at EuroCRIS web site. On October 31th 2001 Workshop on CERIF-2000 by CERIF Task Group.

07 September. RDF Calendar - metadata standard based on iCalendar , for scheduling, events and itinerary descriptions. Based on RDF, RDF Schema for RDF Calendar. The format could serve for describing itinerary research conferences,  scheduling of research projects, etc. Live data example of RDF Calendar itinerary of International Semantic Web Working Symposium(SWWS) you can see at SWWS page. Also demonstration of querying calendar data.  The development of format and discussions at RDF Calendar mailing list.

04 September. EuroCRIS Newsletter. Edition 6. Editor Eric Zimmerman. Research Information systems, technologies, funding opportunities, standards, conferences. Newsletter of EuroCRIS

03 September.  Math-Net Page Application Profile 1.1. created 2000-12-07. Application profile for description mathematic research in  web pages. Contains vocabulary for describing researchers, results, publications. RDF Schema proposed and set of vocabularies, guidelines how to create pages and process information. Part of complete Math-Net Research information RDF Schema. Please, look at Math-Net  + CERIF ontology to see relations between CERIF and Math-Net terms.

03 September 2001. C. Lagoze, J.Hunter, "The ABC Ontology and Model", accepted at DC-2001, Tokyo, October 2001. The ABS is ontology and framework for describing information resources. Contains temporality Category, Actuality Category, Abstraction.   Could be very useful for CRIS which support the lifecycle of objects, or object of which might be changed and how they we changed should be described in CRIS, actual/non-actual. 

03 September 2001. J. Hunter, "Adding Multimedia to the Semantic Web - Building an MPEG-7 Ontology", SWWS, Stanford, July 2001. Ontology for describing multimedia information is proposed. Ontology contain vocabulary of multimedia types and actions over multimedia content. Can be used for CRIS which contain multimedia information for presenting research results (especially, technology, exact science, multimedia for humanity)

02 September 2001. RDFWeb - Information system for providing access about people and research of Semantic Web Community.  Implemented in two version as an RDBMS  based system and as an RDF based system. As RDF metadata format for describing RDF researchers and their research. Sample RDF description of Dan Brickley. Vocabulary still in flux. Sample page for describing researchers and generating RDF. Simple SquishQL Query to metadata database  

28 August 2001. Issued MiREG Metadata Framework draft of extensions to Dublin Core Elements based on the draft UK e-Government metadata standards.

28st August 2001. Issued new version of Redland RDF Application Framework 0.9.10. Redland application framework is a set of tools, interfaces for storing, querying and manipulating RDF models.  Among other things  Java API was added. the application framework is issued by Dave Beckett, from Institute for Learning and Research Technology. Institute research in a number of areas very CRIS related such as Virtual learning tools and environments, Digital libraries

1st August. EPrints 2 Alpha-1 has been released.  EPrints - self-archiving software, developed at the Electronics and Computer Science Department of the University of Southampton, for  e-prints management and publication and fully compatible with OAI protocol as a data provider.

29 June 2001. During the ALA 2001 Annual Convention, NISO, the National Standards Information Organization and BASIC, the Book and Serial Industry Communications, jointly presented a forum on the challenges of integrating electronic journals into library collections and the standards now in development that will enable publishers and librarians to deliver better access and more information about these resources.  In the presentation of by Brian Green, Managing Agent BIC/EDItEUR (Book Industry Communication) entitled "Developments in Metadata: ONIX for Serials", international metadata standard for describing e-serials, e-publications was presented

L. Hill, S. Crosier, T. Smith, M. Goodchild, "A Content Standard for Computational Models", D-Lib Magazine, June, 2001. In the article, the Content Standard for Computational Models (CSCM), metadata standard developed  for presentation scientific computational model is described.

22nd June 2001. Ariadne 28, published. Ariadne is quarterly published magazine for information science professional in academia, Higher Education.

 Philip Hunter, "The Management of Content: Universities and the Electronic Publishing Revolution." About university sector digital publication, content management systems (CMS) in the university sector and role of university publication in the current electronic publishing revolution.

Michael Day. "E-print Services and Long-term Access to the Record of Scholarly and Scientific Research".  Review of E-print services (free the scholarly and scientific literature). Metadata for E-Prints

14nd June 2001. Caltech Registers Two Repositories with the Open Archives Initiative. Caltech university registered repositories of conference proceedings and technical reports.

1st June 2001. The UK Collection Description  Focus launched (CD-Focus). The aim of the Focus is to develop metadata schemas, tools for description of collections.

April 2001. John S. Erickson, "Information Objects and Rights Management, A Mediation-based Approach to DRM Interoperability", D-Lib Magazine. The article discusses architectures and metadata standards for Digital Rights.

11 Oct 2009. The Defense Virtual Library (DVL) issued metadata schemas for description moving images, still images, sound. Notification

Metadata. General issues.

Metadata Activity Statement by W3 Consortium.

Metadata course by European Schoolnet, excellent first introduction to metadata issues

DIGITAL LIBRARIES: Metadata Resources, Collection of metadata tools, standards, projects

Metadata.net. Projects, tools, standards in metadata

IEEE Meta-Data. Forum in metadata issues, collection of links to metadata events, projects

Metadata. Set of links to standards, tools, catalogs  

An Introduction to Metadata, Chris Taylor, University of Queensland Library

Schemas-forum metadata watch.  The Metadata Watch is an activity that intends to build a broad and comprehensive list of projects, programmes, software tools and guidelines that use -- or describe how to use -- metadata schemas. The SCHEMAS partners will compile this in collaboration with a network of domain specialists.

Metadata. Science, Research, Technology 

General research metadata standards.  (For publication general information about research activities, such terms as project, researcher, article, university)

Tim Berners-Lee, James Hendler,  "Scientific publishing on the 'semantic web'", Nature

Short review of Semantic Web potential for scientific publishing. Main idea

M. Grotschel, L. Lugger, "Scientific Information systems and Metadata", Classification in the Information Age. Proc. of the 22nd Annual GfKl Conference, Dresden, March 4-6, 1998.

Report of researchers from Konrad-Zuse-Zentrum
für Informationstechnik Berlin about research metadata formats and vocabularies

Lopatenko A. S. "Current Research Information Systems. Review" (Russian only), expanded version of Lopatenko A. S., Kulagin M. V. "Current Research Information Systems and Digital Libraries. Needs for  integration", to appears in proceedings of "Digital Libraries: Advanced Methods and Technologies, Digital Collections", Sep. 2001, html(rus), 

Review of Research Information System projects in Europe, US and other countries. Having analyzed lots of CRIS(42 articles of EuroCRIS conferences and others), author describes general CRIS applications, main technical requirements for CRIS, and implementation method for CRIS. Main metadata standards for Research and Technology also described in the article.

Descriptions of dissertations in RDF (Germany)

RDF Schema - Metadatensatz für die Implementierung in RDF 

My Meta Maker for Theses2000 Metadata creation toolset

Math-Net Page Application Profile 1.1. created 2000-12-07. Application profile for description mathematic research in  web pages. Contains vocabulary for describing researchers, results, publications. RDF Schema proposed and set of vocabularies, guidelines how to create pages and process information. Part of complete Math-Net Research information RDF Schema. Please, look at Math-Net  + CERIF ontology to see relations between CERIF and Math-Net terms.

Math-Net metadata sets (preprints, researchers' homepages, software, documents on research)

Math-Net tools for metadata creation

Home page of this server is tagged by  Math-Net schema metadata elements. You can see visual description of the page shown by Math-Net metadata tools

 

KA2 - Knowledge Acquisition Community Ontology

EDT-ms. Metadata Standard for Electronic Thesis and Dissertations. The set of metadata elements, based on Dublin Core to describe an electronic thesis and dissertations.

EPA Scientific Metadata Standards Project

EULER Project: European Libraries and Electronic Resources in Mathematical Sciences

Joint Committee on Atomic and Molecular Physical Data Exchange Standards (JCAMP-DX)

LBNL EPA Scientific Metadata Standards Project

RDF Calendar - metadata standard based on iCalendar , for scheduling, events and itinerary descriptions. Based on RDF, RDF Schema for RDF Calendar. The format could serve for describing itinerary research conferences,  scheduling of research projects, etc. Live data example of RDF Calendar itinerary of International Semantic Web Working Symposium(SWWS) you can see at SWWS page. Also demonstration of querying calendar data.  The development of format and discussions at RDF Calendar mailing list.
 

RDFWeb - Information system for providing access about people and research of Semantic Web Community.  Implemented in two version as an RDBMS  based system and as an RDF based system. As RDF metadata format for describing RDF researchers and their research. Sample RDF description of Dan Brickley. Vocabulary still in flux. Sample page for describing researchers and generating RDF. Simple SquishQL Query to metadata database

Background about GRIDs, cut from the ERGO BBS
ERCIM news (April 2001) Special theme: GRIDS: e-Science to e-Business

Network Common Data Format (NetCDF)

Planetary Data System (PDS)

DESIRE project  - the project's focus was on enhancing existing European information networks for research users across Europe through research and development in three main areas of activity: Caching, Resource Discovery and Directory Services.

Educational metadata standards

UNIVERSAL data model - model for describing education (learning) resources, their scheduling, offers of learning resources, annotations, requirement to use resources, delivery systems. Developed for information exchange  about learning resources,  shared use of learning resources by european universities.  Part of Universal project. Model developed by New Media Working Group of the Department of Information Systems  at the Vienna University of Economics and Business Administration
 

DCMI (Dublin Core Metadata Initiative) Education Metadata Set.

IEEE Learning Technology Standard Committee (LTSC) - IEEE standard to facilitate the development, deployment, maintenance and interoperation of computer implementations of education and training components and systems. The standard LOM (Learning Objects Metadata)  was developed by Committee. Standard description (draft 7 Nov 1999), Requirements specification (Learning Object Metadata Framework  19 Nov 1997), IMS Learning Resource Meta-data Best Practices and Implementation Guide(2001, Final specification), IMS Learning Resource Meta-data XML Binding specification (2001, Final specification) 

European SchoolNet metadata standard - standard for information exchange about scholarly resources, based on Dublin Core. Contains vocabularies for classifying persons, homepages, publications, software and datasets (results of university projects). Developed by European SchoolNet

NGfL (National Grid for Learning Scotland) Scotland metadata standard - Scotland metadata standard, primarily intended for description of information resources, related to learning, scholarship. Adaptation of  European SchoolNet metadata standard

eduPerson - defining an LDAP object class that includes widely-used person attributes in higher education. Proposal of Net@EDU. Net@EDU was created in July 1998 with the merger of the Networking and Telecommunications Task Force (NTTF) and the Federation of American Research Networks (FARNET).  

Specific science metadata standards (very special, describing computational models, geographical data, nuclear particles)

International Mathematic Metadata Task Force (IMU) - comitee to create a uniform semantic base for metadata for the whole of Mathematics and to combine existing metadata set for various types of mathematical resources to form consistent covering all objects relevant for mathematics

Subject Classification for school and College Mathematics  - thesaurus of mathematical objects by American Mathematic Metadata Task Force

Base Schema for Mathematic Metadata derived from IMS/LOM schemes by American Mathematic Metadata Task Force 

A number of documents useful for learning and using metadata for mathematics by AMMTF

Metadata for mathematic preprints -a set of metadata elements defined as qualifiers to Dublin Core with HTML embedded encoding (Meta tags).  A part of Math-net application profile

Biological data profile, October 1999, Biological Data Working Group, Federal Geographic Data  Committee and USGS Biological Resources Division. A common set of terminology and definitions to describe biological datatext on FGDC web site, Biological Data Profile Workbook, USGC page for BDP. Application profile for Z39.50

Spatial Data Standards for Facilities, Infrastructure, and Environment SDSFIE. The SDSFIE have focused on the development of graphic and nongraphic standards for GIS implementations at Air Force, Army, Navy, and Marine Corps installations, U.S. Army Corps of Engineers Civil Works activities, and other Government organizations. From press-release "NCITS 353 is a nonproprietary geographic information (GI) standard for use with offthe-shelf Geographic Information System (GIS), Computer Aided Design and Drafting (CADD), and relational database software. The standard coupled with this software supports comprehensive master planning, environmental planning, and site planning, engineering, and lifecycle maintenance for facilities/installations, infrastructure, and environmental applications."  Usability study for geospatial data

Some very specific ontology (algebra, matrix, etc) for science are provided by KSL at Stanford. Please, look at Ontolingua Ontologies part of the Resource Guide 

The OpenMath standard. Standard for encoding mathematical information. Result of Esprit OpenMath Project and currently is  being used by OpenMath Society. Consist of standards for mathematical objects encoding and for content dictionaries encoding. Also

Xuehong Li.XML and the Communication of Mathematical Objects University of Wertern Ontario, London Ontario. 1999. Master's Thesis.

OMDoc - Open Mathematical Documents. Extension of OpenMath and MathML standards, for  description of mathematical documents and formulae, concentrates on meaning of formulae.  XML based standard.

Biological Ontology Committee in Japan. A set of biological ontologies accepted by Japan scientists.  Also a collection of bio-Ontologies in the world

Content Standard for Digital Geospatial Metadata (CSDGM). The objectives of the standard are to provide a common set of terminology and definitions for the documentation of digital geospatial data. The standard establishes the names of data elements and compound elements (groups of data elements) to be used for these purposes, the definitions of these compound elements and data elements, and information about the values that are to be provided for the data elements.

Content Standard for Computational Models (CSCM). Metadata standard to describe scientific computational models. Developed in Alexandria Digital Earth Prototype (ADEPT), NSF Digital Library II project 

Guideline Interchange Format (the last version (Draft 3.4) April 2001, InterMed Laboratory). Specification for structured representation of medical guidelines. Allows to describe various types of guidelines and guideline content. Excellent in description of complicated decission proccesses (like what to do to cure patient from  illness XXX if ... ).  Description of guidelines documents and . RDF Schema , Ontology,  UML diagramms, ER diagramms, GLIF Expression Language (GEL) are provided.

ISO Space. Set of ISO data and metadata standards for space data and information transfer system

Unclassified useful for research metadata formats

Collection descriptions - review of collection description projects, formats by Andy Powell

Industrial research, innovations, management

In this section metadata formats intended for providing industrial research, technologies, research management, processes are described. Also metadata formats  which can be applied for industrial research, innovation and research processes are  described here.

StraDiWare - the project supervised by Prof Keith Jeffery, Rutherford Appleton Laboratory. "The aim of the project is to formalize linkage between business objectives, decision to implement IT systems and IT systems themselves". The project aimed at IST development lifecycles, but its well0developed data models, logical  models, intentional models, conceptual models, philosophy and architecture could be useful for iinovation and industrial research projects.

The Business  Process Management Initiative - initiative to standardize the management of business processes that span applications, departments.

Metadata databases.

Epaminondas Kapetanios and Moira C. Norrie, Semantic Querying of Scientific Data through a Context Meta-data Database, ERCIM News 35,

Value-added metadata

In this section metadata standards intended for providing value-added services are described. It could be collection description metadata, content rating metadata

Quality rating in RDF. The report of Desire.org   The report discusses use of metadata for web-indexing and quality-rating. Could be useful for CRIS with expert estimation of research results/inputted data, information  systems in innovations. 

C. Lagoze, J.Hunter, "The ABC Ontology and Model", accepted at DC-2001, Tokyo, October 2001. The ABS is ontology and framework for describing information resources. Contains temporality Category, Actuality Category, Abstraction.   Could be very useful for CRIS which support the lifecycle of objects, or object of which might be changed and how they we changed should be described in CRIS, actual/non-actual. 

Glossaries, Vocabularies, Thesauri.

Research Methods Glossary - index of terms to describe research methods of wide set of sciences. Such terms, as Experimental research, Focus group, Naturalistic paradigm, Statistical significance, Theoretical framework and others (more then 100) are defined and described in the index. Developed in project GOLD - development of distance learning  in UK

American Mathematical Society  (AMS) Mathematics Subject  Classification Search web interface, or browseable or in PDF

Mathematics Subject Classification ATION  . presentation, review

Vocabulary Universal Standard Products and Services Classification (UNSPSC).  The vocabulary developed to classify both products and services throughout the global marketplace.  In Current Research Information Systems vocabulary can be used to describe industrial relevance of applied research projects and results. View or search vocabulary, crosswalk files

J. Hunter, "Adding Multimedia to the Semantic Web - Building an MPEG-7 Ontology", SWWS, Stanford, July 2001. Ontology for describing multimedia information is proposed. Ontology contain vocabulary of multimedia types and actions over multimedia content. Can be used for CRIS which contain multimedia information for presenting research results (especially, technology, exact science, multimedia for humanity)

 

        SHOE Ontologies for Research and Technology

SHOE - (Simple HTML Ontology Extension).

SHOE is a small extension to HTML which allows to annotate some knowledge about web page content. SHOW is very simple language for declaring ontology, defining classification, relationship, inference rules, categories, etc. SHOE was developed in Department of Computer Science, University of Maryland.  SHOE specification, tools, SHOE ontology in plain text and DAML, examples are accessible at SHOE home page

Computer Science Department Ontology, v.1.1. by Jeff Heflin. Ontology for computer science departments, contains a large set of concept which can be used for most university CRIS for research and education

University ontology by Jeff Heflin. Ontology for describing activities of universities, universities, education and research in universities. Provides a good list of categories, can be used for CRIS

Dublin Core ontology in SHOE - ontology t deal with Dublin Core described informational resources. Developed by Sean Luke, Department of Computer Science, University of Maryland.  Dublin core is not standard for research data specially, but can be used for basic classification of researchers, projects, university pages.

OIL (Ontology Inference Layer) Ontology for  Current Research Information systems and Education

OIL - "is a proposal for a web-based representation and inference layer for ontologies, which combines the widely used modelling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics. It is compatible with RDF Schema (RDFS), and includes a precise semantics for describing term meanings (and thus also for describing implied information)."  OIL was sponsored by European Community via the IST projects Ibrow and On-To-Knowledge. Tools for ontology development and reasoning, specifications, publications, case studies at OIL home page  

KA2 Ontology of Knowledge Acquisition community. Ontology can serve  for description of university and research data. Such  terms as Faculty staff, PhD student, Research project, software, publications, etc are included. Ontology is  a part of distribution of OntoEdit, OntoMat. Relationships between KA2 term and CERIF terms, please

SWRC Semantic Web Research Community Ontology. Ontology for CRIS-like applications. Includes such classes as Event, Publication, Project, Organization, Product and numerous    subclasses of these classes. Not very rich in description of each term. Ontology in Oil from SemantecWeb.org

DAML Ontologies for Current Research Information systems and Education

DAML (DARPA Agent Markup Language) - ontology markup language, developed as an extension to XML and RDF. DAML allows to specify ontologies and markup pages for automatic knowledge extraction. Last version of DAML are named DAML + OIL. DAML specifications, examples, tools, ontologies are published at DAML home page. Short step by step presentation of DAML project  - "DAML Status and Tools" by Mike Dean  

 

http://www.semanticweb.org/ontologies/swrc-onto-2000-09-10.daml

The SWRC ( Semantic Web Research Community Ontology)  ontology models the semantic web research community (its researchers, topics, publications, tools, etc. and relations between them). This ontology will form the basis to annotate documents in order to enable semantic access to these documents

http://www.ksl.stanford.edu/projects/DAML/ksl-daml-desc.daml

Ontology used for markup for the initial DAML homework assignment. It covers topics of interest to project pages - people, papers, research programs, etc. It was done in ontolingua and translated into DAML-ONT. It has a companion file - the instance data file available from http://www.ksl.stanford.edu/projects/DAML/ksl-daml-instances.daml

http://www.cs.umd.edu/projects/plus/DAML/onts/cs1.1.daml

An ontology for describing universities and the activities that occur at them. This is the DAML version of a SHOE ontology.

http://www.w3.org/2000/10/swap/pim/doc.rdf

Documentation control, versioning, digital rights management and access control primitives.

http://www.cs.umd.edu/projects/plus/DAML/onts/univ1.0.daml

An ontology for describing universities and the activities that occur at them. This is the DAML version of a SHOE ontology

http://www.daml-atlas.org/ontologies/atlas-cmu.daml

Main ontology describing relationships between research groups, research projects, and individuals (including basic support for bios).

http://ubot.lockheedmartin.com/ubot/2000/10/baby-shoe/shoeproj-ont.daml

A manually generated DAML translation of the SHOE Project ontology. This ontology was used as a staring place for annotation of the UBOT website.

http://www.daml-atlas.org/ontologies/cmu-ri-courses-ont.daml

Framework for describing classes taught at the RI at CMU. Based on the current year's course list (00-01)

Ontolingua  Ontologies

Abstract algebra. Defines the basic vocabulary for describing algebraic operators, domains, and structures such as fields, rings, and groups (25 definitions). Accessed by Stanford KSL Network Services

Basic Matrix Algebra. This ontology attempts to capture basic concepts in linear algebra, with emphasis on matrix operations (26 definitions). Accessed in Stanford KSL Network Services

Chemical crystals. This ontology describes the different types of crystalline structures of the substances (121 definitions). Accessed by Stanford KSL Network Services

Chemical Elements. 103 elements of periodic system. Accessed by Stanford KSL Network Services

Enterprise ontology. A collection of terms and definition relevant to business enterprises(177 definitions). Accessed by Stanford KSL Network Services

Physical quantities. Terms for measuring in engineering and physics (28 definitions). Accessed by Stanford KSL Network Services

Simple geometry. Basic geometric concepts.  (8 definitions). Accessed by Stanford KSL Network Services

 

Implementations of metadata for science

DSpace  - a digital archive created to capture and distribute the intellectual output of MIT. Stable, long-term storage of documents, with support of different formats, rights management.

An use of DAML + OIL: an ontology in the ophtalmology domain  Application of DAML + OIL for a heuristic application in the ophthalmology domain. In the article the real application of DAML + OIL to scientific application is shown

Project TAMBIS Transparent access to heterogeneous distributed sources of biological information. Ontology-driven user interface, description logics. Ontology for biological data. see also. Online demonstration

Development and usage of metadata

The concept of application profiles. Links to articles   1,   2,   3

The Semantic Web: A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities

Research Summary - XML Adoption: Benefits & Challenges
Issues
for Science and Engineering Researchers in the Digital Age

Metadata
MetaNet - A Metadata Term Thesaurus to Enable 
Semantic Interoperability Between Metadata Domains

W3 Semantic Web activity 

Semantic Web community portal 

Dave Beckett's Resource Description Framework (RDF) Resource Guide

The SCHEMAS Project provides a forum for metadata schema designers involved in projects under the IST Programme and national initiatives in Europe.

 

Tools for RDF parsing, ontology development, reasoning

SiLRI (Simple Logic-based RDF Interpreter) - Java-implemented logic-based
engine with RDF support (Frame-logic)
http://www.w3.org/TandS/QL/QL98/pp/queryservice.html escription
SILRI - http://ontobroker.semanticweb.org/silri/
http://www.ontoprise.de/co_produ_tool4.htm
SWI-Prolog system
http://www.swi.psy.uva.nl/projects/SWI-Prolog/
Good article about it
http://www.xml.com/lpt/a/2001/07/25/prologrdf.html
Java RDF API   and parsers

http://www-uk.hpl.hp.com/people/bwm/rdf/jena/
S. Melnik's  parser somewhere at http://www-diglib.stanford.edu/diglib/ginf/

RDF Query language and engine
http://swordfish.rdfweb.org/rdfquery/
If you are interesting in collecting RDF information scattered on web pages
RDF Crawler could be a good tool for you
http://ontobroker.semanticweb.org/rdfcrawl/index.html