User Tools

Site Tools


xmlformat

StrainInfo catalog schema

Introduction

This document describes the XML format used for synchronizing BRCs with StrainInfo. The format is based on a subset of the Microbial Common Language (MCL) implemented using XML. This document contains a description of how to create files for this use-case of the MCL standard. The resulting files can be validated using an XML validator and the StrainInfo catalog schema.

General structure

The root element of the document is the mcl:Catalog tag. The tag contains several name space declarations and the location of the StrainInfo catalog schema. These attributes are obligatory. The mcl:Catalog root element contains a header (mcl:CatalogDescription) used to add metadata. The bulk of the document consists of the mcl:Culture elements that contain all catalog entries. Each mcl:Culture corresponds with one strain number and its associated information.

<?xml version="1.0"?>
<mcl:Catalog xmlns:mcl="http://www.straininfo.net/ns/mcl/2.0/"
          xmlns:dc="http://purl.org/dc/elements/1.1/"
          xmlns:dcterms="http://purl.org/dc/terms/"
          xmlns:prism="http://prismstandard.org/namespaces/basic/2.0"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="http://www.straininfo.net/mcl/2.0/ http://www.straininfo.net/schema/2.0/si-catalog.xsd">

    <mcl:CatalogDescription>
          <!-- Meta information about this catalog export -->
    </mcl:CatalogDescription>
    
    <mcl:Culture>
          <!-- Description of a catalog entry corresponding with one strain number. -->
    </mcl:Culture>
    
    <mcl:Culture>
          <!-- ... -->
    </mcl:Culture>
    
    <mcl:Culture>
          <!-- ... -->
    </mcl:Culture>
    
    <!-- And so on. Contains an mcl:Culture for each strain number in the catalog. -->
    
</mcl:Catalog>

The mcl:CatalogDescription and mlc:Culture elements are described in more detail in the next two sections.

Structure of mcl:CatalogDescription

The mcl:CatalogDescription field contains meta information about the generated XML document. The example is self-explanatory:

<mcl:CatalogDescription>
    <dc:creator>Bert Verslyppe</dc:creator>
    <dcterms:created>2010-01-01T09:05:37</dcterms:created>
    
    <mcl:catalogVersion>v9.2b 20091231</mcl:catalogVersion>
    <mcl:catalogLastUpdateDate>2009-12-31T13:30:00</mcl:catalogLastUpdateDate>
    
    <mcl:BRC>
        <mcl:WDCMNumber>1024</mcl:WDCMNumber>
        <mcl:fullName>StrainInfo Demo Collection</mcl:fullName>
        <mcl:acronym>SD</mcl:acronym>
    </mcl:BRC>
</mcl:CatalogDescription>

The mcl:catalogVersion is a free text field that can be used by BRCs that have an internal version number, and can for example be associated with printed catalog releases. There are no specifications of the form or what information should be included and can be omitted if no internal version number is used. The mcl:catalogLastUpdateDate field (containing the date of the last change to the catalog) however, is obligatory. Do not confuse this term with the dcterms:created field, which contains the creation date of the XML itself.

The mcl:acronym elements are used to document the 'official' acronyms in use and consists of all characters before the number part of the strain numbers in the collection. This consists of the BRC acronym and possibly supplemental prefixes of the strain number. For example, for strain numbers with a structure like FOO B. 166, the acronym is FOO B.. The mcl:acronym element can be repeated if several prefixes/acronyms are used.

Structure of mcl:Culture

The mcl:Culture elements contain the effective catalog. Each element describes a strain number and all corresponding data such as the name of the organism, information on its isolation, growth conditions and other features. This is an example taken from the LMG collection.

<mcl:Culture>
    <mcl:strainNumber>LMG 24056</mcl:strainNumber>
    <mcl:otherStrainNumber>DSM 44871</mcl:otherStrainNumber>
    <mcl:otherStrainNumber>Trujillo LUPAC 09</mcl:otherStrainNumber>
    <mcl:catalogURL>http://bccm.belspo.be/db/lmg_strain_details.php?NUM=24056</mcl:catalogURL>
    <mcl:cultureLastUpdateDate>2008-08-05T12:30:00</mcl:cultureLastUpdateDate>
    
    <mcl:speciesName>Micromonospora saelicesensis</mcl:speciesName>
    <mcl:nomenclaturalPublication>
        <dcterms:bibliographicCitation>Trujillo, Kroppenstedt, Fernandez-Molinero, Schumann and Martinez-Molina 2007</dcterms:bibliographicCitation>
    </mcl:nomenclaturalPublication>      
    
    <mcl:isolationDate>2003</mcl:isolationDate>
    <mcl:isolator>M.Trujillo</mcl:isolator>
    <mcl:isolatorInstitute>Dep. de Microbiologia y Genetica Universidad de Salamanca</mcl:isolatorInstitute>
    <mcl:Sample>
        <mcl:sampleLocationCountry>Spain</mcl:sampleLocationCountry>
        <mcl:sampleLocationPlace>Salamanca</mcl:sampleLocationPlace>
        <mcl:sampleHabitat>Lupinus angustifolius, root nodule</mcl:sampleHabitat>
    </mcl:Sample>
    <mcl:Deposit>
        <mcl:resultingStrainNumber>LMG 24056</mcl:resultingStrainNumber>
        <mcl:depositDate>2007</mcl:depositDate>
        <mcl:depositor>M.Trujillo</mcl:depositor>
        <mcl:depositorInstitute>Dep. de Microbiologia y Genetica Universidad de Salamanca</mcl:depositorInstitute>
    </mcl:Deposit>
    <mcl:history>&lt;- 2007, M.Trujillo Dep. de Microbiologia y Genetica Universidad de Salamanca Spain (2003)</mcl:history>
   
    <mcl:Medium>
        <mcl:mediumNumber>LMG Medium 185</mcl:mediumNumber>
        <mcl:mediumName>Bacteria Culture Medium 185</mcl:mediumName>
        <mcl:mediumURL>http://bccm.belspo.be/db/media_search_results.php?COLL=LMG&amp;FIELD=NUM&amp;TEXT1=185</mcl:mediumURL>
    </mcl:Medium>
    <mcl:growthTemperature>28</mcl:growthTemperature>
</mcl:Culture>

The mcl:Culture element starts with the strain number of the culture being described. The identification section also contains a direct link to the online catalog (mcl:catalogURL) and a list of equivalent strain numbers (mcl:otherStrainNumber). These elements should be carefully implemented as they are essential for the visibility of the BRC in StrainInfo. The mcl:otherStrainNumber link your cultures to cultures of other BRCs. If it is impossible to split the equivalent strain numbers into different elements, the mcl:otherStrainNumbers element can be used. Please note that special characters (<,&,>,..) must be escaped (&lt;, &amp;, &gt;,…) in XML (e.g. in the URL)!

The species name (mcl:speciesName) is accompanied by its corresponding authors in the mcl:nomenclaturalPublication field. Often only a list of authors and a year is available. This can be expressed using the dcterms:bibliographicCitation element. More information on how to cite publications can found in the following section.

Information about isolation and the environmental sample from which the strain was isolated can be added using several terms. This information will be used to link strains to the environment. As a result, to give more visibility to your strains, please include as much detail as possible. Only the most important terms are illustrated in the example: please refer to the Microbiological Common Language (MCL) reference for more information on additional terms and their definition. Note that the sample information is combined using the mcl:Sample element.

The mcl:Deposit elements describes the deposit into the BRC. Intermediate deposits (i.e. the complete history of the lineage since isolation) can also be expressed by repeating the mcl:Deposit element. Elements should be sorted chronologically, being the last element the most recent deposit (i.e. the deposit in the BRC resulting in the current strain number). The example above describes the deposit of the isolator. If the strain was transferred from another BRC, do not forget to include the original strain number using the mcl:originatingStrainNumber element. The full description can be included using the mcl:history element which corresponds with the legacy descriptions using arrows pointing left (do not forget to escape special characters!).

Growth conditions can also be included. References to culture media can be included using the mcl:Medium element. A direct link to the description of the medium can be made using the mcl:mediumURL element. This element is not allowed to contain references to the current culture. However, if this is necessary (e.g. the medium description is only available from the culture catalog page), special placeholders can be used to construct the URL (see the MCL reference on mcl:mediumURL).

Structure of mcl:Publication

First, this section describes the elements which can be used to describe a publication. Afterwards, the elements used for referring to publications are introduced.

A publication can be described using the following elements (the mcl:Publicaton placeholder element should be ignored):

<mcl:Publication id="1000">
    <dcterms:bibliographicCitation>Nakamura, L K, Blumenstock, I, Claus, D, Taxonomic study of Bacillus coagulans Hammer 1915 with a proposal for Bacillus smithii sp. nov, Int J Syst Bacteriol, 38, 63-73, 1988</dcterms:bibliographicCitation>
    <dc:title>Taxonomic study of Bacillus coagulans Hammer 1915 with a proposal for Bacillus smithii sp. nov</dc:title>
    <dc:creator>Nakamura, L K</dc:creator>
    <dc:creator>Blumenstock, I</dc:creator>
    <dc:creator>Claus, D</dc:creator>
    <prism:publicationName>Int J Syst Bacteriol</prism:publicationName>
    <prism:number>38</prism:number>
    <prism:pageRange>63-73</prism:pageRange>
    <dcterms:issued>1988</dcterms:issued>
</mcl:Publication>

The element dcterms:bibliographicCitation is obligatory. It contains the human-readable citation of the work, containing enough information to enable the user to find the intended publication. If the components of this citation are available separately (or if the citation can be easily split into components), it is recommend to include the separate components. Note that the dc:creator field is repeated for each individual author. More information on the definition of the terms can be found in the MCL reference.

In the mcl:Culture element, references to publications can be made using the following publication reference terms:

  • mcl:nomenclaturalPublication: references a publication in which the species name was introduced.
  • mcl:taxonomicPublication: references a publication in which taxonomy is described.
  • mcl:historyPublication: references a publication in which the history of the strain is described.
  • mcl:environmentPublication: references a publication in which the sample environment is described.
  • mcl:preservationPublication: references a publication in which preservation methods are described.
  • mcl:publication: references a publication relavant to the culture, but for which the type of publication is not known. (Note the small 'p' which indicates a relation)

In XML, the mcl:Publication element is redundant and must be omitted. In other words, the publication reference elements directly contain the publication description elements. It is allowed to make multiple distinct references to the same publication. The id attribute is reserved for use by StrainInfo, it corresponds with the id used in the publication passport URLs.

xmlformat.txt · Last modified: 2014/10/08 16:35 (external edit)