Unitary taxonomies

Taxonomy is one of the oldest branches of biology and in its modern form dates back to the work of Linnaeus in the middle of the 18th century. Prior to Linnaeus, naturalists named animals and plants informally with no agreed common system. Linnaeus's binomial nomenclature provided taxonomy with a universal system, first for plants and shortly afterwards for animals. The binomial system has evolved down the years to become the current Codes of animal and plant nomenclature. Today there is much discussion about the "bioinformatics crisis": the massive amount of DNA sequence and other molecular information that biologists have to tame and access. We consider this the second bioinformatics crisis, the Linnaean system provided a solution to the first bioinformatics crisis.

The solution that evolved from the work of Linnaeus and his successors involved several components. The first was that every species of plant and animal should have a binomial (e.g. Homo sapiens where Homo is the genus which may contain several species, in this case just the single extant sapiens) and that each species is associated with a particular specimen preserved somewhere in a museum or other collection (the type). Recent years have seen several proposals to amend these foundations of taxonomy (e.g., Phylocode, Tautz et al, 2003) but we believe they should remain substantially untouched since they have functioned effectively for so long and continue to do so.

When a new species is discovered, or if a new classification is proposed, the next step is to publish the results in a journal or other permanent resource. A name does not become valid ("available" in zoological terminology) until it is published in accordance with the requirements of the relevant Code, and if the same species is described twice then that with the earlier publication date takes priority (unless it upsets well embedded usage of the name). The taxonomy of a group is thus not any particular treatment but the sum of all the papers published on the taxon. This is a unique aspect of taxonomy: unlike all other branches of sciences taxonomists need to refer to material published in the 19th and even 18th century, so a taxonomist needs to be very familiar with the plants or animals he or she works on, and also have an intimate knowledge of the often complex literature on the group. This dependence on the literature was the price that had to be paid for stability, and has served the subject well for 250 years.

But there are costs to this way of doing taxonomy, and the question is can these be mitigated using modern means of communicating and sharing information, especially using the web. Costs are that:

  • Taxonomic information is highly scattered and difficult to access, which is an impediment to people who are not experts in the field
  • Taxonomic research requires access to an extensive and often very old literature that is often only available in major institutions; research is made particularly difficult in developing countries (which include major biodiversity hotspots)
  • Paper literature is often a poor medium for taxonomy; it is expensive, which in particular restricts the use of illustrations
  • In some groups, much time is taken in the study and reinterpretation of 17th and 18th century types, a process often leading to changes in nomenclature
  • There is very frequently a disconnect between taxonomists and the people who use their taxonomic products. End-users normally need to know the current consensus taxonomy of a particular group, but consensus is often difficult to achieve because, typically, there are multiple taxonomic hypotheses of species in the literature. But end-users also need to know when there are alternative hypotheses so that their data are associated with a particular taxonomic concept. Here again, this can be difficult to extract from the literature
  • All taxonomies are hypotheses and change as research progresses; but tracing how a particular name or classification evolves is often difficult and can lead to ambiguities as to what concept was being referred to at any particular time.
  • It is often difficult for non-taxonomists to contribute to the taxonomic research programme, although they will have invaluable observations and material to add.

One proposal to try to avoid these problems is to create for each major group a unitary taxonomy on the web. A unitary taxonomy has two main goals. The first is to serve as a "one-stop shop" providing all the resources that a taxonomist needs to work on a particular group beyond the physical specimens themselves. The second is to provide an interface to allow non-specialists not only to gain access to information about a particular group but also to enable them to contribute information.

The first stage of a unitary taxonomy is the creation of the first web revision. In many ways this is the equivalent of a traditional taxonomic revision and would include all the species accepted by the reviser plus significant taxonomic concepts that he or she rejects. A draft first web revision would be placed on the web for refereeing by the taxonomic community (also done through the web), and then the final version would be prepared after revision in the light of the comments received. An editorial board is likely to play an important role in this process.

One model is that once a first web revision of a suitable standard is agreed, it would circumscribe the species and other taxa that need be considered in future revisions. For example, were new 18th century types to be discovered (or reinterpreted) they would not necessitate a change in nomenclature. This remains contentious (even within the CATE team (Godfray, 2002, Nature; Scoble, 2004, Phil. Trans.), and it is also precluded at present by the Codes. The alternative model (which we are at least adopting provisionally) is that discoveries of early descriptions retain priority and are incorporated in future editions of the web taxonomy.

After the first web revision is posted on the internet, taxonomic research would continue with new species being described and new classifications being proposed. At present, the Codes require that taxonomic changes are published on paper (or other fixed format). But under the ideal unitary proposal, additions and other changes would be posted on the unpublished part of the unitary web site and made open to review for a fixed period (again on the web). After the refereeing process, a committee, the equivalent of an editorial board, would decide whether the changes should be incorporated and published in the next edition of the current web revision. But, and this is very important, even if not incorporated, the alternative hypothesis would be maintained on the unitary website for further research.

The consensus current web revision would be the version of the taxonomy of the group with which non-specialists would interact. It would provide a ‘product’ of taxonomy that would be immensely easier for other biologists and non-biologists to work with than the current situation. The incremental approach of adding additional information to a web-platform will help users and those building and improving the taxonomy. It would grow the number of users of taxonomy, and would enable taxonomists to be more properly credited for their work than happens at the moment.

The notion of a consensus taxonomy is controversial and jars with the traditional independence of action of individual taxonomists. The case against consensus taxonomies was put eloquently by Thiele and Yates (2002) and we share their concern. We believe, however, that it is possible to have all the benefits of a consensus taxonomy, in particular the much greater engagement with end users, while maintaining the independence of individual taxonomists and recognising the provisional nature of taxonomic hypothesis. In particular, all taxonomic hypotheses (that conform to the current Codes) will by right be included on the unpublished part of the unitary website though not necessarily in the consensus taxonomy. Paper-based revisions often act, de facto, as consensus taxonomies because a single comprehensive work, with arguments as to why the taxonomic concepts proposed therein should be accepted, tends, until revised, to be adopted as a standard work and one that is at least reasonably accessible to users. But believing in the idea of a consensus taxonomy is not proof that it will work, and one of the aims of CATE is to test whether this dual approach will function and command support from the taxonomic and wider communities.