Introduction to the Plant DNA C-values database
Nuclear DNA C-value and genome size are important biodiversity characters with fundamental biological significance and many uses (Bennett and Leitch, 1995; Bennett et al., 2000). The DNA C-value of an organism is the amount of nuclear DNA in its unreplicated gametic nucleus (Swift 1950), irrespective of the ploidy level of the taxon. (For papers discussing the terms 'genome size' and 'C-value' see Greilhuber et al., 2005 and Greilhuber and Dolezel 2009.)
Estimating DNA C-values (genome sizes)
The first C-values estimated for a few plants were made in the 1950s using tedious chemical extraction methods. The subsequent development of other techniques including Feulgen microdensitometry, flow cytometry and DNA image cytometry, have made estimating DNA amounts both easier and faster, such that C-value data are now available for over 8,500 plant species.
The need for C-value data
Information about C-values is used in a wide range of biological fields (e.g. see Bennett et al., 2000 and Bennett and Leitch et al., 2011). Interest in such data, judging by the numbers of new estimates published, has remained high in recent years (Bennett and Leitch 2011). Indeed, increasingly the data are being used for large scale comparative analyses (e.g. including studies of the relationship between genome size and B-chromosomes (Levin et al., 2005), duration of cell cycle (Francis et al., 2008), seed size and mass (Beaulieu et al.,2007), plant growth form and distribution (Ohri, 2005), leaf cell size and stomatal density (Beaulieu et al., 2008; Hodgson et al., 2010), patterns of invasiveness (Kubesova et al. 2010; Lavergne et al. 2010) and patterns of genome size evolution (Leitch et al., 2005; 2009; 2010, Beaulieu et al., 2010, Leitch and Leitch 2013).
The need for reference lists
Information on DNA amounts in different plant taxa is often hard to locate as published data are widely scattered in a diverse range of journals, whilst a significant proportion are unpublished and unavailable.
By 1976 Bennett had decided to start compiling lists of DNA C-values in angiosperms and to publish them. Nine extensive collected lists are now published in hard copy form, primarily for reference purposes (Bennett and Smith, 1976; Bennett et al., 1982; Bennett and Smith, 1991; Bennett and Leitch, 1995, 1997, 2005, 2011; Bennett et al., 2000; Zonneveld et al. 2005). Together these list C-values for over 6,200 species, cited from 617 original references, and represent c. 1.8% of the global angiosperm flora.
Work on C-values in other plant groups lagged behind that in angiosperms. This issue was addressed in September 1997 when the Angiosperm Genome size workshop and discussion meeting was held at the Royal Botanic Gardens, Kew. A key aim of the workshop was to identify major gaps in our knowledge of plant DNA C-values and to recommend targets and priorities for new work to fill them by international collaboration. At the workshop Murray reviewed our knowledge of C-values in non-angiosperm groups and produced the first list of DNA amounts in gymnosperms (Murray 1998). The list contained C-value estimates for 117 species (corresponding to 16% of all gymnosperm species), cited from 24 original references. Since this list, C-value estimates in a further 24 original references have been published and these include data for nearly 238 species not previously reported, but no new compiled list in hard copy has been published.
Pteridophytes (ferns, fern allies and lycophytes) and bryophytes
In 1997 Murray also reviewed data for pteridophytes and bryophytes and noted that the situation was much worse with estimates for only ~0.42% of pteridophytes and ~0.1% for bryophytes. He also noted that locating the C-value data had been very difficult. There was clearly a need to collate DNA amount data in these groups and to make them easily accessible.
For pteridophytes, estimates of DNA amounts were pooled into one user-friendly reference list by Bennett and Leitch (2001). This contained DNA C-values for just 48 species from eight original references and highlighted the ongoing need for work to increase knowledge in this area. Later, Obermayer et al. (2002) reported new C-values for thirty species. Since then, further estimates from 13 original references have also been located. However, no further compiled list has been published in hard copy.
In bryophytes there is still no equivalent pooled list of C-values published in hard copy. C-value data are available but scattered in the literature. The largest source of data comes from Voglmayr (2000) who estimated C-values in 138 moss taxa in a carefully targeted study whose aim was to cover a representative spectrum of moss taxa. Voglmayr's paper also reviewed C-value estimates made by previous workers and is thus the closest approximation to a single reference source for C-values in bryophytes currently available in hard copy. A survey of genome sizes in 43 liverwort species was published by Temsch et al. (2010), providing a first insight into genome size diversity in this group, while the first insights into genome sizes for hornworts was published by Bainard et al. (2013).
The 1997 Angiosperm Genome Size workshop did not assess knowledge of C-values in algae. This was not because they were seen as unimportant, but the gaps identified for several other groups seemed daunting enough. However, once first compilations of DNA C-values for gymnosperms (Murray, 1998), pteridophytes (Obermayer et al., 2002) and bryophytes (Voglmayr, 2000) were available, the lack of readily accessible C-value data in algae became more apparent. This major gap has now been addressed as Kapraun (2005) and Phillips and Kapraun (2011) have published the compilations of genome size estimates for red (Rhodophyta), green (Chlorophyta) and brown (Phaeophyta) algae. In these papers they also review the considerable diversity in this character and its possible evolutionary significance.
The Need for an Electronic Database
From the above, it became clear that there was a growing need to combine available C-value estimates for different plant groups into a single, easily accessible, electronic database.
The first electronic database of DNA C-values in angiosperms (release 1.0, April 1997) The collected lists of angiosperm DNA amounts were produced to make data more accessible for both reference and analysis purposes. However, as the number of such lists rose, finding whether an estimate for a particular species was listed, and if so, its reported size, took longer. To overcome this, in 1997 we decided to pool all the data from collected lists that had been published into one combined database for about 2,800 angiosperm species, and to make this available in an electronic form. A first version of the electronic Angiosperm DNA C-values database (release 1.0) went live in April 1997. This contained C-value data but lacked many associated details and explanatory footnotes (e.g. chromosome number, ploidy level, life cycle type, and calibration standard and techniques used) given in the original published lists.
Updating the Angiosperm DNA C-values database
Release 2.0, October 1998 On 31st October 1998 an updated version of the Angiosperm DNA C-values database (release 2.0) was released. A new user-friendly format made it possible to search and query the database for the first time.
Release 3.0, December 2000 Following publication of a sixth list of DNA amounts (Bennett et al., 2000) which included first DNA C-value estimates for 691 species not included in any of the previous five lists, it became necessary to update the database again. Release 3.0 went live in December 2000. Not only did it contain C-value data for an additional 691species, but it also gave 1C DNA amounts expressed in megabase pairs (Mbp) for the first time. (The factor used to convert picograms to Mbp was 1 pg = 980 Mbp (Cavalier-Smith, 1985; Bennett et al., 2000).)
Further releases Since 2000 further releases have gone live. In release 4.0 (January 2003) new systematic and reference data were added, whereas in releases 5.0, 6.0, 7.0 and 8.0 additional C-value data have been added, including over 4,000 species new to the database. Data for releases 5.0, 6.0 and 7.0 were taken from Bennett and Leitch (2005), Zonneveld et al. (2005) and Bennett and Leitch (2011, respectively. In contrast, data for Release 8.0 and 9.0 have not been published in an accompanying paper.
The Pteridophyte DNA C-values database (release 1.0, December 2000)
To compliment the growing Angiosperm DNA C-values database, the first electronic database for a non-angiosperm group — the pteridophytes, was release in December 2000. This contained C-value estimates for 48 species given in Bennett and Leitch (2001). Additional information on spore type (i.e. homosporous or heterosporous), sperm flagella number (i.e. biflagellate or multiflagellate), sporangial type (i.e. eusporangiate or leptosporangiate), chromosome number and ploidy level was also given where known.
The Plant DNA C-values Database
Key recommendation 2 of the Angiosperm Genome Size workshop (1997) identified a clear need 'To improve accessibility to plant C-value data by making them readily available for reference purposes as published reference lists and/or on the internet in one plant DNA C-values database'.
Release 1.0, September 2001 Release 1.0 of the Plant DNA C-values database contributed significantly to fulfilling this recommendation. By incorporating C-value data for all embryophyte plant groups (i.e. angiosperms, pteridophytes, gymnosperms and bryophytes) into the Plant DNA C-values database, a single compilation of C-value data for land plants was available for the first time.
Release 2.0, January 2003 Following Release 1.0, work continued to develop and extend the Plant DNA C-values database and a second release went live in January 2003 with four major improvments.
C-value estimates for additional pteridophytes and gymnosperms were made available Additional C-value estimates for a species could be viewed The original reference sources for C-value estimates could be viewed In angiosperms, a new systematic option using the APG (Angiosperm Phylogeny Group) families for searching the database was made available
Release 3.0, December 2004 As further C-value data became available this was added to the database and release 3.0 went live in December 2004.
New data for angiosperms, gymnosperms, pteridophytes and bryophytes were added For angiosperms, C-values for 804 species (628 of which were new to the database) were collated from 88 original references into a sixth supplementary list of DNA amounts (Bennett and Leitch 2005). These were added to the database.
For gymnosperms, pteridophytes and bryophytes no new compiled lists have been published, but new data from individual research papers and communications have continued to be added to the database.
Algal C-values now available online Release 3.0 also included 253 algal DNA C-values for the first time. Following the publication of a compiled list of 247 algal species by Kapraun (2005) these values, together with six additional species were entered into the database and made available online. Further data have since become available and the latest release now comprises 445 species.
Release 4.0, October 2005 Following the publication of Zonneveld et al. (2005), C-values for a further 411 species (308 of which are new to the database) have been added.
Release 5.0, December 2010 This release contained C-value data for a further 1908 species including 1860 angiosperm species not listed in previous releases of the database.
Release 6.0, December 2012 This release contains C-value data for 8,510 species including 1,860 species not listed in previous releases of the database.
Release 7.1, April 2019
The most recent release of the C-values database comprises data for 12,273 species comprising:
Angiosperm DNA C-values (10,770 species) Gymnosperm DNA C-values (421 species) Pteridophyte DNA C-values (303 species - 246 species of ferns and fern allies and 57 lycophytes) Bryophyte DNA C-values (334 species) Algal DNA C-values (445 species)
Users can choose between analysing C-value data across different groups of plants (using the Plant DNA C-values database), or searching just part of the database (e.g. angiosperms) by selecting the specific plant group of interest (i.e. angiosperms, gymnosperms, pteridophytes, bryophytes or algae) from the Plant DNA C-values homepage.