Tardigrades have unique stress-adaptations that allow them to survive extremes of cold, heat, radiation and vacuum. To study this, encoded protein clusters and pathways from an ongoing transcriptome study on the tardigrade Milnesium tardigradum were analyzed using bioinformatics tools and compared to expressed sequence tags (ESTs) from Hypsibius dujardini, revealing major pathways involved in resistance against extreme environmental conditions. ESTs are available on the Tardigrade Workbench along with software and databank updates. Our analysis reveals that RNA stability motifs for M. tardigradum are different from typical motifs known from higher animals. M. tardigradum and H. dujardini protein clusters and conserved domains imply metabolic storage pathways for glycogen, glycolipids and specific secondary metabolism as well as stress response pathways (including heat shock proteins, bmh2, and specific repair pathways). Redox-, DNA-, stress- and protein protection pathways complement specific repair capabilities to achieve the strong robustness of M. tardigradum. These pathways are partly conserved in other animals and their manipulation could boost stress adaptation even in human cells. However, the unique combination of resistance and repair pathways make tardigrades and M. tardigradum in particular so highly stress resistant.
Keywords: RNA, expressed sequence tag, cluster, protein family, adaptation, tardigrada, transcriptome
Tardigrades represent one of the four main invertebrate taxa in which anhydrobiotic populations are wide-spread, the other three groups being crustaceans, nematodes and rotifers. In many of these species, the anhydrobiotic state may be induced over the whole life cycle, from the egg to the adult stage.1–3 The mechanisms by which these anhydrobiotic organisms preserve their cellular integrity under extreme desiccation have remained a mystery to biologists. Previous research on the biochemistry of anhydrobiotic organisms focussed strongly on the role of polyhydroxy compounds, mainly carbohydrates, as membrane dry state stabilizers.4,5 For instance the eutardigrade Richtersius coronifer was shown to accumulate the disaccharide trehalose at about 2.3% dry weight on entering the anhydrobiotic state.6 In all other eutardigrades and heterotardigrades the accumulation is generally lower, and in Milnesium tardigradum trehalose is not detectable.7
Furthermore, physiological and molecular studies indicated the role of several stress proteins (heat shock proteins and late embryogenesis abundant proteins) in protection against desiccation damage.8,9 Stress and late embryogenesis abundant proteins (LEAs) act as molecular chaperones and bind to other proteins— thereby preventing aggregation or unfolding of the protein or promoting protein folding.10–13 Stress proteins may also protect cells from oxidative damage both in vivo and in vitro.14–16 Heat-shock protein genes and proteins (Hsp70 family) have been studied in the tardigrade species M. tardigradum in the stage of dehydration, rehydration17,18 and under irradiation.19 The first verification of LEAs in tardigrades was reported by McGee et al.20
However, a more diversified approach is needed to fully capture the different facets of superior stress adaptation in these organisms. Advances in transcriptomics allow one to perform diversified large-scale comparisons exploiting different expressed sequence tag (EST) resources for tardigrades. The present work is based on the analysis of a new large-scale EST dataset generated from the cosmopolitan terrestrial tardigrade M. tardigradum, a species that is well known for its remarkable resistance to heat, radiation and cold.21–23 To identify general tardigrade stress adaptations as well as specific pathways unique to M. tardigradum and its extremely high stress resistance, we compared these data to another well-studied tardigrade, Hypsibius dujardini (which is well studied for instance regarding development)24 and, in addition, has a large amount of ESTs that are publicly available. In comparison, it is clearly less resistant to environmental stress than M. tardigradum. To make the most of this accumulating EST evidence, all sequences compared are made available on the Tardigrade Workbench25 including software updates and various comparison options to different tardigrade species and other organisms. The accumulated ESTs allow researchers to identify tardigrade-specific RNA motifs and tardigrade-specific (M. tardigradum and/or H. dujardini) ESTs as well as ESTs where the implied function occurs in other species, including Homo sapiens. The analysis of encoded proteins permits positive predictions on the presence of pathways. However, tardigrade proteome, metabolome, transcriptome, and genome data have to be accumulated further for detailed analysis. Our data establish and show a rich repertoire of tardigrade stress protective pathways (eg, LEAs, heat shock, and bmh2 pathway) including protective metabolism (storage pathways, fatty acid biosynthesis). Furthermore, the repertoire for M. tardigradum is clearly richer than in H. dujardini, in particular regarding repair functions (oxidative damage, DNA repair, RNA and protein repair). Using EST clustering and annotation of conserved domains, we have annotated tardigrade-specific protein clusters and attempted to determine their involvement in stress protection while verifying suggested pathways by further tardigrade ESTs.
The analyses support the concept that tardigrades combine several strategies to achieve high stress resistance. This does not allow perfect protection and damage is efficiently repaired, particularly in M. tardigradum. Such a combination enhances protein protection and turnover, redox protection and specific protection of DNA. In addition to tardigrade-specific adaptations, several of the stress-protective pathways are more generally used and occur in human cells.
To understand the extremely strong tardigrade stress adaptation capabilities we conducted a detailed analysis of their transcriptome in terms of encoded proteins and pathways. This analysis shows that a multitude of repair pathways are critical, including tardigrade-specific ones in combination with classical stress resistance pathways.
Cultures of the cosmopolitan eutardigrade species M. tardigradum Doyère, 1840 (Apochela, Milnesidae) were kept and reared on Petri dishes (diameter 9.4 cm) filled with a small layer of agarose (3%; peqGOLD Universal Agarose, peqLAB, Erlangen, Germany) and covered with spring water (Volvic™ water, Danone Waters Deutschland, Wiesbaden, Germany) at 20 °C ± 2 °C and a light/dark cycle of 12 h. Rotifers of the species Philodina citrina and nematodes Panagrellus sp. were provided as food source to adults, while, juvenile tardigrades were additionally fed with green algae Chlorogonium elongatum. For all experiments adult animals in good condition were taken from the culture and starved for three days. This prevents the preparation of additional RNA originating from incompletely digested food in the intestinal system.
We obtained all ESTs of M. tardigradum from our ongoing sequencing project of M. tardigradum. For H. dujardini sequences we obtained all available EST sequences from GenBank.26 Proteins were predicted using a BLASTX27 search against UniProtKB/Swis-sProt-, UniProtKB/TrEMBL- and NR-database. The open reading frames (ORFs) for nucleotide sequences, showing significant results (E-value <0.001), were extracted. To be more efficient, only sequences without results were searched against more extensive databases (UniProtKB/TrEMBL—and NR-database). For sequences without homology the longest ORF was extracted. All available new and existing sequence information (RNA, encoded proteins) was integrated on the Tardigrade Workbench. For this a new server was hosted at http://waterbear.bioapps.biozentrum.uni-wuerzburg.de. It includes enhanced management capabilities, new databases and rewritten source code. All new EST sequences have been deposited with GenBank and have GenBank accession numbers (refer to supplementary material 2 for the accession numbers of all sequences discussed in this study).
Identification of regulatory elements in M. tardigradum mRNAs
Here, all known ESTs of M. tardigradum were systematically screened using the software UTRscan.28 This software screens 30 regulatory elements for RNA regulation (translational elements, stability elements and other well known regulatory elements). The default settings for batch mode were used and the reported elements were collected.
Primers for different heat shock proteins (HSP10: forward 5′-CCACTTCGCTACAAACAG-3′, reverse 5′-GTGATGCCAATAGGAGTG-3′, product length 171 bp; HSP17.2: forward 5′-GTCGATGCAGTCAAACAG-3′, reverse 5′-GCGTGCTCAGTTACTCTC-3′, product length 99 bp; HSP40: forward 5′-CTCTGGGAAGCCATTACT-3′, reverse 5′-TGAGACTTGCTTCTCTCC-3′, product length 119 bp; HSP90: forward 5′-GTCGATGCAGTCAAACAG-3′, reverse 5′-GCGTGCTCAGTTACTCTC-3′, product length 99 bp; HSP70: forward 5′-CCAAGTCTTCGAGTGATC-3′, reverse 5′-TCTGCTCAGACAACAGTC-3′, product length 84 bp) were used to validate EST data of M. tardigradum by PCR. The PCR protocol used consists of an 8 min denaturation step at 94 °C, 35 cycles of 30 s denaturation (94 °C) followed by 30 s annealing (56 °C) and 30 s elongation at 72 °C and a final elongation at 72 °C for 10 min in a reaction volume of 20 μl. A total amount of 1 μl of cDNA was added. We used primer3 for the design of the oligonucleotides.29
Total protein lysate from M. tardigradum (10 μg) was separated on a NuPAGE™ 4%–12% Bis-Tris mini gel (Invitrogen) using MES running buffer. A voltage of 200 V was applied until the bromophenol blue front had reached the bottom of the gel (approx. 40 cm). Separated proteins were electro-transferred onto PVDF membrane for 1.5 h at maximum 50 mA (0.8 cm−2) in a semi-dry transfer unit (Hoefer™ TE 77) using the following transfer solution: concentrated anode buffer (300 mM Tris-HCl pH 10.4), anode buffer (25 mM Tris-HCl pH 10.4) and cathode buffer (25 mM Tris-HCl pH 9.4, 40 mM aminohexanoic acid). The PVDF membrane was incubated in blocking buffer containing 5% non-fat milk, 0.1% Tween20 in PBS.
Detection of Hsp70 was carried out using anti-Hsp70 (BD Biosciences Pharmingen) as the primary antibody and a horseradish peroxidase conjugated anti mouse IgG (GE-Healthcare) as the secondary antibody. For detection of LEA proteins we used Ari LEA Antibody as the primary antibody and a horseradish peroxidise conjugated anti-rabbit IgG (GE-Healthcare) as the secondary antibody. ECL DualVue marker (Western Blotting Detection kit, GE-Healthcare) was used to determine the molecular mass of the target proteins on film. Images were acquired using an Image Scanner Model UTA-1100 (Amersham Biosciences).
Sequence and domain analysis
A number of different iterative sequence alignment procedures and sequence analysis methods were used as previously described30 including gene context, interaction predictions, domain analysis and phylogenetic tests. Similar sequences were clustered by different methods: CLANS and Inparanoid. While this clustering is described in more detail below, other methods were considered, included COG/KOG clustering, as well as clusters containing specific domains evident by PFAM31 membership and different domain families according to Interpro32 and SMART.33,34
Clustering using the CLANS method
CLANS performed an all-against-all pairwise sequence comparisons using TBLASTX.27 Afterwards similar proteins were clustered in 3D applying the CLANS method35 with a cut-off E-value of 0.001. The program calculates pairwise attraction values based on the HSP P-values obtained from the BLAST run. Finally, clusters were identified using convex clustering and NJ-trees with standard parameters.
Clustering using the Inparanoid method
After all-against-all pairwise sequence comparisons using TBLASTX,27 the results of the comparison were imported into Inparanoid36 for prediction of orthologs within the ESTs. As the EST sets cover a substantial fraction of the encoded proteins but do not yet represent the whole protein set of the compared tardigrades, we called the orthologs given by Inparanoid “shared candidate orthologs sequences (sCOS)” and the remaining sequences “candidate single sequence ESTs”. The sCOS represent the lower bound of identified orthologs, as further identified proteins will certainly increase the number of orthologs between the two species of tardigrades.
Sequence comparisons to identify tardigrade proteins and stress pathways
We conducted systematic sequence to sequence comparisons starting from verified orthologs of a number of organisms (Drosophila melanogaster, Caenorhabditis elegans; different plants; fungi; as well as vertebrates) and used these to screen tardigrade proteins for similarity. Good hits to stress proteins and best hits to LEA proteins were further examined for functionality by a number of sequence analysis methods comprising sensitive back-searches, 37 Prosite motifs and domain composition.30 Furthermore, we conducted a large-scale mapping of all available sequence data from M. tardigradum and H. dujardini to clusters of orthologous groups found in complete prokaryotic genomes (COGs38) or in complete eukaryotic genomes (KOGs39). This mapping allowed us to directly predict individual protein functions and protein families by comparing the sequence to well established families from known genome sequences. For this, the latest version (last major update in 2008) of the COG/KOG database was used.40 Briefly, proteins were classified into the following major COG/KOG classes and functions: (A) RNA processing and modification; (B) chromatin structure and dynamics cycle control and mitosis; (E) amino acid metabolism and transport; (F) nucleotide metabolism and transport; (G) carbohydrate metabolism and transport; (H) coenzyme metabolism; (I) lipid metabolism; (J) translation; (K) transcription; (L) replication and repair; “(M) cell wall/membrane/envelope biogenesis; (N) cell motility; (O) post-translational modification, protein turnover, chaperone functions; (P) inorganic ion transport and metabolism; (Q) secondary structure; (T) signal transduction; (U) intracellular trafficking and secretion; (V) defence mechanisms; (Y) nuclear structure; (Z) cytoskeleton. The classification considers also unresolved cases: (R) general functional prediction only; (S) function unknown.
Phylogenetic comparison of LEA proteins
LEA proteins from H. dujardini and R. coronifer were collected after the annotation of ESTs. LEA sequences from other organisms were collected from NCBI.40
These sequences were aligned and distances were calculated applying ClustalW.41 Bootstrap supports for the different branches were computed. The final unrooted tree (Fig. 4) was drawn using the software FigTree (version 1.2.3; http://tree.bio.ed.ac.uk/software/figtree).
Phylogenetic tree of different tardigrade LEA-like proteins.
Transcriptome generation and analysis platform
H. dujardini ESTs were generated using a directional cDNA library. The library was constructed from adults and juveniles from an established laboratory culture derived from a single parthenogenetic female (culture ED52). As H. dujardini releases eggs in the shed cuticle at each post-larval moult, the sample also includes egg and embryo transcripts. A total of 5235 sequences are deposited in dbEST.
M. tardigradum ESTs were generated as part of a research program to study stress adaptations for this species. The ESTs were generated from a directional cDNA library. For these experiments adult animals in a good physical condition were taken directly from the culture and starved for three days to avoid preparation of additional RNA originating from incompletely digested food in the intestinal system. The derived M. tardigradum ESTs are a mixture of inactive stage animals and active motile animals.
A total of 9982 M. tardigradum sequences were generated in a first round. After filtering out low quality sequences we finally obtained 6593 sequences which were deposited in dbEST. Further newly determined sequences were added in the same way. All GenBank accession identifiers for the sequences analyzed in this manuscript are listed in Additional file 2. Statistical sampling predicts that this represents a substantial fraction of the total transcriptome. In particular, the majority of the highly expressed pathways are already sampled.
Extended and updated tardigrade workbench
The tardigrade workbench server25 was updated accordingly and now includes the new data mentioned above, as well as all previous public EST data on tardigrades. Furthermore, by adding systematic translation, all encoded proteins and standard protein databases are available for comparison at the workbench. An online tutorial was added at http://waterbear.bioapps.biozentrum.uni-wuerzburg.de/cgi-bin/about.pl. Moreover, the data structure and the management of the database were improved by changes in database implementation. New software includes a faster application-specific pattern-search, a new version (2.0) of the regular expression and Perl search as well as the possibility to load a fasta-format file with multiple sequences to screen for promoter patterns (including tardigrade-specific or general promoter motifs of interest).
We included novel options for COG statistics (see also below). Information for any COG can be examined and suitable ESTs can be downloaded. The new server is available at http://waterbear.bioapps.biozentrum.uni-wuerzburg.de and serves as an analysis platform for further studies of tardigrade molecular makeup.
Regulatory RNA motifs involved in post transcriptional regulation in M. tardigradum
We systematically searched for regulatory RNA motifs, comparing all ESTs (new and old) available (Table 1) within the EST data from M. tardigradum. From a screen of 30 well characterized regulatory motifs (see materials and methods) only twelve were detected in M. tardigradum sequences. Frequently used regulatory motifs found included mRNA stability signals, such as the LOX 15-DICE element,42 brd box,43 and the alcohol dehydrogenase 3′UTR down-regulation control element.44 In contrast, other elements, such as the widely used AU elements in vertebrates,45 were not found in the M. tardigradum ESTs. Translational elements such as the IRE were found, but are rare in M. tardigradum sequences.
Regulatory motifs found by UTR-Scan in 3264 Milnesium tardigradum unique genes.
Detailed comparison of different protein clusters in tardigrades
Results from different clustering methods were combined to get maximum information about detected tardigrade EST clusters (details are given in supplementary material).
The software Inparanoid clustered similar sequences in each species (candidate ortholog sequences, COS) with exactly the same domain composition, as detected by Inparanoid.36 This identified sCOSs present both in the H. dujardini and M. tardigradum EST data (Fig. 1, top). Furthermore, Inparanoid also identified whether sequences with similar domain composition occur in the same species (paralogues). There were 178 sCOSs, corresponding to 214 M. tardigradum and 234 H. dujardini EST clusters.
Sequence similar protein clusters in Milnesium tardigradum and Hypsibius dujardini.
The set of sCOSs is a conservative estimate as only the full genome sequence of both organisms would allow us to identify all orthologs. However, these COSs already show that a substantial fraction of homologous proteins exist in both species, with a major fraction including adaptations against environmental stress (Fig. 1, top).
CLANS clustering,35 for comparison, identifies de novo protein families in any collection of sequences by sequence similarity (cut-off see materials and methods) alone. CLANS clustering does not demand a sequence to have an ortholog in another species or complete genomes (Fig. 1, bottom).
There are 13 CLANS clusters consisting of 203 proteins for M. tardigradum, 16 CLANS clusters consisting of 246 proteins in H. dujardini, and 37 CLANS clusters consisting of 684 proteins when all available ESTs from both species are pooled (Additional file 1). In each comparison there are tardigrade-specific protein clusters (a total of 314 proteins in different clusters; twelve clusters are shared). Within the M. tardigradum clusters, there are genes related to stress detoxification (cytochrome p450), protein turnover (cathepsin, metalloproteinase-like cluster and three ubiqutin-like clusters), protein secretion (rab-like), redox protection (glutathione S-transferase, cystatin-like) and regulation (ras, calmodulin-like). The second largest cluster is M. tardigradum specific. H. dujardini also features species-specific clusters (four with no homologies including the top cluster) and, as in M. tardigradum, clusters of ras-like proteins, cathepsins, glutathion S- transferase-like and cystatin-like proteins, dehydrogenases and a number of diapause-specific proteins. Diapause is a reversible state of developmental suspension occurring in plants, insects and vertebrates, including marsupials and some other mammals.46 In tardigrades the presence of diapause-specific proteins may support either the tun form or the regulation of other metabolic inactive states (eg, in development). Cuticula protection is probably supported by chitin binding proteins. As with all further analyses below full details on cluster results are given in Additional File 1.
Specific encoded protein families and those shared with other organisms occurring in M. tardigradum and H. dujardini ESTs
To predict the repertoire of functions already visible from the currently assembled transcriptome, we compared it with the full transcriptome available from the sequencing effort on M. tardigradum ESTs and available data regarding prokaryotic COGs38 and eukaryotic KOGs.39 These are conserved genes grouped across a large number of species (the different classes are explained in material and methods; complete lists in Additional File 1, S-Table II). Functions in H. dujardini include (Table 3A): molecular chaperons [category O], uncharacterized proteins [category R] including unusual protein kinases and membrane ATPases [category C]. While ribosomal proteins occur in many organisms [category J], there is COG/KOG evidence that there are specific proteins used here and that there is an increased number of specific ribosomal proteins in M. tardigradum. Furthermore, M. tardigradum (Table 2B) has even more Milnesium-specific, as yet poorly characterized proteins [categories S and R], metabolic enzymes [category C] and secondary metabolic enzymes involved in protein protection and turnover [category O]. Furthermore, there is evidence for specific DNA turnover and protection [category K] and transport processes including, for instance, guanine nucleotide exchange factors and SNARE proteins47 [category U].
Important COGs/KOGs identified uniquely in Milnesium tardigradum.
Enzymes in metabolic pathways predicted to be present in Hybsibius dujardini and Milnesium tardigradum according to the EST data.1
Shared COGs and KOGs are tardigrade-specific reinforcements for protein repair and turnover, proteasome components, cytochrome P450-like, GTPases and GTP-binding proteins (for the cytoskeleton). There are only four (KOG2733-membrane, KOG3098, KOG4604, KOG4431) uncharacterized protein clusters that are shared. The KOG4431 encodes proteins induced by hypoxia that could be involved in tardigrade-specific adaptations.
Experimental validation and phylogenetic analysis of heat shock and LEA proteins
We directly tested the presence of stress proteins in M. tardigradum by PCR (heat shock proteins shown in Fig. 2) and by Western blot analysis using anti-bodies against heat shock protein Hsp70 and LEA protein (Fig. 3). A strong band was detected for Hsp70 slightly below 75 kDa as expected. There are some degradation products visible (fainter bands, particularly in lane 3, showing the LEA protein). Furthermore, the thin upper band visible in lane 2 may indicate an isoform of Hsp70 (for instance, a minor dimer form83 has been described for Hsp70). When using the antibody against LEA proteins, one prominent band appeared at approx. 60 kDa accompanied by several faint bands in different regions. Thus, the presence of Hsp70 and LEA proteins could be experimentally confirmed. Further details on heat shock protein distribution and phylogeny are given below and in Additional File 1.
PCR validation of heat shock protein transcripts in Milnesium tardigradum and Hypsibius dujardini.
Western blot of heat shock and LEA proteins in Milnesium tardigradum.
It is known from previous studies that LEA-like proteins48 occur in tardigrade species. They are known to improve stress resistance in other organisms, particularly in plants. As there are currently no complete sequences available for the M. tardigradum LEA proteins, we compared (Fig. 4) all identified LEA proteins from H. dujardini (13 sequences, red) and R. coronifer (3 sequences, orange). The identified sequences were compared to a number of other bona fide LEA proteins from other organisms (PFAM37 seed alignment). We used the new classification by Tunnacliffe and Wise,48 and compared our results to their LEA protein groups 1, 2 and 3. The remaining LEA groups (4–6)48 have no similarity to the tardigrade proteins. Groups 1, 2 and 3 correspond to the PFAM domains Lea 1, Lea 2, and Lea 4 and are shown in Figure 4 (full seed alignment).
Apart from one protein, all H. dujardini proteins cluster in two branches, but have only Lea 4 proteins as relatives from other organisms. Thus all tardigrade sequences including all three R. coronifer sequences are found in the area of Lea 4-like proteins, a potential independent LEA protein family is present in H. dujardini (Fig. 4; GenBank identifier CO508035.1).
Predicted metabolic enzymes and resulting pathways from M. tardigradum and H. dujardini EST analysis
The EST sequencing effort on M. tardigradum as well as the existing EST sequences of H. dujardini provide better insights on central metabolic pathways involved in metabolic protection and stress resistance (Table 3, details in S-Table IV). Thus H. dujardini and M. tardigradum have key glycolytic enzymes (the list in M. tardigradum is even more complete). However, aldolase is found only in H. dujardini. Starch and sucrose metabolism is represented by a number of enzymes. On the other hand, basic fatty acid biosynthesis has only one (H. dujardini) or two (M. tardigradum) general enzymes, but there is clear evidence for fatty acid elongation in mitochondria and fatty acid conversion into different CoA-enzyme intermediates. The latter is again more completely demonstrated in M. tardigradum and there is also clear evidence for ketone body synthesis and conversion (which should allow long term adaptation against carbohydrate shortage). Furthermore, linoleic acids synthesis pathways are present and there are several ESTs encoding enzymes of the glycerophospholipid and sphingolipid metabolism.
We found several metabolite pathways implicated in enhancing stress defence, through the provision of specific metabolites. The trehalose pathway is not fully evident from EST data in M. tardigradum. However, H. dujardini has an EST encoding trehalase-like protein 1 [GenBank identifier CZ258529.1]. Furthermore, there are clear pathways for glycogen storage and glycolipids (found in both organisms), or specific protein and membrane modifications (several such enzymatic pathways available for both organisms).
Stress pathways predicted from M. tardigradum and H. dujardini ESTs compared to other organisms
The regulatory stress pathway around 14-3-3 related protein bmh2, known to exist in yeast49 and man, is also conserved in tardigrades (Fig. 5; Additional File 1: S- Table V and VI). Conserved regulatory input protein Sck150 is found in H. dujardini as well as in M. tardigradum. Function predictions from sequence were verified by additional sequence analysis such as back-searches, Prosite Motifs and SMART domain composition.30 Bmh2-dependent metabolic enzymes found in both tardigrades include trehalase and acyltransferases such as LCB1 or LCB2.49 Furthermore, proteomics data has already identified glycerol- 3-phosphate acyltransferases in M. tardigradum.51 The data suggests the presence of a complete stress pathway, including the associated metabolic enzymes in H. dujardini as well as key switches in M. tardigradum.
Bmh2 stress pathway is conserved in tardigrades.
Alpert52 compiled a collection of pathways involved in desiccation tolerance, which was investigated in the present work. For tardigrades, we found no genes indicating stress protection by photopigments.53,54 This is in accordance with morphological data. Using systematic sequence to sequence comparisons, starting with verified orthologs of a number of organisms (see material and methods), we obtained further results regarding stress pathway proteins using the available EST data from H. dujardini and M. tardigradum (Table 4, details in Additional File 1, S-Table Va and b). Repair pathways include RNA helicases55 as well as peroxiredoxin, peroxidase and superoxide dismutase.56 Several LEA protein57,58 candidates were found, including very good matches to known LEA proteins (for validation see Fig. 4). Different larger and smaller heat shock proteins59 could be identified, including Hsp70 in H. dujardini and M. tardigradum.17,19,51 DNA repair appears to be mediated by the MutS pathway.60 Furthermore, M. tardigradum contains specific DNA repair (RAD51, DnaJ family61) as well as other DNA protection pathway proteins (DNA helicases). According to the analyzed EST data in tardigrades a specific animal signalling pathway is present, the HOG (high osmolarity glycerole) signalling pathway, including PKA-mediated regulation.62
Specific stress pathway proteins predicted to be present in Hypsibius dujardini according to the analyzed EST data.1
Tardigrade-specific stress adaptations versus general adaptations shared with other animals
There are two other desiccation resistant animal phyla, rotifera and nematodes. To get an insight into the general and tardigrade specific adaptations we compared our inventory of stress pathways found in M. tardigradum and/or H. dujardini with stress pathways known from rotifera or nematodes. Conservation of key pathways in man is shown in Figure 6. We considered all stress protection proteins found in either M. tardigradum or H. dujardini and examined whether these proteins are also found in nematodes, rotifers or man. We also examined specific genes involved in nematode or rotifer stress response and whether these are found in tardigrades.
Comparison of COGs/KOGs for Milnesium tardigradum, Hypsibius dujardini (Tardigrades), Homo sapiens and Caenorhabditis elegans (Nematodes).
A first overview on this is shown by COGs/KOGs38,39 and using available genome information regarding protein clusters in nematodes (C. elegans) and H. sapiens.40 We considered three major COGs/KOGs40 categories involved in protective functions by comparison of our tardigrade data and protein clusters to the other two organisms regarding the COGs/KOGs categories V (defence mechanisms), O (posttranslational modification) and L (replication, recombination and repair). The overview (Venn diagram, Fig. 6) shows that there are many shared functions (172), but only 81 tardigrade-“specific” protein clusters (ie, they were neither found in man nor in nematodes). The largest overlap between organism groups is between nematodes and man. To better identify unique tardigrade adaptations we subtracted from these clusters any clusters common for tardigrades and yeast. Compared to rotifers, nematodes, human and yeast these are proteins occurring only in tardigrades (Fig. 7) yielding a list of 46 specific adaptations and COGs/KOGs clusters found only in tardigrades within the context of this comparison (Table 5; detailed results in Additional File 1, S-Table VII). They include COGs for glutaredoxin-related proteins and molecular chaperons as well as KOGs involved in molecular chaperoning. This includes the DnaJ superfamily with a large number of different clusters (Milnesium-specific, Table 4C; Additional File, S-Tables II and V). The COG clusters 4973, 4974 supply site-specific recombinases.63 Collectively, these strongly enhance DNA repair and explain the recently documented resistance of M. tardigradum against DNA damage.64,65 Details on all involved functions and protein clusters are given in Additional File 1, including basic COGs/KOGs shared with yeast in tardigrades (Table S7). Figure 8 summarizes all EST predicted adaptations and compares their distribution in other model organisms, focussing on the predicted encoded protein families. Tardigrades combine adaptations found in nematodes, rotifers and vertebrates, with tardigrade-specific adaptations. Moreover, there are a number of tardigrade specific EST clusters in which sequence similarity does not allow the prediction of any specific function. Interestingly, some protein functions (11) are shared only between tardigrades and H. sapiens in this comparison (Fig. 8 and Additional File 1, Table 7D): These are pathways including the DNA repair protein RHP5766 and ubiquitin protein ligase, as well as proteasome maturation factor.67 Sequence information on rotifers is only limited available. A mitochondrial chaperonin is shared with man, rotifers and C. elegans. A Hsp90 family-type molecular chaperone68,69 is also shared between rotifers and tardigrades (see Additional File 1, S-Table VIIf and g).
Comparison of COGs/KOGs specific for Milnesium tardigradum, Hypsibius dujardini (both tardigrades) and Saccharomyces cerevisiae and Saccharomyces pombe (both yeast).
Tardigrade stress adaptation: Which are specific, which are shared?
Specific stress pathways derived from proteins predicted to be present in Milnesium tardigradum.1
Tardigrade specific cluster domain annotation.
Tardigrade-specific EST clusters
Annotation of the tardigrade-specific EST clusters is non-trivial, as major parts of the sequence have only a weak similarity to other organisms. Here, we used a combination of EST sequencing, clustering, annotation of conserved domains and a search for ESTs encoding proteins related to the domain-predicted pathways to make functional predictions of these particular tardigrade-specific EST sequences.
Annotated domains in tardigrade-specific protein clusters (TSPs) are shown in Figure 9. We looked for further ESTs in M. tardigradum which supported these predictions using the pathway context (middle panel, at the bottom). In particular, we examined whether some ESTs encoded a regulatory protein for the annotated TSP putative protein or an interacting protein (eg, another enzyme of the predicted pathway). For this, all tardigrade clusters identified by CLANS (including all TSPs published previously25) were systematically screened for domains with a known function (applying HMM models from PFAM31 for different domains, latest version; see Additional File, S-Table VIII). Note that these TSPs include sequences from H. dujardini, R. coronifer and M. tardigradum. However, a search for PFAM domains contained in the sequences associated with M. tardigradum was successful only for several TSP clusters (Table 5). This means that the molecular function conferred by the respective domains is utilized in these clusters and can be predicted to be present in M. tardigradum. However, as the rest of the sequences could not be annotated (otherwise these would not be TSPs but shared clusters with other organisms) there are additional molecular functions not yet recognized, that modify the function of the complete protein (eg, regulate it in response to stress or may confer completely novel functions). On the other hand, the domain hits themselves are all highly significant (E-value <10−6 and the significant domain hit covers at least 20% of the cluster sequence, see Table 5 TSPs; and Additional File 1, S-Table VIII).
Tardigrade stress adaptation: General and unique adaptations as well as adaptations shared with man.
Additional domains, such as regulatory or binding domains, modify these predicted functions encoded in the TSPs. Specific domains identified in TSPs as parts of the complete sequence include similarity to osteopontin (TSP8), lipolysis stimulated receptor (TSP18), sulfite exporter TauE (TSP 19), signal peptidase subunit domain (TSP37), Grb (TSP 51), as well as similarity to the DUF19 domain where we can annotate the DUF domain function (TSP55). TSPs 30, 35 and 42 have similarities to uncharacterized domain families (DUFs: DUF568, DUF1481 and DUF2543, respectively) and TSP 33 is completely uncharacterized.
Table 5 shows that this partial information is nevertheless enough to predict part of the tentative protein function and to make an initial estimate on potential interactors or pathways that the TSP may be involved in:
In detail, TSP8 encodes domains similar to osteopontin (eg, tardigrade sequence GenBank identifier GE637688.1). Osteopontin enhances secretion, cell motility, and extracellular matrix invasion and is a general cytokine.82
TSP18 encodes domains with similarity to acetyl-CoA dehydrogenase C-terminal domain and sequences similar to lipolysis stimulated receptor (eg, GenBank identifier CO741640.1). This indicates a role in lipolysis and fatty acid metabolism is possible, supported by the presence of enzymes for lipid metabolism as noted above.
TSP 19 contains domains similar to taurine E, a sulfite exporter (eg, GenBank identifier CO741254.1). The E-value is significant (5.4e-06) and the resulting metabolite, taurine, is known to be a strong antioxidant. However, we did not find pertaining to the pathway such ESTs in M. tardigradum (eg. taurine dehydrogenase and sulfite transferase). As the sequences are clearly glycine-rich, these sequence may be implicated in similar tasks as TSP51 (see below).
TSP37 encodes sequences with domains having significant similarity to a signal peptidase subunit domain spc22/23, a well conserved signal peptidase (eg, GenBank identifier CK326237.1). Hence a tentative function prediction would be that the cluster is involved in protein secretion and transport.
TSP51 contains domains related to GRP (Glycine rich protein family; see eg, GenBank identifier CO741993.1). It is known that this domain is found in proteins required for general stress response. These proteins are cytokeratin-like proteins; glycine-rich and RNA-binding proteins. Particular regulators for general stress response are heat shock proteins and a number of them have been identified in M. tardigradum (this paper and Reuner et al17).
TSP55 could be annotated as it contains the DUF19 domain (match over 126 amino acids in alignment; see eg, GenBank identifier GE637487.1). DUF19 domain is found in one or two copies of several C. elegans proteins. It is roughly 130 amino acids long and has 12 conserved cysteines, which form six intradomain disulphide bridges, suggesting that this domain is an extracellular one. The GO annotation for this protein indicates that it has a function in nematode larval development and is a positive regulator of growth rate. Interaction database predictions70 suggest that the function of the domain is connected to the metabolism of methylations (tRNA methylation, C1-methylation). Further annotation of the DUF19 domain is possible due to the homologs in the genome of the sea-ice bacterium (Psychromonas ingrahamii). The neighbouring gene in this genome is known to be involved in tetrahydropterin synthesis. This activity supports amino acids and thymidine biosynthesis and would be important for larval development, as well as growth in general. Furthermore, this neighbour combination is conserved across genomes. We conclude that these genes are probably involved in tetrahydropterin synthesis.
We found several further genes for tetrahydrofolate synthesis in M. tardigradum annotated by their homology. These are the genes sepiapterin reductase (including 7,8-dihydrobiopterin:NADP + oxidoreductase [EC:184.108.40.206]) and GTP cyclohydrolase 1 [EC:220.127.116.11] as well as DHFR [EC:18.104.22.168] ([GenBank: GR865643.1], [GenBank: GR865643.1], [GenBank: EZ758936.1], [GenBank: EZ761524.1], respectively). The NO synthase [GenBank: EZ762352.1] also occurs in M. tardigradum (long, significant alignment (E-value < 10−72) over more than 600 residues). Tetrahydropterin synthase regulators in vertebrates are NO as well as NO-Synthase.71
Finally, we analyse a new TSP-specific cluster identified from the ongoing sequencing72 where partial-annotation was possible (see materials and methods). For this TSP a chitin binding domain type 2 could be annotated.61 Chitinases hydrolyze chitin (N-acetyl-ß-D-glucosamine) in biological structures such as insect exoskeleton, fungal cell wall, nematode egg shell, and other biological structures involved in protection and self-defence. Furthermore, chitinases along with chitin synthetases are essential for remodelling chitin-containing structures during growth and development as well as for defence from chitinous pathogens.73 In tardigrades we suggest that they will be involved in remodelling chitin during the shift between active and tun stage and that the identified specific tardigrade cluster is also involved in physical protection by chitin. With the identified TSP cluster interactors and regulators of chitinase in the form of binding partners are present in M. tardigradum. Chitinase itself is also encoded (GenBank: AAB81847.1; 117 residue alignment, E-value < 10−6).
Furthermore, regulation74 may be similar to insects, as there is an ecdyson receptor related EST present (GenBank identifier EZ761204.1). Insect regulation relies upon ecdysone producing enzymes and ecdysone receptor. Supporting evidence to place the tardigrades closer to this super-phylum is provided by extensive steroid synthesis, Table 3).
To combine all different pathways and adaptations analyzed, Figure 9 summarizes general stress pathways for M. tardigradum (left), tardigrade-specific pathways (in the sense of the comparison man/rotifers/nematodes/Drosophila/tardigrades, middle panel, top) and pathways shared only with human in this comparison (right). In each of these categories a number of different pathways specifically boost repair and protection of proteins and DNA. In our opinion, it is clearly this multitude and overlap of different proteins that protect M. tardigradum and allow its extreme stress tolerance in the tun stage. All human pathways in Figure 9 (right), are found to be shared with tardigrades but not found in the other organisms compared here. These are potential pharmacological targets to enhance stress resistance in human cells (eg, tissue conservation, cryoprotection, surgical interventions).
Tardigrades are an animal phylum with a surprising stress-resistance and adaptation potential, using phylum-specific pathways as well as enhancing general pathways to increase stress resistance. As a first step towards a comprehensive transcriptome and full genome analysis this study assesses currently available transcriptome information to probe the tardigrade adaptation potential and addresses a number of specific questions already answerable from the current set of EST data. These data cover a substantial fraction of the whole transcriptome with mainly highly expressed ESTs, but also some of the poorly expressed ESTs.
Furthermore, the Tardigrade Workbench was updated with new management options and software added to enhance comparative effort on tardigrade species.
The current tardigrade transcriptome is put into its pathway context looking at all implied levels including RNA motifs, encoded proteins, pathways and implied phenotypic metabolism, signalling and stress protection.
Tardigrade adaptation is strong but there are significant differences between the compared species. The comparison between H. dujardini and the particular stress-resistant M. tardigradum suggests that M. tardigradum disposes of larger protein families for stress-resistance. Importantly, it invests more in repair strategy. This allows recovery from more extensive stress in addition to the direct protective strategy (H. dujardini but also M. tardigradum).
Comparisons between COGs from important stress tolerant animal groups delineate shared and common COGs between tardigrades and some model organisms (D. melanogaster, C. elegans). Furthermore, we show that some stress protective pathways from tardigrades occur in human cells although they were not found in the other organisms.
The function of tardigrade-specific clusters is investigated by analysis of conserved domains and validation of implicated pathways by looking for ESTs encoding further proteins involved in these pathways.
For two protein families (LEA proteins, heat shock proteins), there is also complementary experimental evidence from PCR and Western blotting. By direct comparison of the different clusters, the predicted proteins shared by both species could be identified. Many protective strategies are found and combined in tardigrades. Interestingly, we see clearly in M. tardigradum that the protection is not perfect but rather a high degree of damage is tolerated (eg, DNA lesions and oxidation). This damage is then subsequently repaired by a sophisticated combination of pathways, including several tardigrade-specific sequence clusters and pathways. The same is true in protein damage response, controlled by specific protein turnover pathways and boosted by oxidative damage repair as well as heat shock proteins. Both EST libraries are independent efforts and their accessions deposited with GenBank. The EST library of H. dujardini was generated in the laboratory of M. Blaxter (Institute of Evolutionary Biology, University of Edinburgh).
The M. tardigradum transcriptome is newly generated as a part of a general effort (www.funcrypta.de
While melatonin biosynthesis is frequently studied and considered to be highly relevant, the awareness of melatonin catabolism is largely restricted to a very few compounds, among which some of them are often thought to be more or less irrelevant because they are mostly found in low quantities. The relatively fast hepatic catabolism by cytochrome P450 subforms that preferably leads to 6-hydroxymelatonin is known to limit the presence of melatonin in the blood. Therefore, this process is often believed to represent the only route of biological significance. However, at a closer look, such a view turns out to be too much centered on the circulation and on mammals. The disregard of melatonin catabolism in other organs can be misleading, since metabolites other than 6-hydroxymelatonin may attain relevant concentrations in some tissues and also in the cerebrospinal fluid (CSF), as will be discussed in this article. Moreover, conditions can exist under which another catabolic route can become more important, as has been found to occur under the influence of inflammation [1,2]. Generally, it seems important to distinguish between the different roles of melatonin within an organism, roles that exceed that of a hormone in the classic definition  as being released from a specific gland and distributed via the circulation. Melatonin is known to be synthesized in numerous organs and cells, and the quantities of extrapineal melatonin are by orders of magnitude higher than those in the pineal gland and in the circulation [4,5]. The role of 6-hydroxymelatonin is closely associated with melatonin’s function of a short-lived chronobiological signaling factor that is rapidly eliminated by conversion to an easily excretable compound. 6-Hydroxylation is the prerequisite of conjugation, mainly by sulfation, which leads to the urinary metabolite 6-sulfatoxymelatonin. These requirements are not necessarily valid for many other tissues that produce melatonin in high quantities, but often release it only in low amounts [4,6]. The fate of extrapineal melatonin in tissues is often incompletely understood.
Another need related to considering other routes of melatonin catabolism follows from the discovery of this compound in almost all taxa tested, including bacteria, various phyla of eukaryotic unicells, plants, fungi and invertebrate animals [7,8,9]. As will be outlined in this article, considerable differences exist concerning the quantitative prevalence of specific catabolic pathways between major taxa. The differences will be also discussed in terms of the properties of the respective major metabolites. These properties can be decisive for understanding the biological meaning of the pathways in ecologically different groups of organisms. Moreover, these considerations will not only be restricted to enzymatic mechanisms, but also comprise nonenzymatic reactions based on reactive oxygen and nitrogen species (ROS, RNS) and, in particular, photochemical reactions that are of importance at sites exposed to UV light.
2. CYP-Based Metabolism
CYPs are the major enzymes of vertebrate melatonin metabolism, especially in the liver, but also in other tissues. In quantitative terms, hydroxylation by hepatic CYP1A2 to 6-hydroxymelatonin represents the prevailing mechanism, but other isoforms, in particular, CYP1A1 and the nonhepatic CYP1B1, to a smaller extent also CYP2C19, can catalyze the same reaction  (Figure 1). Hydroxylation at ring atom 6 is required for conjugation, mostly by sulfation, to a minor amount by glucuronidation. In either case, the hydrophilic conjugate is easily excreted. Urinary levels of 6-sulfatoxymelatonin are usually regarded as an indirect, temporally integrating measure of melatonin production by the pineal gland .
An alternate catabolic route catalyzed by CYP isoforms is that of dealkylation, a general property of many CYPs. However, this pathway is usually regarded to be of minor importance for melatonin. The formation of N-acetylserotonin by demethylation (Figure 1) leads to the difficulty that this compound is also the precursor of melatonin, two roles that cannot be easily distinguished under experimental settings. Concerning subforms that accept melatonin as a substrate, demethylation reactions are known for CYP2C19 and CYP1A2 and may, perhaps, also be possible with CYP1A1 . A third reaction type that can be catalyzed by CYP1A2 using other substrates, epoxide formation , has not been directly studied with melatonin and may appear to be rather unlikely. An epoxide is easily hydrolyzed to a dihydroxylated compound. Epoxidation would only be, theoretically, possible in two places at the indolic moiety of melatonin, at ring atoms 2 and 3 or at ring atoms 6 and 7. No report exists for a formation of 6,7-dihydroxymelatonin. Hydrolysis of a 2,3-epoxide would lead to a dihydroxylated product that would immediately turn into the respective keto tautomer, 3-acetamidoethyl-3-hydroxy-5-methoxyindolin-2-one. This compound, which is sometimes misleadingly called 2,3-dihydroxymelatonin, is known as a melatonin metabolite, but its formation has been described to be catalyzed by cytochrome c rather than CYPs . This product easily undergoes a rearrangement to a kynuric metabolite, as will be discussed in another section.
Several of the melatonin-catabolizing CYPs are also expressed in nonhepatic tissues. Although their quantitative contribution to circulating 6-hydroxymelatonin levels and to urinary 6-sulfatoxymelatonin is only minor, local metabolism in other organs may not be entirely irrelevant. This is particularly valid for the central nervous system (CNS). In the brain, the expression of CYP1A2 [14,15,16], CYP1A1 [14,16], CYP1B1 [10,16,17], and CYP2C19 [18,19,20] has been demonstrated. The contribution of the CYP2C19 to melatonin demethylation and that of CYP1B1 to 6-hydroxylation in the brain may be higher than in other tissues . Interestingly, 6-hydroxymelatonin is also sulfated in the CNS [1,21], although the role of 6-sulfatoxymelatonin in the brain as well as its elimination from there has remained entirely unclear .
Although CYP-mediated 6-hydroxylation represents a major route of vertebrate melatonin catabolism, this does not seem to play any relevant role in other organisms. At least, no reports on relevant quantities of 6-hydroxy- or 6-sulfatoxymelatonin exist to date for invertebrate animals, fungi, plants and algae. This difference is particularly remarkable in the case of plants, which express high numbers of different CYP isoforms that might be suspected to also catalyze 6-hydroxylation and O-demethylation of melatonin . Instead, another hydroxylation reaction at ring atom 2, which will be described in the subsequent section, has turned out to be the prevailing catabolic route and to be independent of CYPs . CYP-independent hydroxylations of melatonin are also possible by nonenzymatic processes, e.g., by interaction with free radicals, as will be discussed next.
3. Other Hydroxylation Mechanisms
Hydroxylations are possible by enzymatic and nonenzymatic reactions. From a fundamental point of view, nonenzymatic hydroxylation by free radicals should be possible in any aerobic species, in any cell type and also in the extracellular space. However, this does not imply by any means that the rates of these reactions are always more or less the same. In fact, the quantitative relevance of these reactions can strongly vary, depending on free radical-generating processes that may be strongly increased, in vertebrates, e.g., by inflammation or, in many organisms, at sites exposed to environmental stress, such as UV radiation, in phototroph species even by visible light that causes free radical formation in the photosystems.
Hydroxylation by interaction with free radicals, in particular, via consecutive reactions with two hydroxyl radicals (•OH), are principally possible at any unsubstituted C-atom of the indole moiety . However, not all of the possible products are of equal biological or medical interest. Nonenzymatically formed 6-hydroxymelatonin represents only a very minor fraction relative to the CYP-generated quantities. 4-Hydroxymelatonin has recently attracted some attention, as it was reported to be an excellent radical scavenger with potency for eliminating peroxyl radicals higher than that of melatonin . Previously, 4-hydroxymelatonin was detected as a metabolite in keratinocytes, in which its formation was strongly enhanced by UV B .
Another compound of relevance is formed by hydroxylation at ring atom 3. This change causes an immediate intramolecular rearrangement that leads to the formation of a third ring, to give a metabolite denominated as cyclic 3-hydroxymelatonin [26,27] (Figure 2). This had first been discovered as a combination of hydrogen abstraction by a •OH and addition of a second one [26,27]. Its formation was increased by administration of melatonin  and, strongly, by exposure to ionizing radiation . Cyclic 3-hydroxymelatonin was shown to also be a potent free radical scavenger  and to be converted by two •OH to another key metabolite of melatonin, N1-acetyl-N2-formyl-5-methoxykynuramine (AFMK) , which will be discussed in detail in a following section. Until recently, no enzymatic formation of cyclic 3-hydroxymelatonin was known. This gap has now been closed, but to date this information is restricted to bacteria. In Escherichia coli, three enzymes belonging to the 2-oxoglutarate-dependent dioxygenase (2-ODD) superfamily, 2-ODD 11 (most active form), 2-ODD 26, and 2-ODD 33, were shown to be capable of converting melatonin into cyclic 3-hydroxymelatonin . Rice plants carrying such a melatonin 3-hydroxylase (M3H) transgene produced the expected metabolite and also its secondary product, AFMK. Exposure to cadmium, which is known to increase melatonin levels in rice also increased the apparent M3H activity . However, product specificities of the M3H subforms 2-ODD 11 and 2-ODD 33 were somewhat incomplete, since these isoenzymes hydroxylated melatonin also at ring atom 2, though at comparably low rates . Actually, the demonstration of enzymatic 3-hydroxylation of melatonin has only validity for E. coli, but may be soon extended to other bacteria. As melatonin is present in bacteria  and has been also detected in E. coli , a functional role of M3H enzymes may exist, but this remains to be demonstrated. Concerning other organisms, the possibility of enzymatic 3-hydroxylation has not yet been tested. With regard to low basal quantities of cyclic 3-hydroxymelatonin, this may be rather unlikely in vertebrates, but many other major taxa have not been investigated in this regard.
Hydroxylation of melatonin at ring atom 2 has also been shown to exist both enzymatically and nonenzymatically. As with comparable hydroxylations at other ring atoms, this is also possible by consecutive interactions with two •OH [32,33]. These findings are in line with the observation that 2-hydroxymelatonin is a cutaneous photoproduct that is strongly increased by UV B, although certain amounts of this compound were also detected in non-irradiated keratinocytes . Moreover, formation of 2-hydroxymelatonin was reported to occur under the influence of HOCl  and, later, of taurine chloramine , a reactive intermediate generated by activated neutrophils from HOCl and taurine. Therefore, activation of myeloperoxidase during local inflammatory responses seems to cause nonenzymatic 2-hydroxylation of melatonin.
2-Hydroxylation of melatonin by side reactions of enzymes with different main functions has been occasionally communicated. For instance, 2-hydroxymelatonin has been found to represent an intermediate metabolite formed by cytochrome c in an AFMK-generating pathway , a result that might explain the formation of 2-hydroxymelatonin in non-irradiated keratinocytes. Horseradish peroxidase was also reported to hydroxylate melatonin at ring atom 2, although substantial rates were only observed at pH 5.5 . Additionally, a dimer of 2-hydroxymelatonin was detected under these conditions .
In plants, a considerably higher physiological relevance exists for 2-hydroxylation by another 2-ODD that has turned out to be rather specific, exhibits relatively high activities and is responsible for the prevailing melatonin catabolizing pathway in, at least, angiosperms, perhaps also in all plants (but not in all phototrophs). This enzyme has been denominated as melatonin 2-hydroxylase (M2H) and exists, in species studied in this regard, in multiple subforms. In rice, three subforms were present in the cytosol, whereas another one was expressed in chloroplasts . Studies on cloned M2H unequivocally showed that, in plants, the 2-hydroxymelatonin pathway was predominantly an enzymatic one . In quantitative terms, the amounts of metabolites formed in this route was remarkable and, for most experts, surprising. The products attained levels by orders of magnitude higher (average 368-fold) than those of the parent compound melatonin, at least, in several plants studied .
Of course, this unexpected proportion requires explanation and raises the question on its biological significance. First, it is important to be aware of the tautomery of a 2-hydroxylated indolic compound (Figure 2). In most literature, authors use the term “2-hydroxymelatonin”, when they mean, in fact, another compound, namely, its keto tautomer, 2-acetamidoethyl-5-methoxyindolin-2-one. This indolinone [33,40], sometimes also referred to as an oxindole, is known since long to be the prevailing tautomer relative to the enolic hydroxyindole  and has been recently reported to represent almost 100% of the keto/enol mixture . Using the expression “2-hydroxymelatonin” for its indolinone tautomer is highly misleading, since it inappropriately seems to indicate properties of this molecule reminiscent of those known for melatonin. To discriminate the indolinone from melatonin and other hydroxylated melatonin derivatives, especially in terms of properties, it would be preferable to apply a more correct terminology. Instead of the relatively long chemical name, an abbreviation may be used, such as “AMIO” (Figure 2), as had also been done and is now customary with other melatonin metabolites such as AFMK.
The differences in properties mainly concern two aspects, those of reactivity and of lipophilicity. Contrary to melatonin and several of its hydroxylated derivatives, such as cyclic 3-hydroxymelatonin, 4-hydroxy- and 6-hydroxymelatonin, the metabolite AMIO is much less reactive and a relatively poor direct antioxidant . This means that AMIO will not be easily removed by oxidants, but the low reactivity may also disfavor a rapid enzymatic elimination. The removal of AMIO from biological material remains to be clarified. One possibility was detected in a dermatological context , as which will be discussed below, but the quantitative relevance of this route in plants remains questionable, especially with regard to the extremely high amounts found there. Moreover, the change from a hydroxyindole to an indolinone leads to a substantial increase in lipophilicity, which can be easily seen in chromatograms. Therefore, it would be of interest to know how AMIO is distributed in plant cells that contain high amounts of this metabolite. In particular, the possibility should be tested that AMIO might be trapped in lipids, such as compartments with high amount of membranes, e.g., chloroplasts, or in lipid droplets. The recently started and actually progressing studies on the functional role of AMIO may shed light on this problem. A study testing MAP kinase activation in the context of pathogen resistance revealed activation of MAPK3 and MAPK6 by AMIO, however, to a smaller extent as observed with melatonin . At least, one would assume from this report that AMIO should be sufficiently available in the aqueous phase to act on other proteins. Another investigation showed that AMIO contributed to cold and drought resistance, upregulated respective transcription factors, caused increases in the osmoprotectant amino acid proline, and supported mitochondrial integrity under cold and drought stress . These initial findings indicate that AMIO is, in fact, a bioactive compound derived from melatonin and contributes to overall effects of the parent compound.
4. The Deacetylation Pathway
Another catabolic route is initiated by deacetylation of melatonin to 5-methoxytryptamine (5-MT) (Figure 3). The existence of this pathway is insofar remarkable as 5-MT can also serve as a precursor of melatonin in the alternate pathway of melatonin synthesis that seems to prevail in various organisms outside the animals . This duality in the role of 5-MT was particularly evident in Saccharomyces. When starved yeast cells were supplied with exogenous melatonin, a large fraction was converted to 5-MT, but when these cells received 5-MT, they formed melatonin . Substantial amounts of 5-MT were also obtained when cells were supplemented with N-acetylserotonin .
Deacetylation of melatonin has been described for several enzymes. In older literature, such enzymes were generally referred to as aryl acylamidases (AAAs). Moreover, AAA side activities were described for acetylcholinesterase , also under the name of AAA-2 , and butyrylcholinesterase [48,49,50], but their roles as melatonin-converting enzymes seem to be low or irrelevant. Other AAAs were detected in liver [51,52], brain  and pineal gland . However, the brain enzyme did not catabolize melatonin, whereas this was demonstrated for the hepatic isoform [51,52] and, indirectly, for that from the pineal . 5-MT was also shown to appear in the blood after injection of melatonin . In eyes and pineal glands of the European mole (Talpa europaea)  and in pineals of Syrian hamsters [55,56], 5-MT had been measured. However, safe determinations required the inhibition of 5-MT catabolism by blocking MAO A . Under such conditions, a circadian rhythm of 5-MT was detected in the hamster pineals [55,56]. This rhythm strongly differed from that of melatonin and was rather reminiscent of the serotonin rhythm. As 5-MT exerted several biological effects, especially in the reproductive system, it was for a while discussed as another pineal hormone .
Our understanding of 5-MT formation from melatonin, which had been also observed in the eyes of Xenopus laevis , was considerably improved by attributing it to a specific melatonin deacetylase . This was shown to also apply to findings on retina and/or pineal gland in other nonmammalian species, such as the teleost fish Carassius auratus, the lizards Anolis carolinensis and Sceloporus jarovi, and the chicken, Gallus domesticus [61,62]. In Xenopus