Global evolution and expression analysis of BTB-containing ankyrin repeat genes in plants

The ankyrin (ANK) repeat domain and bric-a-brac, tram-track, broad complex (BTB) domains, which are the most common protein motifs in eukaryotic proteins, regulate diverse developmental and biological processes in plants. In this study, 230 BTB-containing ANK (ANK-BTB) homologs were identified and categorized into two groups (class I and class II) in plants. Phylogenetic and comparative analysis found that ANK-BTB genes originated in bryophytes and ferns and their number expanded by segment duplications. All of the selected ANK-BTB genes were expressed in two or more tested tissues, indicating that these genes are involved in various aspects of developmental processes in Arabidopsis. Furthermore, the ANK-BTB genes responded to abiotic stresses (NaCl, mannitol, heat and cold) and ABA treatments. To our knowledge, this study is the first report of a genome-wide analysis of ANK-BTB genes. This study also provides valuable information to understand the classification, evolution and putative functions of the gene family.


INTRODUCTION
The ankyrin (ANK) repeat domain is a common protein motif widely present in animals and plants [1].The ANK repeats consist of 33 residues repeated in tandem that build specific secondary α-helices separated by loops [2].These repeats were initially discovered in two yeast cell-cycle regulators, namely, Swi6/Cdc10, and in the Drosophila signaling protein Notch [3].Several amino acids in the ANK motifs are conserved, and correspond to hydrophobic positions required to maintain the secondary structure [2,4].Binding within the ANK repeat is a common feature in inter-and/or intramolecular protein interactions [5,6].
In plants, the ANK proteins are involved in various developmental and biological processes.AtAKR, the first reported ANK protein in Arabidopsis, is regulated by light and plays important roles in cell differentiation [7].AtEMB506 is critical for embryonic development [8]; AtNPR1 is a key regulator of systemic acquired resistance against P. syringae [9]; AtACD6 is a regulator and an effector of salicylic acid-mediated defense response [10]; AtITN1 affects the ABA-mediated production of reactive oxygen species, which is important for salt-stress tolerance [11]; XBAT32 targets ethylene biosynthetic enzymes for proteasomal degradation to maintain appropriate levels of ethylene [12]; OsBIANK1 regulates disease resistance response in rice [13].
The bric-a-brac, tram-track, broad complex (BTB) domain, also known as the POZ domain, is widely distributed in eukaryotes [14].This domain is an evolutionarily conserved protein interaction motif containing approximately 100 amino acid residues that form four α-helices connected by β-folds [15,16].The BTB domain is mainly present in the N-terminal of the zinc finger protein and Kelch motif-containing proteins and is generally involved in homodimerization and heterodimerization [12,[17][18][19].BTB proteins have roles in diverse processes, including developmental program, defense and abiotic stress response [20][21][22].For example, AtETO1, AtEOL1 and AtEOL2 can regulate ethylene biosynthesis [23]; AtBT2 mediates multiple responses to nutrients, stresses and hormones [24]; AtBOP1, which contains the ANK and BTB domains, plays a key role in morphogenesis [25].
A previous study carried out genome-wide identification and phylogenetic analysis of BTB or ANK gene families in Arabidopsis and rice, respectively [16,26,27].In the present study, the evolution and expression patterns of BTB-containing ANK genes (ANK-BTB) in 41 genome-sequenced plant species to investigate their potential functions in plant development and abiotic stress tolerance.

Identification of ANK-BTB genes in 41 plant species
All genome information about 41 plant species was downloaded from the Phytozome database [28], which was used to construct a stand-alone database.The stand-alone version of the Basic Local Alignment Search Tool (BLAST) [29], which is available from the National Center for Biotechnology Information (NCBI), was used with an e-value cutoff of 1e-003.The known Arabidopsis ANK-BTB genes were used as query sequences to search similar sequences from the proteome and genome files.
All of the protein sequences derived from the collected candidate genes were examined using domain analysis programs, namely Protein family (Pfam) [30] and Simple Modular Architecture Research Tool (SMART) [31], with the default cutoff parameters.We analyzed the domains of all of the peptide sequences using a hidden Markov model (HMM) [32,33] model with Pfam searching.We then obtained sequences containing the typical BTB and ANK domains with Pfam numbers PF00651 and PF12796 (PF13857, PF13606 and PF13637), from the 41 genome sequences using a Perl-based script.All of the protein sequences were compared with a known peptide by using ClustalX to verify the candidate genes [33].

Chromosomal location and gene structure of the ANK-BTB genes in Arabidopsis
Annotations of gene locations were retrieved from the .gfffile of the Arabidopsis genome and mapped to the chromosomes using the chromosome mapping tool [34].The gene structure was generated with the Gene Structure Display Server (GSDS) [35].

Sequence alignment and phylogenetic analysis
The peptide sequences were aligned using the ClustalX program with BLOSUM30 as protein-weight matrix [33].The multiple sequence comparison by log-expectation (MUSCLE) program (version 3.52) was used to confirm the ClustalX results [36].Phylogenetic trees of the protein sequences were constructed with the neighbor-joining (NJ) method using the Molecular Evolutionary Genetics Analysis program (MEGA5) [37].The reliability of the obtained trees was tested by a bootstrapping method with 1000 replicates.Phylogenetic and chromosomal location analyses were used to identify duplicated genes.The number of nonsynonymous substitutions per nonsynonymous site (Ka) and synonymous substitutions per synonymous site (Ks) were calculated by DnaSP [38,39].

Plant materials
Arabidopsis thaliana (Col-0) seeds were surfacesterilized and sown on Murashige and Skoog (MS) medium.The seeds were stratified at 4°C for 2 days prior to germination.The seedlings were grown on MS medium or soil under long-day regime (16 h light/8 h dark cycle) at 23±1°C.All stress treatments were carried out using 2-week-old seedlings grown on MS medium.For different treatments, the whole seedlings were placed on filter paper soaked with 150 mM NaCl, 300 mM mannitol or 10 µM ABA for 3 or 12 h.For heat and cold treatments, the seedlings were placed at 37°C or 4°C for 3 or 12h.

RNA extraction and qRT-PCR analysis
Total RNA was isolated from different A. thaliana seedlings or tissues using the RNeasy plant mini kit (Qiagen, Germany) according to the manufacturer's instructions.Real-time PCR analyses were performed with the SYBR ® Premix Ex Taq TM (Takara) on the Bio-Rad CFX96 real-time PCR system.UBQ10 served as an internal control.The primers used for qRT-PCR analysis are presented in Table S1.All the experiments were repeated three times, and similar results were obtained.

Identification of ANK-BTB genes in 41 plant species
Comprehensive bioinformatics analysis indicated the presence of 7128 genes with the NK domain in 41 species from algae to angiosperms (Table S2).Meanwhile, 3220 genes with the BTB domain were identified according to conserved domain searching (Table S3).Several BTB members were observed in Micromonas pusilla and Ostreococcus lucimarinus.We also used Perl script to count the numbers and distributions of genes containing both the BTB and ANK domains.A total of 230 genes was found, distributed among 35 species from mosses and ferns to cruciferous species, in addition to single-celled algae.The highest number of genes was observed in Citrus clementina (with 12 members), whereas only one gene was found in Vitis vinifera.Additionally, seven member-genes were obtained from Arabidopsis.To distinguish among the species, we provisionally named the Arabidopsis genes as AthANK-BTB1-7.The open reading frame (ORF) length, peptide length, genomic location and exon numbers are shown in Fig. 1 and Table S4.

Phylogenetic relationships and comparative analysis of ANK-BTB genes in 41 species
In this study, the ANK-BTB genes originated from the ferns and were identified in 35 species of land plants.The results suggested that the ANK-BTB genes may be involved in the morphological characteristics of land plants and their adaptations for survival in certain environments (Fig. 1).To clarify the phylogenetic relationship among the ANK-BTB genes and infer the evolutionary history of the gene family, we used the full-length protein sequences of the family members in plants for constructing a joint unrooted phylogenetic tree (Fig. 2).Based on the analysis of the tree, the proteins were categorized into two major groups (classes I and II), with well-supported bootstrap values.Class I was divided into three subgroups (Ia, Ib and Ic), which were confirmed by maximum likelihood (ML) tree by full-length and conserved domain length (Fig. S1 and S2).Statistically, classes I and II contained 199 and 31 members, respectively (Fig. 2).
In class Ia, 74 (84%) genes were obtained from eudicots and only 14 genes were found in monocots (Fig. S3).In class Ia, 9 genes were obtained from Citrus clementina and 7 members were detected in Citrus sinensis, thereby implicating the distinct expansion of subclass Ia in Citrus.Classes Ib, Ic and II genes showed a wide range of species polymorphism, which included eudicots, monocots, ferns and bryophytes.In addition to several species (Medicago truncatula, Phaseolus vulgaris, Malus domestica, and Vitis vinifera), all embryophytes contained only one class II gene, which suggested their highly conserved evolutionary feature (Table S5).

Structure classification of ANK-BTB genes in plants
The protein structure of each ANK-BTB gene was also analyzed with SMART, Pfam and Multiple EM for Motif Elicitation (MEME) (Fig. 3).Four known domains (ANK, BTB, DUF3420 and NPR1-C) were identified by SMART and Pfam (Fig. 3A).Class I contained only one BTB domain in the N-terminus, and two BTB domains were found in the C-terminus of class II.In classes Ia and Ib, the specific DUF3420 domain was located between BTB and ANK domains.In classes Ia and Ic, the proteins contained the NPR1-C domain in the C-terminus, which contained two conserved sequence motifs, LENRV and DLN.As shown in Figs.3B and S3, we identified 20 conserved motifs using MEME.Motif 17 was found only in classes Ia and Ib; motifs 9, 10, 12, 15 and 18 were in the C-terminus of classes Ia and Ic.Moreover, motifs 8, 13, 14, 19, and 20 were specifically identified in class II.
Finally, we proposed the evolution model of conserved domain based on the evolutionary tree of ANK-BTBs (Fig. 3C).The ANK or BTB domain was  widespread in Viridiplantae.The genes with both ANK and BTB domains were first identified in bryophytes.Eventually, the monocot and dicots possessed the specific class Ia-type ANK-BTB structures.

Chromosomal location and gene structure analysis of ANK-BTB genes in Arabidopsis
To investigate the relationships between genetic divergence within the AthANK-BTB family and gene duplication in Arabidopsis, we depicted the physical chromosomal locations of each AthANK-BTB member (Fig. 4A).Three segmental duplication events of seven genes were found in the Arabidopsis genome.All of these segmentally duplicated genes were observed as paralogs in the phylogenetic analysis.These results indicated that segmental duplications played important roles in AthANK-BTB expansion in the genome.To explore different selective constraints on the duplicated AthANK-BTB genes, we calculated the Ks and Ka/Ks ratios for each duplicated pair.A Ka/Ks ratio higher than 1 generally indicated accelerated evolution with positive selection, a ratio equal to 1 corre-sponded to neutral selection, and a ratio less than 1 indicated negative or purifying selection.Ka/Ks ratios of all duplicated pairs were less than 1, thereby implying strong purifying selection (Table S6).These results suggested that the functions of the duplicated genes did not diverge over the course of genome evolution following the duplication events.
Structural analyses provided valuable information concerning duplication events when interpreting phylogenetic relationships within gene families.In the AthANK-BTB family, the number of exons varied from two (AthANK-BTB2 and AthANK-BTB6) to five (AthANK-BTB1) (Fig. 4B).Additionally, exon members within the duplicated pairs shared a similar exon structure and length (Table S4).

Expression pattern of the ANK-BTB genes in different tissues
To investigate the spatial expression profiles of ANK-BTB in Arabidopsis developmental tissues, we analyzed the expression of the ANK-BTB genes in root, rosette leaf, cauline leaf, stem, flower, silique and seed using quantitative RT-PCR (qRT-PCR).As shown in Fig. 5A, most of the ANK-BTB genes can be detected with different transcript levels in all tissues.However, relatively high expression levels were observed in specific tissues, such as AthANK-BTB1 in seed, AthANK-BTB2 in root and flower, AthANK-BTB4 and AthANK-BTB5 in leaf and AthANK-BTB7 in the stem.Furthermore, only two members were exclusively observed in specific tissues, for instance, AthANK-BTB3 in the root and leaf and extremely high levels in dry seed, AthANK-BTB6 in the root, seed and at a relatively higher level in the flower.These results indicated the exclusive functions for AthANK-BTB3 and AthANK-BTB6 in seed storage and flower development, respectively.

Abiotic responsiveness of the ANK-BTB gene families
To explore the functional potentials of AthANK-BTB genes in various environmental abiotic stresses, we examined their expression levels under different stresses, namely, NaCl, mannitol, 37°C, 4°C and ABA, in twoweek-old wild-type seedlings through qRT-PCR analysis.(Fig. 5B).For temperature stress, AthANK-BTB1, AthANK-BTB2, and AthANK-BTB4 were expressed at 3 h of heat treatment and 12 h of cold treatment.Specifically, the mRNA level of AthANK-BTB3 was suppressed by heat treatment and low temperature and gradually declined over time.The expression of AthANK-BTB6 was rapidly inhibited at 3 h and significantly increased at 12 h by heat and cold treatments.Surprisingly, AthANK-BTB7 accumulated slightly at 3 h, and its expression was rapidly decreased by cold.In addition, the expression of AthANK-BTB3 was upregulated by mannitol and that of AthANK-BTB5 was downregulated by NaCl.Among all AthANK-BTB members, AthANK-BTB2, AthANK-BTB3, AthANK-BTB4 and AthANK-BTB6 were slightly suppressed by ABA.Thus, expression analysis demonstrated the diverse roles of AthANK-BTB genes in responses to different stresses, especially in temperature signaling.

DISCUSSION
The BTB and ANK proteins play important roles in the development and stress resistance of animals and plants.In humans, the ZBTB1 protein, which contains the BTB domain, acts as a transcription repressor in the activation of CREB and cAMP-mediated signal transduction pathway to regulate cellular physiology [40].The BTB-type protein CIBZ is involved in spinal cord injury in mouse [41]; ZBTB20 functions as a molecular switch for a pathway that induces invariant pyramidal neuron morphogenesis and suppression of cell fate transitions in newborn neurons [42].In Arabidopsis, BACH1 (BTB-type gene) target genes are involved in the oxidative stress response and the control of cell cycle [43].The BTB protein Keap1 is an adaptor participating in oxidative stress sensing [44].Single-celled algae, particularly Micromonas pusilla and Ostreococcus lucimarinus, encode less BTB proteins.By contrast, the BTB domain is widely present in higher flowering plants and low single-celled plants.The large number of BTB members indicates that this domain plays important roles in vascular plants.
The ANK gene also exhibits important functions in eukaryotes; for example, TRPA1 is upregulated in colitis and its activation exerts protective roles in humans [45].Ankyrin-1 improves long-term betaglobin expression in hematopoietic stem cells for gene therapy of hemoglobinopathies in mouse [46].In Arabidopsis, AtAKR2 regulates antioxidant metabolism during disease resistance and stress responses [47].ACD6 can control defense responses against virulent bacteria [10,48].NPR1, a positive regulator of acquired resistance responses, is a central activator of SA-regulated gene expression [9].In rice, XB3 plays roles in Xa21-mediated immunity [49][50][51].CaKR1, a pepper ANK gene, plays roles in both biotic and abiotic stress responses [52].Evolution analysis showed that the ANK domain was widespread in dicots and algae with large and conserved numbers.Therefore, the ANK domain, which has important functions in protein interactions, possesses highly conserved features during plant evolution.Phylogenetic analysis of the genes containing both ANK and BTB domains showed that ANK-BTB genes originated in mosses and ferns; these homologs were divided into two groups, namely, classes I and II (Figs. 2 and 3C).Class I-type ankyrin proteins contained two ANK domains in the C-terminal and one BTB domain in the N-terminal (Fig. 3A).Further analysis of the Arabidopsis ANK-BTB members showed the collinear relationship among three pairs of class I genes, which suggested the large fragment duplication in the expansion of class I-type genes (Fig. 4A).The Ka/Ks ratio estimation implied that distinct selective pressures operated on class I ANK-BTB genes.Comparisons between species counterparts revealed high Ka/Ks ratios, which indicated that ANK-BTB evolved under strong positive purifying selection (Table S6).
AtANK-BTB genes, except AtANK-BTB3 and AtANK-BTB6, were ubiquitously expressed in all the examined tissues, including roots, rosette leaves, stems, cauline leaves, floral organs, silique and seeds.AtANK-BTB3 was mostly expressed in seeds, and AtANK-BTB6 in floral organs (Fig. 5A).Hence, we supposed that AthANK-BTB3 may play a role in seed storage, and AthANK-BTB6 may function in flower development; The other five AthANK-BTBs may play an important role in the whole life cycle of Arabidopsis, and they may be involved in plant growth and development processes.The first characterized ANK protein in plants was AKR, which is associat-ed with the regulation of chloroplast differentiation [53].The AKRP interacting partner protein EMB506 also consists of five ANK repeats, and it is essential for plant organogenesis and morphogenesis during developmental stages [54].However, several studies have elucidated the function of ANK proteins during plant growth and development stages [55][56][57][58], but few of them play a regulatory role during stress conditions [59].Comprehensive stress expression investigation on Arabidopsis ANK-BTB genes implicated the putative characteristics in abiotic stresses (Fig. 5B).Remarkably, most of the AthANK-BTB members were regulated by heat or cold treatment, thereby suggesting a possible involvement in temperature signaling.AthANK-BTB2, AthANK-BTB3, AthANK-BTB4 and AthANK-BTB6 were regulated by ABA, so they may be involved in the ABA signaling pathway.AthANK-BTB3 and AthANK-BTB5 were upregulated by mannitol after 3 h of treatment and subsequently downregulated, but narrowed the gap to the control after 12 h of treatment.Thus, we predicted that these genes may be involved in osmotic stress adaptation.AKT1 is composed of five ANK repeats toward its C-terminus, and it plays a significant role in root K + uptake [60][61][62][63][64].The akt1 mutants display ABA hypersensitivity and enhanced drought tolerance [65].ANK protein OXIDATIVE STRESS 2 plays a role in oxidative stress conditions [66].In humans, ANK and BTB domains containing protein-2 inhibit the aggregation of α-synuclein and play a role in Parkinson's disease [67].BPOZ is involved in the development of leukemia through protein-protein interaction [68].Thus, we propose that the Arabidopsis ANK-BTB genes may have important functions in developmental and biological processes.

Fig. 1 .
Fig. 1.The phylogenetic relationships of plants with completely sequenced genomes.The number in parentheses corresponds to the number of ANK-BTB genes in each species.

Fig. 2 .
Fig. 2. Phylogenetic relationship of ANK-BTB in plants.The phylogenetic tree was constructed based on a complete protein sequence alignment of ANK-BTB by the neighbor-joining method with bootstrapping analysis (1000 replicates).The subgroups are marked by a colored background.

Fig. 3 .
Fig. 3. Conserved domain analysis of the ANK-BTB proteins in plants.A -Pfam conserved domain of four types ANK-BTB proteins.B -MEME motif model of four types ANK-BTB proteins.C -The evolution model of the ANK and BTB domains in plants.

Fig. 4 .
Fig. 4. The genomic location and gene structure of ANK-BTBs.A -The genomic location of ANK-BTB genes in Arabidopsis.Sister paralogous pairs are indicated by a red line.B -The gene structure of Arabidopsis ANK-BTBs.Introns, and exons are represented by black lines and green boxes respectively.