PROTEINS SEQUENCE ANALYSIS OF CONTAGIOUS CAPRINE PLEUROPNEUMONIA

A total of twenty (20) contagious bovine pleuropneumonia (CCPP) proteins were retrieved from the GenBank (www.ncbi.nlm.nih.gov). The proteins sequences were used to investigate the molecular identity of various CCPP proteins. The physico-chemical properties of CCPP proteins were performed using protparam tool. Isoelectric point (pI), molecular weight (MW), extinction coefficient (EC); instability index (II), aliphatic index (AI) and grand average of hydropathicity (GRAVY) were computed. The study revealed that the pI of CCPP proteins were acidic and basic in nature. The EC and II of CCPP proteins indicate better stability which is an indication of resistant to mutation and thermally stable. The GRAVY of CCPP proteins revealed some are positive while some are negative. The positive value indicates solubility (hydrophilic) in water while negative is not soluble (hydrophobic) in water. The amino acid composition of CCPP proteins indicates that they are rich in isoleucine, leucine and lysine. The three dimensional structures (3D) of the CCPP proteins were determine using Phyre2 server. The amino acid sequences of CCPP proteins were subjected to secondary structure prediction using ExPASy’s SOPMA tool. The proteins are more of alpha helix structure. The genetic information eminating from this study may bring insight into mutagenesis and pharmacogenetic.


Introduction
Contagious Caprine Pleuropneumonia (CCPP) is a devastating disease of goats included in the list of notifiable diseases of the Organization for Animal Health (OIE).The first description of the disease dates back to 1873, in Algeria (Thomas, 1873).CCPP is a contagious disease of goats, which occurs in per acute, acute or chronic forms and is characterized by fibrinous pneumonia, pleurisy and profuse pleural exudates (Edelsten et al., 1990).Mortality rates of 60-100% are common (Edelsten et al., 1990).The disease is reported to occur in many countries in West and Eastern Africa and in Pakistan and India (OIE, 2001).The infectious agent Mycoplasma capricoleum subspecies capripneumoniae, formerly known as the F38-like group, is difficult to isolate and has only been identified in a few of the countries where the disease has been reported (Bolske et al., 1995a).

Materials and Methods
A total of twenty (20) CCPP proteins of goat were retrieved from the GenBank (www.ncbi.nlm.nih.gov).The Genbank accession numbers of the sequences and sequence variations are shown in Table 1.ProtParam Tool was used for the computation of various physical and chemical properties of the CCPP proteins using amino acid sequences.The computated parameters were molecular weight, theoretical pI (isoelectric point), amino acid composition, extinction coefficient, estimated half-life, instability index, aliphatic index and grand average of hydropathicity (GRAVY) (Gasteiger, 2005).The amino acid sequences of CCPP proteins were subjected to secondary structure prediction using ExPASy's SOPMA tool.It predicts 69.5% of amino acids for a 3 state description of the secondary structure (a helix, b sheets and coil).The Phyre2 server was used to predict the 3D structure of CCPP proteins.These servers predict the threedimensional structure of a protein sequence using the principles and techniques of homology modeling (Kelley and Sternberg, 2009).Currently, the most powerful and accurate methods for detecting and aligning remotely related sequences rely on profiles or Hidden Markov Models (HMMs).3DligandSite was used to predict the binding site of the 3D structure of the CCPP proteins.Phyre2 is coupled to the 3DligandSite server for protein binding site prediction (Wass et al., 2010).

Results
Physico-chemical characteristics of CCPP proteins predicted by protparam are shown in Table 2.The computed isoelectric point (pI) values of CCPP proteins in the study revealed Phosphoglycerate kinase, Glycyl-tRNA Synthetase, ATPdependent protease La, GTP-Binding protein, tRNA Modification GTpass, Lysine-tRNA ligase and Chromosome segregation ATPase are acidic which have (pI<7) while the rest appeared to be basic in nature with (pI>7).The net charge of CCPP protein revealed only Phosphoglycerate kinase is neutral (no charge).Glycyl-tRNA Synthetase, ATP-dependent protease La, GTP-Binding protein, tRNA Modification, Lysine-tRNA ligase and Chromosome segregation ATPase are negatively (-) charge while the rest of the protein are positively (+) charge.The extinction coefficient of a protein at 280 nm depends almost exclusively on the number of aromatic residues, particularly tryptophan (Gill et al., 1989).Extinction coefficient values for CCPP proteins at 280 nm ranged from 8940 to (Signal recognition particle protein is lowest and Prolipoprotein diacylglyceryl tranferase is highest) respectively.The half life of protein is the time it takes for half of the amount of protein in a cell to disappear after its synthesis in the cell of the proteins.In this study the half life of all the CCPP proteins is 30 hours.The instability index provides an estimate of the stability of the protein in a test tube.A protein whose instability index is smaller than 40 is predicted as stable, a value above 40 predicts that the protein will be unstable (Guruprasad et al., 1990).The result from this study shown that ATP-dependent protease La, Excinuclease ABC Subunit B, Prolipoprotein diacylglyceryl tranferase and Cell division protein FtsY protein have value >40 while the rest of protein are have<40.The aliphatic index of a protein is defined as the relative volume occupied by aliphatic side chains (alanine, valine, isoleucine, and leucine).The result revealed that Amino acid permease, DNA-primase, GTP-Binding protein, tRNA Modification GTpass, PTS system-IIBC component, Hypothetical Protein Mccp 3340, Dihydro folate-foly poly glutamate synthase and Prolipoprotein diacylglyceryl tranferase proteins from this study have AI>100 while the rest of the CCPP protein have AI<100.The grand average hydropathicity (GRAVY) of the CCPP protein revealed that Amino acid permease, tRNA Modification GTpass and PTS system-IIBC component have positive while the rest of CCPP protein have negative value.The prediction of secondary structure of CCPP proteins is shown in Table 3.The result revealed that Signal recognition particle protein showed the highest alpha helix (53.91%) and the lowest is Chaperone protein Dnaj (26.88%).The extended strand prediction, Dihydro folate-foly poly glutamate synthase gives highest value (28.18%) and the lowest is Signal recognition particle protein (13.87%).The beta turn prediction of secondary structure revealed that Chaperone protein Dnaj gives the highest value (14.25%) and Chromosome segregation ATPase is the lowest (5.67%).The random coil prediction of secondary structure revealed that Chaperone protein Dnaj gives the highest value (35.22%) and Chromosome segregation ATPase showed the lowest value (20.04%).All the CCPP proteins are having higher value in alpha helix structure.The amino acid composition percentage of CCPP protein is shown in Table 4.All the CCPP proteins used for this study have similar amino acid composition of all the CCPP protein with higher percentage in isoleucine, leucine and lysine.Isoleucine and leucine are aliphatic amino acid and lysine is polar amino amino acid.All the CCPP proteins have zero percentage composition of selenocystein and pyrrolysine amino acids.

Discussion
CCPP diseases disease notifiable to the World Organization for Animal Health (OIE) since it has a major impact on livestock production and a potential for rapid spread across national borders.As a result, CCPP-infected countries are excluded from international trade.At present, the disease causes vast problems in Africa with severe socio-economical consequences.The computed isoelectric points (pI) for both CCPP will be useful for developing buffer system for purification by isoelectric focusing method.The isoelectric point is of significance in protein purification because it is the pH at which solubility is always minimal and at which mobility in an electro focusing system is zero and therefore the point at which the protein will accumulate (Fennema, 2008).The extinction coefficient of a protein at 280 nm depends almost exclusively on the number of aromatic residues, particularly tryptophan (Gill and Von-Hippel, 1989).This indicates that the higher the EC value of the CCPP proteins, the higher the number of aromatic residues (Gasteiger 2003;Munduganore et al., 2012).In particular, hydrophobic amino acids can be involved in binding/recognition of hydrophobic ligands such as lipids (Betts et al., 2003).All the CCPP proteins have zero selenocystein and pyrrolysine which is interpret as stop codons (protein cannot conclusively determine the identity of a residue) (Suchanek et al., 2005).Many important biological processes such as cell signaling, transport of membrane-impermeable molecules, cell-cell communication, cell recognition and cell adhesion are mediated by membrane proteins (Jones, 2007).Although there has been some recent progress in predicting the full 3-D structure of transmembrane proteins (e.g.Yarov-Yarovoy et al., 2006), the most widely applied prediction technique for these proteins is to determine the transmembrane topology, i.e. the inside-outside location of the N and C termini relative to the cytoplasm, along with the number and sequence locations of the membrane spanning regions.This will facilitate the understanding of the structure and function of CCPP proteins.Determining the structure and function of a novel protein is a cornerstone of many aspects of modern biology.The accuracy of protein structure prediction depends critically on sequence similarity between the query and template as observed in the present study.If a template is detected with >30% sequence identity to the query, then usually most or all of the alignment will be accurate and the resulting relative positions of structural elements in the model will be reliable (Kelley et al., 2015).The practical applications of CCPP protein structure prediction include guiding the development of functional hypotheses about hypothetical proteins, improving phasing signals in crystallography and selecting sites for mutagenesis (Qian et al., 2007;Rava and Hussain, 2007).

Conclusion
The physico-chemical properties, amino acid composition, and secondary structure of CCPP proteins indicated physical, chemical and thermal stability of the protein molecules.These indicated that the proteins are resistant to mutation and can withstand wide range of temperature.Genetic data revealed from this study will bring new insights into epidemiological questions.Molecular typing has been instrumental in determining the population structure and evolution of pathogens.Since CCPP has both economical and nutritional consequences, efforts should be intensified towards finding sustainable genomic solutions to these deadly diseases which continue to ravage the livestock industry.New typing tool may help improve the surveillance and control of the disease, as well as to trace new epidemics.