Data mining and bioinformatics of bacterial virulence factors
Bacterial virulence factors are moieties that are produced by bacterial pathogens that are essential for causing disease in a host. An important class of virulence factors is the bacterial exotoxins that are secreted by viable pathogenic cells. Toxins play an important role in the various strategies developed by pathogenic bacteria to cause disease, since these proteins are responsible for the majority of symptoms and lesions during infection. These bacterial proteins are amongst the most potent toxins known to man. The majority are A/B or binary toxins that bind to the target membrane with a receptor (B subunit) and deliver a second moiety (A subunit) directly into the cytoplasm. The bacterial exotoxins can be classified into two major groups, according to the profound toxic program these proteins unleash within host cells and include those that are
- cell surface active and
- intracellularly active.
One large group of A/B toxins is known as the mono-ADP-ribosyltransferase family (mARTs). These toxins are enzymes that act to kill target eukaryotic cells by covalent modification of an essential protein within the host organism. Generally in the evolution of proteins, as the primary sequence of proteins diverge; the structures also diverge with an exponential dependence. It is now recognized, however, that proteins with similar three-dimensional folding do not always possess similar amino acid sequences. The mART enzymes are a classic example of a protein family that catalyze the same enzymatic reaction (albeit the target nucleophiles and proteins are different) where the three-dimensional structure has been preserved but the primary sequences are not related. These enzymes bind NAD+, facilitate the scission of the glycosidic bond (C-N) between nicotinamide and the N-ribose of NAD+, and transfer the ADP-ribose group to a specific target protein.
In addition, this family of enzymes also possesses NAD+ase or glycohydrolysis activity, but the physiological relevance of this activity is not known. This former type of covalent modification usually has a dramatic effect on the function of the target protein. This modification occurs for the bacterial enzyme dinitrogen reductase, resulting in regulation of the important process of biological nitrogen fixation. Diphtheria toxin acts on and inhibits elongation factor-2 (eEF2), which blocks its function on the ribosome thereby inhibiting protein synthesis in the host cell. Cholera toxin acts on Gαs protein, trapping this protein in its active conformation, causing the signaling pathway to be perpetually activated, leading to several physiological responses in the host, including massive loss of body fluids and even death. Pertussis toxin also adds an ADP-ribose moiety to a signaling protein, a Gαi protein that inhibits adenyl cyclase, closes K+ channels and opens Ca2+channels. The effect is to lower the affinity of the G protein for GTP, trapping the protein in the inactive conformation, which leads to an adverse effect on the pulmonary system of the host. In eukaryotes, mART activity has been detected in a number of species and in mammals, mART activity is an important means to regulate key cellular events modulating biological activity of several proteins. Interestingly, it has been shown that RNA, DNA, and even antibiotics such as rifampicin can be ADP-ribosylated. A related family of endogenous eukaryotic enzymes known as the PARPs possess poly-ADP-ribosyltransferase activity and have recently stimulated medical interest because of its role in cell death processes following pathological injury or noxious insult from such conditions as heart attack (ischemia) and stroke.
The first bacterial genome sequence was published in 1995 (H. influenzae) and today more than 200 microbial genome sequences are in the public domain. Alarmingly, it has been estimated that approximately 40% of these represent important human pathogens. Comparative in silico methods, along with large-scale approaches such as transcriptomics and proteomics, are beginning to reveal insights into new virulence genes, pathogen-host interactions, and the molecular basis of host specificity. The rapidly expanding number of sequenced bacterial genomes provides a solid basis to allow extensive searches of novel virulence factors. It has been well established that mARTs represent a class of enzymes which are important for the pathogenesis of the microorganisms that produce them. Several reports have been published recently, which are focused on the "in silico" identification of novel members of this class of enzymes. This data mining approach will be applied to construct an amino acid profile of each group of mARTs, based on known consensus sequences, which will be employed in a pattern-based search strategy. We will use ScanProsite to screen these two patterns against the bacterial genomes for which no putative mART has been previously discovered. The matching ORFs from this pattern search will be subjected to secondary structure prediction using three different programs: PHD, HNN and PSIPRED and the consensus from the three methods will be superimposed to give a common result. Only the ORFs that show good agreement between primary and secondary structure requirements in the region lining the mART active site will be considered as putative mART enzymes.
The ORFs identified from the genomic mining experiments will then be amplified by PCR from the corresponding bacterial genomic DNA and cloned into a PET E.coli expression vector, followed by overexpression and purification by affinity chromatography. The purified proteins will then be tested against several synthetic and natural known mART substrates for activity. Those proteins exhibiting mART activity will be receive a full kinetic evaluation according to our own protocols.
Currently, we have been actively characterizing two members of the mART family, ETA and DT. A recent high resolution structure of ETA in complex with its target protein, eEF2 has provided the structural details to pursue the characterization of the substrate-binding/recognition loops within these enzymes.
Previously, Bazan and Koch-Nolte suggested, based on sequence and structural links between distant members of the mART family, that the loop region connecting beta strand 4 to beta strand 5 would likely be responsible for protein substrate specificity. Furthermore, the active site will be studied by site-directed mutagenesis in the newly identified members of the mART family, along other structural regions within these enzyme active sites important for the transfer of the ADP-ribose of NAD+ to the incoming nucleophile on the target protein substrate.


