Protein-protein interactions play a pivotal role in the regulation of various cellular processes. The formation of higher order protein complexes is frequently accompanied by extensive structural remodeling of the individual components, varying from domain re-orientation to induced folding of unstructured elements. Nuclear Magnetic Resonance (NMR) spectroscopy is a powerful tool for macromolecular structure determination in solution. It has the unique advantage of being capable of elucidating the dynamic behavior of proteins during the process of recognition. Recent advances in NMR techniques have enabled the study of significantly larger proteins and protein complexes. These innovations have also led to faster and more accurate structure determination. My research interests focus on the exploration of molecular recognition and conformational variability of protein complexes in crucial biomedical processes. To achieve this, new NMR methodology must also developed.
A crucial step in determining solution structures of proteins using nuclear magnetic resonance (NMR) spectroscopy is the process of sequential assignment, which correlates backbone resonances to corresponding residues in the primary sequence of a protein, today, typically using data from triple-resonance NMR experiments. We have developed a novel computer-assisted method for sequential assignment, using an algorithm that conducts an exhaustive search of all spin systems both for establishing sequential connectivities and then for assignment. By running the program iteratively with user intervention after each cycle, ambiguities in the assignments can be eliminated efficiently and backbone resonances can be assigned rapidly. The efficiency and robustness of this approach have been tested with 27 proteins of sizes varying from 76 amino acids to 723 amino acids, and with data of varying qualities. We further examine the complexity of sequential assignment with regard to the size of the protein, the completeness of NMR data sets, and the uncertainty in resonance positions.
|
|
Studies of biological macromolecules by NMR spectroscopy rely upon multidimensional, multinuclear experiments to separate the large number of resonances present, and to provide correlations between these resonances to aid in their assignment. Reconstructing multidimensional NMR spectra from 2-D projections significantly reduces the time needed for data collection over conventional methodology. Kupče and Freeman have recently proposed and demonstrated that one could reconstruct full 3-D spectra and 4-D spectra by collecting a small subset of 2-D spectra using the lower-value algorithm. Here, we provide a generalization of the projection-reconstruction process to spectra of arbitrary dimensionality, using a concept of coordinate rotation to produce explicit expressions for reconstruction. These expressions allow one to reconstruct subsets of the higher-dimensionality space without producing the full spectrum, permitting convenient analysis of the data. We demonstrate the effectiveness of these procedures in the reconstruction of the 5-D HACACONH spectrum of protein G B1 domain, from twelve 2-D projections collected in five experiments. We further demonstrate that the base spectra of GFT-NMR are equivalent to projections of the 5-D spectrum at fixed angles.
![]() |
Similar concepts can also be used for reconstruction of sidechain experiments. However, for large proteins, the lack of signal accumulation in the lower-value algorithm can be an issue. We further developed a hybrid algorithm that enables partial signal accumulation and have demonstrated it on the sequential assignment of 30 kDa proteins.
![]() |
Recently, we applied the Filtered Backprojection (FBP) method to solution NMR and obtained the first quantitative reconstruction of high-resolution 4D CH3-NH NOESY spectra within 88 hrs. This approach was later shown to be the first Fourier Transform of radially-sampled time domain data.
![]() |
We showed that the Filtered Backprojection method is equivalent to a Fourier Transform in the Polar coordinate, and the associated artifacts can be separated into the familiar truncation artifacts and aliasing artifacts.
![]() |
Based on the Fourier Transform of individual rings, we proposed a new sampling scheme - concentric ring sampling - that shows improved signal-to-aliasing artifact ratios and can be readily extended to higher dimensions.
|
|
In conjunction with development of fast NMR algorithms, we are developing optimized pulse sequences for fast NMR and novel pulse sequences for large proteins. For example, the "Just-in-time" TROSY HNCACO experiment shows great improvement for providing complete CO resonance information regardless of residue type. Additionally, we are developing software tools for evaluating the qualities of NMR structures.
LpxC, the zinc-dependent UDP-3-O-(acyl)-N-acetylglucosamine deacetylase, catalyzes the committed step in the biosynthesis of lipid A, an amphiphilic lipid that constitutes the outermost monolayer of Gram-negative bacteria. A minimal structure of lipid A with two sugar-like moieties (Kdo2-lipid A) is essential for the viability of Gram-negative bacteria, providing them with crucial protection from external agents. Enzymes involved in lipid A biosynthesis are strictly conserved, have only been identified in Gram-negative species, and not surprisingly, have become attractive targets for the design of novel antibiotics. Among these enzymes, LpxC plays a central role in regulating lipid A synthesis.
Intriguingly, LpxCs from different Gram-negative organisms possess distinct ligand specificities: L-161,240, one of the most potent inhibitors of the E. coli LpxC, is 100-fold less active against the LpxC from P. aeruginosa and completely inactive against LpxCs from other Gram-negative organisms, such as A. aeolicus. We hypothesize that protein dynamics close to the active site, or conformational changes induced by ligand binding, may explain the different behaviors of various LpxCs in response to distinct small molecule inhibitors, despite the conserved mechanism of LpxC catalysis. It is thus an ideal system to study the mobility and dynamics during the process of molecular recognition. Recently, we have determined the solution structures of Aquifex aeolicus LpxC (AaLpxC) in complex with a substrate analog inhibitor TU-514 and in complex with CHIR-090, a slow-binding, time-dependent inhibitor.
![]() |
|
|
The C-terminal domain (CTD) of RNA polymerase II (RNAPII) plays a pivotal role in orchestrating RNA processing and other co-transcriptional events to achieve proper gene expression. It consists of multiple heptad repeats (Y1S2P3T4S5P6S7) that are highly conserved from yeast to human. The predominant form of CTD modification is the phosphorylation of Ser2 and/or Ser5 within the heptad repeats. The level and pattern of CTD phosphorylation are regulated by the concerted action of CTD kinases and phosphatases during the transcription cycle. The vast number of CTD phosphorylation states, also known as the “CTD code,” form the basis for recruitment of specific macromolecular complexes to the transcribing polymerase. Recent identification of novel phosphoCTD-associating proteins has expanded the known functions of the CTD from mediating RNA processing to coordinating other co-transcriptional events, such as chromatin remodeling. Compared to the rapid progress in CTD biology, the structural knowledge of CTD recognition by CTD-associating proteins and CTD-modifying enzymes is very limited. Our long term goal is to understand the structures and mechanisms of CTD-modifying enzymes and CTD-mediated assembly of co-transcriptional complexes. As a first step, we have recently determined the solution structure of the human Set2-Rpb1-Interacting (hSRI) domain and have identified five important residues that affect phosphoCTD binding.
![]() |
Post-translational modification by ubiquitin plays an important role in many cellular processes such as transcription, translation, DNA repair, virus budding, protein re-localization, protein degradation by the 26S proteasome, and cell cycle progression. In order for ubiquitin tags to signal in divergent pathways, they must be specifically recognized by distinct ubiquitin-binding domains for transmitting proper signals. To date, more than 16 distinct motifs have been identified as ubiquitin-binding domains, yet the molecular details of the ubiquitin-specific recognition by many of these domains remain to be elucidated. Recently, we have determined the solution structure of the UBZ domain of the Y-family DNA polymerase eta and proposed a model for its ubiquitin recognition based on NMR titration and spin-labeling experiments.
|
|
We are also studying ubiquitin-binding domains that recognize ubiquitin through novel binding areas. For example, our recent structural studies on the BUZ domain of Ubp-M revealed a novel zinc-finger architecture of the BUZ domain and a unique mode of ubiquitin-recognition.
![]() |