28747 cd00113: PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates. 28748 cd01751: PLAT/ LH2 domain of plant lipoxygenase related proteins. Lipoxygenases are nonheme, nonsulfur iron dioxygenases that act on lipid substrates containing one or more (Z,Z)-1,4-pentadiene moieties. In plants, the immediate products are involved in defense mechanisms against pathogens and may be precursors of metabolic regulators. The generally proposed function of PLAT/LH2 domains is to mediate interaction with lipids or membrane bound proteins. 28749 cd01752: PLAT/LH2 domain of polycystin-1 like proteins. Polycystins are a large family of membrane proteins composed of multiple domains, present in fish, invertebrates, mammals, and humans that are widely expressed in various cell types and whose biological functions remain poorly defined. In human, mutations in polycystin-1 (PKD1) and polycystin-2 (PKD2) have been shown to be the cause for autosomal dominant polycystic kidney disease (ADPKD). The generally proposed function of PLAT/LH2 domains is to mediate interaction with lipids or membrane bound proteins. 28750 cd01753: PLAT domain of 12/15-lipoxygenase. As a unique subfamily of the mammalian lipoxygenases, they catalyze enzymatic lipid peroxidation in complex biological structures via direct dioxygenation of phospholipids and cholesterol esters of biomembranes and plasma lipoproteins. Both types of enzymes are cytosolic but need this domain to access their sequestered membrane or micelle bound substrates. 28751 cd01754: PLAT/LH2 domain of plant-specific single domain protein family with unknown function. Many of its members are stress induced. In general, PLAT/LH2 consists of an eight stranded beta-barrel and it's proposed function is to mediate interaction with lipids or membrane bound proteins. 28752 cd01755: PLAT/ LH2 domain present in connection with a lipase domain. This family contains two major subgroups, the lipoprotein lipase (LPL) and the pancreatic triglyceride lipase. LPL is a key enzyme in catabolism of plasma lipoprotein triglycerides (TGs). The central role of triglyceride lipases is in energy production. In general, PLAT/LH2 domain's proposed function is to mediate interaction with lipids or membrane bound proteins. 28753 cd01756: PLAT/LH2 domain repeats of family of proteins with unknown function. In general, PLAT/LH2 consists of an eight stranded beta-barrel and it's proposed function is to mediate interaction with lipids or membrane bound proteins. 28754 cd01757: PLAT/LH2 domain present in RAB6 interacting protein 1 (Rab6IP1)_like family. PLAT/LH2 domains consists of an eight stranded beta-barrel. In RabIP1 this domain may participate in lipid-mediated modulation of Rab6IP1's function via it's generally proposed function of mediating interaction with lipids or membrane bound proteins. 28755 cd01758: PLAT/ LH2 domain present in lipoprotein lipase (LPL). LPL is a key enzyme in catabolism of plasma lipoprotein triglycerides (TGs) and has therefeore has a profound influence on triglyceride and high-density lipoprotein (HDL) cholesterol levels in the blood. In general, PLAT/LH2 domain's proposed function is to mediate interaction with lipids or membrane bound proteins. 28756 cd01759: PLAT/LH2 domain of pancreatic triglyceride lipase. Lipases hydrolyze phospholipids and triglycerides to generate fatty acids for energy production or for storage and to release inositol phosphates that act as second messengers. The central role of triglyceride lipases is in energy production. The proposed function of PLAT/LH2 domains is to mediate interaction with lipids or membrane bound proteins. 28757 cd02899: Scavenger receptor protein. A subfamily of PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates. This subfamily contains Toxoplasma gondii Scavenger protein TgSR1. 28758 cd00326: Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer. 28759 cd03117: Carbonic anhydrase alpha, CA_IV, CA_XV, like isozymes. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues. This subgroup, restricted to animals, contains isozyme IV and similar proteins such as mouse CA XV. Isozymes IV is attached to membranes via a glycosylphosphatidylinositol (GPI) tail. In mammals, Isozyme IV plays crucial roles in kidney and lung function, amongst others. This subgroup also contains the dual domain CA from the giant clam, Tridacna gigas. T. gigas CA plays a role in the movement of inorganic carbon from the surrounding seawater to the symbiotic algae found in the clam's tissues. CA XV is expressed in several species but not in humans or chimps. Similar to isozyme CA IV, CA XV attaches to membranes via a GPI tail. 28760 cd03118: Carbonic anhydrase alpha, CA isozyme V_like subgroup. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidines. This vertebrate subgroup comprises isozyme V. CA V is the mitochondrial isozyme, which may play a role in gluconeogenesis and ureagenesis and possibly also in lipogenesis. 28761 cd03119: Carbonic anhydrase alpha, isozymes I, II, and III and XIII. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidines. This vertebrate subgroup comprises isozymes I, II, and III, which are cytoplasmic enzymes. CA I, for example, is expressed in erythrocyes of many vertebrates; CA II is the most active cytosolic isozyme; while it is being expressed nearly ubiquitously, it comprises 95% of the renal carbonic anhydrase and is required for renal acidification; CA III has been implicated in protection from the damaging effect of oxidizing agents in hepatocytes. CAXIII may play important physiological roles in several organs. 28762 cd03120: Carbonic anhydrase alpha related protein, group VIII. Carbonic anhydrase related proteins (CARPs) are sequence similar to carbonic anhydrases. Carbonic anhydrases are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism. CARPs have lost conserved histidines involved in zinc binding and consequently their catalytic activity. CARP VIII may play roles in various biological processes of the central nervous system, and could be involved in protein-protein interactions. CARP VIII has been shown to bind inositol 1,4,5-triphosphate (IP3) receptor type I (IP3RI), reducing the affinity of the receptor for IP3. IP3RI is an intracellular IP3-gated Ca2+ channel located on intracellular Ca2+ stores. IP3RI converts IP3 signaling into Ca2+ signaling thereby participating in a variety of cell functions. 28763 cd03121: Carbonic anhydrase alpha related protein: groups X, XI and related proteins. This subgroup contains carbonic anhydrase related proteins (CARPs) X and XI, which have been implicated in various biological processes of the central nervous system. CARPs are sequence similar to carbonic anhydrases. Carbonic anhydrases are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism. CARPs have lost conserved histidines involved in zinc binding and consequently their catalytic activity. CARP XI plays a role in the development of gastrointestinal stromal tumors. 28764 cd03122: Carbonic anhydrase alpha related protein, receptor_like subfamily. Carbonic anhydrase related proteins (CARPs) are sequence similar to carbonic anhydrases. Carbonic anhydrases are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism. CARPs have lost conserved histidines involved in zinc binding and consequently their catalytic activity. This sub-family of carbonic anhydrase-related domains found in tyrosine phosphatase receptors may play a role in cell adhesion. 28765 cd03123: Carbonic anhydrase alpha, isozymes VI, IX, XII and XIV. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Alpha CAs are mostly monomeric enzymes. The zinc ion is complexed by three histidine residues. This sub-family comprises the secreted CA VI, which is found in saliva, for example, and the membrane proteins CA IX, XII, and XIV. 28766 cd03124: Carbonic anhydrase alpha, prokaryotic-like subfamily. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidines. This sub-family includes bacterial carbonic anhydrase alpha, as well as plant enzymes such as tobacco nectarin III and yam dioscorin and, carbonic anhydrases from molluscs, such as nacrein, which are part of the organic matrix layer in shells. Other members of this family may be involved in maintaining pH balance, in facilitating transport of carbon dioxide or carbonic acid, or in sensing carbon dioxide levels in the environment. Dioscorin is the major storage protein of yam tubers and may play a role as an antioxidant. Tobacco Nectarin may play a role in the maintenace of pH and oxidative balance in nectar. Mollusc nacrein may participate in calcium carbonate crystal formation of the nacreous layer. This subfamily also includes three alpha carbonic anhydrases from Chlamydomonas reinhardtii (CAH 1-3). CAHs1-2 are localized in the periplasmic space. CAH1 faciliates the movement of carbon dioxide across the plasma membrane when the medium is alkaline. CAH3 is localized to the thylakoid lumen and provides CO2 to Rubisco. 28767 cd03125: Carbonic anhydrase alpha, isozyme VI. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues. This sub-family comprises the secreted CA VI, which is found in saliva. 28768 cd03126: Carbonic anhydrase alpha, isozymes XII and XIV. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues. This sub-family comprises the membrane proteins CA XII and XIV. 28769 cd03149: Carbonic anhydrase alpha, CA isozyme VII_like subgroup. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidines. This vertebrate subgroup comprises isozyme VII. CA VII is the most active cytosolic enzyme after CA II, and may be highly expressed in the brain. Human CA VII may be a target of antiepileptic sulfonamides/sulfamates. 28770 cd03150: Carbonic anhydrase alpha, isozyme IX. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Alpha CAs are strictly monomeric enzymes. The zinc ion is complexed by three histidine residues. This sub-family comprises the membrane protein CA IX. CA IX is functionally implicated in tumor growth and survival. CA IX is mainly present in solid tumors and its expression in normal tissues is limited to the mucosa of alimentary tract. CA IX is a transmembrane protein with two extracellular domains: carbonic anhydrase and, a proteoglycan-like segment mediating cell-cell adhesion. There is evidence for an involvement of the MAPK pathway in the regulation of CA9 expression. 28771 cd00332: Phenylalanine ammonia-lyase (PAL) and histidine ammonia-lyase (HAL) and. PAL and HAL are members of the Lyase class I_like superfamily of enzymes that, catalyze similar beta-elimination reactions and are active as homotetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. PAL, present in plants and fungi, catalyzes the conversion of L-phenylalanine to E-cinnamic acid. HAL, found in several bacteria and animals, catalyzes the conversion of L-histidine to E-urocanic acid . Both PAL and HAL contain the cofactor 3, 5-dihydro-5-methylidene-4H-imidazol-4-one (MIO) which is formed by autocatalytic excision/cyclization of the internal tripeptide, Ala-Ser-Gly. PAL is being explored as enzyme substitution therapy for Phenylketonuria (PKU), a disorder which involves an inability to metabolize phenylalanine. HAL failure in humans results in the disease histidinemia. 28772 cd01334: Lyase class I family of the Lyase_I superfamily. This family contains proteins similar to class II fumarase, aspartase, adenylosuccinate lyase (ASL), argininosuccinate lyase (ASAL), and 3-carboxy-cis,cis-muconate lactonizing (CMLE) enzyme. Proteins of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. 28773 cd01357: Aspartase_like. This group contains proteins similar to aspartase (L-aspartate ammonia-lyase) and fumarase class II enzymes. These proteins are members of the Lyase class I family. Members of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. Aspartase catalyzes the reversible deamination of aspartic acid. 28774 cd01359: Argininosuccinate lyase (argininosuccinase, ASAL). This group contains proteins similar to ASAL, a member of the Lyase class I family. Members of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. ASAL is a cytosolic enzyme which catalyzes the reversible breakdown of argininosuccinate to arginine and fumarate during arginine biosynthesis. In ureotleic species ASAL also catalyzes a reaction involved in the production of urea. Included in this group are the major soluble avian eye lens proteins from duck, delta 1 and delta 2 crystallin. Of these two isoforms only delta 2 has retained ASAL activity. These crystallins may have evolved by, gene recruitment of ASAL followed by gene duplication. In humans, mutations in ASAL result in the autosomal recessive disorder argininosuccinic aciduria. 28776 cd01362: Class II fumarases. This group contains proteins similar to Escherichia coli fumarase C and the human mitochondrial fumarase. These proteins are members of the Lyase class I family. Members of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. Fumarase catalyzes the reversible hydration/dehydration of fumarate to L-malate during the Krebs cycle. 28779 cd01596: Aspartase_like. This group contains proteins similar to aspartase (L-aspartate ammonia-lyase) and fumarase class II enzymes. These proteins are members of the Lyase class I family. Members of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. Aspartase catalyzes the reversible deamination of aspartic acid. Fumarase catalyzes the reversible hydration/dehydration of fumarate to L-malate during the Krebs cycle. 28780 cd01597: pCLME: prokaryotic 3-carboxy-cis,cis-muconate cycloisomerase (CMLE)_like This group contains proteins similar to pCLME, a member of the Lyase class I family. Members of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. CMLE catalyzes the cyclization of 3-carboxy-cis,cis-muconate (3CM) to 4-carboxy-muconolactone in the beta-ketoadipate pathway. This pathway is responsible for the catabolism of a variety of aromatic compounds into intermediates of the citric cycle in prokaryotic and eukaryotic micro-organisms. 28781 cd01598: PurB: PurB_like adenylosuccinases (adenylsuccinate lyase, ASL). This group contains proteins similar to EcASL, the product of the gene purB in Escherichia coli. ASL is a member of the Lyase class I family of the Lyase_I superfamily. Members of the Lyase class I family function as homotetramers to catalyze similar beta-elimination reactions in which a Calpha-N or Calpha-O bond is cleaved with the subsequent release of fumarate as one of the products. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. ASL catalyzes two non-sequential steps in the de novo purine biosynthesis pathway: the conversion of 5-aminoimidazole-(N-succinylocarboxamide) ribotide (SAICAR) into 5-aminoimidazole-4-carboxamide ribotide (AICAR) and; the conversion of adenylsuccinate (SAMP) into adenosine monophosphate (AMP).. 28782 cd03302: Adenylsuccinate lyase_2: Adenylsuccinate lyase (ASL)_subgroup 2. This subgroup contains mainly eukaryotic proteins similar to ASL, a member of the Lyase class I family. Members of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. ASL catalyzes two steps in the de novo purine biosynthesis: the conversion of 5-aminoimidazole-(N-succinylocarboxamide) ribotide (SAICAR) into 5-aminoimidazole-4-carboxamide ribotide (AICAR) and, the conversion of adenylsuccinate (SAMP) into adenosine monophosphate (AMP). ASL deficiency has been linked to several pathologies including psychomotor retardation with autistic features, epilepsy and muscle wasting. 28783 cd00415: NAD(P)H-dependent flavin oxidoreductase (oxidored) FMN-binding superfamily domain. The members of this family reduce a range of alternative electron acceptors. Most use FAD/FMN as a cofactor and NAD(P)H as electron donor. Some contain 4Fe-4S cluster to transfer electron from FAD to FMN. Members of this super-family have TIM barrel structure. It includes the old yellow enzyme from yeast, pentaerythritol tetranitrate reductase, morphinon reductase, trimethylamine dehydrogenase, dimethylamine dehydrogenase, histamine dehydrogenase, enoate reductases and 2,4-dienoyl-CoA reductase. 28784 cd02801: Dihydrouridine synthase-like (DUS-like) FMN-binding domain. Members of this family catalyze the reduction of the 5,6-double bond of a uridine residue on tRNA. Dihydrouridine modification of tRNA is widely observed in prokaryotes and eukaryotes, and also in some archae. Most dihydrouridines are found in the D loop of t-RNAs. The role of dihydrouridine in tRNA is currently unknown, but may increase conformational flexibility of the tRNA. It is likely that different family members have different substrate specificities, which may overlap. 1VHN, a putative flavin oxidoreductase, has high sequence similarity to DUS. The enzymatic mechanism of 1VHN is not known at the present. 28785 cd02803: Old yellow enzyme (OYE)-like FMN binding domain. OYE was the first flavin-dependent enzyme identified, however its true physiological role remains elusive to this day. Each monomer of OYE contains FMN as a non-covalently bound cofactor, uses NADPH as a reducing agent with oxygens, quinones, and alpha,beta-unsaturated aldehydes and ketones can acting as electron acceptors in the catalytic reaction. Members of OYE family include trimethylamine dehydrogenase, 2,4-dienoyl-CoA reductase, enoate reductase, pentaerythriol tetranitrate reductase, xenobiotic reductase, and morphinone reductase. 28786 cd02808: Glutamate synthase (GltS) FMN-binding domain. GltS is a complex iron-sulfur flavoprotein that catalyzes the reductive synthesis of L-glutamate from 2-oxoglutarate and L-glutamine via intramolecular channelling of ammonia, a reaction in the plant, yeast and bacterial pathway for ammonia assimilation. It is a multifunctional enzyme that functions through three distinct active centers, carrying out L-glutamine hydrolysis, conversion of 2-oxoglutarate into L-glutamate, and electron uptake from an electron donor. 28788 cd02810: Dihydroorotate dehydrogenase (DHOD) and Dihydropyrimidine dehydrogenase (DHPD) FMN-binding domain. DHOD catalyzes the oxidation of (S)-dihydroorotate to orotate. This is the fourth step and the only redox reaction in the de novo biosynthesis of UMP, the precursor of all pyrimidine nucleotides. DHOD requires FMN as co-factor. DHOD divides into class 1 and class 2 based on their amino acid sequences and cellular location. Members of class 1 is cytosolic enzymes and multimers while class 2 enzymes are membrane associated and monomeric. The class 1 enzymes can be further divided into subtypes 1A and 1B which are homodimers and heterotetrameric proteins, respectively. DHPD catalyzes the first step in pyrimidine degradation: the NADPH-dependent reduction of uracil and thymine to the corresponding 5,6-dihydropyrimidines. DHPD contains two FAD, two FMN and eight [4Fe-4S] clusters, arranged in two electron transfer chains that pass its homodimeicr interface twice. Two of the Fe-S clusters show a hitherto unobserved coordination involving a glutamine residue. 28789 cd02811: Isopentenyl Diphosphate:dimethylallyl diphosphate isomerase type 2 (IDI-2) FMN-binding domain. Two types of IDIs have been characterized at present. The long known IDI-1 is only dependent on divalent metals for activity, whereas IDI-2 requires a metal, FMN and NADPH. IDI-2 catalyzes the interconversion of isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP) in the mevalonate pathway. 28790 cd02812: PcrB FMN-binding domain. This family has unknown function. It belongs to the NAD(P)H-dependent flavin oxidoreductase super-family which reduce a range of alternative electron acceptors. Most use FAD/FMN as a cofactor and NAD(P)H as electron donor. Some contain 4Fe-4S cluster to transfer electron from FAD to FMN. Members of this super-family have TIM barrel structure. 28791 cd02911: Dihydropyrimidine dehydrogenase-like (DHPD-like) FMN-binding domain. DHPD catalyzes the first step in pyrimidine degradation: the NADPH-dependent reduction of uracil and thymine to the corresponding 5,6-dihydropyrimidines. DHPD contains two FAD, two FMN and eight [4Fe-4S] clusters, arranged in two electron transfer chains that pass the dimer interface twice. Two of the Fe-S clusters show a hitherto unobserved coordination involving a glutamine residue. 28792 cd02922: Flavocytochrome b2 (FCB2) FMN-binding domain. FCB2 (AKA L-lactate:cytochrome c oxidoreductase) is a respiratory enzyme located in the intermembrane space of yeast mitochondria which catalyzes the oxidation of L-lactate to pyruvate. FCB2 also participates in a short electron-transport chain involving cytochrome c and cytochrome oxidase which ultimately directs the reducing equivalents gained from L-lactate oxidation to oxygen, yielding one molecule of ATP for every L-lactate molecule consumed. FCB2 is composed of 2 domains: a C-terminal flavin-binding domain, which includes the active site for lacate oxidation, and a N-terminal b2-cytochrome domain, required for efficient cytochrome c reduction. FCB2 is a homotetramer and contains two noncovalently bound cofactors, FMN and heme per each subunit. 28793 cd02925: Glycolate oxidase-like (GOX-like) FMN-binding domain. This protein familly includes a widespread family of homologous FMN-dependent a-hydroxyacid oxidizing enzymes. This family occurs in both prokaryotes and eukaryotes. Memebers of this family include flavocytochrome b2 (FCB2), glycolate oxidase (GOX), lactate monooxygenase (LMO), mandelate dehydrogenase (MDH), and long chain hydroxyacid oxidase (LCHAO). In green plants, glycolate oxidase is one of the key enzymes in photorespiration where it oxidizes glycolate to glyoxylate. LMO catalyzes the oxidation of L-lactate to acetate and carbon dioxide. MDH oxidizes (S)-mandelate to phenylglyoxalate. It is an enzyme in the mandelate pathway that occurs in several strains of Pseudomonas which converts (R)-mandelate to benzoate. 28794 cd02929: Trimethylamine dehydrogenase (TMADH) and histamine dehydrogenase (HD) FMN-binding domain. TMADH is an iron-sulfur flavoprotein that catalyzes the oxidative demethylation of trimethylamine to form dimethylamine and formaldehyde. The protein forms a symetrical dimer with each subunit containing one 4Fe-4S cluster and one FMN cofactor. It contains a unique flavin, in the form of a 6-S-cysteinyl FMN which is bent by ~25 along the N5-N10 axis of the flavin isoalloxazine ring. This modification of the conformation of the flavin is thought to facilitate catalysis. C30 plays a role not only in substrate binding, but in the optimal geometrical alignment of substrate with the 6-S-cysteinyl FMN in the enzyme active site. Histamine dehydrogenase catalyzes oxidative deamination of histamine. The amino acid sequence of histamine dehydrogenase is closely related to those of TMADH and dimethylamine dehydrogenase containing an unusual covalently bound flavin mononucleotide, 6-S-cysteinyl-flavin mononucleotide, and one 4Fe-4S cluster as redox active cofactors in each subunit of the homodimer. The presence of the identical redox cofactors in histamine dehydrogenase has been confirmed by sequence alignment analysis, mass spectral analysis, UV-vis and EPR spectroscopy, and chemical analysis of iron and acid-labile sulfur. These results suggest that the structure of histamine dehydrogenase in the vicinity of the two redox centers is almost identical to that of trimethylamine dehydrogenase as a whole. The structure modeling study, however, demonstrated that a putative substrate-binding cavity in histamine dehydrogenase is quite distinct from that of trimethylamine dehydrogenase. 28795 cd02930: 2,4-dienoyl-CoA reductase (DCR) FMN-binding domain. DCR in E. coli is an iron-sulfur flavoenzyme which contains FMN, FAD and a 4Fe-4S cluster. It is also a monomer, unlike that of its eukaryotic counterparts which form homotetramers and lack the flavin and iron-sulfur cofactors. Metabolism of unsaturated fatty acids requires auxiliary enzymes in addition to those used in b-oxidation. After a given number of cycles through the b-oxidation pathway, those unsaturated fatty acyl-CoAs with double bonds at even-numbered carbon positions contain 2-trans, 4-cis double bonds that can not be modified by enoyl-CoA hydratase. DCR is used that utilizes NADPH to remove the C4-C5 double bond. DCR can catalyze the reduction of both natural fatty acids with cis double bonds as well as substrates containing trans double bonds. The reaction is initiated by hybrid transfer from NADPH to FAD, which in turn transfers electrons, one at a time, to FMN via the 4Fe-4S cluster. The fully reduced FMN provides a hydrid ion to the C5 atom of substrate, and Tyr-166 and His-252 are proposed to form a catalytic dyad that protonates the C4 atom of the substrate and complete the reaction. 28796 cd02931: Enoate reductase (ER)-like FMN-binding domain. Enoate reductase catalyzes the NADH-dependent reduction of carbon-carbon double bonds of nonactivated 2-enoates as well as of alpha,beta-unsaturated aldehydes, cyclic ketones, and methylketones. ERs are similar to 2,4-dienoyl-CoA reductase from E. coli and to old yellow enzyme from Saccharomyces cerevisiae. 28797 cd02932: Old yellow enzyme (OYE) YqjM-like FMN binding domain. YqjM is involved in the oxidative stress response of Bacillus subtilis. Like the other OYE members, each monomer of YqjM contains FMN as a non-covalently bound cofactor and uses NADPH as a reducing agent. The YqjM enzyme exists as a homotetramer that is assembled as a dimer of catalytically dependent dimers, while other OYE members exist only as monomers or dimers. Moreover, the protein displays a shared active site architecture where an arginine finger at the COOH terminus of one monomer extends into the active site of the adjacent monomer and is directly involved in substrate recognition. Another remarkable difference in the binding of the ligand in YqjM is represented by the contribution of the NH2-terminal tyrosine instead of a COOH-terminal tyrosine in OYE and its homologs. 28798 cd02933: Old yellow enzyme (OYE)-like FMN binding domain. OYE was the first flavin-dependent enzyme identified, however its true physiological role remains elusive to this day. Each monomer of OYE contains FMN as a non-covalently bound cofactor, uses NADPH as a reducing agent with oxygens, quinones, and alpha,beta-unsaturated aldehydes and ketones can acting as electron acceptors in the catalytic reaction. Members of OYE family include trimethylamine dehydrogenase, 2,4-dienoyl-CoA reductase, enoate reductase, pentaerythriol tetranitrate reductase, xenobiotic reductase, and morphinone reductase. 28799 cd02940: Dihydropyrimidine dehydrogenase (DHPD) FMN-binding domain. DHPD catalyzes the first step in pyrimidine degradation: the NADPH-dependent reduction of uracil and thymine to the corresponding 5,6-dihydropyrimidines. DHPD contains two FAD, two FMN and eight [4Fe-4S] clusters, arranged in two electron transfer chains that pass the dimer interface twice. Two of the Fe-S clusters show a hitherto unobserved coordination involving a glutamine residue. 28801 cd00524: Superoxide reductase-like (SORL) domain; present in a family of mononuclear non-heme iron proteins that includes superoxide reductase and desulfoferrodoxin. Superoxide reductase-like proteins scavenge superoxide anion radicals as a defense mechanism against reactive oxygen species and are found in anaerobic bacteria and archeae, and microaerophilic Treponema pallidum. The SORL domain contains an active iron site, Fe[His4Cys(Glu)], which in the reduced state loses the glutamate ligand. Superoxide reductase (class II) forms a homotetramer with four Fe[His4Cys(Glu)] centers. Desulfoferrodoxin (class I) is a homodimeric protein, with each protomer comprised of two domains, the N-terminal desulforedoxin (DSRD) domain and C-terminal SORL domain. Each domain has a distinct iron center: the DSRD iron center I, Fe(S-Cys)4; and the SORL iron center II, Fe[His4Cys(Glu)].. 28802 cd03171: Superoxide reductase-like (SORL) domain, class I; SORL-domains are present in a family of mononuclear non-heme iron proteins that includes superoxide reductase and desulfoferrodoxin. Superoxide reductase-like proteins scavenge superoxide anion radicals as a defense mechanism against reactive oxygen species and are found in anaerobic bacteria and archeae, and microaerophilic Treponema pallidum. Desulfoferrodoxin (class I) is a homodimeric protein, with each protomer comprised of two domains, the N-terminal desulforedoxin (DSRD) domain and C-terminal SORL domain. Each domain has a distinct iron center: the DSRD iron center I, Fe(S-Cys)4; and the SORL iron center II, Fe[His4Cys(Glu)].. 28803 cd03172: Superoxide reductase-like (SORL) domain, class II; SORL-domains are present in a family of mononuclear non-heme iron proteins that includes superoxide reductase and desulfoferrodoxin. Superoxide reductase-like proteins scavenge superoxide anion radicals as a defense mechanism against reactive oxygen species and are found in anaerobic bacteria and archeae, and microaerophilic Treponema pallidum. The SORL domain contains an active iron site, Fe[His4Cys(Glu)], which in the reduced state loses the glutamate ligand. Superoxide reductase (class II) forms a homotetramer with four Fe[His4Cys(Glu)] centers. 28804 cd00395: Tyrosinyl-tRNA synthetase (TyrRS)/Tryptophanyl-tRNA synthetase (TrpRS) catalytic core domain. These enzymes attach Tyr or Trp, respectively, to the appropriate tRNA. These class I enzymes are homodimers, which aminoacylate the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the class I characteristic HIGH and KMSKS motifs, which are involved in ATP binding. 28806 cd00517: ATP-sulfurylase (ATPS), also known as sulfate adenylate transferase, catalyzes the transfer of an adenylyl group from ATP to sulfate, forming adenosine 5'-phosphosulfate (APS). This reaction is generally accompanied by a further reaction, catalyzed by APS kinase, in which APS is phosphorylated to yield 3'-phospho-APS (PAPS). In some organisms the APS kinase is a separate protein, while in others it is incorporated with ATP sulfurylase in a bifunctional enzyme that catalyzes both reactions. In bifunctional proteins, the domain that performs the kinase activity can be attached at the N-terminal end of the sulfurylase unit or at the C-terminal end, depending on the organism. While the reaction is ubiquitous among organisms, the physiological role of the reaction varies. In some organisms it is used to generate APS from sulfate and ATP, while in others it proceeds in the opposite direction to generate ATP from APS and pyrophosphate. ATP sulfurylase can be a monomer, a homodimer, or a homo-oligomer, depending on the organism. ATPS belongs to a large superfamily of nucleotidyltransferases that includes pantothenate synthetase (PanC), phosphopantetheine adenylyltransferase (PPAT), and the amino-acyl tRNA synthetases. The enzymes of this family are structurally similar and share a dinucleotide-binding domain. 28807 cd00560: PanC Pantoate-beta-alanine ligase, also known as pantothenate synthase, catalyzes the formation of pantothenate from pantoate and alanine. PanC belongs to a large superfamily of nucleotidyltransferases that includes , ATP sulfurylase (ATPS), phosphopantetheine adenylyltransferase (PPAT), and the amino-acyl tRNA synthetases. The enzymes of this family are structurally similar and share a dinucleotide-binding domain. 28809 cd00671: This is the catalytic core domain of Arginyl tRNA synthetase (ArgRS). This class I enzyme is a monomer, which aminoacylates the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. There are at least three subgroups of ArgRS. One type contains both characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. The other subtype lacks the KMSKS motif; however, it has a lysine N-terminal to the HIGH motif, which serves as the functional counterpart to the second lysine of the KMSKS motif. A third group, which is found primarily in archaea and a few bacteria, lacks both the KMSKS motif and the HIGH loop lysine. 28811 cd00674: This is the catalytic core domain of lysyl tRNA synthetase (LysRS). This class I enzyme is a monomer, which aminoacylates the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. The class I LysRS is found only in archaea and some bacteria and has evolved separately from class II LysRS, as the two do not share structural or sequence similarity. 28813 cd00805: Tyrosinyl-tRNA synthetase (TyrRS) catalytic core domain. TyrRS is a homodimer, which attaches Tyr to the appropriate tRNA. TyrRS is a class I tRNA synthetases, so it aminoacylates the 2'-OH of the nucleotide at the 3' end of the tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formationof the enzyme bound aminoacyl-adenylate. It contains the class I characteristic HIGH and KMSKS motifs, which are involved in ATP binding. 28814 cd00806: Tryptophanyl-tRNA synthetase (TrpRS) catalytic core domain. TrpRS is a homodimer, which attaches Tyr to the appropriate tRNA. TrpRS is a class I tRNA synthetases, so it aminoacylates the 2'-OH of the nucleotide at the 3' end of the tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains class I characteristic HIGH and KMSKS motifs, which are involved in ATP binding. 28821 cd02039: Cytidylyltransferase-like domain.Many ofthese proteins are known to use CTP or ATP and release pyrophosphate. Protein families that contain at least one copy of this domain include citrate lyase ligase, pantoate-beta-alanine ligase, glycerol-3-phosphate cytidyltransferase, ADP-heptose synthase, phosphocholine cytidylyltransferase, lipopolysaccharide core biosynthesis protein KdtB, the bifunctional protein NadR, and a number whose function is unknown. 28822 cd02064: Riboflavin kinase (Flavokinase). This family represents the C-terminal region of the bifunctional riboflavin biosynthesis protein riboflavin kinase / FAD synthetase. These enzymes have both ATP:riboflavin. 5'-phospho transferase and ATP:FMN-adenylyltransferase activities . The C-terminal domain has FMN-adenylyltransferase activitie. They catalyse the 5'-phosphorylation of riboflavin to FMN and the adenylylation of FMN to FAD . A domain has been identified in the N-terminal region that is well conserved in all the bacterial FAD synthetases.This domain has remote similarity to nucleotidyl transferases and, hence, it may be involved in the adenylylation reaction of FAD synthetases .. 28823 cd02156: nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain. 28824 cd02158: PanC_ATPS Pantothenate synthetase (PanC) and ATP-sulfurylase (ATPS) share a similar dinucleotide-binding domain. 28825 cd02163: Phosphopantetheine adenylyltransferase (PPAT) is an essential enzyme in bacteria that catalyses a rate-limiting step in coenzyme A (CoA) biosynthesis, by transferring an adenylyl group from ATP to 4'-phosphopantetheine, yielding dephospho-CoA (dPCoA). Each phosphopantetheine adenylyltransferase (PPAT) subunit displays a dinucleotide-binding fold that is structurally similar to that in class I aminoacyl-tRNA synthetases. Superposition of bound adenylyl moieties from dPCoA in PPAT and ATP in aminoacyl-tRNA synthetases suggests nucleophilic attack by the 4'-phosphopantetheine on the -phosphate of ATP. The proposed catalytic mechanism implicates transition state stabilization by PPAT without involving functional groups of the enzyme in a chemical sense in the reaction. The homologous active site attachment of ATP and the structural distribution of predicted sequence-binding motifs in PPAT classify the enzyme as belonging to the nucleotidyltransferase superfamily. 28826 cd02164: The PPAT domain of the bifunctional enzyme with PPAT and DPCK functions. The final two steps of the CoA biosynthesis pathway are catalysed by phosphopantetheine adenylyltransferase (PPAT) and dephospho-CoA (dPCoA) kinase (DPCK). The PPAT reaction involves the reversible adenylation of 4'-phosphopantetheine to form 3'-dPCoA and PPi, and DPCK catalyses phosphorylation of the 3'-hydroxy group of the ribose moiety of dPCoA. In eukaryotes the two enzymes are part of a large multienzyme complex . Studies in Corynebacterium ammoniagenes suggested that separate enzymes were present , and this was subsequently confirmed on identification of the bacterial PPAT/coaD. 28827 cd02165: This family contains the predominant bacterial/eukaryotic adenylyltransferases for nicotinamide-nucleotide and for the deamido form, nicotinate nucleotide. Nicotinamide/nicotinate mononucleotide (NMN/ NaMN)adenylyltransferase (NMNAT) is an indispensable enzyme in the biosynthesis of NAD(+) and NADP(+). Nicotinamide-nucleotide adenylyltransferase synthesizes NAD via the salvage pathway, while nicotinate-nucleotide adenylyltransferase synthesizes the immediate precursor of NAD via the de novo pathway. Human NMNAT displays unique dual substrate specificity toward both NMN and NaMN, and thus can participate in both de novo and salvage pathways of NAD synthesis. 28828 cd02166: This family of archaeal proteins exhibits nicotinamide-nucleotide adenylyltransferase (NMNAT) activity utilizing the salvage pathway to synthesize NAD. In some cases, the enzyme was tested and found also to have the activity of nicotinate-nucleotide adenylyltransferase an enzyme of NAD de novo biosynthesis, although with a higher Km. In some archaeal species, a number of proteins which are uncharacterized with respect to activity, are also present. 28829 cd02167: The NMNAT domain of NadR protein. The NadR protein (hiNadR) is a bifunctional enzyme possessing both NMN adenylytransferase (NMNAT) and ribosylnicotinamide kinase (RNK) activities. Its function is essential for the growth and survival of H. influenzae and thus may present a new highly specific anti-infectious drug target. The N-terminal domain that hosts the NMNAT activity is closely related to archaeal NMNAT. The bound NAD at the active site of the NMNAT domain reveals several critical interactions between NAD and the protein.The NMNAT domain of hiNadR defines yet another member of the pyridine nucleotide adenylyltransferase. 28830 cd02168: This domain represents the N-terminal NMNAT (Nicotinamide/nicotinate mononucleotide adenylyltransferase) domain of a novel bifunctional enzyme endowed with NMN adenylyltransferase and #Nudix' hydrolase activities. This domain is highly homologous to the archeal NMN adenyltransferase that catalyzes NAD synthesis from NMN and ATP. NMNAT is an indispensable enzyme in the biosynthesis of NAD(+) and NADP(+). Nicotinamide-nucleotide adenylyltransferase synthesizes NAD via the salvage pathway, while nicotinate-nucleotide adenylyltransferase synthesizes the immediate precursor of NAD via the de novo pathway. The C-terminal domain of this enzyme shares homology with the archaeal ADP-ribose pyrophosphatase, a member of the 'Nudix' hydrolase family. 28831 cd02169: Citrate lyase ligase, also known as [Citrate (pro-3S)-lyase] ligase, is responsible for acetylation of the (2-(5''-phosphoribosyl)-3'-dephosphocoenzyme-A) prosthetic group of the gamma subunit of citrate lyase, converting the inactive thiol form of this enzyme to the active form. The acetylation of 1 molecule of deacetyl-citrate lyase to enzymatically active citrate lyase requires 6 molecules of ATP. The Adenylylyltranferase activity of the enzyme involves the formation of AMP and and pyrophosphate in the acetylation reaction. 28832 cd02170: The cytidylyltransferase family includes cholinephosphate cytidylyltransferase (CCT), glycerol-3-phosphate cytidylyltransferase, RafE and phosphoethanolamine cytidylyltransferase (ECT). All enzymes catalyze the transfer of a cytidylyl group from CTP to various substrates. 28833 cd02171: These sequences describe glycerol-3-phosphate cytidylyltransferase, also called CDP-glycerol pyrophosphorylase. A closely related protein assigned a different function experimentally is a human ethanolamine-phosphate cytidylyltransferase . Glycerol-3-phosphate cytidyltransferase acts in pathways of teichoic acid biosynthesis. Teichoic acids are substituted polymers, linked by phosphodiester bonds, of glycerol, ribitol, etc. An example is poly(glycerol phosphate), the major teichoic acid of the Bacillus subtilis cell wall. Most but not all species encoding proteins in this family are Gram-positive bacteria. 28834 cd02172: RfaE is a protein involved in the biosynthesis of ADP-L-glycero-D-manno-heptose, a precursor for LPS inner core biosynthesis. RfaE is a bifunctional protein in Escherichia coli, and separate proteins in other organisms. Domain I is suggested to act in D-glycero-D-manno-heptose 1-phosphate biosynthesis, while domain II (this family) adds ADP to yield ADP-D-glycero-D-manno-heptose .. 28835 cd02173: CTP:phosphoethanolamine cytidylyltransferase (ECT) catalyzes the conversion of phosphoethanolamine to CDP-ethanolamine as part of the CDP#ethanolamine biosynthesis pathway. ECT expression in hepatocytes is localized predominantly to areas of the cytoplasm that are rich in rough endoplasmic reticulum. Several ECT's, including yeast and human ECT, have large repetitive sequences located within their N- and C-termini. 28836 cd02174: CTP:phosphocholine cytidylyltransferase (CCT) catalyzes the condensation of CTP and phosphocholine to form CDP-choline as the rate-limiting and regulatory step in the CDP-choline pathway. CCT is unique in that its enzymatic activity is regulated by the extent of its association with membrane structures . A current model posits that bilayer curvature elastic stress is sensed by CCT and governs the degree of membrane association, thus providing a mechanism for both positive and negative regulation of activity. 28837 cd00924: Cytochrome c oxidase subunit Vb. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit Vb is one of three mammalian subunits that lacks a transmembrane region. Subunit Vb is located on the matrix side of the membrane and binds the regulatory subunit of protein kinase A. The abnormally extended conformation is stable only in the CcO assembly. 28838 cd00925: Cytochrome c oxidase subunit VIa. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit VIa is expressed in two tissue-specific isoforms in mammals but not fish. VIa-H is the heart and skeletal muscle isoform; VIa-L is the liver or non-muscle isoform. Mammalian VIa-H induces a slip in CcO (decrease in proton/electron stoichiometry) at high intramitochondrial ATP/ADP ratios, while VIa-L induces a permanent slip in CcO, depending on the presence of cardiolipin and palmitate. 28839 cd00974: Desulforedoxin (DSRD) domain; a small non-heme iron domain present in the desulforedoxin (rubredoxin oxidoreductase) and desulfoferrodoxin proteins of some archeael and bacterial methanogens and sulfate/sulfur reducers. Desulforedoxin is a small, single-domain homodimeric protein; each subunit contains an iron atom bound to four cysteinyl sulfur atoms, Fe(S-Cys)4, in a distorted tetrahedral coordination. Its metal center is similar to that found in rubredoxin type proteins. Desulforedoxin is regarded as a potential redox partner for rubredoxin. Desulfoferrodoxin forms a homodimeric protein, with each protomer comprised of two domains, the N-terminal DSRD domain and C-terminal superoxide reductase-like (SORL) domain. Each domain has a distinct iron center: the DSRD iron center I, Fe(S-Cys)4; and the SORL iron center II, Fe[His4Cys(Glu)].. 28840 cd02749: Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1""-p (ADP-ribose-1""-mon ophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes. 28841 cd02900: Macro domain, Appr-1""-pase family. The macro domain is a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. The yeast protein Ymx7 and related proteins in this family contain a stand-alone macro domain and may be specific phosphatases catalyzing the conversion of ADP-ribose-1""-monophosphate (Appr-1""-p) to ADP-ribose. Appr-1""-p is an intermediate in a metabolic pathway involved in pre-tRNA splicing. 28842 cd02901: Macro domain, Poa1p_like family. The macro domain is a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1""-p (ADP-ribose-1""-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes. Members of this family show similarity to the yeast protein Poa1p, reported to be a phosphatase specific for Appr-1""-p, a tRNA splicing metabolite. Poa1p may play a role in tRNA splicing regulation. 28843 cd02903: Macro domain, BAL_like family. The macro domain is a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1""-p (ADP-ribose-1""-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes. Members of this family show similarity to BAL (B-aggressive lymphoma) proteins, which contain one to three macro domains. Most BAL family macro domains belong to this family except for the most N-terminal domain in multiple-domain containing proteins. Most BAL proteins also contain a C-terminal PARP active site and are also named as PARPs. Human BAL1 (or PARP-9) was originally identified as a risk-related gene in diffuse large B-cell lymphoma that promotes malignant B-cell migration. Some BAL family proteins exhibit PARP activity. Poly (ADP-ribosyl)ation is an immediate DNA-damage-dependent post-translational modification of histones and other nuclear proteins. BAL proteins may also function as transcription repressors. 28844 cd02904: Macro domain, Macro_H2A_like family. The macro domain is a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1""-p (ADP-ribose-1""-monophospha te) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes. Members of this family are similar to macroH2A, a variant of the major-type core histone H2A, which contains an N-terminal H2A domain and a C-terminal nonhistone macro domain. Histone macroH2A is enriched on the inactive X chromosome of mammalian female cells. It does not bind poly ADP-ribose, but does bind the monomeric SirT1 metabolite O-acetyl-ADP-ribose (OAADPR) with high affinity through its macro domain. In addition, the macro domain of macroH2A associates with histone deacetylases and affects the acetylation status of nucleosomes. MacroH2A-containing nucleosomes are repressive toward transcription. 28845 cd02905: Macro domain, GDAP2_like family. The macro domain is a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1""-p (ADP-ribose-1""-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes. This family contains proteins similar to human GDAP2, the ganglioside induced differentiation associated protein 2, whose gene is expressed at a higher level in differentiated Neuro2a cells compared with non-differentiated cells. GDAP2 contains an N-terminal macro domain and a C-terminal Sec14p-like lipid binding domain. It is specifically expressed in brain and testis. 28846 cd02906: Macro domain, Unknown family 1. The macro domain is a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1""-p (ADP-ribose-1""-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes. This family is composed of uncharacterized proteins containing a macro domain, either as a stand-alone domain or in addition to a C-terminal SIR2 (silent information regulator 2) domain. 28847 cd02907: Macro domain, Af1521- and BAL-like family. The macro domain is a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as an Appr-1""-p (ADP-ribose-1""-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes. The macro domains in this family show similarity to Af1521, a protein from Archaeoglobus fulgidus containing a stand-alone macro domain. Af1521 binds ADP-ribose and exhibits phosphatase activity toward Appr-1""-p. Also included in this family are the N-terminal (or first) macro domains of BAL (B-aggressive lymphoma) proteins which contain multiple macro domains. Most BAL proteins also contain a C-terminal PARP active site and are also named as PARPs. Human BAL1 (or PARP-9) was originally identified as a risk-related gene in diffuse large B-cell lymphoma that promotes malignant B-cell migration. Some BAL family proteins exhibit PARP activity. Poly (ADP-ribosyl)ation is an immediate DNA-damage-dependent post-translational modification of histones and other nuclear proteins. BAL proteins may also function as transcription repressors. 28848 cd02908: Macro domain, Appr-1""-pase_like family. The macro domain is a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1""-p (ADP-ribose-1""-monophospha te) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes. This family is composed of uncharacterized proteins that show similarity to Appr-1""-pase, containing conserved putative active site residues. Appr-1""-p ase is a phosphatase specific for ADP-ribose-1""-monophosphate. 28849 cd03330: Macro domain, Unknown family 2. The macro domain is a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1""-p (ADP-ribose-1""-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes. This family is composed of uncharacterized proteins containing a stand-alone macro domain. 28850 cd03331: Macro domain, Poa1p_like family, SNF2 subfamily. The macro domain is a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1""-p (ADP-ribose-1""-mon ophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes. Members of this subfamily contain a C-terminal macro domain that show similarity to the yeast protein Poa1p, reported to be a phosphatase specific for Appr-1""-p, a tRNA splicing metabolite. In addition, they also contain an SNF2 domain, defined by the presence of seven motifs with sequence similarity to DNA helicases. SNF2 proteins have the capacity to use the energy released by their DNA-dependent ATPase activity to stabilize or perturb protein-DNA interactions and play important roles in transcriptional regulation, maintenance of chromosome integrity and DNA repair. 28852 cd01740: Type 1 glutamine amidotransferase (GATase1)-like domain found in Formylglycinamide ribonucleotide amidotransferase (FGAR-AT). FGAR-AT catalyzes the ATP-dependent conversion of formylglycinamide ribonucleotide (FGAR) and glutamine to formylglycinamidine ribonucleotide (FGAM), ADP, Pi, and glutamate in the fourth step of the purine biosynthetic pathway. FGAR-AT is a glutamine amidotransferase. Glutamine amidotransferase activity catalyses the transfer of ammonia from the amide side chain of glutamine to an acceptor substrate. FGAR-AT belongs to the triad family of amidotransferases having a conserved Cys-His-Glu catalytic triad in the glutaminase active site. 28854 cd01742: Type 1 glutamine amidotransferase (GATase1) domain found in GMP synthetase. GMP synthetase is a glutamine amidotransferase from the de novo purine biosynthetic pathway. Glutamine amidotransferase (GATase) activity catalyse the transfer of ammonia from the amide side chain of glutamine to an acceptor substrate. GMP synthetase catalyses the amination of the nucleotide precursor xanthosine 5 '-monophospahte to form GMP. GMP synthetase belongs to the triad family of amidotransferases having a conserved Cys-His-Glu catalytic triad in the glutaminase active site. 28855 cd01743: Type 1 glutamine amidotransferase (GATase1) domain found in Anthranilate synthase (ASase). This group contains proteins similar to para-aminobenzoate (PABA) synthase and ASase. These enzymes catalyze similar reactions and produce similar products, PABA and ortho-aminobenzoate (anthranilate). Each enzyme is composed of non-identical subunits: a glutamine amidotransferase subunit (component II) and a subunit that produces an aminobenzoate products (component I). ASase catalyses the synthesis of anthranilate from chorismate and glutamine and is a tetrameric protein comprising two copies each of components I and II. Component II of ASase belongs to the family of triad GTases which hydrolyze glutamine and transfer nascent ammonia between the active sites. In some bacteria, such as Escherichia coli, component II can be much larger than in other organisms, due to the presence of phosphoribosyl-anthranilate transferase (PRTase) activity. PRTase catalyses the second step in tryptophan biosynthesis and results in the addition of 5-phosphoribosyl-1-pyrophosphate to anthranilate to create N-5 '-phosphoribosyl-anthranilate. In E.coli, the first step in the conversion of chorismate to PABA involves two proteins: PabA and PabB which co-operate to transfer the amide nitrogen of glutamine to chorismate forming 4-amino-4 deoxychorismate (ADC). PabA acts as a glutamine amidotransferase, supplying an amino group to PabB, which carries out the amination reaction. A third protein PabC then mediates elimination of pyruvate and aromatization to give PABA. Several organisms have bipartite proteins containing fused domains homologous to PabA and PabB commonly called PABA synthases. These hybrid PABA synthases may produce ADC and not PABA. 28856 cd01744: This group of sequences represents the small chain of the glutamine-dependent form of carbamoyl phosphate synthase, CPSase II. CPSase II catalyzes the production of carbomyl phosphate (CP) from bicarbonate, glutamine and two molecules of MgATP. The reaction is believed to proceed by a series of four biochemical reactions involving a minimum of three discrete highly reactive intermediates. The synthesis of CP is critical for the initiation of two separate biosynthetic pathways. In one CP is coupled to aspartate, its carbon and nitrogen nuclei ultimately incorporated into the aromatic moieties of pyrimidine nucleotides. In the second pathway CP is condensed with ornithine at the start of the urea cycle and is utilized for the detoxification of ammonia and biosynthesis of arginine. CPSases may be encoded by one or by several genes, depending on the species. The E.coli enzyme is a heterodimer consisting of two polypeptide chains referred to as the small and large subunit. Ammonia an intermediate during the biosynthesis of carbomyl phosphate produced by the hydrolysis of glutamine in the small subunit of the enzyme is delivered via a molecular tunnel between the remotely located carboxyphosphate active site in the large subunit. CPSase IIs belong to the triad family of amidotransferases having a conserved Cys-His-Glu catalytic triad in the glutaminase active site. This group also contains the sequence from the mammalian urea cycle form which has lost the active site Cys, resulting in an ammonia-dependent form, CPSase I. 28858 cd01746: Type 1 glutamine amidotransferase (GATase1) domain found in Cytidine Triphosphate Synthetase (CTP). CTP is involved in pyrimidine ribonucleotide/ribonucleoside metabolism. CTPs produce CTP from UTP and glutamine and regulate intracellular CTP levels through interactions with four ribonucleotide triphosphates. The enzyme exists as a dimer of identical chains that aggregates as a tetramer. CTP is derived form UTP in three separate steps involving two active sites. In one active site, the UTP O4 oxygen is activated by Mg-ATP-dependent phosphorylation, followed by displacement of the resulting 4-phosphate moiety by ammonia. At a separate site, ammonia is generated via rate limiting glutamine hydrolysis (glutaminase) activity. A gated channel that spans between the glutamine hydrolysis and amidoligase active sites provides a path for ammonia diffusion. CTPs belong to the triad family of amidotransferases having a conserved Cys-His-Glu catalytic triad in the glutaminase active site. 28859 cd01747: Type 1 glutamine amidotransferase (GATase1) domain found in gamma-Glutamyl Hydrolase. gamma-Glutamyl Hydrolase catalyzes the cleavage of the gamma-glutamyl chain of folylpoly-gamma-glutamyl substrates and is a central enzyme in folyl and antifolyl poly-gamma-glutamate metabolism. GATase activity involves the removal of the ammonia group from a glutamate molecule and its subsequent transfer to a specific substrate, thus creating a new carbon-nitrogen group on the substrate. gamma-Glutamyl hydrolases belong to the triad family of amidotransferases having a conserved Cys-His-Glu catalytic triad in the glutaminase active site. 28860 cd01748: Type 1 glutamine amidotransferase (GATase1) domain found in imidazole glycerol phosphate synthase (IGPS). IGPS incorporates ammonia derived from glutamine into N1-[(5 '-phosphoribulosyl)-formimino]-5-aminoimidazole-4-carboxamide ribonucleotide (PRFAR) to form 5'-(5-aminoimidazole-4-carboxamide) ribonucleotide (AICAR) and imidazole glycerol phosphate (IGP). The glutamine amidotransferase domain generates the ammonia nucleophile which is channeled from the glutaminase active site to the PRFAR active site. IGPS belong to the triad family of amidotransferases having a conserved Cys-His-Glu catalytic triad in the glutaminase active site. 28861 cd01749: Glutamine Amidotransferase (GATase_I) involved in pyridoxine biosynthesis. Glutamine amidotransferase (GATase) activity involves the removal of the ammonia group from a glutamate molecule and its subsequent transfer to a specific substrate, thus creating a new carbon-nitrogen group on the substrate. This group contains proteins like Bacillus subtilus YaaE and Plasmodium falciparum Pdx2 which are members of the triad glutamine aminotransferase family and function in a pathway for the biosynthesis of vitamin B6. 28862 cd01750: Type 1 glutamine amidotransferase (GATase1) domain found in Cobyric Acid Synthase (CobQ). CobQ plays a role in cobalamin biosythesis. CobQ catalyses amidations at positions B, D, E, and G on adenosylcobyrinic A,C-diamide in the biosynthesis of cobalamin. CobQ belongs to the triad family of amidotransferases. Two of the three residues of the catalytic triad that are involved in glutamine binding, hydrolysis and transfer of the resulting ammonia to the acceptor substrate in other triad aminodotransferases are conserved in CobQ. 28863 cd03128: Type 1 glutamine amidotransferase (GATase1)-like domain. This group contains proteins similar to Class I glutamine amidotransferases, the intracellular PH1704 from Pyrococcus horikoshii, the C-terminal of the large catalase: Escherichia coli HP-II, Sinorhizobium meliloti Rm1021 ThuA, the A4 beta-galactosidase middle domain and peptidase E. The majority of proteins in this group have a reactive Cys found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. For Class I glutamine amidotransferases proteins which transfer ammonia from the amide side chain of glutamine to an acceptor substrate, this Cys forms a Cys-His-Glu catalytic triad in the active site. Glutamine amidotransferases activity can be found in a range of biosynthetic enzymes included in this cd: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase, anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase (CPSase), cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. For Pyrococcus horikoshii PH1704, the Cys of the nucleophile elbow together with a different His and, a Glu from an adjacent monomer form a catalytic triad different from the typical GATase1 triad. Peptidase E is believed to be a serine peptidase having a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad of typical GATase1 domains, by having a Ser in place of the reactive Cys at the nucleophile elbow. The E. coli HP-II C-terminal domain, S. meliloti Rm1021 ThuA and the A4 beta-galactosidase middle domain lack the catalytic triad typical GATaseI domains. GATase1-like domains can occur either as single polypeptides, as in Class I glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. Peptidase E has a circular permutation in the common core of a typical GTAse1 domain. 28864 cd03129: Type 1 glutamine amidotransferase (GATase1)-like domain found in peptidase E_like protiens. This group contains proteins similar to the aspartyl dipeptidases Salmonella typhimurium peptidase E and Xenopus laevis peptidase E and, extracellular cyanophycinases from Pseudomonas anguilliseptica BI (CphE) and Synechocystis sp. PCC 6803 CphB. In bacteria peptidase E is believed to play a role in degrading peptides generated by intracellular protein breakdown or imported into the cell as nutrient sources. Peptidase E uniquely hydrolyses only Asp-X dipeptides (where X is any amino acid), and one tripeptide Asp-Gly-Gly. Cyanophycinases are intracellular exopeptidases which hydrolyze the polymer cyanophycin (multi L-arginyl-poly-L-aspartic acid) to the dipeptide beta-Asp-Arg. Peptidase E and cyanophycinases are thought to have a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad typical of GATase1 domains by having a Ser in place of the reactive Cys at the nucleophile elbow. Xenopus peptidase E is developmentally regulated in response to thyroid hormone and, it is thought to play a role in apoptosis during tail reabsorption. 28865 cd03130: Type 1 glutamine amidotransferase (GATase1) domain found in Cobyrinic Acid a,c-Diamide Synthase. CobB plays a role in cobalamin biosythesis catalyzing the conversion of cobyrinic acid to cobyrinic acid a,c-diamide. CobB belongs to the triad family of amidotransferases. Two of the three residues of the catalytic triad that are involved in glutamine binding, hydrolysis and transfer of the resulting ammonia to the acceptor substrate in other triad aminodotransferases are conserved in CobB. 28866 cd03131: Type 1 glutamine amidotransferase (GATase1)-like domain found in homoserine trans-succinylase (HTS). HTS, the first enzyme in methionine biosynthesis in Escherichia coli, transfers a succinyl group from succinyl-CoA to homoserine forming succinyl homoserine. It has been suggested that the succinyl group of succinyl-CoA is initially transferred to an enzyme nucleophile before subsequent transfer to homoserine. The catalytic triad typical of GATase1 domains is not conserved in this GATase1-like domain. However, in common with GATase1 domains a reactive cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. It has been proposed that this cys is in the active site of the molecule. However, as succinyl has been found bound to a conserved lysine residue, this conserved cys may play a role in dimer formation. HTS activity is tightly regulated by several mechanisms including feedback inhibition and proteolysis. It represents a critical control point for cell growth and viability. 28867 cd03132: Type 1 glutamine amidotransferase (GATase1)-like domain found in at the C-terminal of several large catalases. Catalase catalyzes the dismutation of hydrogen peroxide (H2O2) to water and oxygen. This group includes the large catalases: Neurospora crassa Catalase-1 and Catalase-3 and, Escherichia coli HP-II. This GATase1-like domain has an essential role in HP-II catalase activity. However, it lacks enzymatic activity and the catalytic triad typical of GATase1 domains. Catalase-1 and -3 are homotetrameric, HP-II is homohexameric. It has been proposed that this domain may facilitate the folding and oligomerization process. The interface between this GATase1-like domain of HP-II and the core of the subunit forms part of a channel which provides access to the deeply buried catalase active sites of HPII. Catalase-1 is associated with non-growing cells; Catalase-3 is associated with growing conditions. HP-II is produced in stationary phase. Catalase-1 is induced by ethanol and heat shock. Catalase-3 is induced under stress conditions such a hydrogen peroxide, paraquat, cadmium, heat shock, uric acid and nitrate treatment. 28868 cd03133: Type 1 glutamine amidotransferase (GATase1)-like domain found in zebrafish ES1. This group includes, proteins similar to ES1, Escherichia coli enhancing lycopene biosynthesis protein 2, Azospirillum brasilense iaaC and, human HES1. The catalytic triad typical of GATase1domains is not conserved in this GATase1-like domain. However, in common with GATase1domains a reactive cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. Zebrafish ES1 is expressed specifically in adult photoreceptor cells and appears to be a cytoplasmic protein. A. brasilense iaaC is involved in controlling IAA biosynthesis. 28869 cd03134: A type 1 glutamine amidotransferase (GATase1)-like domain found in PfpI from Pyrococcus furiosus. This group includes proteins similar to PfpI from P. furiosus. and PH1704 from Pyrococcus horikoshii. These enzymes are ATP-independent intracellular proteases and may hydrolyze small peptides to provide a nutritional source. Only Cys of the catalytic triad typical of GATase1 domains is conserved in this group. This Cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. For PH1704, it is believed that this Cys together with a different His in one monomer and Glu (from an adjacent monomer) forms a different catalytic triad from the typical GATase1domain. PfpI is homooligomeric. Protease activity is only found for oligomeric forms of PH1704. 28870 cd03135: Type 1 glutamine amidotransferase (GATase1)-like domain found in Human DJ-1. DJ-1 is involved in multiple physiological processes including cancer, Parkinson's disease and male fertility. It is unclear how DJ-1 functions in these. DJ-1 has been shown to possess chaperone activity. DJ-1 is preferentially expressed in the testis and moderately in other tissues; it is induced together with genes involved in oxidative stress response. The Drosophila homologue (DJ-1A) plays an essential role in oxidative stress response and neuronal maintenance. Inhibition of DJ-1A function through RNAi, results in the cellular accumulation of reactive oxygen species, organismal hypersensitivity to oxidative stress, and dysfunction and degeneration of dopaminergic and photoreceptor neurons. DJ-1 has lacks enzymatic activity and the catalytic triad of typical GATase1 domains, however it does contain the highly conserved cysteine located at the nucelophile elbow region typical of these domains. This cysteine been proposed to be a site of regulation of DJ-1 activity by oxidation. DJ-1 is a dimeric enzyme. 28871 cd03136: A subgroup of AraC transcriptional regulators having an N-terminal Type 1 glutamine amidotransferase (GATase1)-like domain. This group contains proteins similar to the Pseudomonas aeruginosa ArgR regulator. ArgR functions in the control of expression of certain genes of arginine biosynthesis and catabolism. AraC regulators are defined by a AraC-type helix-turn-helix DNA binding domain at their C-terminal. AraC family transcriptional regulators are widespread among bacteria and are involved in regulating diverse and important biological functions, including carbon metabolism, stress responses and virulence in different microorganisms. The catalytic triad typical of GATase1 domains is not conserved in this GATase1-like domain. However, in common with typical GATase1domains a reactive cys residue is found in some sequences in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. 28872 cd03137: A subgroup of AraC transcriptional regulators having a Type 1 glutamine amidotransferase (GATase1)-like domain. AraC regulators are defined by a AraC-type helix-turn-helix DNA binding domain at their C-terminal. AraC family transcriptional regulators are widespread among bacteria and are involved in regulating diverse and important biological functions, including carbon metabolism, stress responses and virulence in different microorganisms. The catalytic triad typical of GATase1 domains is not conserved in this GATase1-like domain. However, in common with typical GATase1domains a reactive cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. 28873 cd03138: A subgroup of AraC transcriptional regulators having a Type 1 glutamine amidotransferase (GATase1)-like domain. AraC regulators are defined by a AraC-type helix-turn-helix DNA binding domain at their C-terminal. AraC family transcriptional regulators are widespread among bacteria and are involved in regulating diverse and important biological functions, including carbon metabolism, stress responses and virulence in different microorganisms. The catalytic triad typical of GATase1 domains is not conserved in this GATase1-like domain. However, in common with typical GATase1domains a reactive cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. 28874 cd03139: Type 1 glutamine amidotransferase (GATase1)-like domain found in a subgroup of proteins similar to PfpI from Pyrococcus furiosus. PfpI is an ATP-independent intracellular proteases which may hydrolyze small peptides to provide a nutritional source. Only Cys of the catalytic triad typical of GATase1 domains is conserved in this group. This Cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. 28875 cd03140: Type 1 glutamine amidotransferase (GATase1)-like domain found in a subgroup of proteins similar to PfpI from Pyrococcus furiosus. PfpI is an ATP-independent intracellular proteases which may hydrolyze small peptides to provide a nutritional source. Only Cys of the catalytic triad typical of GATase1 domains is conserved in this group. This Cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. 28876 cd03141: Type 1 glutamine amidotransferase (GATase1)-like domain found in proteins similar to Escherichia coli Hsp31 protein (EcHsp31). This group includes EcHsp31 and Saccharomyces cerevisiae Ydr533c protein. EcHsp31 has chaperone activity. Ydr533c is upregulated in response to various stress conditions along with the heat shock family. EcHsp31 coordinates a metal ion using a 2-His-1-carboxylate motif present in various ions that use iron as a cofactor such as Carboxypeptidase A. The catalytic triad typical of GATase1 domains is not conserved in this GATase1-like domain. However, in common with a typical GATase1 domain, a reactive Cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. For EcHsp31, this Cys together with a different His and, an Asp (rather than a Glu) residue form a different catalytic triad from the typical GATase1 domain. For Ydr533c a catalytic triad forms from the conserved Cys together with a different His and Glu from that of the typical GATase1domain. Ydr533c protein and EcHsp31 are homodimers. 28877 cd03142: Type 1 glutamine amidotransferase (GATase1)-like domain found in Sinorhizobium meliloti Rm1021 ThuA (SmThuA). This group includes proteins similar to SmThuA which plays a role in a major pathway for trehalose catabolism. SmThuA is induced by trehalose but not by related structurally similar disaccharides like sucrose or maltose. Proteins in this group lack the catalytic triad of typical GATase1 domains: a His replaces the reactive Cys found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. S. meliloti Rm1021 thuA mutants are impaired in competitive colonization of Medicago sativa roots but are more competitive than the wild-type Rml021 in infecting alfalfa roots and forming nitrogen-fixing nodules. 28880 cd03145: Type 1 glutamine amidotransferase (GATase1)-like domain found in cyanophycinase. This group contains proteins similar to the extracellular cyanophycinases from Pseudomonas anguilliseptica BI (CphE) and Synechocystis sp. PCC 6803 CphB. Cyanophycinases are intracellular exopeptidases which hydrolyze the polymer cyanophycin (multi L-arginyl-poly-L-aspartic acid) to the dipeptide beta-Asp-Arg. Cyanophycinase is believed to be a serine-type exopeptidase having a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad typical of GATase1 domains by having a Ser in place of the reactive Cys at the nucleophile elbow. 28881 cd03146: Type 1 glutamine amidotransferase (GATase1)-like domain found in peptidase E. This group contains proteins similar to the aspartyl dipeptidases Salmonella typhimurium peptidase E and Xenopus laevis peptidase E. In bacteria peptidase E is believed to play a role in degrading peptides generated by intracellular protein breakdown or imported into the cell as nutrient sources. Peptidase E uniquely hydrolyses only Asp-X dipeptides (where X is any amino acid), and one tripeptide Asp-Gly-Gly. Peptidase E is believed to be a serine peptidase having a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad typical of GATase1 domains by having a Ser in place of the reactive Cys at the nucleophile elbow. Xenopus PepE is developmentally regulated in response to thyroid hormone and, it is thought to play a role in apoptosis during tail reabsorption. 28882 cd03147: Type 1 glutamine amidotransferase (GATase1)-like domain found in Saccharomyces cerevisiae Ydr533c protein. This group includes proteins similar to S. cerevisiae Ydr533c. Ydr533c is upregulated in response to various stress conditions along with the heat shock family. The catalytic triad typical of GATase1domains is not conserved in this GATase1-like domain. However, in common with a typical GATase1domain, a reactive Cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. This Cys together with a different His and Glu residue form a different catalytic triad from the typical GATase1domain. Ydr533c protein is a homodimer. 28883 cd03148: Type 1 glutamine amidotransferase (GATase1)-like domain found in Escherichia coli Hsp31 protein (EcHsp31). This group includes proteins similar to EcHsp31. EcHsp31 has chaperone activity. EcHsp31 coordinates a metal ion using a 2-His-1-carboxylate motif present in various ions that use iron as a cofactor such as Carboxypeptidase A. The catalytic triad typical of GATase1 domains is not conserved in this GATase1-like domain. However, in common with a typical GATase1domain, a reactive Cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. This Cys together with a different His and, an Asp (rather than a Glu) residue form a different catalytic triad from the typical GATase1 domain. EcHsp31 is a homodimer. 28884 cd03169: Type 1 glutamine amidotransferase (GATase1)-like domain found in a subgroup of proteins similar to PfpI from Pyrococcus furiosus. PfpI is an ATP-independent intracellular proteases which may hydrolyze small peptides to provide a nutritional source. Only Cys of the catalytic triad typical of GATase1 domains is conserved in this group. This Cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. 28885 cd00001: PTS_IIB, PTS system, Mannose/sorbose specific IIB subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. This family is one of four structurally and functionally distinct group IIB PTS system cytoplasmic enzymes, necessary for the uptake of carbohydrates across the cytoplasmic membrane and their phosphorylation. The active site histidine receives a phosphate group from the IIA subunit and transfers it to the substrate. 28886 cd00002: protein family related to YbaK Protein From H. Influenzae (HI1434); prokaryotic domain; function unknown; alignment contains insertion domains of prokaryotic prolyl-tRNA synthetases. 28888 cd00004: Sortase domain; transpeptidase of Gram-positive bacteria, cleaves surface proteins at the LPXTG motif between Thr and Gly and catalyzes the formation of an amide bond between the carboxyl group of Thr and the amino group of cell-wall crossbridges. In two different classes of sortases the N-terminus either functions as both a signal peptide for secretion and a stop-transfer signal for membrane anchoring, or it contains a signal peptide only and the C-terminus serves as a membrane anchor. 28889 cd00005: Family 9 carbohydrate-binding module (CBM), plays a role in microbial degradation of cellulose and hemicellulose found in plants; previously called cellulose-binding domain; the binding sites of the CBMs for which structures have been determined are of two general types: flat surfaces comprising predominantly aromatic residues tryptophan and tyrosine and extended shallow grooves; this domain frequently occurs in tandem. 28890 cd00006: PTS_IIA, PTS system, mannose/sorbose specific IIA subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. This family is one of four structurally and functionally distinct group IIA PTS system cytoplasmic enzymes, necessary for the uptake of carbohydrates across the cytoplasmic membrane and their phosphorylation. IIA subunits receive phosphoryl groups from HPr and transfer them to IIB subunits, which in turn phosphorylate the substrate. 28891 cd00007: 3'-5' exonuclease. The 35EXOc domain is responsible for the 3'-5' exonuclease proofreading activity of prokaryotic DNA polymerase I (pol I) and other enzymes, it catalyses the hydrolysis of unpaired or mismatched nucleotides. This domain consists of the amino-terminal half of the Klenow fragment in E. coli pol I. 35EXOc is also found in the Werner syndrome helicase (WRN), focus forming activity 1 protein (FFA-1) and ribonuclease D (RNase D).. 28892 cd00008: 5'-3' exonuclease; T5 type 5'-3' exonuclease domains may co-occur with DNA polymerase I (Pol I) domains, or be part of Pol I containing complexes. They digest dsDNA and ssDNA, releasing mono-,di- and tri-nucleotides, as well as oligonucleotides, and have also been reported to possess RNase H activity. Also called 5' nuclease family, involved in structure-specific cleavage of flaps formed by Pol I activity (similar to mammalian flap endonuclease I, FEN-1). A single nucleic acid strand may be threaded through the 5' nuclease enzyme before cleavage occurs. The domain binds two divalent metal ions which are necessary for activity. 28893 cd00009: AAA-superfamily of ATPases associated with a wide variety of cellular activities, including membrane fusion, proteolysis, and DNA replication. 28894 cd00010: AAI_LTSS; Alpha Amylase Inhibitor, Lipid Transfer and Seed Storage proteins. This domain is found in plant trypsin-alpha amylase inhibitors and plant lipid transfer and seed storage proteins. Lipid transfer proteins facilitate the transfer of lipids between natural or artificial membranes. Alpha-amylase inhibitors act on the enzymes responsible for cleavage of alpha-glucoside bonds present in glycogen and starch. 28895 cd00011: Arfaptin domain; arfaptin is a ubiquitously expressed protein implicated in mediating cross-talk between Rac, a member of the Rho family, and Arf small GTPases; Arfaptin binds to GTP-bound Arf1 and Arf6, but binds Rac.GTP and Rac.GDP with similar affinities. Structures of Arfaptin with Rac bound to either GDP or the slowly hydrolysable analogue GMPPNP show that the switch regions adopt similar conformations in both complexes. Arf1 and Arf6 are thought to bind to the same surface as Rac. 28896 cd00012: Actin; An ubiquitous protein involved in the formation of filaments that are a major component of the cytoskeleton. Interaction with myosin provides the basis of muscular contraction and many aspects of cell motility. Each actin protomer binds one molecule of ATP and either calcium or magnesium ions. Actin exists as a monomer in low salt concentrations, but filaments form rapidly as salt concentration rises, with the consequent hydrolysis of ATP. Polymerization is regulated by so-called capping proteins. The ATPase domain of actin shares similarity with ATPase domains of hexokinase and hsp70 proteins. 28897 cd00013: Actin depolymerisation factor/cofilin -like domains; present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. 28898 cd00014: Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav).. 28899 cd00015: Albumin domain, contains five or six internal disulphide bonds; albuminoid superfamily includes alpha-fetoprotein which binds various cations, fatty acids and bilirubin; vitamin D-binding protein which binds to vitamin D, its metabolites, and fatty acids; alpha-albumin which binds water, cations (such as Ca2+, Na+ and K+), fatty acids, hormones, bilirubin and drugs; and afamin of which little is known; these belong to a multigene family with highly conserved intron/exon organization and encoded protein structures; evolutionary comparisons strongly support vitamin D-binding protein as the original gene in this group with subsequent local duplications generating the remaining genes in the cluster. 28901 cd00017: Anaphylatoxin homologous domain; C3a, C4a and C5a anaphylatoxins are protein fragments generated enzymatically in serum during activation of complement molecules C3, C4, and C5. They induce smooth muscle contraction. These fragments are homologous to repeats in fibulins. 28902 cd00018: DNA-binding domain found in transcription regulators in plants such as APETALA2 and EREBP (ethylene responsive element binding protein). In EREBPs the domain specifically binds to the 11bp GCC box of the ethylene response element (ERE), a promotor element essential for ethylene responsiveness. EREBPs and the C-repeat binding factor CBF1, which is involved in stress response, contain a single copy of the AP2 domain. APETALA2-like proteins, which play a role in plant development contain two copies. 28903 cd00019: AP endonuclease family 2; These endonucleases play a role in DNA repair. Cleave phosphodiester bonds at apurinic or apyrimidinic sites; the alignment also contains hexulose-6-phosphate isomerases, enzymes that catalyze the epimerization of D-arabino-6-hexulose 3-phosphate to D-fructose 6-phosphate, via cleaving the phosphoesterbond with the sugar. . 28904 cd00020: Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model. 28905 cd00021: B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction. 28906 cd00022: Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger. 28907 cd00023: Bowman-Birk type proteinase inhibitor (BBI); family of plant serine protease inhibitors that block trypsin or chymotrypsin.They are either single-headed (one reactive site, one inactive site, present mainly in monocotyledonous seeds) or double-headed (two reactive sites, present mainly in dicotyledonous seeds).. 28908 cd00024: Chromatin organization modifier (chromo) domain is a conserved region of around 50 amino acids found in a variety of chromosomal proteins, which appear to play a role in the functional organization of the eukaryotic nucleus. Experimental evidence implicates the chromo domain in the binding activity of these proteins to methylated histone tails and maybe RNA. May occur as single instance, in a tandem arrangement or followd by a related ""chromo shadow"" domain. 28909 cd00027: Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks. 28910 cd00028: Bulb-type mannose-specific lectin. The domain contains a three-fold internal repeat (beta-prism architecture). The consensus sequence motif QXDXNXVXY is involved in alpha-D-mannose recognition. Lectins are carbohydrate-binding proteins which specifically recognize diverse carbohydrates and mediate a wide variety of biological processes, such as cell-cell and host-pathogen interactions, serum glycoprotein turnover, and innate immune responses. 28911 cd00029: Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis).. 28912 cd00030: Protein kinase C conserved region 2 (CalB); Ca2+-binding motif present in phospholipases, protein kinases C, and synaptotagmins (among others). Some do not appear to contain Ca2+-binding sites. Particular C2s appear to bind phospholipids, inositol polyphosphates,and intracellular proteins. Synaptotagmin and PLC C2s are permuted in sequence with respect to N- and C-terminal beta strands. 28913 cd00031: Cadherin repeat domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion; these domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium; plays a role in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-,CNR-,proto-,and FAT-family cadherin, desmocollin, and desmoglein, exists as monomers or dimers (hetero- and homo-); two copies of the repeat are present here. 28914 cd00032: Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs. 28915 cd00033: Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR); The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function. 28916 cd00034: Chromo Shadow Domain, found in association with N-terminal chromo (CHRromatin Organization MOdifier) domain; Chromo domains mediate the interaction of the heterochromatin with other heterochromatin proteins, thereby affecting chromatin structure (e.g. Drosophila and human heterochromatin protein (HP1) and mammalian modifier 1 and modifier 2). 28917 cd00035: Chitin binding domain, involved in recognition or binding of chitin subunits; fold analogous to hevein; occurs in plant and fungal proteins that bind N-acetylglucosamine, plant endochitinases, wound-induced proteins, and K.lactis killer toxin alpha subunit, occurs singly or multiply. 28918 cd00036: Chitin/cellulose binding domain. Putative carbohydrate binding domain found in many different glycosyl hydrolase enzymes. May occur in tandem arrangements. 28920 cd00038: effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels. 28921 cd00040: Granulocyte Macrophage Colony Stimulating Factor (GM-CSF) is a member of the large family of polypeptide growth factors called cytokines. It stimulates a wide variety of hematopoietic and nonhematopoietic cell types via binding to members of the cytokine receptor family, mainly the GM-CSF receptor. 28922 cd00041: CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast. 28923 cd00042: Cystatin-like domain; Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as kininogen, His-rich glycoprotein and fetuin also contain these domains. 28924 cd00043: Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain. 28925 cd00044: Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction. 28926 cd00045: Death effector domain. DED is part of a superfamily of death domains which also includes death-domain (DD) and caspase recruitment domain (CARD). Protein-protein interactions involving these domains occur through homotypic interactions, such as DED-DED. Caspases are the primary executioners of apoptosis via proteolytic cascades, and upstream caspases such as caspase-8 and caspase-9 are activated by signaling complexes such as the death inducing signaling complex (DISC) and the apoptosome. Binding of caspases to specific adaptor molecules via DED or CARD domains leads to autoactivation of caspases. 28927 cd00046: DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. 28928 cd00268: DEAD-box helicases. A diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP- binding region. 28929 cd00047: Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active. 28930 cd00048: Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases. 28931 cd00049: MH1 is a small DNA binding domain, binding in an unusal way involving a beta hairpin structure binding to the major groove. MH1 is present in Smad proteins, an important family of proteins involved in TGF-beta signalling and frequent targets of tumorigenic mutations. Also known as Domain A in dwarfin family proteins. 28932 cd00050: MH2 domain; C terminal domain of SMAD family proteins, responsible for receptor interaction, transactivation, and homo- and heterooligomerisation; also known as Domain B in dwarfin family proteins. 28933 cd00051: EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers. 28934 cd00052: Eps15 homology domain; found in proteins implicated in endocytosis, vesicle transport, and signal transduction. The alignment contains a pair of EF-hand motifs, typically one of them is canonical and binds to Ca2+, while the other may not bind to Ca2+. A hydrophobic binding pocket is formed by residues from both EF-hand motifs. The EH domain binds to proteins containing NPF (class I), [WF]W or SWG (class II), or H[TS]F (class III) sequence motifs. 28935 cd00053: Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium. 28936 cd00054: Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements. 28937 cd00055: Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies. 28938 cd00056: endonuclease III; includes endonuclease III (DNA-(apurinic or apyrimidinic site) lyase), alkylbase DNA glycosidases (Alka-family) and other DNA glycosidases. 28939 cd00057: Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes. 28940 cd00058: Acidic and basic fibroblast growth factor family; FGFs are mitogens, which stimulate growth or differentiation of cells of mesodermal or neuroectodermal origin. The family plays essential roles in patterning and differentiation during vertebrate embryogenesis, and has neurotrophic activities. FGFs have a high affinity for heparan sulfate proteoglycans and require heparan sulfate to activate one of four cell surface FGF receptors. Upon binding to FGF, the receptors dimerize and their intracellular tyrosine kinase domains become active. FGFs have internal pseudo-threefold symmetry (beta-trefoil topology).. 28941 cd00059: Forkhead (FH), also known as a ""winged helix"". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or ""wings"" in the C-terminal region, hence the term winged helix. 28942 cd00060: Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation).. 28943 cd00061: Fibronectin type 1 domain, approximately 40 residue long with two conserved disulfide bridges. FN1 is one of three types of internal repeats which combine to form larger domains within fibronectin. Fibronectin, a plasma protein that binds cell surfaces and various compounds including collagen, fibrin, heparin, DNA, and actin, usually exists as a dimer in plasma and as an insoluble multimer in extracellular matrices. Dimers of nearly identical subunits are linked by a disulfide bond close to their C-terminus. FN1 domains also found in coagulation factor XII, HGF activator, and tissue-type plasminogen activator. In tissue plasminogen activator, FN1 domains may form functional fibrin-binding units with EGF-like domains C-terminal to FN1. 28944 cd00062: Fibronectin Type II domain: FN2 is one of three types of internal repeats which combine to form larger domains within fibronectin. Fibronectin, a plasma protein that binds cell surfaces and various compounds including collagen, fibrin, heparin, DNA, and actin, usually exists as a dimer in plasma and as an insoluble multimer in extracellular matrices. Dimers of nearly identical subunits are linked by a disulfide bond close to their C-terminus. Fibronectin is composed of 3 types of modules, FN1,FN2 and FN3. The collagen binding domain contains four FN1 and two FN2 repeats. 28945 cd00063: Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases. 28946 cd00064: Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors. 28947 cd00065: FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;. 28949 cd00067: GAL4-like Zn2Cys6 binuclear cluster DNA-binding domain; found in transcription regulators like GAL4. Domain consists of two helices organized around a Zn(2)Cys(6 )motif; Binds to sequences containing 2 DNA half sites comprised of 3-5 C/G combinations. 28950 cd00068: G protein gamma subunit-like motifs, the alpha-helical G-gamma chain dimerizes with the G-beta propeller subunit as part of the heterotrimeric G-protein complex; involved in signal transduction via G-protein-coupled receptors. 28951 cd00069: Glycoprotein hormone beta chain homologues. Gonadotropins; reproductive hormones consisting of two glycosylated chains (alpha and beta) of similar topology with Cysteine-knot motifs. 28952 cd00070: Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation. 28953 cd00072: GYF domain: contains conserved Gly-Tyr-Phe residues; Proline-binding domain in CD2-binding and other proteins. Involved in signaling lymphocyte activity. Also present in other unrelated proteins (mainly unknown) derived from diverse eukaryotic species. 28954 cd00073: linker histone 1 and histone 5 domains; the basic subunit of chromatin is the nucleosome, consisting of an octamer of core histones, two full turns of DNA, a linker histone (H1 or H5) and a variable length of linker DNA; H1/H5 are chromatin-associated proteins that bind to the exterior of nucleosomes and dramatically stabilize the highly condensed states of chromatin fibers; stabilization of higher order folding occurs through electrostatic neutralization of the linker DNA segments, through a highly positively charged carboxy- terminal domain known as the AKP helix (Ala, Lys, Pro); thought to be involved in specific protein-protein and protein-DNA interactions and play a role in suppressing core histone tail domain acetylation in the chromatin fiber. 28955 cd00074: Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins. 28956 cd00075: Histidine kinase-like ATPases; This family includes several ATP-binding proteins for example: histidine kinase, DNA gyrase B, topoisomerases, heat shock protein HSP90, phytochrome-like ATPases and DNA mismatch repair proteins. 28957 cd00076: Histone H4, one of the four histones, along with H2A, H2B and H3, which forms the eukaryotic nucleosome core; along with H3, it plays a central role in nucleosome formation; histones bind to DNA and wrap the genetic material into ""beads on a string"" in which DNA (the string) is wrapped around small blobs of histones (the beads) at regular intervals; play a role in the inheritance of specialized chromosome structures and the control of gene activity; defects in the establishment of proper chromosome structure by histones may activate or silence genes aberrantly and thus lead to disease; the sequence of histone H4 has remained almost invariant in more than 2 billion years of evolution. 28958 cd00077: Metal dependent phosphohydrolases with conserved 'HD' motif. 28959 cd00078: HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains. 28960 cd00079: Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process. 28961 cd00080: Helix-hairpin-helix class 2 (Pol1 family) motif. HhH2 domains are found in Rad2 family of prokaryotic and eukaryotic replication and repair nucleases, i.e., DNA polymerase I, Taq DNA polymerase, DNA repair protein Rad2 endonuclease, flap endonuclease, exonuclease I and IX, 5'-3' exonuclease and also bacteriophage Rnase H. These nucleases degrade RNA-DNA or DNA-DNA duplexes, or both and play essential roles in DNA duplication, repair, and recombination. 28963 cd00082: His Kinase A (dimerization/phosphoacceptor) domain; Histidine Kinase A dimers are formed through parallel association of 2 domains creating 4-helix bundles; usually these domains contain a conserved His residue and are activated via trans-autophosphorylation by the catalytic domain of the histidine kinase, they subsequently transfer the phosphoryl group to the Asp acceptor residue of a response regulator protein. Two-component signalling systems, consisting of a histidine protein kinase that senses a signal input and a response regulator that mediates the output, are ancient and evolutionarily conserved signaling mechanisms in prokaryotes and eukaryotes. 28964 cd00083: Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins. 28965 cd00084: High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions. 28966 cd01388: SOX-TCF_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include SRY and its homologs in insects and vertebrates, and transcription factor-like proteins, TCF-1, -3, -4, and LEF-1. They appear to bind the minor groove of the A/T C A A A G/C-motif. 28967 cd01389: MATA_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include the fungal mating type gene products MC, MATA1 and Ste11. 28968 cd01390: HMGB-UBF_HMG-box, class II and III members of the HMG-box superfamily of DNA-binding proteins. These proteins bind the minor groove of DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions. 28969 cd00085: HNH nucleases; HNH endonuclease signature which is found in viral, prokaryotic, and eukaryotic proteins. The alignment includes members of the large group of homing endonucleases, yeast intron 1 protein, MutS, as well as bacterial colicins, pyocins, and anaredoxins. 28970 cd00086: Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner. 28971 cd00087: Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation. 28972 cd00088: Histidine Phosphotransfer domain, involved in signalling through a two part component systems in which an autophosphorylating histidine protein kinase serves as a phosphoryl donor to a response regulator protein; the response regulator protein is modulated by phosphorylation and dephosphorylation of a conserved aspartic acid residue; two-component proteins are abundant in most eubacteria; In E. coli there are 62 two-component proteins involved in a variety of processes such as chemotaxis, osmoregulation, metabolism and transport 1; also present in both Gram positive and Gram negative pathogenic bacteria where they regulate basic housekeeping functions and control expression of toxins and other proteins important for pathogenesis; in archaea and eukaryotes, two-component pathways constitute a very small number of all signaling systems; in fungi they mediate environmental stress responses and, in pathogenic yeast, hyphal development. In Dictyostelium and in plants, they are involved in important processes such as osmoregulation, cell growth, and differentiation; to date two-component proteins have not been identified in animals; in most prokaryotic systems, the output response is effected directly by the RR, which functions as a transcription factor while in eukaryotic systems, two-component proteins are found at the beginning of signaling pathways where they interface with more conventional eukaryotic signaling strategies such as MAP kinase and cyclic nucleotide cascades. 28973 cd00089: Protein kinase C-related kinase homology region 1 domain; also known as the ACC (antiparallel coiled-coil) finger domain or Rho-binding domain. Found in vertebrate PRK1 and yeast PKC1 protein kinases C; those found in rhophilin bind RhoGTP; those in PRK1 bind RhoA and RhoB. Rho family members function as molecular switches, cycling between inactive and active forms, controlling a variety of cellular processes. HR1 repeats often occur in tandem repeat arrangments, seperated by a short linker region. 28974 cd00090: Arsenical Resistance Operon Repressor and similar prokaryotic, metal regulated homodimeric repressors. ARSR subfamily of helix-turn-helix bacterial transcription regulatory proteins (winged helix topology). Includes several proteins that appear to dissociate from DNA in the presence of metal ions. 28975 cd00091: DNA/RNA non-specific endonuclease; prokaryotic and eukaryotic double- and single-stranded DNA and RNA endonucleases also present in phosphodiesterases. They exists as monomers and homodimers. 28976 cd00092: helix_turn_helix, cAMP Regulatory protein C-terminus; DNA binding domain of prokaryotic regulatory proteins belonging to the catabolite activator protein family. 28977 cd00093: Helix-turn-helix XRE-family like proteins. Prokaryotic DNA binding proteins belonging to the xenobiotic response element family of transcriptional regulators. 28978 cd00094: Hemopexin-like repeats.; Hemopexin is a heme-binding protein that transports heme to the liver. Hemopexin-like repeats occur in vitronectin and some matrix metalloproteinases family (matrixins). The HX repeats of some matrixins bind tissue inhibitor of metalloproteinases (TIMPs). This CD contains 4 instances of the repeat. 28979 cd00095: Interferon alpha, beta. Includes also interferon omega and tau. Different from interferon gamma family. Type I interferons(alpha, beta) belong to the larger helical cytokine superfamily, which includes growth hormones, interleukins, several colony-stimulating factors and several other regulatory molecules. All function as regulators of cellular activty by interacting with cell-surface receptors and activating various signalling pathways. Interferons produce antiviral and antiproliferative responses in cells. Receptor specificity determines function of the various members of the family. 28980 cd00096: Immunoglobulin domain family; members are components of immunoglobulins, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting 2 beta-sheets with a Trp packing against the disulfide bond. 28981 cd00098: Immunoglobulin domain constant region subfamily; members of the IGc subfamily are components of immunoglobulins, T-cell receptors, CD1 cell surface glycoproteins, secretory glycoproteins A/C, and Major Histocompatibility Complex (MHC) class I/II molecules. In immunoglobulins, each chain is composed of one variable domain (IGv) and one or more constant domains (IGc); these names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. T-cell receptors form heterodimers, pairing two chains (alpha/beta or gamma/delta), each with a IGv and IGc domain. MHCs form heterodimers pairing two chains (alpha/beta or delta/epsilon), each with a MHC and IGc domain. A predominant feature of most Ig domains is a disulfide bridge connecting 2 beta-sheets with a Trp packing against the disulfide bond. 28982 cd00099: Immunoglobulin domain variable region (v) subfamily; members of the IGv subfamily are components of immunoglobulins and T-cell receptors. In immunoglobulins, each chain is composed of one variable domain (IGv) and one or more constant domains (IGc); these names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. Within the variable domain, there are regions of even more variability called the hypervariable or complementarity-determining regions (CDRs) which are responsible for antigen binding. A predominant feature of most Ig domains is a disulfide bridge connecting 2 beta-sheets with a Trp packing against the disulfide bond. 28983 cd00931: Immunoglobulin domain cell adhesion molecule (cam) subfamily; members are components of neural cell adhesion molecules (N-CAM L1), Fasciclin II and the insect immune protein Hemolin. The subfamily also includes receptor domains such as as the extracelluar ligand binding domain of Fibroblast Growth Factor Receptor 2. Members are phylogenetically diverse, occuring throughout metazoa, and are not components of the adaptive immune system molecules found in jawed vertebrates. A predominant feature of most Ig domains is a disulfide bridge connecting 2 beta-sheets with a Trp packing against the disulfide bond. 28984 cd00100: Interleukin-1 homologes; Cytokines with various biological functions. Interleukin 1 alpha and beta are also known as hematopoietin and catabolin. This family also contains interleukin-1 receptor antagonists (inhibitors).. 28986 cd00102: Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers. 28987 cd00602: IPT domain of eukaryotic transcription factors NF-kappaB/Rel, nuclear factor of activated Tcells (NFAT), and recombination signal J-kappa binding protein (RBP-Jkappa). The IPT domains in these proteins are involved in DNA binding. Most NF-kappaB/Rel proteins form homo- and heterodimers, while NFAT proteins are largely monomeric (with TonEBP being an exception). While the majority of sequence-specific DNA binding elements are found in the N-terminal domain, several are found in the IPT domain in loops adjacent to, and including, the linker region. 28988 cd00603: IPT domain of Plexins and Cell Surface Receptors (PCSR) and related proteins . This subgroup contains IPT domains of plexins, receptors, like the plasminogen-related growth factor receptors, the hepatocyte growth factor-scatter factors, and the macrophage-stimulating receptors and of fibrocystin. Plexins are involved in the regulation of cell proliferation and of cellular adhesion and repulsion receptors. In general, there are three copies of the IPT_PCSR domain present preceeded by SEMA (semaphorin) and PSI (plexin, semaphorin, integrin) domains. 28989 cd00604: IPT domain (domain D) of cyclodextrin glycosyltransferase (CGTase) and similar enzymes. These enzymes are involved in the enzymatic hydrolysis of alpha-1,4 linkages of starch polymers and belong to the glycosyl hydrolase family 13. Most consist of three domains (A,B,C) but CGTase is more complex and has two additional domains (D,E). The function of the IPT/D domain is unknown. 28990 cd01175: IPT domain of the COE family (Col/Olf-1/EBF) of non-basic, helix-loop-helix (HLH)-containing transcription factors. COE family proteins are all transcription factors and play an important role in variety of developmental processes. Mouse EBF is involved in the regulation of the early stages of B-cell differentiation, Drosophila collier is a regulator of the head patterning, and a related protein in Xenopus is involved in primary neurogenesis. All COE family members have a well conserved DNA binding domain that contains an atypical Zn finger motif. The function of the IPT domain is unknown. 28991 cd01176: IPT domain of the recombination signal Jkappa binding protein (RBP-Jkappa). RBP-J kappa, was initially considered to be involved in V(D)J recombination because of its DNA binding specificity and structural similarity to site-specific recombinases known as the integrase family. Further studies indicated that RBP-J kappa functions as a repressor of transcription, via destabilization of the general transcription factor IID and recruitment of histone deacetylase complexes. 28992 cd01177: IPT domain of the transcription factor NFkappaB and related transcription factors. NFkappaB is considered a central regulator of stress responses, activated by different stressful conditions, including physical stress, oxidative stress, and exposure to certain chemicals. NFkappaB blocking cell apoptosis in several cell types, gives it an important role in cell proliferation and differentiation. 28993 cd01178: IPT domain of the NFAT family of transcription factors. NFAT transcription complexes are a target of calcineurin, a calcium dependent phosphatase, and activate genes mainly involved in cell-cell-interaction. 28994 cd01179: Second repeat of the IPT domain of Plexins and Cell Surface Receptors (PCSR) . Plexins are involved in the regulation of cell proliferation and of cellular adhesion and repulsion receptors. In general, there are three copies of the IPT domain present preceeded by SEMA (semaphorin) and PSI (plexin, semaphorin, integrin) domains. 28995 cd01180: First repeat of the IPT domain of Plexins and Cell Surface Receptors (PCSR) . Plexins are involved in the regulation of cell proliferation and of cellular adhesion and repulsion receptors. In general, there are three copies of the IPT domain present preceeded by SEMA (semaphorin) and PSI (plexin, semaphorin, integrin) domains. 28996 cd01181: Third repeat of the IPT domain of Plexins and Cell Surface Receptors (PCSR) . Plexins are involved in the regulation of cell proliferation and of cellular adhesion and repulsion receptors. In general, there are three copies of the IPT domain present preceeded by SEMA (semaphorin) and PSI (plexin, semaphorin, integrin) domains. 28997 cd00103: Interferon Regulatory Factor (IRF); also known as tryptophan pentad repeat. The family of IRF transcription factors is important in the regulation of interferons in response to infection by virus and in the regulation of interferon-inducible genes. The IRF family is characterized by a unique 'tryptophan cluster' DNA-binding region. Viral IRFs bind to cellular IRFs; block type I and II interferons and host IRF-mediated transcriptional activation. 28998 cd00104: Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD. 28999 cd01327: Kazal-type pancreatic secretory trypsin inhibitors (PSTI) and related proteins, including the second domain of the ovomucoid turkey inhibitor and the C-terminal domain of the esophagus cancer-related gene-2 protein (ECRG-2), are members of the superfamily of kazal-type proteinase inhibitors and follistatin-like proteins. 29000 cd01328: Follistatin-like SPARC (secreted protein, acidic, and rich in cysteines) domain; SPARC/BM-40/osteonectin is a multifunctional glycoprotein which modulates cellular interaction with the extracellular matrix by its binding to structural matrix proteins such as collagen and vitronectin. The protein it composed of an N-terminal acidic region, a follistatin (FS) domain and an EF-hand calcium binding domain. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a small hydrophobic core of alpha/beta structure (Kazal domain) and has five disulfide bonds and a conserved N-glycosylation site. The FSL_SPARC domain is a member of the superfamily of kazal-like proteinase inhibitors and follistatin-like proteins. 29001 cd01330: The kazal-type serine protease inhibitor domain has been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The KAZAL_SLC21 domain is a member of the superfamily of kazal-like proteinase inhibitors and follistatin-like proteins. 29002 cd00105: K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension. 29003 cd02393: Polynucleotide phosphorylase (PNPase) K homology RNA-binding domain (KH). PNPase is a polyribonucleotide nucleotidyl transferase that degrades mRNA in prokaryotes and plant chloroplasts. The C-terminal region of PNPase contains domains homologous to those in other RNA binding proteins: a KH domain and an S1 domain. KH domains bind single-stranded RNA and are found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. 29004 cd02394: K homology RNA-binding domain_vigilin_like. The vigilin family is a large and extended family of multiple KH-domain proteins, including vigilin, also called high density lipoprotein binding protien (HBP), fungal Scp160 and bicaudal-C. Yeast Scp160p has been shown to bind RNA and to associate with both soluble and membrane-bound polyribosomes as a mRNP component. Bicaudal-C is a RNA-binding molecule believed to function in embryonic development at the post-transcriptional level. In general, KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. 29005 cd02395: Splicing factor 1 (SF1) K homology RNA-binding domain (KH). Splicing factor 1 (SF1) specifically recognizes the intron branch point sequence (BPS) UACUAAC in the pre-mRNA transcripts during spliceosome assembly. We show that the KH-QUA2 region of SF1 defines an enlarged KH (hnRNP K) fold which is necessary and sufficient for BPS binding. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. 29006 cd02396: K homology RNA-binding domain, PCBP_like. Members of this group possess KH domains in a tandem arrangement. Most members, similar to the poly(C) binding proteins (PCBPs) and Nova, containing three KH domains, with the first and second domains, which are represented here, in tandem arrangement, followed by a large spacer region, with the third domain near the C-terminal end of the protein. The poly(C) binding proteins (PCBPs) can be divided into two groups, hnRNPs K/J and the alphaCPs, which share a triple KH domain configuration and poly(C) binding specificity. They play roles in mRNA stabilization, translational activation, and translational silencing. Nova-1 and Nova-2 are nuclear RNA-binding proteins that regulate splicing. This group also contains plant proteins that seem to have two tandem repeat arrrangements, like Hen4, a protein that plays a role in AGAMOUS (AG) pre-mRNA processing and important step in plant development. In general, KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. 29007 cd00107: The ""knottin"" fold is stable cysteine-rich scaffold, in which one disulfide bridge crosses the macrocycle made by two other disulfide bridges and the connecting backbone segments. Members include plant lectins/antimicrobial peptides, plant proteinase/amylase inhibitors, plant gamma-thionins, and arthropod defensins. 29008 cd00108: Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides. 29009 cd00109: BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. 29010 cd00110: Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules. 29011 cd00111: P or trefoil or TFF domain; Trefoil factor family domain peptides are mucin-associated molecules, largely found in epithelia of gastrointestinal tissues. Function is not known but it was originally identified from mucosal tissues, where it may have a regulatory or structural role and has also been implicated as a growth fractor in other tissues.The domain is found in 1 to 6 copies where it occurs. 29012 cd00112: Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure. 29013 cd00114: NAD+ dependent DNA ligase adenylation domain. DNA ligases catalyze the crucial step of joining the breaks in duplex DNA during DNA replication, repair and recombination, utilizing either ATP or NAD(+) as a cofactor, but using the same basic reaction mechanism. The enzyme reacts with the cofactor to form a phosphoamide-linked AMP with the amino group of a conserved Lysine in the KXDG motif, and subsequently transfers it to the DNA substrate to yield adenylated DNA. This alignment contains members of the NAD+ dependent subfamily only. 29014 cd00115: Low molecular weight phosphatase family;. 29015 cd00116: Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1).. 29016 cd00117: Ly-6 antigen / uPA receptor -like domain; occurs singly in GPI-linked cell-surface glycoproteins (Ly-6 family,CD59, thymocyte B cell antigen, Sgp-2) or as three-fold repeated domain in urokinase-type plasminogen activator receptor. Topology of these domains is similar to that of snake venom neurotoxins. 29017 cd00118: Lysin domain, found in a variety of enzymes involved in bacterial cell wall degradation. This domain may have a general peptidoglycan binding function. 29018 cd00119: C-type lysozyme (1, 4-beta-N-acetylmuramidase, LYZ) and alpha-lactalbumin (lactose synthase B protein, LA). They have a close evolutionary relationship and similar tertiary structure, however, functionally they are quite different. Lysozymes have primarily bacteriolytic function; hydrolysis of peptidoglycan of prokaryotic cell walls and transglycosylation. LA is a calcium-binding metalloprotein that is expressed exclusively in the mammary gland during lactation. LA is the regulatory subunit of the enzyme lactose synthase. The association of LA with the catalytic component of lactose synthase, galactosyltransferase, alters the acceptor substrate specificity of this glycosyltransferase, facilitating biosynthesis of lactose. 29019 cd00120: MADS: MCM1, Agamous, Deficiens, and SRF (serum response factor) box family of eukaryotic transcriptonal regulators. Binds DNA and exists as hetero and homo-dimers. Composed of 2 main subgroups: SRF-like/Type I and MEF2-like (myocyte enhancer factor 2)/ Type II. These subgroups differ mainly in position of the alpha 2 helix responsible for the dimerization interface; Important in homeotic regulation in plants and in immediate-early development in animals. Also found in fungi. 29020 cd00265: MEF2 (myocyte enhancer factor 2)-like/Type II subfamily of MADS ( MCM1, Agamous, Deficiens, and SRF (serum response factor) box family of eukaryotic transcriptional regulators. Binds DNA and exists as hetero and homo-dimers. Differs from SRF-like/Type I subgroup mainly in position of the alpha helix responsible for the dimerization interface. Important in homeotic regulation in plants and in immediate-early development in animals. Also found in fungi. 29021 cd00266: SRF-like/Type I subfamily of MADS (MCM1, Agamous, Deficiens, and SRF (serum response factor) box family of eukaryotic transcriptional regulators. Binds DNA and exists as hetero- and homo-dimers. Differs from the MEF-like/Type II subgroup mainly in position of the alpha 2 helix responsible for the dimerization interface. Important in homeotic regulation in plants and in immediate-early development in animals. Also found in fungi. 29023 cd00122: MeCP2, MBD1, MBD2, MBD3, MBD4, CLLD8-like, and BAZ2A-like proteins constitute a family of proteins that share the methyl-CpG-binding domain (MBD). The MBD consists of about 70 residues and is defined as the minimal region required for binding to methylated DNA by a methyl-CpG-binding protein which binds specifically to methylated DNA. The MBD can recognize a single symmetrically methylated CpG either as naked DNA or within chromatin. MeCP2, MBD1 and MBD2 (and likely MBD3) form complexes with histone deacetylase and are involved in histone deacetylase-dependent repression of transcription. MBD4 is an endonuclease that forms a complex with the DNA mismatch-repair protein MLH1. The MBDs present in putative chromatin remodelling subunit, BAZ2A, and putative histone methyltransferase, CLLD8, represent two phylogenetically distinct groups within the MBD protein family. 29024 cd01395: Methyl-CpG binding domains (MBD) present in putative histone methyltransferases (HMT) such as CLLD8 and SETDB1 proteins; CLLD8 contains a MBD, a PreSET and a bifurcated SET domain, suggesting that CLLD8 might be associated with methylation-mediated transcriptional repression. SETDB1 and other proteins in this group have a similar domain architecture. SETDB1 is a novel KAP-1-associated histone H3, lysine 9-specific methyltransferase that contributes to HP1-mediated silencing of euchromatic genes by KRAB zinc-finger proteins. 29025 cd01396: MeCP2, MBD1, MBD2, MBD3, and MBD4 are members of a protein family that share the methyl-CpG-binding domain (MBD). The MBD, consists of about 70 residues and is defined as the minimal region required for binding to methylated DNA by a methyl-CpG-binding protein which binds specifically to methylated DNA. The MBD can recognize a single symmetrically methylated CpG either as naked DNA or within chromatin. MeCP2, MBD1 and MBD2 (and likely MBD3) form complexes with histone deacetylase and are involved in histone deacetylase-dependent repression of transcription. MBD4 is an endonuclease that forms a complex with the DNA mismatch-repair protein MLH1. 29026 cd01397: Methyl-CpG binding domains (MBD) present in putative chromatin remodelling factor such as BAZ2A; BAZ2A contains a MBD, DDT, PHD-type zinc finger and Bromo domain suggesting that BAZ2A might be associated with histone acetyltransferase (HAT) activity. The Drosophila melanogaster toutatis protein, a putative subunit of the chromatin-remodeling complex, and other such proteins in this group share a similar domain architecture with BAZ2A, as does the Caenorhabditis elegans flectin homolog. 29028 cd00126: Pancreatic Hormone domain, a regulator of pancreatic and gastrointestinal functions; neuropeptide Y (NPY)b, peptide YY (PYY), and pancreatic polypetide (PP) are closely related; propeptide is enzymatically cleaved to yield the mature active peptide with amidated C-terminal ends; receptor binding and activation functions may reside in the N- and C-termini respectively; occurs in neurons, intestinal endocrine cells, and pancreas; exist as monomers and dimers . 29029 cd00127: Dual specificity phosphatases (DSP); Ser/Thr and Tyr protein phosphatases. Structurally similar to tyrosine-specific phosphatases but with a shallower active site cleft and a distinctive active site signature motif, HCxxGxxR. Characterized as VHR- or Cdc25-like. 29030 cd00128: Xeroderma pigmentosum G N- and I-regions (XPGN, XPGI); contains the HhH2 motif; domain in nucleases. XPG is a eukaryotic enzyme that functions in nucleotide-excision repair and transcription-coupled repair of oxidative DNA damage. Functionally/structurally related to FEN-1; divalent metal ion-dependent exo- and endonuclease, and bacterial and bacteriophage 5'3' exonucleases. 29031 cd00129: PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions. 29032 cd01098: Plant PAN/APPLE-like domain; present in plant S-receptor protein kinases and secreted glycoproteins. PAN/APPLE domains fulfill diverse biological functions by mediating protein-protein or protein-carbohydrate interactions. S-receptor protein kinases and S-locus glycoproteins are involved in sporophytic self-incompatibility response in Brassica, one of probably many molecular mechanisms, by which hermaphrodite flowering plants avoid self-fertilization. 29033 cd01099: Subfamily of PAN/APPLE-like domains; present in N-terminal (N) domains of plasminogen/hepatocyte growth factor proteins, and various proteins found in Bilateria, such as leech anti-platelet proteins. PAN/APPLE domains fulfill diverse biological functions by mediating protein-protein or protein-carbohydrate interactions. 29034 cd01100: Subfamily of PAN/APPLE-like domains; present in plasma prekallikrein/coagulation factor XI, microneme antigen proteins, and a few prokaryotic proteins. PAN/APPLE domains fulfill diverse biological functions by mediating protein-protein or protein-carbohydrate interactions. 29035 cd00130: PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction. 29036 cd00131: Paired Box domain. 29037 cd00132: PAK (p21 activated kinase) Binding Domain (PBD), binds Cdc42p- and/or Rho-like small GTPases; also known as the Cdc42/Rac interactive binding (CRIB) motif; has been shown to inhibit transcriptional activation and cell transformation mediated by the Ras-Rac pathway. CRIB-containing effector proteins are functionally diverse and include serine/threonine kinases, tyrosine kinases, actin-binding proteins, and adapter molecules. 29038 cd01093: PAK (p21 activated kinase) Binding Domain (PBD), binds Cdc42p- and/or Rho-like small GTPases; also known as the Cdc42/Rac interactive binding (CRIB) motif; has been shown to inhibit transcriptional activation and cell transformation mediated by the Ras-Rac pathway. This subgroup of CRIB/PBD-domains is found N-terminal of Serine/Threonine kinase domains in PAK and PAK-like proteins. 29039 cd00133: PTS system, lactose/cellobiose specific IIB subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. This family is one of four structurally and functionally distinct group IIB PTS system cytoplasmic enzymes, necessary for the uptake of carbohydrates across the cytoplasmic membrane and their phosphorylation. 29040 cd00134: Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD. 29041 cd00135: Platelet-derived and vascular endothelial growth factors (PDGF, VEGF) family domain; PDGF is a potent activator for cells of mesenchymal origin; PDGF-A and PDGF-B form AA and BB homodimers and an AB heterodimer; VEGF is a potent mitogen in embryonic and somatic angiogenesis with a unique specificity for vascular endothelial cells; VEGF forms homodimers and exists in 4 different isoforms; overall, the VEGF monomer resembles that of PDGF, but its N-terminal segment is helical rather than extended; the cysteine knot motif is a common feature of this domain. 29042 cd00136: PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein. 29043 cd00986: PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins. 29044 cd00987: PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins. 29045 cd00988: PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins. 29046 cd00989: PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins. 29047 cd00990: PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins. 29048 cd00991: PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins. 29049 cd00992: PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases. 29051 cd00138: Phospholipase D. Active site motifs; The PLD superfamily includes enzymes involved in signal transduction, lipid biosynthesis, endonucleases and open reading frames in pathogenic viruses and bacteria. PLD hydrolyzes the terminal phosphodiester bond of phospholipids to phosphatidic acid and a hydrophilic constituent. Phosphatidic acid is a compound that is heavily involved in signal transduction. The common features of the family members are that they can bind to a phosphodiester moiety, and that most of these enzymes are active as bi-lobed monomers or dimers. 29053 cd00140: Beta clamp domain. The beta subunit (processivity factor) of DNA polymerase III holoenzyme, refered to as the beta clamp, forms a ring shaped dimer that encircles dsDNA (sliding clamp) in bacteria. The beta-clamp is structurally similar to the trimeric ring formed by PCNA (found in eukaryotes and archaea) and the processivity factor (found in bacteriophages T4 and RB69). This structural correspondence further substantiates the mechanistic connection between eukaryotic and prokaryotic DNA replication that has been suggested on biochemical grounds. . 29054 cd00141: DNA polymerase X family; includes vertebrate DNA polymerase beta and terminal deoxynucleotidyltransferase. An N-terminal 8kD domain and a 31kD C-terminal polymerase domain are connected with a protease-sensitive hinge. The activity of the N-terminal domain seems to be variable, in DNA polymerase beta it has metal dependent nuclease activity and metal independent lyase activity. 29055 cd00142: Phosphoinositide 3-kinase, catalytic domain; Phosphoinositide 3-kinase isoforms participate in a variety of processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, and apoptosis. These homologues may be either lipid kinases and/or protein kinases: the former phosphorylate the 3-position in the inositol ring of inositol phospholipids. The ataxia telangiectesia-mutated gene product, the targets of rapamycin (TOR) and the DNA-dependent kinase have not been found to possess lipid kinase activity. Some of this family possess PI-4 kinase activities. 29056 cd00891: Phosphoinositide 3-kinase (PI3K), catalytic domain; PI3Ks phosphorylate the 3-position in the inositol ring of inositol phospholipids. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, and apoptosis. They can be divided into 3 main classes, defined by their substrate specificity and domain structure. 29058 cd00893: Phosphoinositide 4-kinase (PI4K), catalytic domain; PI4K phosphorylates hydroxyl group at position 4 on the inositol ring of phosphoinositide, the first commited step in the phosphatidylinositol cycle. 29059 cd00894: Phosphoinositide 3-kinase (PI3K) class I, catalytic domain; Phosphoinositide 3-kinase isoforms participate in a variety of processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, and apoptosis. They phosphorylate the 3-position in the inositol ring of inositol phospholipids. PI3K class I prefer phosphoinositol (4,5)-bisphosphate as substrate. Mammalian members interact with active Ras. They form heterodimers with adapter molecules linking them to different signalling pathways. 29060 cd00895: Phosphoinositide 3-kinase (PI3K) class II, catalytic domain; Phosphoinositide 3-kinase isoforms participate in a variety of processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, and apoptosis. They phosphorylate the 3-position in the inositol ring of inositol phospholipids. PI3K class II phosphorylate phosphoinositol (PtdIns), PtdIns(4)-phosphate, but not PtdIns(4,5)-bisphosphate. They are larger, having a C2 domain at the C-terminus. 29061 cd00896: Phosphoinositide 3-kinase (PI3K) class III, catalytic domain; Phosphoinositide 3-kinase isoforms participate in a variety of processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, and apoptosis. They phosphorylate the 3-position in the inositol ring of inositol phospholipids. PI3Ks class III phosphorylate phosphoinositol (PtdIns) only. The prototypical PI3K class III, yeast Vps34, is involved in trafficking proteins from Golgi to the vacuole. 29062 cd00143: Serine/threonine phosphatases, family 2C, catalytic domain; The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity. 29063 cd00144: Protein phosphatase 2A homologues, catalytic domain. Large family of serine/threonine phosphatases, including PP1, PP2A and PP2B (calcineurin) family members. 29064 cd00145: DNA polymerase type-B family; DNA directed DNA polymerase. Posseses DNA binding, polymerase and 3'-5' exonuclease activity. 29065 cd00146: polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases. 29066 cd00147: Cytoplasmic phospholipase A2, catalytic subunit; cytosolic phospholipases A2 hydrolyse arachidonyl phospholipids; family includes phospholipase B isoforms. 29067 cd00148: Profilin binds actin monomers, membrane polyphosphoinositides such as PI(4,5)P2, and poly-L-proline. Profilin can inhibit actin polymerization into F-actin by binding to monomeric actin (G-actin) and terminal F-actin subunits, but - as a regulator of the cytoskeleton - it may also promote actin polymerization. It plays a role in the assembly of branched actin filament networks, by activating WASP via binding to WASP's proline rich domain. Profilin may link the cytoskeleton with major signalling pathways by interacting with components of the phosphatidylinositol cycle and Ras pathway. 29068 cd00152: Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. ""Lo ng"" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers. 29070 cd00155: Guanine nucleotide exchange factor for Ras-like small GTPases. Small GTP-binding proteins of the Ras superfamily function as molecular switches in fundamental events such as signal transduction, cytoskeleton dynamics and intracellular trafficking. Guanine-nucleotide-exchange factors (GEFs) positively regulate these GTP-binding proteins in response to a variety of signals. GEFs catalyze the dissociation of GDP from the inactive GTP-binding proteins. GTP can then bind and induce structural changes that allow interaction with effectors. 29071 cd00156: Signal receiver domain; originally thought to be unique to bacteria (CheY, OmpR, NtrC, and PhoB), now recently identified in eukaroytes ETR1 Arabidopsis thaliana; this domain receives the signal from the sensor partner in a two-component systems; contains a phosphoacceptor site that is phosphorylated by histidine kinase homologs; usually found N-terminal to a DNA binding effector domain; forms homodimers. 29073 cd00158: Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins. 29074 cd01443: Cdc25 enzymes are members of the Rhodanese Homology Domain (RHOD) superfamily. Also included in this CD are eukaryotic arsenate resistance proteins such as Saccharomyces cerevisiae Acr2p and similar proteins. Cdc25 phosphatases activate the cell division kinases throughout the cell cycle progression. Cdc25 phosphatases dephosphorylate phosphotyrosine and phosphothreonine residues, in order to activate their Cdk/cyclin substrates. The Cdc25 and Acr2p RHOD domains have the signature motif (H/YCxxxxxR).. 29075 cd01444: GlpE sulfurtransferase (ST) and homologs are members of the Rhodanese Homology Domain superfamily. Unlike other rhodanese sulfurtransferases, GlpE is a single domain protein but indications are that it functions as a dimer. The active site contains a catalytically active cysteine. 29076 cd01445: Thiosulfate sulfurtransferases (TST) contain 2 copies of the Rhodanese Homology Domain. Only the second repeat contains the catalytically active Cys residue. The role of the 1st repeat is uncertain, but believed to be involved in protein interaction. This CD aligns the 1st and 2nd repeats. 29077 cd01446: N-terminal regulatory rhodanese domain of dual specificity phosphatases (DSP), such as Mapk Phosphatase. This domain is believed to determine substrate specificity by binding the substrate, such as ERK2, and activating the C-terminal catalytic domain by inducing a conformational change. This domain has homology to the Rhodanese Homology Domain. 29078 cd01447: Polysulfide-sulfurtransferase - Rhodanese Homology Domain. This domain is believed to serve as a polysulfide binding and transferase domain in anaerobic gram-negative bacteria, functioning in oxidative phosphorylation with polysulfide-sulfur as a terminal electron acceptor. The active site contains the same conserved cysteine that is the catalytic residue in other Rhodanese Homology Domain proteins. 29079 cd01448: Thiosulfate sulfurtransferase (TST), N-terminal, inactive domain. TST contains 2 copies of the Rhodanese Homology Domain; this is the 1st repeat, which does not contain the catalytically active Cys residue. The role of the 1st repeat is uncertain, but it is believed to be involved in protein interaction. 29080 cd01449: Thiosulfate sulfurtransferase (TST), C-terminal, catalytic domain. TST contains 2 copies of the Rhodanese Homology Domain; this is the second repeat. Only the second repeat contains the catalytically active Cys residue. 29081 cd01518: Member of the Rhodanese Homology Domain superfamily. This CD includes Escherichia coli YceA, Bacillus subtilis YbfQ, and similar uncharacterized proteins. 29082 cd01519: Member of the Rhodanese Homology Domain superfamily. This CD includes the heat shock protein 67B2 of Drosophila melanogaster and other similar proteins, many of which are uncharacterized. 29083 cd01520: Member of the Rhodanese Homology Domain superfamily. This CD includes several putative ATP /GTP binding proteins including E. coli YbbB. 29084 cd01521: Member of the Rhodanese Homology Domain superfamily. This CD includes the putative rhodanese-like protein, Psp2, of Yersinia pestis biovar Medievalis and other similar uncharacterized proteins. 29085 cd01522: Member of the Rhodanese Homology Domain superfamily, subgroup 1. This CD includes the putative rhodanese-related sulfurtransferases of several uncharacterized proteins. 29086 cd01523: Member of the Rhodanese Homology Domain superfamily. This CD includes predicted proteins with rhodanese-like domains found N-terminal of the metallo-beta-lactamase domain. 29087 cd01524: Member of the Rhodanese Homology Domain superfamily. Included in this CD are the Lactococcus lactis NADH oxidase, Bacillus cereus NADH dehydrogenase, and Bacteroides thetaiotaomicron pyridine nucleotide-disulphide oxidoreductase, and similar rhodanese-like domains found C-terminal of the pyridine nucleotide-disulphide oxidoreductase (Pyr-redox) domain and the Pyr-redox dimerization domain. 29088 cd01525: Member of the Rhodanese Homology Domain superfamily. Included in this CD are the rhodanese-like domains found C-terminal of the serine/threonine protein kinases catalytic (S_TKc) domain and the Tre-2, BUB2p, Cdc16p (TBC) domain. The putative active site Cys residue is not present in this CD. 29089 cd01526: Member of the Rhodanese Homology Domain superfamily. This CD includes several putative molybdopterin synthase sulfurylases including the molybdenum cofactor biosynthetic protein (CnxF) of Aspergillus nidulans and the molybdenum cofactor synthesis protein 3 (MOCS3) of Homo sapiens. These rhodanese-like domains are found C-terminal of the ThiF and MoeZ_MoeB domains. 29090 cd01527: Member of the Rhodanese Homology Domain superfamily. This CD includes Escherichia coli YgaP, and similar uncharacterized putative rhodanese-related sulfurtransferases. 29091 cd01528: Member of the Rhodanese Homology Domain superfamily, subgroup 2. Subgroup 2 includes uncharacterized putative rhodanese-related domains. 29092 cd01529: Member of the Rhodanese Homology Domain superfamily. This CD includes putative rhodanese-related sulfurtransferases which contain 4 copies of the Rhodanese Homology Domain. Only the second and most of the fourth repeats contain the putative catalytic Cys residue. This CD aligns the 1st , 2nd, 3rd, and 4th repeats. 29093 cd01530: Cdc25 phosphatases are members of the Rhodanese Homology Domain superfamily. They activate the cell division kinases throughout the cell cycle progression. Cdc25 phosphatases dephosphorylate phosphotyrosine and phosphothreonine residues, in order to activate their Cdk/cyclin substrates. Cdc25A phosphatase functions to regulate S phase entry and Cdc25B is required for G2/M phase transition of the cell cycle. The Cdc25 domain binds oxyanions at the catalytic site and has the signature motif (H/YCxxxxxR).. 29094 cd01531: Eukaryotic arsenate resistance proteins are members of the Rhodanese Homology Domain superfamily. Included in this CD is the Saccharomyces cerevisiae arsenate reductase protein, Acr2p, and other yeast and plant homologs. 29095 cd01532: Member of the Rhodanese Homology Domain superfamily, repeat 1. This CD includes putative rhodanese-related sulfurtransferases which contain 4 copies of the Rhodanese Homology Domain. This CD aligns the 1st repeat which does not contain the putative catalytic Cys residue. 29096 cd01533: Member of the Rhodanese Homology Domain superfamily, repeat 2. This CD includes putative rhodanese-related sulfurtransferases which contain 4 copies of the Rhodanese Homology Domain. This CD aligns the 2nd repeat which does contain the putative catalytic Cys residue. 29097 cd01534: Member of the Rhodanese Homology Domain superfamily, repeat 3. This CD includes putative rhodanese-related sulfurtransferases which contain 4 copies of the Rhodanese Homology Domain. This CD aligns the 3rd repeat which does not contain the putative catalytic Cys residue. 29098 cd01535: Member of the Rhodanese Homology Domain superfamily, repeat 4. This CD includes putative rhodanese-related sulfurtransferases which contain 4 copies of the Rhodanese Homology Domain. This CD aligns the 4th repeat which, in general, contains the putative catalytic Cys residue. 29100 cd00160: Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. 29101 cd00161: Ricin-type beta-trefoil; Carbohydrate-binding domain formed from presumed gene triplication. The domain is found in a variety of molecules serving diverse functions such as enzymatic activity, inhibitory toxicity and signal transduction. Highly specific ligand binding occurs on exposed surfaces of the compact domain sturcture. 29102 cd00162: RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H). 29103 cd00163: Pancreatic ribonucleases (RNAse) are pyrimidine-specific endonucleases found in high quantity in the pancreas of certain mammals and of some reptiles. Involved in endonucleolytic cleavage of 3'-phosphomononucleotides and 3'-phosphooligonucleotides ending in C-P or U-P with 2',3'-cyclic phosphate intermediates. Catalytic mechanism is a transphosphorylation of P-O 5' bonds on the 3' side of pyrimidines and subsequent hydrolysis to generate 3' phosphate groups. Other family members include: bovine seminal vesicle and brain ribonucleases; kidney non-secretory ribonucleases; liver-type ribonucleases; angiogenin, which induces vascularization of normal and malignant tissues; eosinophil cationic protein A cytotoxin and helminthotoxin with ribonuclease activity; and frog liver ribonuclease and frog sialic acid-binding lectin. 29104 cd00164: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. Multiple copies of S1 are arranged in tandem. S1 is structurally similar to Cold Shock Domains (CSD), both are members of the Oligonucleotide/oligosaccharide Binding (OB) superfamily. 29105 cd00165: S4/Hsp/ tRNA synthetase RNA-binding domain; The domain surface is populated by conserved, charged residues that define a likely RNA-binding site; Found in stress proteins, ribosomal proteins and tRNA synthetases; This may imply a hitherto unrecognized functional similarity between these three protein classes. 29106 cd00166: Sterile alpha motif.; Widespread domain in signalling and nuclear proteins. In EPH-related tyrosine kinases, appears to mediate cell-cell initiated signal transduction via the binding of SH2-containing proteins to a conserved tyrosine that is phosphorylated. In many cases mediates homodimerization. 29107 cd00167: 'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA. 29108 cd00168: SCP / Tpx-1 / Ag5 / PR-1 / Sc7 family of extracellular domains.; Human glioma pathogenesis-related protein GliPR and the plant pathogenesis-related protein represent functional links between plant defense systems and human immune system. This family has no known function. 29109 cd00169: Chemokine: small cytokines, including a number of secreted growth factors and interferons involved in mitogenic, chemotactic, and inflammatory activity; distinguished from other cytokines by their receptors, which are G-protein coupled receptors; divided into 4 subfamilies based on the arrangement of the two N-terminal cysteines; some members can bind multiple receptors and many chemokine receptors can bind more than one chemokine; this redundancy allows precise control in stimulating the immune system and in contributing to the homeostasis of a cell; when expressed inappropriately, chemokines play a role in autoimmune diseases, vascular irregularities, graft rejection, neoplasia, and allergies; exist as monomers, dimers and multimers, but are believed to function as monomers; found only in vertebrates and a few viruses. See CDs: Chemokine_CXC (cd00273), Chemokine_CC (cd00272), Chemokine_C (cd00271), and Chemokine_CX3C (cd00274) for chemokine subgroups. 29110 cd00271: Chemokine_C, C or lymphotactin subgroup, 1 of 4 subgroup designations of chemokines based on the arrangement of two N-terminal, conserved cysteine residues. Most of the known chemokines (cd00169) belong to either the CC (cd00272) or CXC (cd00273) subclass. The two other subclasses each have a single known member: fractalkine for the CX3C (cd00274) class and lymphotactin for the C (cd00271) class. Chemokine_Cs differ structurally since they contain only one of the two disulfide bridges that are conserved in all other chemokines and they possess a unique C-terminal extension, which is required for biological activity and thought to play a role in receptor binding. Lymphotactin, a mediator of mucosal immunity, has been found to chemoattract neutrophils and B cells through the XCR1 receptor and thought to be a factor in acute allograft rejection and inflammatory bowel disease. 29111 cd00272: Chemokine_CC: 1 of 4 subgroup designations based on the arrangement of the two N-terminal cysteine residues; includes a number of secreted growth factors and interferons involved in mitogenic, chemotactic, and inflammatory activity; some members (e.g. 2HCC) contain an additional disulfide bond which is thought to compensate for the highly conserved Trp missing in these; chemotatic for monocytes, macrophages, eosinophils, basophils, and T cells, but not neutrophils; exist as monomers and dimers, but are believed to be functional as monomers; found only in vertebrates and a few viruses; a subgroup of CC, identified by an N-terminal DCCL motif (Exodus-1, Exodus-2, and Exodus-3), has been shown to inhibit specific types of human cancer cell growth in a mouse model. See CDs: Chemokine (cd00169) for the general alignment of chemokines, or Chemokine_CXC (cd00273), Chemokine_C (cd00271), and Chemokine_CX3C (cd00274) for the additional chemokine subgroups, and Chemokine_CC_DCCL for the DCCL subgroup of this CD. 29112 cd00273: Chemokine_CXC: 1 of 4 subgroup designations based on the arrangement of the two N-terminal cysteine residues; includes a number of secreted growth factors and interferons involved in mitogenic, chemotactic, and inflammatory activity; many members contain an RCxC motif which may be a general requirement for binding to CXC chemokine receptors; those with the ELR motif are chemotatic for neutrophils and have been shown to be angiogenic, while those without the motif act on T and B cells, and are typically angiostatic; exist as monomers and dimers, but are believed to be functional as monomers; found only in vertebrates and a few viruses. See CDs: Chemokine (cd00169) for the general alignment of chemokines, or Chemokine_CC (cd00272), Chemokine_C (cd00271), and Chemokine_CX3C (cd00274) for the additional chemokine subgroups. 29113 cd00274: Chemokine_CX3C: 1 of 4 subgroup designations based on the arrangement of the two N-terminal cysteines; differ structurally from the other subgroups in that they are attached to a membrane-spanning domain via a mucin-like stalk and can be proteolytically cleaved to a freely diffusible form; chemotatic for T cells, monocytes, and natural killer cells; function as monomers and are found only in vertebrates and a few viruses; currently only fractalkine (sometimes called neurotactin) has been identified as a member of this subfamily; the primary source of fractalkine is neurons, and they exhibit cell adhesion and chemoattractive properties in the central nervous system. See CDs: Chemokine (cd00169) for the general alignment of chemokines, or Chemokine_CXC (cd00273), Chemokine_CC (cd00272), and Chemokine_C (cd00271) for the additional chemokine subgroups. 29114 cd01119: Chemokine_CC_DCCL: subgroup of the Chemokine_CC subgroup based on the presence of a DCCL motif involving the two N-terminal cysteine residues; includes a number of small inducible cytokines capable of reversibly inhibiting normal hematopoietic progenitor proliferation by blocking progression through the cell cycle; DCCL subgroup contains Exodus-1 (also known as CCL20, MIP-3alpha, LARC, ST38 (mouse)), Exodus-2 (also known as CCL21, SLC, 6-Ckine, TCA4, CKbeta9), and Exodus-3 (also known as CCL-19, ELC, MIP-3beta, CKbeta11). Exodus-3 was shown to inhibit the growth of human breast cancer cells in vivo in a mouse model; Exodus-1, -2, and -3 were all shown to significantly inhibit chronic myelogenous leukemia progenitor cell proliferation; Exodus-2 and -3 show potent immunotherapeutic activity toward solid tumors; chemotatic for T cells, B cells, dendritic cells, macrophage progenitor cells, and NK cells; exist as monomers and dimers, but are believed to be functional as monomers; found only in vertebrates. See CDs: Chemokine_CC (cd00272) for the entire CC subgroup, Chemokine (cd00169) for the general alignment of chemokines, or Chemokine_CXC (cd00273), Chemokine_C (cd00271), and Chemokine_CX3C (cd00274) for the additional chemokine subgroups. 29115 cd00170: Sec14p-like lipid-binding domain. Found in secretory proteins, such as S. cerevisiae phosphatidylinositol transfer protein (Sec14p), and in lipid regulated proteins such as RhoGAPs, RhoGEFs and neurofibromin (NF1). SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits. 29116 cd00171: Sec7 domain; Domain named after the S. cerevisiae SEC7 gene product. The Sec7 domain is the central domain of the guanine-nucleotide-exchange factors (GEFs) of the ADP-ribosylation factor family of small GTPases (ARFs) . It carries the exchange factor activity. 29117 cd00172: SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. 29118 cd02043: SERine Proteinase INhibitors (serpins), plant specific subgroup. It has been suggested that plant serpins play a role in defense against insect predators. This subgroup corresponds to clade P of the serpin superfamily. In general, serpins exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. 29119 cd02044: ovalbumin family of serpins (ov-serpins). Family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). This subgroup corresponds to clade B of the serpin superfamily. In general, serpins exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. 29120 cd02045: Antithrombin is a serine proteinase inhibitor (serpin) which controls the process of coagulation. It is the most important anticoagulant molecule in mammalian circulation systems, controlled by its interaction with the co-factor, heparin, which accelerates its interaction with target proteases, such as thrombin and factor Xa. This subgroup corresponds to clade C of the serpin superfamily. 29121 cd02046: Heat shock protein 47 (Hsp47), also called colligin, because of its collagen binding ability, is a chaperone specific for procollagen. It has been shown to be essential for collagen biosynthesis, but its exact function is still unclear. Hsp47 is a non-inhibitory member of the SERPIN superfamily and corresponds to clade H. 29122 cd02047: Heparin cofactor II (HCII) inhibits thrombin, the final protease of the coagulation cascade. HCII is allosterically activated by binding to cell surface glycosaminoglycans (GAGs). The specificity of HCII for thrombin is conferred by a highly acidic hirudin-like N-terminal tail, which becomes available after GAG binding for interaction with the anion-binding exosite I of thrombin. This subgroup corresponds to clade D of the serpin superfamily. 29123 cd02048: Neuroserpin is a inhibitory member of the SERine Proteinase INhibitor (serpin) family that reacts preferentially with tissue-type plasminogen activator (tPA). It is located in neurons in regions of the brain where tPA is also found, suggesting that neuroserpin is the selective inhibitor of tPA in the central nervous system (CNS). This subgroup corresponds to clade I of the serpin superfamily. 29124 cd02049: SERine Proteinase INhibitors (serpins), prokaryotic subgroup. Little information about specific functions is available for this subgroup, most likely they are inhibitory members of the serpin superfamily. In general, serpins exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors. 29125 cd02050: C1 inhibitor (C1-Inh) is a protease inhibitor of the serpin family. It plays a pivotal role in regulating the activation of the classical complement pathway and of the contact system, via regulating bradykinin formation, inhibiting factor XII and kallikrein of the contact system, and via acting on factor XI in the coagulation cascade. This subgroup corresponds to clade G of the serpin superfamily. 29126 cd02051: Plasminogen activator inhibitor-1_like. Plasminogen activator inhibitor-1 (PAI-1) is the primary, fast-acting inhibitor of plasminogen activators. It is often bound to vitronectin, an abundant component of the extracellular matrix in many tissues. Protease nexin-1 is a potent serpin able to inhibit thrombin, plasmin, and plasminogen activators. PAI-1 and nexin-1 are members of the serpin superfamily and represent clade E. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. 29127 cd02052: Pigment epithelium-derived factor (PEDF)_like. PEDF is non-inhibitory member of the Serpin superfamily. It exhibits neurotrophic, neuroprotective and antiangiogenic properties and is widely expressed in the developing and adult nervous systems. This subgroup corresponds to clade F1 of the serpin superfamily. 29128 cd02053: Alpha2-antiplasmin (alpha2AP) is the primary inhibitor of plasmin, a proteinase that digests fibrin, the main component of blood clots. Alpha2-Antiplasmin forms an inactive 1 : 1 stoichiometric complex with plasmin. It also rapidly crosslinks to fibrin during blood clotting by activated coagulation factor XIII, and as a consequence fibrin becomes more resistant to fibrinolysis. Therefore alpha2AP is important in modulating the effectiveness and persistence of fibrin with respect to its susceptibility to digestion and removal by plasmin. This subgroup corresponds to clade F2 of the serpin superfamily. 29129 cd02054: Angiotensinogen is part of the renin-angiotensin system (RAS), which plays an important role in blood pressure regulation, renal haemodynamics, fluid and electrolyte homeostasis. It is also involved in normal and abnormal growth processes. The growth promoting actions of angiotensin have been shown in a variety of cells and tissues. This subgroup represents clade A8 of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. 29130 cd02055: Protein Z-dependent protease inhibitor (ZPI) is a member of the serpin superfamily of proteinase inhibitors (clade A10). ZPI inhibits coagulation factor Xa , dependent on protein Z (PZ), a vitamin K-dependent plasma protein. ZPI also inhibits factor XIa in a process that does not require PZ. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. 29131 cd02056: alpha-1-antitrypsin_like. This family contains a variety of different members of clade A of the serpin superfamily. They include the classical serine proteinase inhibitors, alpha-1-antitrypsin and alpha-1-antichymotrypsin, protein C inhibitor, kallistatin, and noninhibitory serpins, like corticosteroid and thyroxin binding globulins. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. 29132 cd02057: Maspin (mammary serine proteinase inhibitor), a member of the serpin superfamily, with a multitude of effects on cells and tissues at an assortment of developmental stages. Maspin has tumor suppressing activity against breast and prostate cancer. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. 29133 cd02058: Plasminogen Activator Inhibitor-2 (PAI-2). PAI-2 is a serine protease inhibitor that belongs to the ov-serpin branch of the serpin superfamily. It is is an effective inhibitor of urinary plasminogen activator (urokinase or uPA) and is involved in cell differentiation, tissue growth and regeneration. 29134 cd02059: The ovalbumin_like group of serpins contains ovalbumin, the squamous cell carcinoma antigen 1 (SCCA1) and other closely related serpins of clade B of the serpin superfamily. Ovalbumin, the major protein component of avian egg white, is a non-inhibitory member of SERine Proteinase INhibitorS (serpins). In contrast, SCCA1 inhibits cysteine proteinases such as cathepsin S, K, L, and papain, a so called cross-class serpin. 29135 cd00173: Src homology 2 domains; Signal transduction, involved in recognition of phosphorylated tyrosine (pTyr). SH2 domains typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. 29136 cd00174: Src homology 3 domains; SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies. 29137 cd00175: Staphylococcal nuclease homologues. SNase homologues are found in bacteria, archaea, and eukaryotes. They contain no disufide bonds. 29138 cd00176: Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here. 29139 cd00177: START(STeroidogenic Acute Regulatory (STAR) related lipid Transfer) Domain. These domains are 200-210 amino acid in length and occur in proteins involved in lipid transport (phosphatidylcholine) and metabolism, signal transduction, and transcriptional regulation. The most striking feature of the START domain structure is a predominantly hydrophobic tunnel extending nearly the entire protein and used to binding a single molecule of large lipophilic compounds, like cholesterol. 29140 cd00178: Soybean trypsin inhibitor (Kunitz) family of protease inhibitors. Inhibit proteases by binding with high affinity to their active sites. Trefoil fold, common to interleukins and fibroblast growth factors. 29141 cd00179: Syntaxin N-terminus domain; syntaxins are nervous system-specific proteins implicated in the docking of synaptic vesicles with the presynaptic plasma membrane; they are a family of receptors for intracellular transport vesicles; each target membrane may be identified by a specific member of the syntaxin family; syntaxins contain a moderately well conserved amino-terminal domain, called Habc, whose structure is an antiparallel three-helix bundle; a linker of about 30 amino acids connects this to the carboxy-terminal region, designated H3 (t_SNARE), of the syntaxin cytoplasmic domain; the highly conserved H3 region forms a single, long alpha-helix when it is part of the core SNARE complex and anchors the protein on the cytoplasmic surface of cellular membranes; H3 is not included in defining this domain. 29142 cd00180: Serine/Threonine protein kinases, catalytic domain. Phosphotransferases of the serine or threonine-specific kinase subfamily. The enzymatic activity of these protein kinases is controlled by phosphorylation of specific residues in the activation segment of the catalytic domain, sometimes combined with reversible conformational changes in the C-terminal autoregulatory tail. 29143 cd00181: Taxis toward Aspartate and Related amino acids and Homologs (TarH). The Tar chemoreceptor of Escherichia coli mediates attractant responses to aspartate, maltose, and phenol, repellent responses to Ni2+ and Co2+, and thermoresponses. These transmembrane signalers monitor the chemical environment by means of specific ligand-binding sites arrayed on the periplasmic side of the membrane, and in turn control cytoplasmic signals that modulate the flagellar rotational machinery. Aspartate is detected through direct binding to Tar molecules, whereas maltose is detected indirectly when complexed with the periplasmic maltose-binding protein. 29144 cd00182: T-box DNA binding domain of the T-box family of transcriptional regulators. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors. 29145 cd00183: N-terminal domain (domain I) of transcription elongation factor S-II (TFIIS); similar to a domain found in elongin A and CRSP70; likely to be involved in transcription; domain I from TFIIS interacts with RNA polymerase II holoenzyme. 29146 cd00184: Tumor Necrosis Factor; TNF superfamily members include the cytokines: TNF (TNF-alpha), LT (lymphotoxin-alpha, TNF-beta), CD40 ligand, Apo2L (TRAIL), Fas ligand, and osteoprotegerin (OPG) ligand. These proteins generally have an intracellular N-terminal domain, a short transmembrane segment, an extracellular stalk, and a globular TNF-like extracellular domain of about 150 residues. They initiate apoptosis by binding to related receptors, some of which have intracellular death domains. They generally form homo- or hetero- trimeric complexes.TNF cytokines bind one elongated receptor molecule along each of three clefts formed by neighboring monomers of the trimer with ligand trimerization a requiste for receptor binding. 29147 cd00185: Tumor necrosis factor receptor (TNFR) domain; superfamily of TNF-like receptor domains. When bound to TNF-like cytokines, TNFRs trigger multiple signal transduction pathways, they are involved in inflammation response, apoptosis, autoimmunity and organogenesis. TNFRs domains are elongated with generally three tandem repeats of cysteine-rich domains (CRDs). They fit in the grooves between protomers within the ligand trimer. Some TNFRs, such as NGFR and HveA, bind ligands with no structural similarity to TNF and do not bind ligand trimers. 29149 cd00187: DNA Topoisomerase, subtype IIA; domain A'; bacterial DNA topoisomerase IV (C subunit, ParC), bacterial DNA gyrases (A subunit, GyrA),mammalian DNA toposiomerases II. DNA topoisomerases are essential enzymes that regulate the conformational changes in DNA topology by catalysing the concerted breakage and rejoining of DNA strands during normal cellular growth. 29151 cd00189: Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here. 29152 cd00190: Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. 29153 cd00191: Thyroglobulin type I repeats.; The N-terminal region of human thyroglobulin contains 11 type-1 repeats TY repeats are proposed to be inhibitors of cysteine proteases. 29154 cd00192: Tyrosine kinase, catalytic domain. Phosphotransferases; tyrosine-specific kinase subfamily. Enzymes with TyrKc domains belong to an extensive family of proteins which share a conserved catalytic core common to both serine/threonine and tyrosine protein kinases. Enzymatic activity of tyrosine protein kinases is controlled by phosphorylation of specific tyrosine residues in the activation segment of the catalytic domain or a C-terminal tyrosine (tail) residue with reversible conformational changes. 29155 cd00193: Soluble NSF (N-ethylmaleimide-sensitive fusion protein)-Attachment protein (SNAP) REceptor domain; these alpha-helical motifs form twisted and parallel heterotetrameric helix bundles; the core complex contains one helix from a protein that is anchored in the vesicle membrane (synaptobrevin), one helix from a protein of the target membrane (syntaxin), and two helices from another protein anchored in the target membrane (SNAP-25); their interaction forms a core which is composed of a polar zero layer, a flanking leucine-zipper layer acts as a water tight shield to isolate ionic interactions in the zero layer from the surrounding solvent. 29156 cd00194: Ubiquitin Associated domain. The UBA domain is a commonly occurring sequence motif in some members of the ubiquitination pathway, UV excision repair proteins, and certain protein kinases. Although its specific role is so far unknown, it has been suggested that UBA domains are involved in conferring protein target specificity. The domain, a compact three helix bundle, has a conserved GFP-loop and the proline is thought to be critical for binding. The UBA domain is distinct from the conserved three helical domain seen in the N-terminus of EF-TS and eukaryotic NAC proteins. 29157 cd00195: Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD. 29158 cd00153: This CD represents the C-terminal Ras-associating (RA) domain of three closely related guanine-nucleotide exchange factors (GEF's), Ral guanine nucleotide dissociation stimulator (RalGDS), RalGDS-like (RGL), and RalGDS-like factor (RLF). The RalGDS proteins are downstream effectors of the Ras-related protein Ral, providing a mechanism for Ral activation by extracellular signals. The RA domain is structurally similar to ubiquitin and exists in a number of other signalling proteins including AF6, rasfadin, SNX27, CYR1, and STE50. 29159 cd00196: Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate. 29160 cd00565: ThiS (ThiaminS) is a sulfur carrier protein involved in thiamin biosynthesis in bacteria. The ThiS fold, like those of two closely related proteins MoaD and Urm1, is similar to that of ubiquitin although there is little or no sequence similarity. 29161 cd00754: MoaD family. Members of this family are involved in biosynthesis of the molybdenum cofactor (Moco), an essential cofactor of a diverse group of redox enzymes. Moco biosynthesis is an evolutionarily conserved pathway present in eubacteria, archaea and eukaryotes. Moco contains a tricyclic pyranopterin, termed molybdopterin (MPT), that contains the cis-dithiolene group responsible for molybdenum ligation. This dithiolene group is generated by MPT synthase, the second major step in Moco biosynthesis. MPT synthase consists of a large (MoeE) and small (MoaD) subunit. The small subunit is inserted into the lare subunit to form the active site. The small subunit, which is structurally similar to ubiquitin, contains a C-terminal thiocarboxylated glycine residue that serves as a sulfur donor for the synthesis of the MPT dithiolene group. . 29162 cd01611: GABARAP (GABA-receptor-associated protein) belongs ot a large family of proteins that mediate intracellular membrane trafficking and/or fusion. GABARAP binds not only to GABA, type A but also to tubulin, gephrin, and ULK1. Orthologues of GABARAP include Gate-16 (golgi-associated ATPase enhancer), LC3 (microtubule-associated protein light chain 3), and ATG8 (autophagy protein 8). ATG8 is a ubiquitin-like protein that is conjugated to the membrane phospholipid, phosphatidylethanolamine as part of a ubiquitin-like conjugation system essential for autophagosome-formation. 29163 cd01612: APG12_C The carboxy-terminal ubiquitin-like domain of APG12. Autophagy is a process in which cytoplasmic components are delivered to the lysosome/vacuole for degradation. Autophagy requires a ubiquitin-like protein conjugation system, in which APG12 is covalently bound to APG5. 29164 cd01617: DCX The ubiquitin-like DCX domain is present in tandem within the N-terminal half of the doublecortin protein. Doublecortin is expressed in migrating neurons. Mutations in the gene encoding doublecortin cause lissencephaly in males and 'double-cortex syndrome' in females. 29165 cd01760: The ras-binding domain (RBD) of the serine/threonine kinase raf is structurally quite similar to the beta-grasp fold of ubiquitin. A raf-like RBD is also present in RGS12 and other members of a family of GTPase activating proteins and TIAM1, a guanine nucleotide exchange protein. 29166 cd01763: Small ubiquitin-related modifier (SUMO) proteins are conjugated to numerous intracellular targets and serve to modulate protein interaction, localization, activity or stability. SUMO (also known as ""Smt3"" and ""sentrin"" in other organisms) is linked to several different pathways, including nucleocytoplasmic transport. Attachment of SUMO to targets proteins is stimulated by PIAS (Protein inhibitor of activated STATs) proteins which serve as E3-like ligases. 29167 cd01764: Urm1 (Ubiquitin-Related Modifier1) The Urm1 fold, like those of two closely related proteins MoaD (molybdopterin synthase) and ThiS (sulfur carrier protein), is similar to that of ubiquitin although there is little or no sequence similarity. The C-terminal glycines of Urm1 are conjugated to an E1-like protein Uba4 as part of a novel conjugation system in yeast. The Urm1 fold is found only in eukaryotes. 29168 cd01766: Ufm1 (ubiquitin-fold modifier 1) is a post-translational UBL (ubiquitin-like) modifier with a tertiary structure similar to that of ubiquitin. Ufm1 is initially expressed as a precursor which undergoes C-terminal cleavage to expose a conserved glycine residue that is required for the conjugation reactions involving Ufm1. 29169 cd01767: The UBX (ubiquitin regulatory X) domain has a beta-grasp fold that is structurally quite similar to ubiquitin although UBX lacks the c-terminal double glycine motif and is thus unlikely to be conjugated to other proteins. Most UBX-containing proteins including p47, FAF1, and SAKS1 (Y33K) also contain a UBA (ubiquitin-associated) domain and are thought to serve as adaptor molecules that shuttle proteins to the proteasome for degradation. 29170 cd01768: The RA (Ras-associating) domain is structurally similar to ubiquitin and is present in one or two copies in a number of signalling molecules that bind and regulate a small GTPase called Ras or the Ras-related GTPases, Ral and Rap. RA-containing proteins include RalGDS, AF6, RIN1, RASSF1, SNX27, CYR1, STE50, and phospholipase C epsilon. 29171 cd01769: UBLs function by remodeling the surface of their target proteins, changing their target's half-life, enzymatic activity, protein-protein interactions, subcellular localization or other properties. At least 10 different ubiquitin-like modifications exist in mammals, and attachment of different ubls to a target leads to different biological consequences. Ubl-conjugation cascades are initiated by activating enzymes, which also coordinate the ubls with their downstream pathways. 29172 cd01770: p47_UBX p47 is an adaptor molecule of the cytosolic AAA ATPase p97. The principal role of the p97-p47 complex is to regulate membrane fusion events. Mono-ubiquitin recognition by p47 is crucial for p97-p47-mediated Golgi membrane fusion events. p47 has carboxy-terminal SEP and UBX domains. The UBX domain has a beta-grasp fold similar to that of ubiquitin however, UBX lacks the c-terminal double glycine motif and is thus unlikely to be conjugated to other proteins. 29173 cd01771: Faf1 (fas-associated factor1) is a nucleolar protein that was first identified as an interaction partner of the death receptor Fas. Faf1 contains N-terminal UAS (ubiquitin-associated) and C-terminal UBX (ubiquitin-like) domains and is closely related to other UBA/UBX-containing proteins like p47, Rep8 and SAKS1. Faf1 is thought to be involved in 18S rRNA synthesis and/or 40S ribosomal subunit assembly. The UBX domain has a beta-grasp fold similar to that of ubiquitin however, UBX lacks the c-terminal double glycine motif and is thus unlikely to be conjugated to other proteins. 29174 cd01772: SAKS1 (SAPK-substrate-1), also known as Y33K, is a widely expressed protein containing N-terminal UBA (ubiquitin-associated) and C-terminal UBX (ubiqiutin-like) domains that was identified as a substrate of stress-activated protein kinases (SAPKs). SAKS1 is related evolutionarily to two other UBA/UBX-containing proteins, p47 and Faf1. The UBA and UBX domains of SAKS1 bind ubiquitin tetramers and valosin-containing protein (VCP), respectively suggesting a role for SAKS1 as an adaptor that directs VCP to polyubiquitinated proteins facilitating its destruction by the proteasome. The UBX domain has a beta-grasp fold similar to that of ubiquitin however, UBX lacks the c-terminal double glycine motif and is thus unlikely to be conjugated to other proteins. 29175 cd01773: Faf1_like1 is a protein of unknown function with a domain architecture that includes the UAS (ubiquitin-associated) and UBX (ubiquitin-like) domains. This protein is related to other UBA/UBX-containing proteins like Faf1, p47, and SAKS1 and may serve as an adaptor molecule that shuttles proteins to the proteasome for degradation. The UBX domain has a beta-grasp fold similar to that of ubiquitin however, UBX lacks the c-terminal double glycine motif and is thus unlikely to be conjugated to other proteins. 29176 cd01774: Faf1_like2 is a protein of unknown function with a domain architecture that includes the UAS (ubiquitin-associated) and UBX (ubiquitin-like) domains. This protein is related to other UBA/UBX-containing proteins like Faf1, p47, and SAKS1 and may serve as an adaptor molecule that shuttles proteins to the proteasome for degradation. The UBX domain has a beta-grasp fold similar to that of ubiquitin however, UBX lacks the c-terminal double glycine motif and is thus unlikely to be conjugated to other proteins. 29177 cd01775: CYR1 is a fungal adenylate cyclase with at least four domains, an N-terminal RA (Ras association) domain, a middle leucine-rich repeat domain, a catalytic domain. The N-terminal RA domain of CYR1 post-translationally modifies a small GTPase called Ras. The Ras-CYR1 pathway has been implicated in the transduction of a glucose-triggered signal to an intracellular environment where a protein phosphorylation cascade is initiated by cyclic AMP. 29178 cd01776: Rin1_RA RIN1 is a RAS effector that binds with specificity and high affinity to activated RAS via its carboxy-terminal RA (RAS-associated) domain. RIN1 competes directly with RAF1 for RAS binding and is thought to divert signaling away from RAF and the MAPK pathway while also shunting RAS signals through alternate pathways. In addition, Rin1 and Rin2 are Rab5-binding proteins, binding preferentially to the GTP-bound form, that enhance the GDP-GTP exchange reaction on Rab5 that regulate the docking and fusion processes of endocytic vesicles. In addition to the RA domain, RIN1 and RIN2 have an SH2 (Src homology 2) domain, a proline-rich SH3 domain, and a Vps9 domain. 29179 cd01777: SNX27_RA SNX27 (sorting nexin protein 27) belongs to a large family of endosome-localized proteins related to sorting nexin1 which is implicated in regulating membrane traffic. The domain architecture of SNX27 includes an amino-terminal PDZ domain, a PX (PhoX homologous) domain, and a carboxy-terminal RA (RAS-associated) domain. 29180 cd01778: RASSF1 (also known as RASSF3 and NORE1) is a tumour suppressor protein with a C-terminal Ras-associating (RA) domain that binds Ras. RASSF1 also binds the proapoptotic protein kinase MST1 and is thus thought to regulate the proapoptotic signalling pathway. RASSF1 also associates with microtubule-associated proteins like MAP1B and regulates tubulin polymerization. RASSF1 also binds CDC20 and regulates mitosis by inhibiting the anaphase-promoting complex and preventing degradation of cyclin A and cyclin B until the spindle checkpoint becomes fully operational. 29181 cd01779: Myosin_IXb_RA RasGTP binding domain from guanine nucleotide exchange factors. In some proteins the domain acts as a RasGTP effector (AF6, canoe and RalGDS, for example), but in other cases it may not bind to RasGTP at all. 29182 cd01780: PLC_epsilon_RA Phosphatidylinositide-specific phospholipase C (PLC) is a signaling enzyme that hydrolyzes membrane phospholipids to generate inositol triphosphate. PLC-epsilon represents a novel forth class of PLC that has a PLC catalytic core domain, a CDC25 guanine nucleotide exchange factor domain and two RA (Ras-association) domains of which the second is critical for Ras activation of the enzyme. 29183 cd01781: The AF-6 protein (also known as afadin and canoe) is a multidomain cell junction protein that contains two N-terminal Ras-associating (RA) domains in addition to FHA (forkhead-associated), DIL (class V myosin homology region), and PDZ domains and a proline-rich region. AF6 acts downstream of the Egfr (Epidermal Growth Factor-receptor)/Ras signalling pathway and provides a link from Egfr to cytoskeletal elements. 29184 cd01782: The AF-6 protein (also known as afadin and canoe) is a multidomain cell junction protein that contains two N-terminal Ras-associating (RA) domains in addition to FHA (forkhead-associated), DIL (class V myosin homology region), and PDZ domains and a proline-rich region. AF6 acts downstream of the Egfr (Epidermal Growth Factor-receptor)/Ras signalling pathway and provides a link from Egfr to cytoskeletal elements. 29185 cd01783: DAGK_delta_RA Diacylgylcerol kinase (DAGK) phosphorylates the second messenger diacylglycerol to phosphatidic acid as part of a protein kinase C pathway. Nine mammalian DAGK isotypes have been identified, which are classified into five subgroups according to their domain architecture and the DAGK-delta and -theta isozymes, which fall into one such group, contain an RA (Ras-associated) domain. DAGKs also contain a conserved catalytic domain (DAGKc), an assesory domain (DAGKa), and an array of conserved motifs that are likely to play a role in lipid-protein and protein-protein interactions in various DAG/PA-dependent signalling pathways. 29186 cd01784: rasfadin_RA Rasfadin (RASSF2) belongs to a family of Ras effectors/tumor suppressors that includes RASSF1 and NORE1. RASSF2 binds directly to K-Ras in a GTP-dependent manner via its RA (RAS-associated) domain. RASSF2 promotes apoptosis and cell cycle arrest and is frequently down-regulated in lung tumor cell lines. 29187 cd01785: PDZ_GEF_RA PDZ-GEF is a guanine nucleotide exchange factor (GEF) characterised by the presence of a PSD-95/DlgA/ZO-1 (PDZ) domain, a Ras-association (RA) domain and a region related to a cyclic nucleotide binding domain (RCBD). RA-GEF exchanges nucleotides of both Rap1 and Rap2, but is also thought to mediate cAMP-induced Ras activation. The RA domain interacts with Rap1 and also contributes to the membrane localization of RA-GEF. This domain may function in a positive feedback loop. 29188 cd01786: STE50_RA The fungal adaptor protein STE50 is an essential component of three MAPK-mediated signalling pathways, which control the mating response, invasive/filamentous growth and osmotolerance (HOG pathway), respectively. STE50 functions in cell signalling between the activated G protein and STE11. The domain architecture of STE50 includes an amino-terminal SAM (sterile alpha motif) domain in addition to the carboxy-terminal ubiquitin-like RA (RAS-associated) domain. While the SAM domain interacts with STE11, the RA domain interacts with CDC42 and RAS. Modulation of signal transduction by STE50 specifically affects the pheromone-response pathway in yeast. 29189 cd01787: Grb7_RA The RA (RAS-associated like) domain of Grb7. Grb7 is an adaptor molecule that mediates signal transduction from multiple cell surface receptors to various downstream signaling pathways. Grb7 and its related family members Grb10 and Grb14 share a conserved domain architecture that includes an amino-terminal proline-rich region, a central segment termed the GM region (for Grb and Mig) which includes the RA, PIR, and PH domains, and a carboxyl-terminal SH2 domain. Grb7/10/14 family proteins are phosphorylated on serine/threonine as well as tyrosine residues and are mainly localized to the cytoplasm. 29190 cd01788: Elongin B is part of an E3 ubiquitin ligase complex called VEC that activates ubiquitylation by the E2 ubiquitin-conjugating enzyme Ubc5. VEC is composed of von Hippel-Lindau tumor suppressor protein (pVHL), elongin C, cullin 2, NEDD8, and Rbx1. ElonginB binds elonginC to form the elonginBC complex which is a positive regulator of RNA polymerase II elongation factor Elongin A. The BC complex then binds VHL (von Hippel-Lindau) tumour suppressor protein to form a VCB ternary complex. Elongin B has a ubiquitin-llike domain. 29191 cd01789: Alp11, also known as tubulin-folding cofactor B, is one of at least three proteins required for the proper folding of tubulins prior to their incorporation into microtubules. These cofactors are necessary for the biogenesis of microtubules and for cell viability. Alp11 has three domains including an N-terminal ubiquitin-like domain (represented by this CD) which executes the essential function, a central coiled-coil domain necessary for maintenance of cellular alpha-tubulin levels, and a C-terminal CLIP-170 domain is required for efficient binding to alpha-tubulin. 29192 cd01790: Herp (Homocysteine-responsive endoplasmic reticulum-resident ubiquitin-like domain protein) , is an integral membrane protein that is induced by the endoplasmic reticulum (ER) stress response pathway and is involved in improving the balance of folding capacity and protein loads in the ER. Herp has an N-terminal ubiquitin-like domain that is involved in Herp degradation, but is not necessary for its enhancement of amyloid beta-protein generation. 29193 cd01791: UBL5 (also known as HUB1) is a ubiquitin-like modifier that is both widely expressed and highly phylogenetically conserved. At the C-terminal end of the ubiquitin-like fold of UBL5 is a di-tyrosine motif followed by a single variable residue instead of the characteristic di-glycine found in all other ubiquitin-like modifiers. ULB5 interacts with a cyclin-like kinase called CLK4 but not with other cyclin-like kinase family members. 29194 cd01792: ISG15 is a ubiquitin-like protein containing two ubiquitin homology domains that becomes conjugated to a variety of proteins when cells are treated with type I interferon or lipopolysaccharide. Although ISG15 has properties similar to those of other ubiquitin-like molecules, it is a unique member of the ubiquitin-like superfamily, whose expression and conjugation to target proteins are tightly regulated by specific signaling pathways, indicating it may have specialized functions in the immune system. 29195 cd01793: Fubi is a ubiquitin-like protein encoded by the fau gene which has an N-terminal ubiquitin-like domain (also referred to as FUBI) fused to the ribosomal protein S30. Fubi is thought to be a tumor suppressor protein and the FUBI domain may act as a substitute or an inhibitor of ubiquitin or one of ubiquitin's close relatives UCRP, FAT10, and Nedd8. 29196 cd01794: DC_UbP (dendritic cell derived ubiquitin-like protein) is a ubiquitin-like protein from human dendritic cells that is expressed in the mitochondrion. The ubiquitin-like domain of this protein is found at the C-terminus and lacks the canonical gly-gly motif of ubiquitin required for ubiquitinization. DC_UbP is expressed in tumor cells but not in normal human adult tissue suggesting a role for DC_UbP in tumorogenesis. 29197 cd01795: The USP (ubiquitin-specific protease) family is one of at least seven deubiquitylating enzyme (DUB) families capable of deconjugating ubiquitin and ubiquitin-like adducts. While the USP's have a conserved catalytic core domain, they differ in their domain architectures. This subfamily, which includes USP31, and USP48, has a carboxy-terminal ubiquitin-like domain in addition to a DUSP (domain of ubiquitin-specific proteases) domain,. 29198 cd01796: DDI1_N DDI1 (DNA damage inducible protein 1) has an amino-terminal ubiquitin-like domain, an retroviral protease-like (RVP-like) domain, and a UBA (ubiquitin-associated) domain. This CD represents the amino-terminal ubiquitin-like domain of DDI1. 29199 cd01797: NIRF_N This CD represents the amino-terminal ubiquitin-like domain of a family of nuclear proteins that includes Np95 and NIRF (Np95/ICBP90-like RING finger) protein. Both Np95 and NIRF have a domain architecture consisting of a ubiquitin-like domain, a PHD finger, a YDG/SRA domain, Rb-binding motifs and a RING finger domain. Both Np95 and NIRF are ubiquitin ligases that ubiquitinate PCNP (PEST-containing nuclear proteins). While Np95 is capable of binding histones, NIRF is involved in cell cycle regulation. 29200 cd01798: parkin_N parkin protein is a RING-type E3 ubiquitin ligase with an amino-terminal ubiquitin-like (Ubl) domain and an RBR signature consisting of two RING finger domains separated by an IBR/DRIL domain. Naturally occurring mutations in parkin are thought to cause the disease AR_JP (autosomal-recessive juvenile parkinsonism). Parkin binds the Rpn10 subunit of 26S proteasomes through its Ubl domain. 29201 cd01799: HOIL1_N HOIL-1 (heme-oxidized IRP2 ubiquitin ligase-1) is an E3 ubiquitin-protein ligase that recognizes heme-oxidized IRP2 (iron regulatory protein2) and is thought to affect the turnover of oxidatively damaged proteins. Hoil-1 has an amino-terminal ubiquitin-like domain as well as an RBR signature consisting of two RING finger domains separated by an IBR/DRIL domain. 29202 cd01800: SF3a120_C Mammalian splicing factor SF3a consists of three subunits of 60, 66, and 120 kDa and functions early during pre-mRNA splicing by converting the U2 snRNP to its active form. The 120kDa subunit (SF3a120) has a carboxy-terminal ubiquitin-like domain and two SWAP (suppressor-of-white-apricot) domains, referred to collectively as the SURP module, at its amino-terminus. 29203 cd01801: Tsc13_N N-terminal domain of Tsc13. Tsc13 is an enoyl reductase involved in elongation of long chain fatty acids that localizes to the endoplasmic reticulum and is highly enriched in a novel structure marking nuclear-vacuolar junctions. 29204 cd01802: AN1 (also known as ANUBL1 and RSD-7) is ubiquitin-like protein with a testis-specific expression in rats that has an N-terminal ubiquitin-like domain and a C-terminal zinc-binding domain. Unlike ubiquitin polyproteins and most ubiquitin fusion proteins, the N-terminal ubiquitin-like domain of An1 does not undergo proteolytic processing. The function of AN1 is unknown. 29205 cd01803: Ubiquitin (includes Ubq/RPL40e and Ubq/RPS27a fusions as well as homopolymeric multiubiquitin protein chains). 29206 cd01804: midnolin_N Midnolin (midbrain nucleolar protein) is expressed in the nucleolus and is thought to regulate genes involved in neurogenesis. Midnolin contains an amino-terminal ubiquitin-like domain. 29207 cd01805: RAD23 belongs to a family of adaptor molecules having affinity for both the proteasome and ubiquitinylated proteins and thought to shuttle these ubiquitinylated proteins to the proteasome for destruction. RAD23 interacts with ubiquitin through its C-terminal ubiquitin-associated domains (UBA) and with the proteasome through its N-terminal ubiquitin-like domain (UBL).. 29208 cd01806: Nedd8 (also known as Rub1) has a single conserved ubiquitin-like domain that is part of a protein modification pathway similar to that of ubiquitin. Nedd8 modifies a family of molecular scaffold proteins called cullins that are responsible for assembling the ROC1/Rbx1 RING-based E3 ubiquitin ligases, of which several play a direct role in tumorigenesis. 29209 cd01807: GDX contains an N-terminal ubiquitin-like domain as well as an uncharacterized c-terminal domain. The function of GDX is unknown. 29210 cd01808: hPLIC-1 and hPLIC-2 (human homologs of the yeast ubiquitin-like Dsk2 protein) are type2 UBL's (ubiquitin-like) proteins that are thought to serve as adaptors that link the ubiquitination machinery to the proteasome. The hPLIC's have an N-terminal UBL domain that binds the S5a subunit of the proteasome and a C-terminal UBA (ubiquitin-associated) domain that binds a ubiquitylated protein. 29211 cd01809: Scythe protein (also known as Bat3) is an apoptotic regulator that is highly conserved in eukaryotes and contains a ubiquitin-like domain near its N-terminus. Scythe binds reaper, a potent apoptotic inducer, and Scythe/Reaper are thought to signal apoptosis, in part through regulating the folding and activity of apoptotic signaling molecules. 29212 cd01810: ISG15 is a ubiquitin-like protein containing two ubiquitin homology domains and becomes conjugated to a variety of proteins when cells are treated with type I interferon or lipopolysaccharide. Although ISG15 has properties similar to those of other ubiquitin-like molecules, it is a unique member of the ubiquitin-like superfamily, whose expression and conjugation to target proteins are tightly regulated by specific signaling pathways, indicating it may have specialized functions in the immune system. 29213 cd01811: OASL_repeat1 (2'-5' oligoadenylate synthetase-like protein) belongs to a family of interferon-induced 2'-5' oligoadenylate synthetases which are important for the antiviral activity of interferons. While each member of this famliy has a conserved N-terminal OAS catalytic domain, only OASL has two tandem ubiquitin-like repeats located at the C-terminus and this CD represents one of those repeats. 29214 cd01812: BAG1_N N-terminal ubiquitin-like (Ubl) domain of the BAG1 protein. This domain occurs together with the BAG domain and is closely related to the Ubl domain of a family of deubiquitinases that includes Rpn11, UBP6 (USP14), USP7 (HAUSP).. 29215 cd01813: The UBP (ubiquitin processing protease) domain (also referred to as USP which stands for ""ubiquitin-specific protease"") is present at in a large family of cysteine proteases that specifically cleave ubiquitin conjugates. This family includes Rpn11, UBP6 (USP14), USP7 (HAUSP). This domain is closely related to the amino-terminal ubiquitin-like domain of BAG1 (Bcl2-associated anthanogene1) protein and is found only in eukaryotes. 29216 cd01814: NTGP5 and ATGP4 are plant-specific isoprenylated GTP-binding proteins with a single fold that resembles ubiquitin. The function of these proteins is unknown. 29217 cd01815: BMSC_UbP (bone marrow stromal cell-derived ubiquitin-like protein) has an N-terminal ubiquitin-like (UBQ) domain and a C-terminal ubiquitin-associated (UBA) domain, a domain architecture similar to those of the UBIN, Chap1, and ubiquilin proteins. This CD represents the N-terminal ubiquitin-like domain. 29218 cd01816: The Raf serine/threonine kinases are composed of three conserved regions, CR1, CR2 and CR3. CR1 has two Ras binding domains (RBD and CRD), CR2 is a serine/threonine rich domain and CR3 is the catalytic kinase domain. The RBD of Raf is structurally similar to ubiquitin with little of no sequence similarity.The Raf signalling pathway plays an important role in the proliferation and survival of tumor cells. 29219 cd01817: RGS12 (regulator of G signalling 12), and RGS14, are members of a family of GTPase-activating proteins (GAP's) specific for the G-alpha subunit, which act as key inhibitors of G-protein-mediated cell responses in eukaryotes. Their domain architecture includes tandem RBD domains as well as PDZ , PTB, and RGS, and GoLoco domains. 29220 cd01818: Tiam1 (T lymphoma invasion and metastasis 1) a guanine nucleotide exchange factor that activates Rac, is an important regulator of Rho GTPase functions in tumor cells including regulation of cell shape and invasiveness in epithelial cells and fibroblasts. TIAM1 has an RBD (Ras-binding domain) similar to that of Raf kinase as well as PH (pleckstrin homology), PDZ, and RhoGEF domains. 29222 cd00198: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. 29223 cd01450: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains . 29224 cd01451: Magnesium chelatase: Mg-chelatase catalyses the insertion of Mg into protoporphyrin IX (Proto). In chlorophyll biosynthesis, insertion of Mg2+ into protoporphyrin IX is catalysed by magnesium chelatase in an ATP-dependent reaction. Magnesium chelatase is a three sub-unit (BchI, BchD and BchH) enzyme with a novel arrangement of domains: the C-terminal helical domain is located behind the nucleotide binding site. The BchD domain contains a AAA domain at its N-terminus and a VWA domain at its C-terminus. The VWA domain has been speculated to be involved in mediating protein-protein interactions. 29225 cd01452: 26S proteasome plays a major role in eukaryotic protein breakdown, especially for ubiquitin-tagged proteins. It is an ATP-dependent protease responsible for the bulk of non-lysosomal proteolysis in eukaryotes, often using covalent modification of proteins by ubiquitylation. It consists of a 20S proteolytic core particle (CP) and a 19S regulatory particle (RP). The CP is an ATP independent peptidase consisting of hydrolyzing activities. One or both ends of CP carry the RP that confers both ubiquitin and ATP dependence to the 26S proteosome. The RP's proposed functions include recognition of substrates and translocation of these to CP for proteolysis. The RP can dissociate into a stable lid and base subcomplexes. The base is composed of three non-ATPase subunits (Rpn 1, 2 and 10). A single residue in the vWA domain of Rpn10 has been implicated to be responsible for stabilizing the lid-base association. 29226 cd01453: Transcription factors IIH type: TFIIH is a multiprotein complex that is one of the five general transcription factors that binds RNA polymerase II holoenzyme. Orthologues of these genes are found in all completed eukaryotic genomes and all these proteins contain a VWA domain. The p44 subunit of TFIIH functions as a DNA helicase in RNA polymerase II transcription initiation and DNA repair, and its transcriptional activity is dependent on its C-terminal Zn-binding domains. The function of the vWA domain is unclear, but may be involved in complex assembly. The MIDAS motif is not conserved in this sub-group. 29227 cd01454: norD type: Denitrifying bacteria contain both membrane bound and periplasmic nitrate reductases. Denitrification plays a major role in completing the nitrogen cycle by converting nitrate or nitrite to nitrogen gas. The pathway for microbial denitrification has been established as NO3- ------> NO2- ------> NO -------> N2O ---------> N2. This reaction generally occurs under oxygen limiting conditions. Genetic and biochemical studies have shown that the first srep of the biochemical pathway is catalyzed by periplasmic nitrate reductases. This family is widely present in proteobacteria and firmicutes. This version of the domain is also present in some archaeal members. The function of the vWA domain in this sub-group is not known. Members of this subgroup have a conserved MIDAS motif. 29228 cd01455: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. Not much is known about the functions of the members of this subgroup. The members of this subgroup are fused to the ancient AAA domain. 29229 cd01456: VWA ywmD type:Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. Not much is known about the function of the members of this subgroup. All members of this subgroup however have a conserved MIDAS motif. . 29230 cd01457: VWA ORF176 type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses. In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. The members of this subgroup are Eubacterial in origin and have a conserved MIDAS motif. Not much is known about the biochemistry of these. 29231 cd01458: Ku70/Ku80 N-terminal domain. The Ku78 heterodimer (composed of Ku70 and Ku80) contributes to genomic integrity through its ability to bind DNA double-strand breaks (DSB) in a preferred orientation. DSB's are repaired by either homologues recombination or non-homologues end joining and facilitate repair by the non-homologous end-joining pathway (NHEJ). The Ku heterodimer is required for accurate process that tends to preserve the sequence at the junction. Ku78 is found in all three kingdoms of life. However, only the eukaryotic proteins have a vWA domain fused to them at their N-termini. The vWA domain is not involved in DNA binding but may very likey mediate Ku78's interactions with other proteins. Members of this subgroup lack the conserved MIDAS motif. 29232 cd01459: VWA Copine: Copines are phospholipid-binding proteins originally identified in paramecium. They are found in human and orthologues have been found in C. elegans and Arabidopsis Thaliana. None have been found in D. Melanogaster or S. Cereviciae. Phylogenetic distribution suggests that copines have been lost in some eukaryotes. No functional properties have been assigned to the VWA domains present in copines. The members of this subgroup contain a functional MIDAS motif based on their preferential binding to magnesium and manganese. However, the MIDAS motif is not totally conserved, in most cases the MIDAS consists of the sequence DxTxS instead of the motif DxSxS that is found in most cases. The C2 domains present in copines mediate phospholipid binding. 29233 cd01460: VWA_Midasin: Midasin is a member of the AAA ATPase family. The proteins of this family are unified by their common archetectural organization that is based upon a conserved ATPase domain. The AAA domain of midasin contains six tandem AAA protomers. The AAA domains in midasin is followed by a D/E rich domain that is following by a VWA domain. The members of this subgroup have a conserved MIDAS motif. The function of this domain is not exactly known although it has been speculated to play a crucial role in midasin function. 29234 cd01461: vWA_interalpha trypsin inhibitor (ITI): ITI is a glycoprotein composed of three polypeptides- two heavy chains and one light chain (bikunin). Bikunin confers the protease-inhibitor function while the heavy chains are involved in rendering stability to the extracellular matrix by binding to hyaluronic acid. The heavy chains carry the VWA domain with a conserved MIDAS motif. Although the exact role of the VWA domains remains unknown, it has been speculated to be involved in mediating protein-protein interactions with the components of the extracellular matrix. 29235 cd01462: VWA YIEM type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. Members of this subgroup have a conserved MIDAS motif, however, their biochemical function is not well characterised. 29236 cd01463: VWA Voltage gated Calcium channel like: Voltage-gated calcium channels are a complex of five proteins: alpha 1, beta 1, gamma, alpha 2 and delta. The alpha 2 and delta subunits result from proteolytic processing of a single gene product and carries at its N-terminus the VWA and cache domains, The alpha 2 delta gene family has orthologues in D. melanogaster and C. elegans but none have been detected in aither A. thaliana or yeast. The exact biochemical function of the VWA domain is not known but the alpha 2 delta complex has been shown to regulate various functional properties of the channel complex. 29237 cd01464: VWA subfamily: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. Members of this subgroup have no assigned function. This subfamily is typified by the presence of a conserved MIDAS motif. 29238 cd01465: VWA subgroup: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. Not much is known about the function of the VWA domain in these proteins. The members do have a conserved MIDAS motif. The biochemical function however is not known. 29239 cd01466: VWA C3HC4-type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. Membes of this subgroup belong to Zinc-finger family as they are found fused to RING finger domains. The MIDAS motif is not conserved in all the members of this family. The function of vWA domains however is not known. 29240 cd01467: VWA BatA type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses. In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. Members of this subgroup are bacterial in origin. They are typified by the presence of a MIDAS motif. 29241 cd01468: trunk domain. COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is known as the trunk domain and has an alpha/beta vWA fold and forms the dimer interface. Some members of this family possess a partial MIDAS motif that is a characteristic feature of most vWA domain proteins. 29242 cd01469: Integrins are a class of adhesion receptors that link the extracellular matrix to the cytoskeleton and cooperate with growth factor receptors to promote celll survival, cell cycle progression and cell migration. Integrins consist of an alpha and a beta sub-unit. Each sub-unit has a large extracellular portion, a single transmembrane segment and a short cytoplasmic domain. The N-terminal domains of the alpha and beta subunits associate to form the integrin headpiece, which contains the ligand binding site, whereas the C-terminal segments traverse the plasma membrane and mediate interaction with the cytoskeleton and with signalling proteins.The VWA domains present in the alpha subunits of integrins seem to be a chordate specific radiation of the gene family being found only in vertebrates. They mediate protein-protein interactions. 29243 cd01470: Complement factors B and C2 are two critical proteases for complement activation. They both contain three CCP or Sushi domains, a trypsin-type serine protease domain and a single VWA domain with a conserved metal ion dependent adhesion site referred commonly as the MIDAS motif. Orthologues of these molecules are found from echinoderms to chordates. During complement activation, the CCP domains are cleaved off, resulting in the formation of an active protease that cleaves and activates complement C3. Complement C2 is in the classical pathway and complement B is in the alternative pathway. The interaction of C2 with C4 and of factor B with C3b are both dependent on Mg2+ binding sites within the VWA domains and the VWA domain of factor B has been shown to mediate the binding of C3. This is consistent with the common inferred function of VWA domains as magnesium-dependent protein interaction domains. 29244 cd01471: Micronemal proteins: The Toxoplasma lytic cycle begins when the parasite actively invades a target cell. In association with invasion, T. gondii sequentially discharges three sets of secretory organelles beginning with the micronemes, which contain adhesive proteins involved in parasite attachment to a host cell. Deployed as protein complexes, several micronemal proteins possess vertebrate-derived adhesive sequences that function in binding receptors. The VWA domain likely mediates the protein-protein interactions of these with their interacting partners. 29245 cd01472: von Willebrand factor (vWF) type A domain; equivalent to the I-domain of integrins. This domain has a variety of functions including: intermolecular adhesion, cell migration, signalling, transcription, and DNA repair. In integrins these domains form heterodimers while in vWF it forms homodimers and multimers. There are different interaction surfaces of this domain as seen by its complexes with collagen with either integrin or human vWFA. In integrins collagen binding occurs via the metal ion-dependent adhesion site (MIDAS) and involves three surface loops located on the upper surface of the molecule. In human vWFA, collagen binding is thought to occur on the bottom of the molecule and does not involve the vestigial MIDAS motif. 29246 cd01473: CTRP for CS protein-TRAP-related protein: Adhesion of Plasmodium to host cells is an important phenomenon in parasite invasion and in malaria associated pathology.CTRP encodes a protein containing a putative signal sequence followed by a long extracellular region of 1990 amino acids, a transmembrane domain, and a short cytoplasmic segment. The extracellular region of CTRP contains two separated adhesive domains. The first domain contains six 210-amino acid-long homologous VWA domain repeats. The second domain contains seven repeats of 87-60 amino acids in length, which share similarities with the thrombospondin type 1 domain found in a variety of adhesive molecules. Finally, CTRP also contains consensus motifs found in the superfamily of haematopoietin receptors. The VWA domains in these proteins likely mediate protein-protein interactions. 29247 cd01474: ATR (Anthrax Toxin Receptor): Anthrax toxin is a key virulence factor for Bacillus anthracis, the causative agent of anthrax. ATR is the cellular receptor for the anthrax protective antigen and facilitates entry of the toxin into cells. The VWA domain in ATR contains the toxin binding site and mediates interaction with protective antigen. The binding is mediated by divalent cations that binds to the MIDAS motif. These proteins are a family of vertebrate ECM receptors expressed by endothelial cells. 29248 cd01475: VWA_Matrilin: In cartilaginous plate, extracellular matrix molecules mediate cell-matrix and matrix-matrix interactions thereby providing tissue integrity. Some members of the matrilin family are expressed specifically in developing cartilage rudiments. The matrilin family consists of at least four members. All the members of the matrilin family contain VWA domains, EGF-like domains and a heptad repeat coiled-coiled domain at the carboxy terminus which is responsible for the oligomerization of the matrilins. The VWA domains have been shown to be essential for matrilin network formation by interacting with matrix ligands. 29249 cd01476: VWA_integrin (invertebrates): Integrins are a family of cell surface receptors that have diverse functions in cell-cell and cell-extracellular matrix interactions. Because of their involvement in many biologically important adhesion processes, integrins are conserved across a wide range of multicellular animals. Integrins from invertebrates have been identified from six phyla. There are no data to date to suggest any immunological functions for the invertebrate integrins. The members of this sub-group have the conserved MIDAS motif that is charateristic of this domain suggesting the involvement of the integrins in the recognition and binding of multi-ligands. 29250 cd01477: VWA F09G8.8 type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. The members of this subgroup lack the MIDAS motif. This subgroup is found only in C. elegans and the members identified thus far are always found fused to a C-Lectin type domain. Biochemical function thus far has not be attributed to any of the members of this subgroup. 29251 cd01478: Sec23-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the budding and fusion of intracellular transport vesicles that selectively carry cargo proteins and lipids from donor to acceptor organelles. The two main classes of vesicular carriers within the endocytic and the biosynthetic pathways are COP- and clathrin-coated vesicles. Formation of COPII vesicles requires the ordered assembly of the coat built from several cytosolic components GTPase Sar1, complexes of Sec23-Sec24 and Sec13-Sec31. The process is initiated by the conversion of GDP to GTP by the GTPase Sar1 which then recruits the heterodimeric complex of Sec23 and Sec24. This heterodimeric complex generates the pre-budding complex. The final step leading to membrane deformation and budding of COPII-coated vesicles is carried by the heterodimeric complex Sec13-Sec31. The members of this CD belong to the Sec23-like family. Sec 23 is very similar to Sec24. The Sec23 and Sec24 polypeptides fold into five distinct domains: a beta-barrel, a zinc finger, a vWA or trunk, an all helical region and a carboxy Gelsolin domain. The members of this subgroup lack the consensus MIDAS motif but have the overall Para-Rossmann type fold that is characteristic of this superfamily. 29252 cd01479: Sec24-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the budding and fusion of intracellular transport vesicles that selectively carry cargo proteins and lipids from donor to acceptor organelles. The two main classes of vesicular carriers within the endocytic and the biosynthetic pathways are COP- and clathrin-coated vesicles. Formation of COPII vesicles requires the ordered assembly of the coat built from several cytosolic components GTPase Sar1, complexes of Sec23-Sec24 and Sec13-Sec31. The process is initiated by the conversion of GDP to GTP by the GTPase Sar1 which then recruits the heterodimeric complex of Sec23 and Sec24. This heterodimeric complex generates the pre-budding complex. The final step leading to membrane deformation and budding of COPII-coated vesicles is carried by the heterodimeric complex Sec13-Sec31. The members of this CD belong to the Sec23-like family. Sec 24 is very similar to Sec23. The Sec23 and Sec24 polypeptides fold into five distinct domains: a beta-barrel, a zinc finger, a vWA or trunk, an all helical region and a carboxy Gelsolin domain. The members of this subgroup carry a partial MIDAS motif and have the overall Para-Rossmann type fold that is characteristic of this superfamily. 29253 cd01480: VWA_collagen alpha(VI) type: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions. 29254 cd01481: VWA_collagen alpha 3(VI) like: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions. 29255 cd01482: Collagen: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions. 29256 cd00199: whey acidic protein-type four-disulfide core domains. Members of the family include whey acidic protein, elafin (elastase-specific inhibitor), caltrin-like protein (a calcium transport inhibitor) and other extracellular proteinase inhibitors. A group of proteins containing 8 characteristically-spaced cysteine residuesforming disulphide bonds, have been termed '4-disulphide core' proteins. Protease inhibition occurs by insertion of the inhibitory loop into the active site pocket and interference with the catalytic residues of the protease. 29257 cd00200: WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment. 29258 cd00201: Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. 29259 cd00202: Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C. 29261 cd00204: ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats. 29262 cd00207: 2Fe-2S iron-sulfur cluster binding domain. Iron-sulfur proteins play an important role in electron transfer processes and in various enzymatic reactions. The family includes plant and algal ferredoxins, which act as electron carriers in photosynthesis and ferredoxins, which participate in redox chains (from bacteria to mammals). Fold is ismilar to thioredoxin. 29263 cd00208: Left-Handed Parallel beta-Helix; Alignment contains 4 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). Proteins containing hexapeptide repeats are often enzymes showing acyl-transferase activity, many are trimeric in their active form. 29264 cd00209: Dihydrofolate reductase (DHFR). Reduces 7,8-dihydrofolate to 5,6,7,8-tetrahydrofolate with NADPH as a cofactor. This is an essential step in the biosynthesis of deoxythymidine phosphate since 5,6,7,8-tetrahydrofolate is required to regenerate 5,10-methylenetetrahydrofolate which is then utilized by thymidylate synthase. Inhibition of DHFR interrupts thymidilate synthesis and DNA replication, inhibitors of DHFR (such as Methotrexate) are used in cancer chemotherapy. 5,6,7,8-tetrahydrofolate also is involved in glycine, serine, and threonine metabolism and aminoacyl-tRNA biosynthesis. 29265 cd00210: PTS_IIA, PTS system, glucose/sucrose specific IIA subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. This family is one of four structurally and functionally distinct group IIA PTS system cytoplasmic enzymes, necessary for the uptake of carbohydrates across the cytoplasmic membrane and their phosphorylation. 29266 cd00211: PTS_IIA, PTS system, fructose/mannitol specific IIA subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. This family is one of four structurally and functionally distinct group IIA PTS system cytoplasmic enzymes, necessary for the uptake of carbohydrates across the cytoplasmic membrane and their phosphorylation. 29267 cd00212: PTS_IIB, PTS system, glucose/sucrose specific IIB subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. This family is one of four structurally and functionally distinct group IIB PTS system cytoplasmic enzymes, necessary for the uptake of carbohydrates across the cytoplasmic membrane and their phosphorylation. 29268 cd00213: S-100/ICaBP-like domain; S-100/intestinal calcium binding domain (ICaBP); The S-100 domain is a subfamily of the EF-hand CaBPs, expressed exclusively in vertebrates, and implicated in intracellular and extracellular regulatory activities. Structural data suggest that many S100 members exist within cells as antiparallel homodimers capable of functionally crossbridging two homologous or heterologous target proteins in a Ca(2+)-dependent (and, in some instances, Ca(2+)-independent) manner. Intracellular S100 proteins activities include protein phosphorylation, enzyme activities, cell proliferation (including neoplastic transformation) and differentiation, the dynamics of cytoskeleton constituents, the structural organization of membranes, intracellular Ca2+ homeostasis, inflammation, and in protection from oxidative cell damage. Extracellular S100 proteins stimulate neuronal survival and/or differentiation and astrocyte proliferation, cause neuronal death via apoptosis, and stimulate or inhibit the activity of inflammatory cells. 29269 cd00214: Calpain, subdomain III. Calpains are calcium-activated cytoplasmic cysteine proteinases, participate in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction. Catalytic domain and the two calmodulin-like domains are separated by C2-like domain III. Domain III plays an important role in calcium-induced activation of calpain involving electrostatic interactions with subdomain II. Proposed to mediate calpain's interaction with phospholipids and translocation to cytoplasmic/nuclear membranes. CD includes subdomain III of typical and atypical calpains. 29270 cd00215: PTS_IIA, PTS system, lactose/cellobiose specific IIA subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. This family is one of four structurally and functionally distinct group IIA PTS system cytoplasmic enzymes, necessary for the uptake of carbohydrates across the cytoplasmic membrane and their phosphorylation. This family of proteins normally function as a homotrimer, stabilized by a centrally located metal ion. Separation into subunits is thought to occur after phosphorylation. 29273 cd00220: Vitelline membrane outer layer protein I (VMO-I) domain, VMO-I is one of the proteins found in the outer layer of the vitelline membrane of poultry eggs; VMO-I, lysozyme, and VMO-II are tightly bound to ovomucin; this complex forms the backbone of the outer layer; VMO-I has three distinct internal repeats; all three repeats are used to define the domain here; VMO-I has recently been shown to synthesize N-acetylchito-oligosaccharides from N-acetylglucosamine; may be a carbohydrate-binding protein; member of the beta-prism-fold family. 29274 cd00222: Collagen-binding protein B domain, mediates bacterial adherence to collagen; the primary sequence has a non-repetitive, collagen-binding A region, followed by the repetitive B region; the B region has one to four 23 kDa repeat units (B1-B4). The B repeat units have been suggested to serve as a `stalk' that projects the A region from the bacterial surface and thus facilitate bacterial adherence to collagen; each B repeat unit has two domains (D1 and D2) placed side-by-side; D1 and D2 have similar secondary structure and exhibit a unique inverse IgG-like domain fold. 29276 cd00224: homolog to Ran-Binding Protein Mog1p; binds to the small GTPase Ran, which plays an important role in nuclear import. Binding is independent of Ran's nucleotide state (RanGTP/RanGDP). 29277 cd00226: Photosynthetic reaction center (RC) complex, subunit H; RC is an integral membrane protein-pigment complex which catalyzes light-induced reduction of ubiquinone to ubiquinol, generating a transmembrane electrochemical gradient of protons used to produce ATP by ATP synthase. Subunit H is positioned mainly in the cytoplasm with one transmembrane alpha helix. Provides proton transfer pathway (water channels) connecting the terminal quinone electron acceptor of RC, to the aqueous phase. Found in photosynthetic bacteria: alpha, beta, and gamma proteobacteria. 29278 cd00227: Chloramphenicol (Cm) phosphotransferase (CPT). Cm-inactivating enzyme; modifies the primary (C-3) hydroxyl of the antibiotic. Related structurally to shikimate kinase II. 29279 cd00228: Eukaryotic Glutathione Synthetase (eu-GS); catalyses the production of glutathione from gamma-glutamylcysteine and glycine in an ATP-dependent manner. Belongs to the ATP-grasp superfamily. 29308 cd00230: Human Replication Protein A: Global Fold Of The N-Terminal RPA-70 Domain Reveals a Basic Cleft and Flexible C-Terminal Linker. Heterotrimeric human single-stranded DNA (ssDNA)-binding protein, replication protein A (RPA), is a central player in DNA replication, recombination, and repair. Two loops on one side of the barrel form a large basic cleft which is a likely site for binding the acidic motifs of transcriptional activators. 29309 cd00231: ZipA C-terminal domain. ZipA, a membrane-anchored protein, is one of at least nine essential gene products necessary for assembly of the septal ring which mediates cell division in E.coli. ZipA and FtsA directly bind FtsZ, a homolog of eukaryotic tubulins, at the prospective division site, followed by the sequential addition of FtsK, FtsQ, FtsL, FtsW, FtsI, and FtsN. ZipA contains three domains: a short N-terminal membrane-anchored domain, a central P/Q domain that is rich in proline and glutamine and a C-terminal domain, which comprises almost half the protein. 29310 cd00232: Heme oxygenase catalyzes the rate limiting step in the degradation of heme to bilirubin, it is essential for recycling of iron from heme. Heme is used as a substrate and cofactor for its own degradation to biliverdin, iron, and carbon monoxide. This family includes bacterial HO, as well as the mammalian isoforms HO-1, and HO-2. Heme oxygenases play key roles in heme homeostasis, oxidative stress response, photosynthetic pigment formation in cyanobacteria, cellular signaling in mammals, and iron acquisition from host heme by bacterial pathogens. 29311 cd00233: VIP2; A family of actin-ADP-ribosylating toxin. A member of the Bacillus-prodiced vegetative insecticidal proteins (VIPs) possesses high specificity against the major insect pest, corn rootworms, and belongs to a classs of binary toxins and regulators of biological pathways distinct from classical A-B toxins. A novel family of insecticidal ADP-ribosyltransferses were isolated from Bacillus cereus during vegetative growth, where VIP1 likely targets insect cells and VIP2 ribosylates actin. VIP2 shares significant sequence similarity with enzymatic components of other binary toxins, Clostridium botulinum C2 toxin, C. perfringens iota toxin, C. piroforme toxin, C. piroforme toxin and C. difficile toxin. 29312 cd00234: Replication Protein A (RPA), Subunit RPA14; RPA is a nuclear ssDNA-binding protein in eukaryotes, essential to DNA replication, recombination, and repair. These functions are associated with the two larger subunits of RPA (RPA32 and RPA70). The smaller subunit (RPA14) is believed to have a structural role in assembly of the RPA heterotrimer. 29313 cd00236: FinO bacterial conjugation repressor domain; the basic protein FinO is part of the the two component FinOP system which is responsible for repressing bacterial conjugation; the FinOP system represses the transfer (tra) operon of the F-plasmid which encodes the proteins responsible for conjugative transfer of this plasmid from host to recipient Escherichia coli cells; antisense RNA, FinP is thought to interact with traJ mRNA to occlude its ribosome binding site, blocking traJ translation and thereby inhibiting transcription of the tra operon; FinO protects FinP against degradation by binding to FinP and sterically blocking the cellular endonuclease RNase E; FinO also also binds to the complementary stem-loop structures in traJ mRNA and promotes duplex formation between FinP and traJ RNA in vitro; this domain contains two independent RNA binding regions. 29314 cd00237: homolog to co-chaperone p23; p23 binds heat shock protein hsp90 and participates in the folding of a number of proteins, including the progesterone receptor. Alignment also contains second subgroup of SGT1 homologs. Yeast Sgt1 is required for kinetochor assembly and associates with SCF ubiquitin ligase. 29316 cd00240: Transcription initiation factor IIF, alpha subunit, N-terminal region of RAP74. Subunit of transcription initiation complex involved in initiation, elongation and promoter escape.Tetramer of 2 alpha and 2 beta TFIIF subunits interacts directly with RNA polymerase II. TFIIF inhibits non-specific transcription initiation by PolII and recruits the polymerase to the preinitiation complex on promoter DNA for site-specific transcription initiation. The PolII/TFIIF-complex attaches through direct interactions of TFIIF with promoter DNA, TFIIB and the TAF250 subunit of TFIID, and provides scaffolding for addition of TFIIE and TFIIH. Together with TFIIE, TFIIF participates in DNA strand separation (open complex formation). N-terminal domains of RAP30 and RAP74 co-fold to form a single core structure, a triple barrel heterodimer, and has pseudo-2-fold symmetry. 29317 cd00241: Cellobiose dehydrogenase (CellobioseDH), cytochrome domain; This extracellular fungal oxidoreductase degrades both lignin and cellulose. It is a hemoflavoenzyme that is comprised of a b-type cytochrome domain linked to a large flavodehydrogenase domain. The 2 domains can be separated proteolytically. The cytochrome domain folds as a beta sandwich and complexes a heme molecule. 29318 cd00242: Protease Inhibitor Ecotin; homodimeric protease inhibitor which binds two chymotrypsin-like serine proteases to form a heterotetramer. Found in bacterial periplasm. Inhibits a broad range of serine proteases including collagenase, trypsin, chymotrypsin, elastase, and factor Xa but not thrombin. Inhibition mechanism involves binding at two different protease contact sites: the primary and secondary binding sites. Primary site loops of ecotin bind to the active site of target proteases in a substrate-like manner with the P1 residue in ecotin mimicing the interactions of a canonical P1 substrate residue. 29319 cd00244: Alginate Lyase A1-III; enzymatically depolymerizes alginate, a complex copolymer of beta-D-mannuronate and alpha-L-guluronate, by cleaving the beta-(1,4) glycosidic bond. 29321 cd00246: Nucleotide exchange factor for Rab-like small GTPases (RabGEF), Mss4 type; RabGEF positely regulates the function of Rab GTPase by promoting exchange of GDP for GTP; members of the Rab subfamily of Ras GTPases are important in vesicular transport;. 29322 cd00247: Endostatin-like domain; the angiogenesis inhibitor endostatin is a C-terminal fragment of collagen XV/XVIII, a proteoglycan/collagen found in vessel walls and basement membranes; this domain has a compact globular fold similar to that of C-type lectins; endostatin XVIII is monomeric and contains a heparin-binding epitope and zinc binding sites while endostatin XV is trimeric and contains neither of these sites; the generation of endostatin or endostatin-like collagen XV/XVIII fragments is catalyzed by proteolytic enzymes within the protease-sensitive hinge region of the C-terminal domain; endostatin inhibits endothelial cell migration in vitro and appears to be highly effective in murine in vivo studies. 29323 cd00248: Mth938 domain; Mth938 is a hypothetical protein encoded by the Methanobacterium thermoautotrophicum (Mth) genome; The protein crystallizes as a dimer, although it is monomeric in solution, with one disulfide bond in each monomer; Two larger sheets, one from each monomer, associate as a ten-strand mixed sheet that forms the base of a cleft which could potentially bind double-stranded nucleic acid with interacting elements from helix A and the tip of strand 5 of either subunit of the Mth938 dimer;. 29324 cd00249: AGE domain; N-acyl-D-glucosamine 2-epimerase domain; Responsible for intermediate epimerization during biosynthesis of N-acetylneuraminic acid. Catalytic mechanism is believed to be via nucleotide elimination and readdition and is ATP modulated. AGE is structurally and mechanistically distinct from the other four types of epimerases. The AGE domain monomer is composed of an alpha(6)/alpha(6)-barrel, the structure of which is also found in glucoamylase and cellulase. The active form is a homodimer. The alignment also contains subtype III mannose 6-phosphate isomerases. 29325 cd00250: Clavaminic acid synthetase (CAS) -like; CAS is a trifunctional Fe(II)/ 2-oxoglutarate (2OG) oxygenase carrying out three reactions in the biosynthesis of clavulanic acid, an inhibitor of class A serine beta-lactamases. In general, Fe(II)-2OG oxygenases catalyze a hydroxylation reaction, which leads to the incorporation of an oxygen atom from dioxygen into a hydroxyl group and conversion of 2OG to succinate and CO2. 29326 cd00252: SPARC_EC; extracellular Ca2+ binding domain (containing 2 EF-hand motifs) of SPARC and related proteins (QR1, SC1/hevin, testican and tsc-36/FRP). SPARC (BM-40) is a multifunctional glycoprotein, a matricellular protein, that functions to regulate cell-matrix interactions; binds to such proteins as collagen and vitronectin and binds to endothelial cells thus inhibiting cellular proliferation. The EC domain interacts with a follistatin-like (FS) domain which appears to stabilize Ca2+ binding. The two EF-hands interact canonically but their conserved disulfide bonds confer a tight association between the EF-hand pair and an acid/amphiphilic N-terminal helix. Proposed active form involves a Ca2+ dependent symmetric homodimerization of EC-FS modules. 29327 cd00253: Pertactin-like passenger domains (virulence factors) of autotransporter proteins of the type V secretion system. Autotransporters are proteins used by Gram-negative bacteria to transport proteins across their outer membranes. The C-terminal (beta) domain of autotransporters forms a pore in the outer membrane through which the N-terminal passenger domain is transported. Following transport, the passenger domain is generally cleaved by an outer membrane protease with the passenger domain either remaining in contact with the surface via a noncovalent interaction with the beta domain or cleaved to release a soluble protein. These proteins are highly diverse and perform a variety of functions that promote virulence, including catalyzing proteolysis, serving as an adhesin, mediating actin-promoted motility, or serving as a cytotoxin. Proteins in this family share similarity in the C-terminal region of the passenger domain as seen in the pertactin structure P.69, a Bordetella pertussis agglutinogen responsible for human pertussis. The P.69 protein consists of a 16-stranded parallel beta-helix with a V-shaped cross-section, and is one of the largest beta-helix known to date. 29328 cd01343: Pertactin-like passenger domains (virulence factors), C-terminal, subgroup 1, of autotransporter proteins of the type V secretion system of Gram-negative bacteria. This subgroup includes the passenger domains of Neisseria and Haemophilus IgA1 proteases, SPATEs (serine protease autotransporters secreted by Enterobacteriaceae), Bordetella pertacins, and nonprotease autotransporters, TibA and similar AIDA-like proteins. 29329 cd01344: Pertactin-like passenger domains (virulence factors), C-terminal, subgroup 2, of autotransporter proteins of the type V secretion system of Gram-negative bacteria. This subgroup includes the passenger domains of the nonprotease autotransporters, Ag43, AIDA-1 and IcsA, as well as, the less characterized ShdA, MisL, and BapA autotransporters. 29330 cd00255: Nidogen, G2 domain; Nidogen is an important component of the basement membrane, an extracellular sheet-like matrix. Nidogen is a multifunctional protein that interacts with many other basement membrane proteins, like collagen, perlecan, lamin, and has a potential role in the assembly and connection of networks. Nidogen consists of 3 globular domains (G1-G3), G3 is the lamin-binding domain, while G2 binds collagen IV and perlecan. Also found in hemicentin, a protein which functions at various cell-cell and cell-matrix junctions and might assist in refining broad regions of cell contact into oriented, line-shaped junctions. Nidogen G2 consists of an N-terminal EGF-like domain (excluded from this alignment model) and an 11-stranded beta-barrel with a central helix, a topology that exhibits high structural similarity to the green flourescent proteins of Cnidaria. 29331 cd00256: VATPase_H, regulatory vacuolar ATP synthase subunit H (Vma13p); activation component of the peripheral V1 complex of V-ATPase, a heteromultimeric enzyme which uses ATP to actively transport protons into organelles and extracellular compartments. The topology is that of a superhelical spiral, in part the geometry is similar to superhelices composed of armadillo repeat motifs, as found in importins for example. 29332 cd00257: Fascin-like domain; members include actin-bundling/crosslinking proteins facsin, histoactophilin and singed; identified in sea urchin, Drosophila, Xenopus, rodents, and humans; The fascin-like domain adopts a beta-trefoil topology and contains an internal threefold repeat; the fascin subgroup contains four copies of the domain; Structurally similar to fibroblast growth factor (FGF). 29333 cd00260: Sialidases or neuraminidases function to bind and hydrolyze terminal sialic acid residues from various glycoconjugates as well as playing roles in pathogenesis, bacterial nutrition and cellular interactions. They have a six-bladed, beta-propeller fold with the non-viral sialidases containing 2-5 Asp-box motifs (most commonly Ser/Thr-X-Asp-[X]-Gly-X-Thr- Trp/Phe). This CD includes eubacterial, eukaryotic, and viral sialidases. 29334 cd00261: Trypsin-alpha amylase inhibitor domain, Alpha Amylase Inhibitor (AAI) subgroup. These cereal-type alpha-amylase inhibitors are composed of 120-160 residues, form 5 disulfide bonds and inhibit amylases from birds, bacilli, insects and mammals. They are related to the other members of the AAI family (plant lipid transfer proteins and seed storage proteins), the disulfide-bonding pattern varys between members. 29336 cd00263: Human Replication Protein A: Global Fold Of The N-Terminal RPA-70 Domain Reveals a Basic Cleft and Flexible C-Terminal Linker. Heterotrimeric human single-stranded DNA (ssDNA)-binding protein, replication protein A (RPA), is a central player in DNA replication, recombination, and repair. Two loops on one side of the barrel form a large basic cleft which is a likely site for binding the acidic motifs of transcriptional activators. Structurally similar to telomere binding protein from Oxytricha nova, and to other domains of RPA, which are included in the core structure alignment presented here. 29337 cd00025: BPI/LBP/CETP N-terminal domain; Bactericidal permeability-increasing protein (BPI) / Lipopolysaccharide-binding protein (LBP) / Cholesteryl ester transfer protein (CETP) N-terminal domain; binds to and neutralizes lipopolysaccharides from the outer membrane of Gram-negative bacteria.; Apolar pockets on the concave surface bind a molecule of phosphatidylcholine, primarily by interacting with their acyl chains; this suggests that the pockets may also bind the acyl chains of lipopolysaccharide. 29338 cd00026: BPI/LBP/CETP C-terminal domain; Bactericidal permeability-increasing protein (BPI) / Lipopolysaccharide-binding protein (LBP) / Cholesteryl ester transfer protein (CETP) C-terminal domain; binds to and neutralizes lipopolysaccharides from the outer membrane of Gram-negative bacteria.; Apolar pockets on the concave surface bind a molecule of phosphatidylcholine, primarily by interacting with their acyl chains; this suggests that the pockets may also bind the acyl chains of lipopolysaccharide. 29339 cd00264: BPI/LBP/CETP domain; Bactericidal permeability-increasing protein (BPI) / Lipopolysaccharide-binding protein (LBP) / Cholesteryl ester transfer protein (CETP) domain; binds to and neutralizes lipopolysaccharides from the outer membrane of Gram-negative bacteria.; Apolar pockets on the concave surface bind a molecule of phosphatidylcholine, primarily by interacting with their acyl chains; this suggests that the pockets may also bind the acyl chains of lipopolysaccharide. 29341 cd00269: DEXH-box helicases. A diverse family of proteins involved in ATP-dependent DNA or RNA unwinding, needed in a variety of cellular processes. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region. 29343 cd00275: Protein kinase C conserved region 2, subgroup 2; C2 Ca2+-binding motif present in phospholipases, protein kinases C, and synaptotagmins (amongst others); some PKCs lack calcium dependence. Particular C2s appear to bind phospholipids, inositol polyphosphates,and intracellular proteins. Two distinct C2 topologies generated by permutation of the sequence with respect to the N- and C-terminal beta strands are seen. In this subgroup, containing phospholipases C and D( PLC-1, PLD) and specific protein kinases C (PKC) subtypes, the C-terminal beta strand occupies the position of what is the N-terminal strand in subgroup 1. 29344 cd00276: Protein kinase C conserved region 2, subgroup 1; C2 Ca2+-binding motif present in phospholipases, protein kinases C, and synaptotagmins (amongst others); some PKCs lack calcium dependence. Particular C2s appear to bind phospholipids, inositol polyphosphates,and intracellular proteins. Two distinct C2 topologies generated by permutation of the sequence with respect to the N- and C-terminal beta strands are seen. In this subgroup, containing synaptotagmins, specific protein kinases C (PKC) subtypes and other proteins, the N-terminal beta strand occupies the position of what is the C-terminal strand in subgroup 2. 29345 cd00279: Ylxr homologs; group of conserved hypothetical bacterial proteins of unknown function; structure revealed putative RNA binding cleft; proteins are encoded by an operon that includes other proteins involved in transcription and/or translation. 29346 cd00283: GIYX(10-11)YIG family of class I homing endonucleases C-terminus (GIY-YIG_Cterm). Homing endonucleases promote the mobility of intron or intein by recognizing and cleaving a homologous allele that lacks the sequence. They catalyze a double-strand break in the DNA near the insertion site of that element to facilitate homing at that site. Class I homing endonucleases are sorted into four families based on the presence of these motifs in their respective N-termini: LAGLIDADG, His-Cys box, HNH, and GIY-YIG. This CD contains several but not all members of the GIY-YIG family. The C-terminus of GIY-YIG is a DNA-binding domain which is separated from the N-terminus by a long, flexible linker. The DNA-binding domain consists of a minor-groove binding alpha-helix, and a helix-turn-helix. Some also contain a zinc finger (i.e. I-TevI) which is not required for DNA binding or catalysis, but is a component of the linker and directs the catalytic domain to cleave the homing site at a fixed distance from the intron insertion site. 29347 cd00284: Cytochrome b (N-terminus)/b6/petB: Cytochrome b is a subunit of cytochrome bc1, an 11-subunit mitochondrial respiratory enzyme. Cytochrome b spans the mitochondrial membrane with 8 transmembrane helices (A-H) in eukaryotes. In plants and cyanobacteria, cytochrome b6 is analogous to eukaryote cytochrome b, containing two chains: helices A-D are encoded by the petB gene and helices E-H are encoded by the petD gene in these organisms. Cytochrome b/b6 contains two bound hemes and two ubiquinol/ubiquinone binding sites. The C-terminal portion of cytochrome b is described in a separate CD. 29349 cd01164: 1-phosphofructokinase (FruK), minor 6-phosphofructokinase (pfkB) and related sugar kinases. FruK plays an important role in the predominant pathway for fructose utilisation.This group also contains tagatose-6-phophate kinase, an enzyme of the tagatose 6-phosphate pathway, which responsible for breakdown of the galactose moiety during lactose metabolism by bacteria such as L. lactis. 29350 cd01166: 2-keto-3-deoxygluconate kinase (KdgK) phosphorylates 2-keto-3-deoxygluconate (KDG) to form 2-keto-3-deoxy-6-phosphogluconate (KDGP). KDG is the common intermediate product, that allows organisms to channel D-glucuronate and/or D-galacturinate into the glycolysis and therefore use polymers, like pectin and xylan as carbon sources. 29351 cd01167: Fructokinases (FRKs) mainly from bacteria and plants are enzymes with high specificity for fructose, as are all FRKs, but they catalyzes the conversion of fructose to fructose-6-phosphate, which is an entry point into glycolysis via conversion into glucose-6-phosphate. This is in contrast to FRKs [or ketohexokinases (KHKs)] from mammalia and halophilic archaebacteria, which phosphorylate fructose to fructose-1-phosphate. 29352 cd01168: Adenosine kinase (AK) catalyzes the phosphorylation of ribofuranosyl-containing nucleoside analogues at the 5 '-hydroxyl using ATP or GTP as the phosphate donor.The physiological function of AK is associated with the regulation of extracellular adenosine levels and the preservation of intracellular adenylate pools. Adenosine kinase is involved in the purine salvage pathway. . 29353 cd01169: 4-amino-5-hydroxymethyl-2-methyl-pyrimidine phosphate kinase (HMPP-kinase) catalyzes two consecutive phosphorylation steps in the thiamine phosphate biosynthesis pathway, leading to the synthesis of vitamin B1. The first step is the phosphorylation of the hydroxyl group of HMP to form 4-amino-5-hydroxymethyl-2-methyl-pyrimidine phosphate (HMP-P) and then the phophorylation of HMP-P to form 4-amino-5-hydroxymethyl-2-methyl-pyrimidine pyrophosphate (HMP-PP), which is the substrate for the thiamine synthase coupling reaction. 29354 cd01170: 4-methyl-5-beta-hydroxyethylthiazole (Thz) kinase catalyzes the phosphorylation of the hydroxylgroup of Thz. A reaction that allows cells to recycle Thz into the thiamine biosynthesis pathway, as an alternative to its synthesis from cysteine, tyrosine and 1-deoxy-D-xylulose-5-phosphate. 29355 cd01171: B.subtilis YXKO protein of unknown function and related proteins. Based on the conservation of the ATP binding site, the substrate binding site and the Mg2+binding site and structural homology this group is a member of the ribokinase-like superfamily. 29356 cd01172: RfaE encodes a bifunctional ADP-heptose synthase involved in the biosynthesis of the lipopolysaccharide (LPS) core precursor ADP-L-glycero-D-manno-heptose. LPS plays an important role in maintaining the structural integrity of the bacterial outer membrane of gram-negative bacteria. RfaE consists of two domains, a sugar kinase domain, represented here, and a domain belonging to the cytidylyltransferase superfamily. 29357 cd01173: Pyridoxal kinase plays a key role in the synthesis of the active coenzyme pyridoxal-5'-phosphate (PLP), by catalyzing the phosphorylation of the precursor vitamin B6 in the presence of Zn2+ and ATP. Mammals are unable to synthesize PLP de novo and require its precursors in the form of vitamin B6 (pyridoxal, pyridoxine, and pyridoxamine) from their diet. Pyridoxal kinase encoding genes are also found in many other species including yeast and bacteria. 29358 cd01174: Ribokinase catalyses the phosphorylation of ribose to ribose-5-phosphate using ATP. This reaction is the first step in the ribose metabolism. It traps ribose within the cell after uptake and also prepares the sugar for use in the synthesis of nucleotides and histidine, and for entry into the pentose phosphate pathway. Ribokinase is dimeric in solution. 29359 cd01937: Ribokinase-like subgroup D. Found in bacteria and archaea, this subgroup is part of the ribokinase/pfkB superfamily. Its oligomerization state is unknown at this time. 29360 cd01938: ADP-dependent glucokinase (ADPGK) and phosphofructokinase (ADPPFK). ADPGK and ADPPFK are proteins that rely on ADP rather than ATP to donate a phosphoryl group. They are found in certain hyperthermophilic archaea and in higher eukaryotes. A functional ADPGK has been characterized in mouse and is assumed to be desirable during ischemia/hypoxia. ADPGK and ADPPFK contain a large and a small domain with the binding site located in a groove between the domains. Partial domain closing is seen when ADP is bound, and further domain closing is observed when glucose is also bound. The oligomerization state apparently varies depending on the species, with some existing as monomers, some as dimers, and some as tetramers. 29361 cd01939: Ketohexokinase (fructokinase, KHK) catalyzes the phosphorylation of fructose to fructose-1-phosphate (F1P), the first step in the metabolism of dietary fructose. KHK can also phosphorylate several other furanose sugars. It is found in higher eukaryotes where it is believed to function as a dimer and requires K(+) and ATP to be active. In humans, hepatic KHK deficiency causes fructosuria, a benign inborn error of metabolism. 29362 cd01940: Fructoselysine kinase-like. Fructoselysine is a fructoseamine formed by glycation, a non-enzymatic reaction of glucose with a primary amine followed by an Amadori rearrangement, resulting in a protein that is modified at the amino terminus and at the lysine side chains. Fructoseamines are typically metabolized by fructoseamine-3-kinase, especially in higher eukaryotes. In E. coli, fructoselysine kinase has been shown in vitro to catalyze the phosphorylation of fructoselysine. It is proposed that fructoselysine is released from glycated proteins during human digestion and is partly metabolized by bacteria in the hind gut using a protein such as fructoselysine kinase. This family is found only in bacterial sequences, and its oligomeric state is currently unknown. 29363 cd01941: YeiC-like sugar kinase. Found in eukaryotes and bacteria, YeiC-like kinase is part of the ribokinase/pfkB sugar kinase superfamily. Its oligomerization state is unknown at this time. 29364 cd01942: Ribokinase-like subgroup A. Found in bacteria and archaea, this subgroup is part of the ribokinase/pfkB superfamily. Its oligomerization state is unknown at this time. 29365 cd01943: MAK32 kinase. MAK32 is a protein found primarily in fungi that is necessary for the structural stability of L-A particles. The L-A virus particule is a specialized compartment for the transcription and replication of double-stranded RNA, known to infect yeast and other fungi. MAK32 is part of the host machinery used by the virus to multiply. 29366 cd01944: YegV-like sugar kinase. Found only in bacteria, YegV-like kinase is part of the ribokinase/pfkB sugar kinase superfamily. Its oligomerization state is unknown at this time. 29367 cd01945: Ribokinase-like subgroup B. Found in bacteria and plants, this subgroup is part of the ribokinase/pfkB superfamily. Its oligomerization state is unknown at this time. .. 29368 cd01946: Ribokinase-like subgroup C. Found only in bacteria, this subgroup is part of the ribokinase/pfkB superfamily. Its oligomerization state is unknown at this time. 29369 cd01947: Guanosine kinase-like sugar kinases. Found in bacteria and archaea, the guanosine kinase-like group is part of the ribokinase/pfkB sugar kinase superfamily. Its oligomerization state is unknown at this time. 29370 cd00288: Pyruvate kinase (PK): Large allosteric enzyme that regulates glycolysis through binding of the substrate, phosphoenolpyruvate, and one or more allosteric effectors. Like other allosteric enzymes, PK has a high substrate affinity R state and a low affinity T state. PK exists as several different isozymes, depending on organism and tissue type. In mammals, there are four PK isozymes: R, found in red blood cells, L, found in liver, M1, found in skeletal muscle, and M2, found in kidney, adipose tissue, and lung. PK forms a homotetramer, with each subunit containing three domains. The T state to R state transition of PK is more complex than in most allosteric enzymes, involving a concerted rotation of all 3 domains of each monomer in the homotetramer. 29371 cd00290: Cytochrome b(C-terminus)/b6/petD: Cytochrome b is a subunit of cytochrome bc1, an 11-subunit mitochondrial respiratory enzyme. Cytochrome b spans the mitochondrial membrane with 8 transmembrane helices (A-H) in eukaryotes. In plants and cyanobacteria, cytochrome b6 is analogous to eukaryote cytochrome b, containing two chains: helices A-D are encoded by the petB gene and helices E-H are encoded by the petD gene in these organisms. Cytochrome b/b6 contains two bound hemes and two ubiquinol/ubiquinone binding sites. The C-terminal domain is involved in forming the ubiquinol/ubiquinone binding sites, but not the heme binding sites. The N-terminal portion of cytochrome b, which contains both heme binding sites, is described in a separate CD. 29372 cd00292: Elongation factor 1 beta (EF1B) guanine nucleotide exchange domain. EF1B catalyzes the exchange of GDP bound to the G-protein, EF1A, for GTP, an important step in the elongation cycle of the protein biosynthesis. EF1A binds to and delivers the aminoacyl tRNA to the ribosome. The guanine nucleotide exchange domain of EF1B, which is the alpha subunit in yeast, is responsible for the catalysis of this exchange reaction. 29373 cd00296: SIR2 superfamily of proteins includes silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation, where the acetyl group from the lysine epsilon-amino group is transferred to the ADP-ribose moiety of NAD+, producing nicotinamide and the novel metabolite O-acetyl-ADP-ribose. Sir2 proteins, also known as sirtuins, are found in all eukaryotes and many archaea and prokaryotes and have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. The most-studied function, gene silencing, involves the inactivation of chromosome domains containing key regulatory genes by packaging them into a specialized chromatin structure that is inaccessible to DNA-binding proteins. The oligomerization state of Sir2 appears to be organism-dependent, sometimes occurring as a monomer and sometimes as a multimer. Also included in this superfamily is a group of uncharacterized Sir2-like proteins which lack certain key catalytic residues and conserved zinc binding cysteines. 29374 cd01406: Sir2-like: Prokaryotic group of uncharacterized Sir2-like proteins which lack certain key catalytic residues and conserved zinc binding cysteines; and are members of the SIR2 superfamily of proteins, silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation. 29375 cd01407: SIR2 family of proteins includes silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation, where the acetyl group from the lysine epsilon-amino group is transferred to the ADP-ribose moiety of NAD+, producing nicotinamide and the novel metabolite O-acetyl-ADP-ribose. Sir2 proteins, also known as sirtuins, are found in all eukaryotes and many archaea and prokaryotes and have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. The most-studied function, gene silencing, involves the inactivation of chromosome domains containing key regulatory genes by packaging them into a specialized chromatin structure that is inaccessible to DNA-binding proteins. The oligomerization state of Sir2 appears to be organism-dependent, sometimes occurring as a monomer and sometimes as a multimer. 29376 cd01408: SIRT1: Eukaryotic group (class1) which includes human sirtuins SIRT1-3 and yeast Hst1-4; and are members of the SIR2 family of proteins, silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation. Sir2 proteins have been shown to regulate gene silencing, DNA repair, and life span. The most-studied function, gene silencing, involves the inactivation of chromosome domains containing key regulatory genes by packaging them into a specialized chromatin structure that is inaccessible to DNA-binding proteins. The nuclear SIRT1 has been shown to target the p53 tumor suppressor protein for deacetylation to suppress DNA damage, and the cytoplasmic SIRT2 homolog has been shown to target alpha-tubulin for deacetylation for the maintenance of cell integrity. 29377 cd01409: SIRT4: Eukaryotic and prokaryotic group (class2) which includes human sirtuin SIRT4 and several bacterial homologs; and are members of the SIR2 family of proteins, silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation. Sir2 proteins have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. 29378 cd01410: SIRT7: Eukaryotic and prokaryotic group (class4) which includes human sirtuin SIRT6, SIRT7, and several bacterial homologs; and are members of the SIR2 family of proteins, silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation. Sir2 proteins have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. 29379 cd01411: SIR2H: Uncharacterized prokaryotic Sir2 homologs from several gram positive bacterial species and Fusobacteria; and are members of the SIR2 family of proteins, silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation. Sir2 proteins have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. 29380 cd01412: SIRT5_Af1_CobB: Eukaryotic, archaeal and prokaryotic group (class3) which includes human sirtuin SIRT5, Archaeoglobus fulgidus Sir2-Af1, and E. coli CobB; and are members of the SIR2 family of proteins, silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation. Sir2 proteins have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. CobB is a bacterial sirtuin that deacetylates acetyl-CoA synthetase at an active site lysine to stimulate its enzymatic activity. . 29381 cd01413: SIR2_Af2: Archaeal and prokaryotic group which includes Archaeoglobus fulgidus Sir2-Af2, Sulfolobus solfataricus ssSir2, and several bacterial homologs; and are members of the SIR2 family of proteins, silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation. Sir2 proteins have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. The Sir2 homolog from the archaea Sulfolobus solftaricus deacetylates the non-specific DNA protein Alba to mediate transcription repression. 29382 cd00298: alpha-crystallin-type heat shock proteins (Hsps), family of small stress induced proteins ranging from 12 -43 kDa, whose common feature is the alpha-crystallin domain. They are generally active as large oligomers consisting of multiple subunits. They are believed to be ATP-independent chaperones that prevent aggregation, and are important in refolding in combination with other Hsps. They are present in all kingdoms, but not in all organisms. 29383 cd00312: Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate. 29385 cd00649: Catalase-peroxidases. This is a subgroup of heme-dependent peroxidases of the plant superfamily that share a heme prosthetic group and catalyze a multistep oxidative reaction involving hydrogen peroxide as the electron acceptor. Catalase-peroxidases can exhibit both catalase and broad-spectrum peroxidase activities depending on the steady-state concentration of hydrogen peroxide. These enzymes are found in many archaeal and bacterial organisms, where they neutralize potentially lethal hydrogen peroxide molecules generated during photosynthesis or stationary phase. Along with related intracellular fungal and plant peroxidases, catalase-peroxidases belong class I of the plant peroxidase superfamily. Unlike the eukaryotic enzymes, they are typically comprised of two homologous domains that probably arose via a single gene duplication event. The heme binding motif is present only in the N-terminal domain; the function of the C-terminal domain is not clear. 29386 cd00691: Ascorbate peroxidases. This is a subgroup of heme-dependent peroxidases of the plant superfamily that share a heme prosthetic group and catalyze a multistep oxidative reaction involving hydrogen peroxide as the electron acceptor. Along with related catalase- peroxidases, ascorbate peroxidases belong class I of the plant superfamily. Ascorbate peroxidases are found in the chloroplasts and/or cytosol of algae and plants, where they have been shown to control the concentration of lethal hydrogen peroxide molecules. The yeast cytochrome c peroxidase is a divergent member of the family. It forms a complex with cytochrome c to catalyze the reduction of hydrogen peroxide to water. 29387 cd00692: Lignin and manganese peroxidases. Ligninases and related extracellular fungal peroxidases belong to class II of the plant heme-dependent peroxidase superfamily. All members of the superfamily share a heme prosthetic group and catalyze a multistep oxidative reaction involving hydrogen peroxide as the electron acceptor. Class II peroxidases are fungal glycoproteins that have been implicated in the oxidative breakdown of lignin, the main cell wall component of woody plants. They contain four conserved disulphide bridges and two conserved calcium binding sites. 29388 cd00693: Secretory peroxidases. Horseradish peroxidase and related secretory peroxidases belong to class III of the plant heme-dependent peroxidase superfamily. All members of the superfamily share a heme prosthetic group and catalyze a multistep oxidative reaction involving hydrogen peroxide as the electron acceptor. Class III peroxidases are found in the extracellular space or in the vacuole in plants, where they have been implicated in hydrogen peroxide detoxification, auxin catabolism and lignin biosynthesis and stress response. Class III peroxidases contain four conserved disulphide bridges and two conserved calcium binding sites. 29390 cd00317: cyclophilin: cyclophilin-type peptidylprolyl cis- trans isomerases. This family contains eukaryotic, bacterial and archeal proteins which exhibit a peptidylprolyl cis- trans isomerases activity (PPIase, Rotamase) and in addition bind the immunosuppressive drug cyclosporin (CsA). Immunosuppression in vertebrates is believed to be the result of the cyclophilin A-cyclosporin protein drug complex binding to and inhibiting the protein-phosphatase calcineurin. PPIase is an enzyme which accelerates protein folding by catalyzing the cis-trans isomerization of the peptide bonds preceding proline residues. Cyclophilins are a diverse family in terms of function and have been implicated in protein folding processes which depend on catalytic /chaperone-like activities. This group contains human cyclophilin 40, a co-chaperone of the hsp90 chaperone system; human cyclophilin A, a chaperone in the HIV-1 infectious process and; human cyclophilin H, a component of the U4/U6 snRNP, whose isomerization or chaperoning activities may play a role in RNA splicing. . 29391 cd01920: cyclophilin_EcCYP_like: cyclophilin-type A-like peptidylprolyl cis- trans isomerase (PPIase) domain similar to the cytosolic E. coli cyclophilin A and Streptomyces antibioticus SanCyp18. Compared to the archetypal cyclophilin Human cyclophilin A, these have reduced affinity for cyclosporin A. E. coli cyclophilin A has a similar peptidylprolyl cis- trans isomerase activity to the human cyclophilin A. Most members of this subfamily contain a phenylalanine residue at the position equivalent to Human cyclophilin W121, where a tyrptophan has been shown to be important for cyclophilin binding. 29392 cd01921: cyclophilin_RRM: cyclophilin-type peptidylprolyl cis- trans isomerase domain occuring with a C-terminal RNA recognition motif domain (RRM). This subfamily of the cyclophilin domain family contains a number of eukaryotic cyclophilins having the RRM domain including the nuclear proteins: human hCyP-57, Arabidopsis thaliana AtCYP59, Caenorhabditis elegans CeCyP-44 and Paramecium tetrurelia Kin241. The Kin241 protein has been shown to have a role in cell morphogenesis. 29393 cd01922: cyclophilin_SpCYP2_like: cyclophilin 2-like peptidylprolyl cis- trans isomerase (PPIase) domain similar to Schizosaccharomyces pombe cyp-2. These proteins bind their respective SNW chromatin binding protein in autologous systems, in a CsA independent manner indicating interaction with a surface outside the PPIase active site. SNW proteins play a basic and broad range role in signaling. 29394 cd01923: cyclophilin_RING: cyclophilin-type peptidylprolyl cis- trans isomerases (cyclophilins) having a modified RING finger domain. This group includes the nuclear proteins, Human hCyP-60 and Caenorhabditis elegans MOG-6 which, compared to the archetypal cyclophilin Human cyclophilin A exhibit reduced peptidylprolyl cis- trans isomerase activity and lack a residue important for cyclophilin binding. Human hCyP-60 has been shown to physically interact with the proteinase inhibitor peptide eglin c and; C. elegans MOG-6 to physically interact with MEP-1, a nuclear zinc finger protein. MOG-6 has been shown to function in germline sex determination. 29395 cd01924: cyclophilin_TLP40_like: cyclophilin-type peptidylprolyl cis- trans isomerases (cyclophilins) similar ot the Spinach thylakoid lumen protein TLP40. Compared to the archetypal cyclophilin Human cyclophilin A, these proteins have similar peptidylprolyl cis- trans isomerase activity and reduced affinity for cyclosporin A. Spinach TLP40 has been shown to have a dual function as a folding catalyst and regulator of dephosphorylation. 29396 cd01925: cyclophilin_CeCYP16-like: cyclophilin-type peptidylprolyl cis- trans isomerase) (PPIase) domain similar to Caenorhabditis elegans cyclophilin 16. C. elegans CeCYP-16, compared to the archetypal cyclophilin Human cyclophilin A has, a reduced peptidylprolyl cis- trans isomerase activity, is cyclosporin insensitive and shows an altered substrate preference favoring, hydrophobic, acidic or amide amino acids. Most members of this subfamily have a glutamate residue in the active site at the position equivalent to a tryptophan (W121 in Human cyclophilin A), which has been shown to be important for cyclophilin binding. 29397 cd01926: cyclophilin_ABH_like: Cyclophilin A, B and H-like cyclophilin-type peptidylprolyl cis- trans isomerase (PPIase) domain. This family represents the archetypal cystolic cyclophilin similar to human cyclophilins A, B and H. PPIase is an enzyme which accelerates protein folding by catalyzing the cis-trans isomerization of the peptide bonds preceding proline residues. These enzymes have been implicated in protein folding processes which depend on catalytic /chaperone-like activities. As cyclophilins, Human hCyP-A, human cyclophilin-B (hCyP-19), S. cerevisiae Cpr1 and C. elegans Cyp-3, are inhibited by the immunosuppressive drug cyclopsporin A (CsA). CsA binds to the PPIase active site. Cyp-3. S. cerevisiae Cpr1 interacts with the Rpd3 - Sin3 complex and in addition is a component of the Set3 complex. S. cerevisiae Cpr1 has also been shown to have a role in Zpr1p nuclear transport. Human cyclophilin H associates with the [U4/U6.U5] tri-snRNP particles of the splicesome. 29398 cd01927: cyclophilin_WD40: cyclophilin-type peptidylprolyl cis- trans isomerases (cyclophilins) having a WD40 domain. This group consists of several hypothetical and putative eukaryotic and bacterial proteins which have a cyclophilin domain and a WD40 domain. Function of the protein is not known. 29399 cd01928: Cyclophilin_PPIL3_like. Proteins similar to Human cyclophilin-like peptidylprolyl cis- trans isomerase (PPIL3). Members of this family lack a key residue important for cyclosporin binding: the tryptophan residue corresponding to W121 in human hCyP-18a; most members have a histidine at this position. The exact function of the protein is not known. 29400 cd00318: Phosphoglycerate kinase (PGK) is a monomeric enzyme which catalyzes the transfer of the high-energy phosphate group of 1,3-bisphosphoglycerate to ADP, forming ATP and 3-phosphoglycerate. This reaction represents the first of the two substrate-level phosphorylation events in the glycolytic pathway. Substrate-level phosphorylation is defined as production of ATP by a process, which is catalyzed by water-soluble enzymes in the cytosol; not involving membranes and ion gradients. . 29401 cd00321: Sulfite oxidase (SO) family, molybdopterin binding domain. This molybdopterin cofactor (Moco) binding domain is found in a variety of oxidoreductases, main members of this family are nitrate reductase (NR) and sulfite oxidase (SO). SO catalyzes the terminal reaction in the oxidative degradation of the sulfur-containing amino acids cysteine and methionine. Assimilatory NRs catalyze the reduction of nitrate to nitrite which is subsequently converted to NH4+ by nitrite reductase. Common features of all known members of this family are that they contain one single pterin cofactor and part of the coordination of the metal (Mo) is a cysteine ligand of the protein and that they catalyze the transfer of an oxygen to or from a lone pair of electrons on the substrate. 29402 cd02107: YedY_like molybdopterin cofactor (Moco) binding domain, a subgroup of the sulfite oxidase (SO) family of molybdopterin binding domains. Escherichia coli YedY has been propsed to form a heterodimer, consisting of a soluble catalytic subunit termed YedY, which is likely membrane-anchored by a heme-containing trans-membrane subunit YedZ. Preliminary results indicate that YedY may represent a new type of membrane-associated bacterial reductase. Common features of all known members of this family are that they contain one single pterin cofactor and part of the coordination of the metal (Mo) is a cysteine ligand of the protein and that they catalyze the transfer of an oxygen to or from a lone pair of electrons on the substrate. 29403 cd02108: bacterial subgroup of the sulfite oxidase (SO) family of molybdopterin binding domains. This domain is found in a variety of oxidoreductases. Common features of all known members of this family, like sulfite oxidase and nitrite reductase, are that they contain one single pterin cofactor and part of the coordination of the metal (Mo) is a cysteine ligand of the protein and that they catalyze the transfer of an oxygen to or from a lone pair of electrons on the substrate. The specific function of this subgroup is unknown. 29404 cd02109: bacterial and archael members of the sulfite oxidase (SO) family of molybdopterin binding domains. This molybdopterin cofactor (Moco) binding domain is found in a variety of oxidoreductases, main members of this family are nitrate reductase (NR) and sulfite oxidase (SO). Common features of all known members of this family are that they contain one single pterin cofactor and part of the coordination of the metal (Mo) is a cysteine ligand of the protein and that they catalyze the transfer of an oxygen to or from a lone pair of electrons on the substrate. The specific function of this subgroup is unknown. 29405 cd02110: Subgroup of sulfite oxidase (SO) family molybdopterin binding domains that contains conserved dimerization domain. This molybdopterin cofactor (Moco) binding domain is found in a variety of oxidoreductases, main members of this family are nitrate reductase (NR) and sulfite oxidase (SO). . 29406 cd02111: molybdopterin binding domain of sulfite oxidase (SO). SO catalyzes the terminal reaction in the oxidative degradation of the sulfur-containing amino acids cysteine and methionine. Common features of all known members of the sulfite oxidase (SO) family of molybdopterin binding domains are that they contain one single pterin cofactor and part of the coordination of the metal (Mo) is a cysteine ligand of the protein and that they catalyze the transfer of an oxygen to or from a lone pair of electrons on the substrate. 29407 cd02112: molybdopterin binding domain of eukaryotic nitrate reductase (NR). Assimilatory NRs catalyze the reduction of nitrate to nitrite which is subsequently converted to NH4+ by nitrite reductase. Eukaryotic assimilatory nitrate reductases are cytosolic homodimeric enzymes with three prosthetic groups, flavin adenine dinucleotide (FAD), cytochrome b557, and Mo cofactor, which are located in three functional domains. Common features of all known members of the sulfite oxidase (SO) family of molybdopterin binding domains are that they contain one single pterin cofactor and part of the coordination of the metal (Mo) is a cysteine ligand of the protein and that they catalyze the transfer of an oxygen to or from a lone pair of electrons on the substrate. 29408 cd02113: bacterial SoxC is a member of the sulfite oxidase (SO) family of molybdopterin binding domains. SoxC is involved in oxidation of sulfur compounds during chemolithothrophic growth. Together with SoxD, a small c-type heme containing subunit, it forms a hetrotetrameric sulfite dehydrogenase. This molybdopterin cofactor (Moco) binding domain is found in a variety of oxidoreductases, main members of this family are nitrate reductase (NR) and sulfite oxidase (SO). Common features of all known members of this family are that they contain one single pterin cofactor and part of the coordination of the metal (Mo) is a cysteine ligand of the protein and that they catalyze the transfer of an oxygen to or from a lone pair of electrons on the substrate. 29409 cd02114: sulfite:cytochrome c oxidoreductase subunit A (SorA), molybdopterin binding domain. SorA is involved in oxidation of sulfur compounds during chemolithothrophic growth. Together with SorB, a small c-type heme containing subunit, it forms a hetrodimer. It is a member of the sulfite oxidase (SO) family of molybdopterin binding domains. This molybdopterin cofactor (Moco) binding domain is found in a variety of oxidoreductases, main members of this family are nitrate reductase (NR) and sulfite oxidase (SO). Common features of all known members of this family are that they contain one single pterin cofactor and part of the coordination of the metal (Mo) is a cysteine ligand of the protein and that they catalyze the transfer of an oxygen to or from a lone pair of electrons on the substrate. 29411 cd00751: Thiolase are ubiquitous enzymes that catalyze the reversible thiolytic cleavage of 3-ketoacyl-CoA into acyl-CoA and acetyl-CoA, a 2-step reaction involving a covalent intermediate formed with a catalytic cysteine. They are found in prokaryotes and eukaryotes (cytosol, microbodies and mitochondria). There are 2 functional different classes: thiolase-I (3-ketoacyl-CoA thiolase) and thiolase-II (acetoacetyl-CoA thiolase). Thiolase-I can cleave longer fatty acid molecules and plays an important role in the beta-oxidative degradation of fatty acids. Thiolase-II has a high substrate specificity. Although it can cleave acetoacyl-CoA, its main function is the synthesis of acetoacyl-CoA from two molecules of acetyl-CoA, which gives it importance in several biosynthetic pathways. 29414 cd00827: ""initiating"" condensing enzymes are a subclass of decarboxylating condensing enzymes, including beta-ketoacyl [ACP] synthase, type III and polyketide synthases, type III, which include chalcone synthase and related enzymes. They are characterized by the utlization of CoA substrate primers, as well as the nature of their active site residues. 29415 cd00828: ""elongating"" condensing enzymes are a subclass of decarboxylating condensing enzymes, including beta-ketoacyl [ACP] synthase, type I and II and polyketide synthases.They are characterized by the utlization of acyl carrier protein (ACP) thioesters as primer substrates, as well as the nature of their active site residues. 29416 cd00829: Thiolase domain associated with sterol carrier protein (SCP)-x isoform and related proteins; SCP-2 has multiple roles in intracellular lipid circulation and metabolism. The N-terminal presequence in the SCP-x isoform represents a peroxisomal 3-ketacyl-Coa thiolase specific for branched-chain acyl CoAs, which is proteolytically cleaved from the sterol carrier protein. 29417 cd00830: Ketoacyl-acyl carrier protein synthase III (KASIII) initiates the elongation in type II fatty acid synthase systems. It is found in bacteria and plants. Elongation of fatty acids in the type II systems occurs by Claisen condensation of malonyl-acyl carrier protein (ACP) with acyl-ACP. KASIII initiates this process by specifically using acetyl-CoA over acyl-CoA. 29418 cd00831: Chalcone and stilbene synthases; plant-specific polyketide synthases (PKS) and related enzymes, also called type III PKSs. PKS generate an array of different products, dependent on the nature of the starter molecule. They share a common chemical strategy, after the starter molecule is loaded onto the active site cysteine, a carboxylative condensation reation extends the polyketide chain. Plant-specific PKS are dimeric iterative PKSs, using coenzyme A esters to deliver substrate to the active site, but they differ in the choice of starter molecule and the number of condensation reactions. 29419 cd00832: Chain-length factor (CLF) is a factor required for polyketide chain initiation of aromatic antibiotic-producing polyketide synthases (PKSs) of filamentous bacteria. CLFs have been shown to have decarboxylase activity towards malonyl-acyl carrier protein (ACP). CLFs are similar to other elongation ketosynthase domains, but their active site cysteine is replaced by a conserved glutamine. 29420 cd00833: polyketide synthases (PKSs) polymerize simple fatty acids into a large variety of different products, called polyketides, by successive decarboxylating Claisen condensations. PKSs can be divided into 2 groups, modular type I PKSs consisting of one or more large multifunctional proteins and iterative type II PKSs, complexes of several monofunctional subunits. 29421 cd00834: Beta-ketoacyl-acyl carrier protein (ACP) synthase (KAS), type I and II. KASs are responsible for the elongation steps in fatty acid biosynthesis. KASIII catalyses the initial condensation and KAS I and II catalyze further elongation steps by Claisen condensation of malonyl-acyl carrier protein (ACP) with acyl-ACP. 29422 cd00328: Catalase. Catalase is an ubiquitous enzyme found in both prokaryotes and eukaryotes involved in the protection of cells from the toxic effects of peroxides. It calalyses the conversion of hydrogen peroxide to water and molecular oxygen. Most catalases exist as tetramers of 65KD subunits containing a protohaem IX group buried deep inside the structure. 29423 cd00333: Major intrinsic protein (MIP) superfamily. Members of the MIP superfamily function as membrane channels that selectively transport water, small neutral molecules, and ions out of and between cells. The channel proteins share a common fold: the N-terminal cytosolic portion followed by six transmembrane helices, which might have arisen through gene duplication. On the basis of sequence similarity and functional characteristics, the superfamily can be subdivided into two major groups: water-selective channels called aquaporins (AQPs) and glycerol uptake facilitators (GlpFs). AQPs are found in all three kingdoms of life, while GlpFs have been characterized only within microorganisms. 29424 cd00346: Tryptophan synthase alpha chain; Tryptophan synthase is a bifunctional pyridoxal-5'-phosphate (PLP)-dependent tetrameric (2 alpha and 2 beta subunits) enzyme that catalyzes the last 2 steps of the L-tryptophan synthesis, the condensation of indole-3-glycerol phosphate (IPG) and L-serine to L-tryptophan and water. The first (alpha-) reaction is the cleavage of indole-3-glycerol phosphate to indole and glyceraldehyde-3-phosphate and is catalyzed by the alpha subunit. The indole is then transferred to the reactive center of the beta subunit through a tunnel connecting both subunits. This process of substrate tunneling is connected to tight allosteric regulation of the enzyme. 29426 cd01094: Alkanesulfonate monoxygenase is the monoxygenase of a two-component system that catalyzes the conversion of alkanesulfonates to the corresponding aldehyde and sulfite. Alkanesulfonate monoxygenase (SsuD) has an absolute requirement for reduced flavin mononucleotide (FMNH2), which is provided by the NADPH-dependent FMN oxidoreductase (SsuE).. 29428 cd01096: Alkanal monooxygenase are flavin monoxygenases. Molecular oxygen is activated by reaction with reduced flavin mononucleotide (FMNH2) and reacts with an aldehyde to yield the carboxylic acid, oxidized flavin (FMN) and a blue-green light. Bacterial luciferases are heterodimers made of alpha and beta subunits which are homologous. The single activer center is on the alpha subunit. The alpha subunit has a stretch of 30 amino acid residues that is not present in the beta subunit. The beta subunit does not contain the active site and is required for the formation of the fully active heterodimer. The beta subunit does not contribute anything directly to the active site. Its role is probably to stabilize the high quantum yield conformation of the alpha subunit through interactionbs across the subunit interface. 29430 cd00349: Ribosomal protein L11. Ribosomal protein L11 and RNA molecule forms a complex, which has been termed as the GTPase-associated region of the large (50S) ribosomal subunit. Protein L11 is highly conserved among eubacteria, archaea and eukaryotes. The C-terminal domain of ribosomal protein L11 binds to RNA and the N-terminal domain is required for cooperative interaction with antibiotics and for binding to 23S RNA. 29431 cd00350: Rubredoxin_like; nonheme iron binding domain containing a [Fe(SCys)4] center. The family includes rubredoxins, a small electron transfer protein, and a slightly smaller modular rubredoxin domain present in rubrerythrin and nigerythrin and detected either N- or C-terminal to such proteins as flavin reductase, NAD(P)H-nitrite reductase, and ferredoxin-thioredoxin reductase. In rubredoxin, the iron atom is coordinated by four cysteine residues (Fe(S-Cys)4), but iron can also be replaced by cobalt, nickel or zinc and believed to be involved in electron transfer. Rubrerythrins and nigerythrins are small homodimeric proteins, generally consisting of 2 domains: a rubredoxin domain C-terminal to a non-sulfur, oxo-bridged diiron site in the N-terminal rubrerythrin domain. Rubrerythrins and nigerythrins have putative peroxide activity. 29432 cd00729: Rubredoxin, Small Modular nonheme iron binding domain containing a [Fe(SCys)4] center, present in rubrerythrin and nigerythrin and detected either N- or C-terminal to such proteins as flavin reductase, NAD(P)H-nitrite reductase, and ferredoxin-thioredoxin reductase. In rubredoxin, the iron atom is coordinated by four cysteine residues (Fe(S-Cys)4), and believed to be involved in electron transfer. Rubrerythrins and nigerythrins are small homodimeric proteins, generally consisting of 2 domains: a rubredoxin domain C-terminal to a non-sulfur, oxo-bridged diiron site in the N-terminal rubrerythrin domain. Rubrerythrins and nigerythrins have putative peroxide activity. 29433 cd00730: Rubredoxin; nonheme iron binding domains containing a [Fe(SCys)4] center. Rubredoxins are small nonheme iron proteins. The iron atom is coordinated by four cysteine residues (Fe(S-Cys)4), but iron can also be replaced by cobalt, nickel or zinc. They are believed to be involved in electron transfer. 29434 cd00355: Ribosomal protein L30 (known as L7 in eukaryotes) is one of the smallest ribosomal proteins with a molecular mass of about 7kDa. L30 binds the 23SrRNA as well as the 5S rRNA and is one of five ribosomal proteins that mediate the interactions 5S rRNA makes with the ribosome. In eukaryotes and archaea, L7 has an additional C-terminal extension that protrudes out from the large subunit toward the subunit interface along with subunit L12 and the 23S rRNA which together form a region referred to as the L7/L12 stalk. 29435 cd01657: The archeal/eukaryotic ribosomal protein L7 (known as L30 in prokaryotes) binds domainII of the 23S rRNA as well as the 5S rRNA and is one of five ribosomal proteins that mediate the interactions 5S rRNA makes with the ribosome. L7 has an additional C-terminal extension, not found in its prokaryotic ortholog L30, that protrudes out from the large subunit toward the subunit interface along with subunit L12 and the 23S rRNA which together form a region referred to as the L7/L12 stalk. 29436 cd01658: Ribosomal protein L30 (known as L7 in eukaryotes) is one of the smallest ribosomal proteins with a molecular mass of about 7kDa. L30 binds the 23SrRNA as well as the 5S rRNA and is one of five ribosomal proteins that mediate the interactions 5S rRNA makes with the ribosome. In eukaryotes and archaea, L7 has an additional C-terminal extension that protrudes out from the large subunit toward the subunit interface along with subunit L12 and the 23S rRNA which together form a region referred to as the L7/L12 stalk. 29437 cd00363: Phosphofructokinase, a key regulatory enzyme in glycolysis, catalyzes the phosphorylation of fructose-6-phosphate to fructose-1,6-biphosphate. The members belong to PFK family that includes ATP- and pyrophosphate (PPi)- dependent phosphofructokinases. Some members evolved by gene duplication and thus have a large C-terminal/N-terminal extension comprising a second PFK domain. Generally, ATP-PFKs are allosteric homotetramers, and PPi-PFKs are dimeric and nonallosteric except for plant PPi-PFKs which are allosteric heterotetramers. 29438 cd00763: Phosphofructokinase, a key regulatory enzyme in glycolysis, catalyzes the phosphorylation of fructose-6-phosphate to fructose-1,6-biphosphate. The members belong to a subfamily of the PFKA family (cd00363) and include bacterial ATP-dependent phosphofructokinases. These are allosrterically regulated homotetramers; the subunits are of about 320 amino acids. 29439 cd00764: Phosphofructokinase, a key regulatory enzyme in glycolysis, catalyzes the phosphorylation of fructose-6-phosphate to fructose-1,6-biphosphate. The members belong to a subfamily of the PFKA family (cd00363) and include eukaryotic ATP-dependent phosphofructokinases. These have evolved from the bacterial PFKs by gene duplication and fusion events and exhibit complex allosteric behavior. 29440 cd00765: Phosphofructokinase, a key regulatory enzyme in glycolysis, catalyzes the phosphorylation of fructose-6-phosphate to fructose-1,6-biphosphate. The members belong to a subfamily of the PFKA family (cd00363) and include pyrophosphate-dependent phosphofructokinases. These are found in bacteria as well as plants. These may be dimeric nonallosteric enzymes as in bacteria or allosteric heterotetramers as in plants. 29441 cd00365: Hydroxymethylglutaryl-coenzyme A (HMG-CoA) reductase (HMGR) is a tightly regulated enzyme, which catalyzes the synthesis of coenzyme A and mevalonate in isoprenoid synthesis. In mammals, this is the rate limiting committed step in cholesterol biosynthesis. Bacteria, such as Pseudomonas mevalonii, which rely solely on mevalonate for their carbon source, catalyze the reverse reaction, using an NAD-dependent HMGR to deacetylate mevalonate into 3-hydroxy-3-methylglutaryl-CoA. There are two classes of HMGR: class I enzymes which are found predominantly in eukaryotes and contain N-terminal membrane regions and class II enzymes which are found primarily in prokaryotes and are soluble as they lack the membrane region. With the exception of Archaeoglobus fulgidus, most archeae are assigned to class I, based on sequence similarity of the active site, even though they lack membrane regions. Yeast and human HMGR are divergent in their N-terminal regions, but are conserved in their active site. In contrast, human and bacterial HMGR differ in their active site architecture. While the prokaryotic enzyme is a homodimer, the eukaryotic enzyme is a homotetramer. 29442 cd00643: Hydroxymethylglutaryl-coenzyme A (HMG-CoA) reductase (HMGR), class I enzyme, homotetramer. Catalyzes the synthesis of coenzyme A and mevalonate in isoprenoid synthesis. In mammals this is the rate limiting committed step in cholesterol biosynthesis. Class I enzymes are found predominantly in eukaryotes and contain N-terminal membrane regions. With the exception of Archaeoglobus fulgidus, most archeae are assigned to class I, based on sequence similarity of the active site, even though they lack membrane regions. Yeast and human HMGR are divergent in their N-terminal regions, but are conserved in their active site. In contrast, human and bacterial HMGR differ in their active site architecture. 29443 cd00644: Hydroxymethylglutaryl-coenzyme A (HMG-CoA) reductase (HMGR), class II, prokaryotic enzyme is a homodimer. Class II enzymes are found primarily in prokaryotes and Archaeoglobus fulgidus and are soluble as they lack the membrane region. Enzymes catalyze the synthesis of coenzyme A and mevalonate in isoprenoid synthesis. Bacteria, such as Pseudomonas mevalonii, which rely solely on mevalonate for their carbon source, catalyze the reverse reaction, using an NAD-dependent HMGR to deacetylate mevalonate into 3-hydroxy-3-methylglutaryl-CoA. Human and bacterial HMGR differ in their active site architecture. 29444 cd00367: Histidine-containing phosphocarrier protein (HPr)-like proteins. HPr is a central component of the bacterial phosphoenolpyruvate sugar phosphotransferase system (PTS). The PTS catalyses the phosphorylation of sugar substrates during their translocation across the cell membrane. The phosphoryl group from phosphoenolpyruvate is transferred to HPr by enzyme I (EI). Phospho-HPr then transfers the phosphoryl group to one of several sugar-specific phosphoprotein intermediates. The conserved histidine in the N-terminus of HPr serves as an acceptor for the phosphoryl group of EI. In addition to the phosphotransferase proteins HPr and E1, this family also includes the closely related Carbon Catabolite Repressor (CCR) proteins which use the same phosphorylation mechanism and interact with transcriptional regulators to control expression of genes coding for utilization of less favored carbon sources. 29447 cd02751: The MopB_DMSOR-like CD contains dimethylsulfoxide reductase (DMSOR), biotin sulfoxide reductase (BSOR), trimethylamine N-oxide reductase (TMAOR) and other related proteins. DMSOR catalyzes the reduction of DMSO to dimethylsulfide, but its cellular location and oligomerization state are organism-dependent. For example, in Rhodobacter sphaeriodes and Rhodobacter capsulatus, it is an 82-kDa monomeric soluble protein found in the periplasmic space; in E. coli, it is membrane-bound and exists as a heterotrimer. BSOR catalyzes the reduction of biotin sulfixode to biotin, and is unique among Mo enzymes because no additional auxiliary proteins or cofactors are required. TMAOR is similar to DMSOR, but its only natural substrate is TMAO. Also included in this group is the pyrogallol-phloroglucinol transhydroxylase from Pelobacter acidigallici. Members of the MopB_DMSOR-like CD belong to the molybdopterin_binding (MopB) superfamily of proteins. 29450 cd02754: Nitrate reductases, NapA (Nitrate-R-NapA), NasA, and NarB catalyze the reduction of nitrate to nitrite. Monomeric Nas is located in the cytoplasm and participates in nitrogen assimilation. Dimeric Nap is located in the periplasm and is coupled to quinol oxidation via a membrane-anchored tetraheme cytochrome. Members of the MopB_Nitrate-R-NapA CD belong to the molybdopterin_binding (MopB) superfamily of proteins. 29456 cd02760: The MopB_Phenylacetyl-CoA-OR CD contains the phenylacetyl-CoA:acceptor oxidoreductase, large subunit (PadB2), and other related proteins. The phenylacetyl-CoA:acceptor oxidoreductase has been characterized as a membrane-bound molybdenum-iron-sulfur enzyme involved in anaerobic metabolism of phenylalanine in the denitrifying bacterium Thauera aromatica. Members of this CD belong to the molybdopterin_binding (MopB) superfamily of proteins. 29457 cd02761: The MopB_FmdB-FwdB CD contains the molybdenum/tungsten formylmethanofuran dehydrogenases, subunit B (FmdB/FwdB), and other related proteins. Formylmethanofuran dehydrogenase catalyzes the first step in methane formation from CO2 in methanogenic archaea and some eubacteria. Members of this CD belong to the molybdopterin_binding (MopB) superfamily of proteins. 29460 cd02764: The MopB_PHLH CD includes a group of related uncharacterized putative hydrogenase-like homologs (PHLH) of molybdopterin binding (MopB) proteins. This CD is of the PHLH region homologous to the catalytic molybdopterin-binding subunit of MopB homologs. 29463 cd02767: The MopB_ydeP CD includes a group of related uncharacterized bacterial molybdopterin-binding oxidoreductase-like domains with a putative molybdopterin cofactor binding site. These members belong to the molybdopterin_binding (MopB) superfamily of proteins. 29464 cd02768: MopB_NADH-Q-OR-NuoG2: The NuoG/Nad11/75-kDa subunit (second domain) of the NADH-quinone oxidoreductase (NADH-Q-OR)/respiratory complex I/NADH dehydrogenase-1 (NDH-1). The NADH-Q-OR is the first energy-transducting complex in the respiratory chains of many prokaryotes and eukaryotes. Mitochondrial complex I and its bacterial counterpart, NDH-1, function as a redox pump that uses the redox energy to translocate H+ ions across the membrane, resulting in a significant contribution to energy production. The atomic structure of complex I is not known and the mechanisms of electron transfer and proton pumping are not established. The nad11 gene codes for the largest (75-kDa) subunit of the mitochondrial NADH:ubiquinone oxidoreductase, it constitutes the electron input part of the enzyme, or the so-called NADH dehydrogenase fragment. In Escherichia coli, this subunit is encoded by the nuoG gene, and is part of the 14 distinct subunits constituting the 'minimal' functional enzyme. The nad11 gene is nuclear-encoded in animals, plants, and fungi, but is still encoded in the mitochondrial genome of some protists. The Nad11/NuoG subunit is made of two domains: the first contains three binding sites for FeS clusters (the fer2 domain), the second domain (this CD), is of unknown function or, as postulated, has lost an ancestral formate dehydrogenase activity that became redundant during the evolution of the complex I enzyme. Although only vestigial sequence evidence remains of a molybdopterin binding site, this protein domain family belongs to the molybdopterin_binding (MopB) superfamily of proteins. Bacterial type II NADH-quinone oxidoreductases and NQR-type sodium-motive NADH-quinone oxidoreductases are not homologs of this domain family. 29465 cd02769: The MopB_DMSOR-BSOR-TMAOR CD contains dimethylsulfoxide reductase (DMSOR), biotin sulfoxide reductase (BSOR), trimethylamine N-oxide reductase (TMAOR) and other related proteins. DMSOR always catalyzes the reduction of DMSO to dimethylsulfide, but its cellular location and oligomerization state are organism-dependent. For example, in Rhodobacter sphaeriodes and Rhodobacter capsulatus, it is an 82-kDa monomeric soluble protein found in the periplasmic space; in E. coli, it is membrane-bound and exists as a heterotrimer. BSOR catalyzes the reduction of biotin sulfixode to biotin, and is unique among Mo enzymes because no additional auxiliary proteins or cofactors are required. TMAOR is similar to DMSOR, but its only natural substrate is TMAO. Members of this CD belong to the molybdopterin_binding (MopB) superfamily of proteins. 29466 cd02770: This CD (MopB_DmsA-EC) includes the DmsA enzyme of the dmsABC operon encoding the anaerobic dimethylsulfoxide reductase (DMSOR) of Escherichia coli and other related DMSOR-like enzymes. Unlike other DMSOR-like enzymes, this group has a predicted N-terminal iron-sulfur [4Fe-4S] cluster binding site. These members belong to the molybdopterin_binding (MopB) superfamily of proteins. 29467 cd02771: MopB_NDH-1_NuoG2-N7: The second domain of the NuoG subunit (with a [4Fe-4S] cluster, N7) of the NADH-quinone oxidoreductase/NADH dehydrogenase-1 (NDH-1) found in various bacteria. The NDH-1 is the first energy-transducting complex in the respiratory chain and functions as a redox pump that uses the redox energy to translocate H+ ions across the membrane, resulting in a significant contribution to energy production. In Escherichia coli NDH-1, the largest subunit is encoded by the nuoG gene, and is part of the 14 distinct subunits constituting the functional enzyme. The NuoG subunit is made of two domains: the first contains three binding sites for FeS clusters (the fer2 domain), the second domain (this CD), is of unknown function or, as postulated, has lost an ancestral formate dehydrogenase activity that became redundant during the evolution of the complex I enzyme. Unique to this group, compared to the other prokaryotic and eukaryotic groups in this domain protein family (NADH-Q-OR-NuoG2), is an N-terminal [4Fe-4S] cluster (N7/N1c) present in the second domain. Although only vestigial sequence evidence remains of a molybdopterin binding site, this protein domain belongs to the molybdopterin_binding (MopB) superfamily of proteins. 29468 cd02772: MopB_NDH-1_NuoG2: The second domain of the NuoG subunit of the NADH-quinone oxidoreductase/NADH dehydrogenase-1 (NDH-1), found in beta- and gammaproteobacteria. The NDH-1 is the first energy-transducting complex in the respiratory chain and functions as a redox pump that uses the redox energy to translocate H+ ions across the membrane, resulting in a significant contribution to energy production. In Escherichia coli NDH-1, the largest subunit is encoded by the nuoG gene, and is part of the 14 distinct subunits constituting the functional enzyme. The NuoG subunit is made of two domains: the first contains three binding sites for FeS clusters (the fer2 domain), the second domain (this CD), is of unknown function or, as postulated, has lost an ancestral formate dehydrogenase activity that became redundant during the evolution of the complex I enzyme. Although only vestigial sequence evidence remains of a molybdopterin binding site, this protein domain belongs to the molybdopterin_binding (MopB) superfamily of proteins. 29469 cd02773: MopB_Res_Cmplx1_Nad11: The second domain of the Nad11/75-kDa subunit of the NADH-quinone oxidoreductase/respiratory complex I/NADH dehydrogenase-1(NDH-1) of eukaryotes and the Nqo3/G subunit of alphaproteobacteria NDH-1. The NADH-quinone oxidoreductase is the first energy-transducting complex in the respiratory chains of many prokaryotes and eukaryotes. Mitochondrial complex I and its bacterial counterpart, NDH-1, function as a redox pump that uses the redox energy to translocate H+ ions across the membrane, resulting in a significant contribution to energy production. The nad11 gene codes for the largest (75 kDa) subunit of the mitochondrial NADH:ubiquinone oxidoreductase, it constitutes the electron input part of the enzyme, or the so-called NADH dehydrogenase fragment. In Paracoccus denitrificans, this subunit is encoded by the nqo3 gene, and is part of the 14 distinct subunits constituting the 'minimal' functional enzyme. The Nad11/Nqo3 subunit is made of two domains: the first contains three binding sites for FeS clusters (the fer2 domain), the second domain (this CD), is of unknown function or, as postulated, has lost an ancestral formate dehydrogenase activity that became redundant during the evolution of the complex I enzyme. Although only vestigial sequence evidence remains of a molybdopterin binding site, this protein domain belongs to the molybdopterin_binding (MopB) superfamily of proteins. 29470 cd02774: MopB_Res_Cmplx1_Nad11_M: Mitochondrial-encoded NADH-quinone oxidoreductase/respiratory complex I, the second domain of the Nad11/75-kDa subunit of some protists. NADH-quinone oxidoreductase is the first energy-transducting complex in the respiratory chain and functions as a redox pump that uses the redox energy to translocate H+ ions across the membrane, resulting in a significant contribution to energy production. The nad11 gene codes for the largest (75-kDa) subunit of the mitochondrial NADH-quinone oxidoreductase, it constitutes the electron input part of the enzyme, or the so-called NADH dehydrogenase fragment. The Nad11 subunit is made of two domains: the first contains three binding sites for FeS clusters (the fer2 domain), the second domain (this CD), is of unknown function or, as postulated, has lost an ancestral formate dehydrogenase activity that became redundant during the evolution of the complex I enzyme. Although only vestigial sequence evidence remains of a molybdopterin binding site, this protein domain belongs to the molybdopterin_binding (MopB) superfamily of proteins. 29471 cd00371: Heavy-metal-associated domain (HMA) is a conserved domain of approximately 30 amino acid residues found in a number of proteins that transport or detoxify heavy metals, for example, the CPx-type heavy metal ATPases and copper chaperones. HMA domain contains two cysteine residues that are important in binding and transfer of metal ions, such as copper, cadmium, cobalt and zinc. In the case of copper, stoichiometry of binding is one Cu+ ion per binding domain. Repeats of the HMA domain in copper chaperone has been associated with Menkes/Wilson disease due to binding of multiple copper ions. 29472 cd00374: Ribonuclease T2 (RNase T2) is a widespread family of secreted RNases found in every organism examined thus far. This family includes RNase Rh, RNase MC1, RNase LE, and self-incompatibility RNases (S-RNases). Plant T2 RNases are expressed during leaf senescence in order to scavenge phosphate from ribonucleotides. They are also expressed in response to wounding or pathogen invasion. S-RNases are thought to prevent self-fertilization by acting as selective cytotoxins of ""self"" pollen. 29473 cd01061: Ribonuclease T2 (RNase T2) is a widespread family of secreted RNases found in every organism examined thus far. This family includes RNase Rh, RNase MC1, RNase LE, and self-incompatibility RNases (S-RNases). Plant T2 RNases are expressed during leaf senescence in order to scavenge phosphate from ribonucleotides. They are also expressed in response to wounding or pathogen invasion. S-RNases are thought to prevent self-fertilization by acting as selective cytotoxins of ""self"" pollen. Generally, RNases have two distinct binding sites: the primary site (B1 site) and the subsite (B2 site), for nucleotides located at the 5'- and 3'- terminal ends of the sessil bond, respectively. This CD includes the eukaryotic RNase T2 family members. 29474 cd01062: Ribonuclease T2 (RNase T2) is a widespread family of secreted RNases found in every organism examined thus far. This family includes RNase Rh, RNase MC1, RNase LE, and self-incompatibility RNases (S-RNases). Plant T2 RNases are expressed during leaf senescence in order to scavenge phosphate from ribonucleotides. They are also expressed in response to wounding or pathogen invasion. S-RNases are thought to prevent self-fertilization by acting as selective cytotoxins of ""self"" pollen. Generally, RNases have two distinct binding sites: the primary site (B1 site) and the subsite (B2 site), for nucleotides located at the 5'- and 3'- terminal ends of the sessil bond, respectively. This CD includes the prokaryotic RNase T2 family members. 29475 cd00383: Effector domain of response regulator. Bacteria and certain eukaryotes like protozoa and higher plants use two-component signal transduction systems to detect and respond to changes in the environment. The system consists of a sensor histidine kinase and a response regulator. The former autophosphorylates in a histidine residue on detecting an external stimulus. The phosphate is then transferred to an invariant aspartate residue in a highly conserved receiver domain of the response regulator. Phosphorylation activates a variable effector domain of the response regulator, which triggers the cellular response. The C-terminal effector domain contains DNA and RNA polymerase binding sites. Several dimers or monomers bind head to tail to small tandem repeats upstream of the genes. The RNA polymerase binding sites interact with the alpha or sigma subunite of RNA polymerase. 29476 cd00385: Isoprenoid Biosynthesis enzymes, Class 1; Superfamily of trans-isoprenyl diphosphate synthases (IPPS) and class I terpene cyclases which either synthesis geranyl/farnesyl diphosphates (GPP/FPP) or longer chained products from isoprene precursors, isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP), or use geranyl (C10)-, farnesyl (C15)-, or geranylgeranyl (C20)-diphosphate as substrate. These enzymes produce a myriad of precursors for such end products as steroids, cholesterol, sesquiterpenes, heme, carotenoids, retinoids, and diterpenes; and are widely distributed among archaea, bacteria, and eukaryota.The enzymes in this superfamily share the same 'isoprenoid synthase fold' and include several subgroups. The head-to-tail (HT) IPPS catalyze the successive 1'-4 condensation of the 5-carbon IPP to the growing isoprene chain to form linear, all-trans, C10-, C15-, C20- C25-, C30-, C35-, C40-, C45-, or C50-isoprenoid diphosphates. Cyclic monoterpenes, diterpenes, and sesquiterpenes, are formed from their respective linear isoprenoid diphosphates by class I terpene cyclases. The head-to-head (HH) IPPS catalyze the successive 1'-1 condensation of 2 farnesyl or 2 geranylgeranyl isoprenoid diphosphates. Cyclization of these 30- and 40-carbon linear forms are catalyzed by class II cyclases. Both the isoprenoid chain elongation reactions and the class I terpene cyclization reactions proceed via electrophilic alkylations in which a new carbon-carbon single bond is generated through interaction between a highly reactive electron-deficient allylic carbocation and an electron-rich carbon-carbon double bond. The catalytic site consists of a large central cavity formed by mostly antiparallel alpha helices with two aspartate-rich regions located on opposite walls. These residues mediate binding of prenyl phosphates via bridging Mg2+ ions, inducing proposed conformational changes that close the active site to solvent, stabilizing reactive carbocation intermediates. Generally, the enzymes in this family exhibit an all-trans reaction pathway, an exception, is the cis-trans terpene cyclase, trichodiene synthase. Mechanistically and structurally distinct, class II terpene cyclases and cis-IPPS are not included in this CD. 29477 cd00683: Trans-Isoprenyl Diphosphate Synthases (Trans_IPPS), head-to-head (HH) (1'-1) condensation reaction. This CD includes squalene and phytoene synthases which catalyze the 1'-1 condensation of two 15-carbon (farnesyl) and 20-carbon (geranylgeranyl) isoprenyl diphosphates, respectively. The catalytic site consists of a large central cavity formed by mostly antiparallel alpha helices with two aspartate-rich regions (DXXXD) located on opposite walls. These residues mediate binding of prenyl phosphates. A two-step reaction has been proposed for squalene synthase (farnesyl-diphosphate farnesyltransferase) in which, two molecules of FPP react to form a stable cyclopropylcarbinyl diphosphate intermediate, and then the intermediate undergoes heterolysis, isomerization, and reduction with NADPH to form squalene, a precursor of cholestrol. The carotenoid biosynthesis enzyme, phytoene synthase (CrtB), catalyzes the condensation reaction of two molecules of geranylgeranyl diphosphate to produce phytoene, a precursor of beta-carotene. These enzymes produce the triterpene and tetraterpene precursors for many diverse sterol and carotenoid end products and are widely distributed among eukareya, bacteria, and archaea. 29478 cd00684: Plant Terpene Cyclases, Class 1 (C1). This CD includes a diverse group of monomeric plant terpene cyclases (Tspa-Tspf) that convert the acyclic isoprenoid diphosphates, geranyl diphosphate (GPP), farnesyl diphosphate (FPP), or geranylgeranyl diphosphate (GGPP) into cyclic monoterpenes, diterpenes, or sesquiterpenes, respectively; a few form acyclic species. Terpnoid cyclases are soluble enzymes localized to the cytosol (sesquiterpene synthases) or plastids (mono- and diterpene synthases). All monoterpene and diterpene synthases have restrict substrate specificity, however, some sesquiterpene synthases can accept both FPP and GPP. The catalytic site consists of a large central cavity formed by mostly antiparallel alpha helices with two aspartate-rich regions located on opposite walls. These residues mediate binding of prenyl diphosphates, via bridging Mg2+ ions (K+ preferred by gymnosperm cyclases), inducing conformational changes such that an N-terminal region forms a cap over the catalytic core. Loss of diphosphate from the enzyme-bound substrate (GPP, FPP, or GGPP) results in an allylic carbocation that electrophilically attacks a double bond further down the terpene chain to effect the first ring closure. Unlike monoterpene, sesquiterene, and macrocyclic diterpenes synthases, which undergo substrate ionization by diphosphate ester scission, Tpsc-like diterpene synthases catalyze cyclization reactions by an initial protonation step producing a copalyl diphosphate intermediate. These enzymes lack the aspartate-rich sequences mentioned above. Most diterpene synthases have an N-terminal, internal element (approx 210 aa) whose function is unknown. 29480 cd00686: Cis, Trans, Terpene Cyclases, Class 1. This CD includes the terpenoid cyclase, trichodiene synthase, which catalyzes the cyclization of farnesyl diphosphate (FPP) to trichodiene using a cis-trans pathway, and is the first committed step in the biosynthesis of trichothecene toxins and antibiotics. As with other enzymes with the 'terpenoid synthase fold', this enzyme has two conserved metal binding motifs that coordinate Mg2+ ion-bridged binding of the diphosphate moiety of FPP. Metal-triggered substrate ionization initiates catalysis, and the alpha-barrel active site serves as a template to channel and stabilize the conformations of reactive carbocation intermediates through a complex cyclization cascade. These enzymes function as homodimers and are found in several genera of fungi. 29481 cd00687: NonPlant Terpene Cyclases, Class 1 (C1). This CD includes terpenoid cyclases such as pentalenene synthase and aristolochene synthase which, using an all-trans pathway, catalyze the ionization of farnesyl diphosphate, followed by the formation of a macrocyclic intermediate by bond formation between C1 with either C10 (aristolochene synthase) or C11 (pentalenene synthase), resulting in production of tricyclic hydrocarbon pentalenene or bicyclic hydrocarbon aristolochene. As with other enzymes with the 'terpenoid synthase fold', they have two conserved metal binding motifs, proposed to coordinate Mg2+ ion-bridged binding of the diphosphate moiety of FPP to the enzymes. Metal-triggered substrate ionization initiates catalysis, and the alpha-barrel active site serves as a template to channel and stabilize the conformations of reactive carbocation intermediates through a complex cyclization cascade. These enzymes function in the monomeric form and are found in fungi, bacteria and Dictyostelium. 29483 cd00868: Terpene cyclases, Class 1 (C1) of the class 1 family of isoprenoid biosynthesis enzymes, which share the 'isoprenoid synthase fold' and convert linear, all-trans, isoprenoids, geranyl (C10)-, farnesyl (C15)-, or geranylgeranyl (C20)-diphosphate into numerous cyclic forms of monoterpenes, diterpenes, and sesquiterpenes. Also included in this CD are the cis-trans terpene cyclases such as trichodiene synthase. The class I terpene cyclization reactions proceed via electrophilic alkylations in which a new carbon-carbon single bond is generated through interaction between a highly reactive electron-deficient allylic carbocation and an electron-rich carbon-carbon double bond. The catalytic site consists of a large central cavity formed by mostly antiparallel alpha helices with two aspartate-rich regions located on opposite walls. These residues mediate binding of prenyl phosphates via bridging Mg2+ ions, inducing proposed conformational changes that close the active site to solvent, stabilizing reactive carbocation intermediates. Mechanistically and structurally distinct, class II terpene cyclases and cis-IPPS are not included in this CD. Taxonomic distribution includes bacteria, fungi and plants. 29484 cd00386: Heme-copper oxidase subunit III. Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which couple the reduction of molecular oxygen to water to, proton pumping across the membrane. The heme-copper oxidase superfamily is diverse in terms of electron donors, subunit composition, and heme types. This superfamily includes cytochrome c and ubiquinol oxidases. Bacterial oxidases typically contain 3 or 4 subunits in contrast to the 13 subunit bovine cytochrome c oxidase (CcO). Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Subunits I, II and III of ubiquinol oxidase are homologous to the corresponding subunits in CcO. This group additionally contains proteins which are fusions between subunits I and III, such as Sulfolobus acidocaldarius SoxM, a subunit of the SoxM terminal oxidase complex. It also includes NorE which has been speculated to be a subunit of nitric oxide reductase. Some archaebacterial cytochrome oxidases lack subunit III. Although not required for catalytic activity, subunit III is believed to play a role in assembly of the multimer complex. Rhodobacter CcO subunit III stabilizes the integrity of the binuclear center in subunit I. It has been proposed that archaea acquired heme-copper oxidases through gene transfer from gram-positive bacteria. 29485 cd01665: Cytochrome c oxidase subunit III. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. CcO catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Only subunits I and II are essential for function, but subunit III, which is also conserved, is believed to play a role in assembly of the multimer complex. Rhodobacter CcO subunit III stabilizes the integrity of the binuclear center in subunit I. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Subunit III contains bound phospholipids in several crystal structures and is proposed to contain a ""lipid pool."" These phospholipids are believed to intrinsic constituents similar to cofactors of the enzyme. 29486 cd02862: NorE_like subfamily of heme-copper oxidase subunit III. Heme-copper oxidases include cytochrome c and ubiquinol oxidases. Alcaligenes faecalis norE is found in a gene cluster containing norCB. norCB encodes the cytochrome c and cytochrome b subunits of nitric oxide reductase (NOR). Based on this and on its similarity to subunit III of cytochrome c oxidase (CcO) and ubiquinol oxidase, NorE has been speculated to be a subunit of NOR. 29487 cd02863: Ubiquinol oxidase subunit III subfamily. Ubiquinol oxidase, the terminal oxidase in the respiratory chains of aerobic bacteria, is a multi-chain transmembrane protein located in the cell membrane. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. Ubiquinol oxidases feature four subunits in contrast to the 13 subunit bovine cytochrome c oxidase (CcO). Subunits I, II, and III of bovine CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Subunits I, II and III of ubiquinol oxidase are homologous to the corresponding subunits in bovine CcO. Although not required for catalytic activity, subunit III appears to be involved in assembly of the multimer complex. 29488 cd02864: Heme-copper oxidase subunit III subfamily. Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which couple the reduction of molecular oxygen to water to, proton pumping across the membrane. The heme-copper oxidase superfamily is diverse in terms of electron donors, subunit composition, and heme types. This superfamily includes cytochrome c and ubiquinol oxidases. Bacterial oxidases typically contain 3 or 4 subunits in contrast to the 13 subunit bovine cytochrome c oxidase (CcO). Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Subunits I, II and III of ubiquinol oxidase are homologous to the corresponding subunits in CcO. Although not required for catalytic activity, subunit III is believed to play a role in assembly of the multimer complex. Rhodobacter CcO subunit III stabilizes the integrity of the binuclear center in subunit I. It has been proposed that Archaea acquired heme-copper oxidases through gene transfer from Gram-positive bacteria. 29489 cd02865: Heme-copper oxidase subunit III subfamily. Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which couple the reduction of molecular oxygen to water to, proton pumping across the membrane. The heme-copper oxidase superfamily is diverse in terms of electron donors, subunit composition, and heme types. This superfamily includes cytochrome c and ubiquinol oxidases. Bacterial oxidases typically contain 3 or 4 subunits in contrast to the 13 subunit bovine cytochrome c oxidase (CcO). Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Subunits I, II and III of ubiquinol oxidase are homologous to the corresponding subunits in CcO. Although not required for catalytic activity, subunit III is believed to play a role in assembly of the multimer complex. Rhodobacter CcO subunit III stabilizes the integrity of the binuclear center in subunit I. It has been proposed that Archaea acquired heme-copper oxidases through gene transfer from Gram-positive bacteria. 29490 cd00389: microbial_RNases. Ribonucleases (RNAses) cleave phosphodiester bonds in RNA and are essential for both non-specific RNA degradation and for numerous forms of RNA processing. The alignment contains fungal RNases (U2, T1, F1, Th, Pb, N1, and Ms) and bacterial RNases (barnase, binase, RNase Sa) , the majority of which are guanyl specific and fungal ribotoxins. 29491 cd00606: fungal type ribonuclease. Ribonucleases (RNAses) cleave phosphodiester bonds in RNA and are essential for both non-specific RNA degradation and for numerous forms of RNA processing. The members of this CD belong to the superfamily of microbial ribonucleases which are predominantly guanyl specific nucleases. Guanyl specific RNAses are endonucleases which split RNA phosphodiester bonds at the 3' oxygen end of guanosine residues to yield oligonucleotides with the guanosine-2',3 '-cyclophosphate at the 3' end and the hydroxyl group at the 5' end. The terminal guanosine-2,3'-cyclophosphate is hydrolysed by guanyl RNAses to give guanosine-3'-phosphate. The alignment also contains ribotoxins, a fungal group of cytotoxins, specifically cleaving the sarcin/ricin loop (SRL) structure of the 23-28S rRNA and therefore being very potent inhibitors of protein synthesis. 29492 cd00607: RNase_Sa. Ribonucleases first isolated from Streptomyces aureofaciens. In general, ribonucleases cleave phosphodiester bonds in RNA and are essential for both non-specific RNA degradation and for numerous forms of RNA processing. RNAse Sa is a guanylate specific endoribonuclease which belongs to the superfamily of microbial ribonucleases. Typical of this sub-family, the enzyme hydrolyses the phosphodiester bonds of RNA at the 3' oxygen end of guanosine residues to yield oligonucleotides with the guanosine-2',3'-cyclophosphate at the 3' end and the hydroxyl group at the 5' end. The terminal guanosine-2,3'-cyclophosphate is hydrolysed by guanyl RNAses to give guanosine-3'-phosphate. 29493 cd00933: Barnase, a member of the family of homologous microbial ribonucleases, catalyses the cleavage of single-stranded RNA via a two-step mechanism thought to be similar to that of pancreatic ribonuclease. The mechanism involves a transesterification to give a 2', 3'-cyclic phosphate intermediate, followed by hydrolysis to yield a 3' nucleotide. The active site residues His and Glu act as general acid-base groups during catalysis, while the Arg and Lys residues are important in binding the reactive phosphate, the latter probably binding the phosphate in the transition state. Barstar, a small 89 residue intracellular protein is a natural inhibitor of Barnase. 29494 cd00217: Flp recombinase, C-terminal catalytic domain. Yeast Flp-like recombinases mediate the amplification of the 2 micron circular plasmid copy number by catalyzing the intra-molecular recombination between two inverted repeats during replication. They belong to the DNA breaking-rejoining enzyme superfamily, which also includes prokaryotic tyrosine recombinases and type IB topoisomerases. These enzymes share the same fold in their catalytic domain containing six conserved active site residues and the overall reaction mechanism. Flp-like recombinases are almost exclusively found in yeast and are highly diverged in sequence from the prokaryotic tyrosine recombinases. They cleave their target DNA in trans with a composite active site in which the catalytic tyrosine is provided by a promoter bound to a site other than the one being cleaved. Thus each active site within Flp complexes is assembled by domain swapping and contains catalytic residues from two different monomers. Two DNA segments are synapsed by the tetrameric enzyme, carrying the nucleophilic tyrosine in each active site with only two of the four monomers active at a given time. The catalytic domain is linked through a flexible loop to the N-terminal domain, which is largely responsible for non-specific DNA binding and isomerization. Its overall fold is similar to the SAM domain fold also found in the N-terminal domains of lambda integrase and XerD recombinase. 29495 cd00397: DNA breaking-rejoining enzymes, C-terminal catalytic domain. The DNA breaking-rejoining enzyme superfamily includes type IB topoisomerases and tyrosine recombinases that share the same fold in their catalytic domain containing six conserved active site residues. The best-studied members of this diverse superfamily include human topoisomerase I, the bacteriophage lambda integrase, the bacteriophage P1 Cre recombinase, the yeast Flp recombinase and the bacterial XerD/C recombinases. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. The enzymes differ in that topoisomerases cleave and then rejoin the same 5' and 3' termini, whereas a site-specific recombinase transfers a 5' hydroxyl generated by recombinase cleavage to a new 3' phosphate partner located in a different duplex region. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity. 29496 cd00659: DNA topoisomerase IB, C-terminal catalytic domain. Topoisomerase I promotes the relaxation of both positive and negative DNA superhelical tension by introducing a transient single-stranded break in duplex DNA. This function is vital for the processes of replication, transcription, and recombination. Unlike Topo IA enzymes, Topo IB enzymes do not require a single-stranded region of DNA or metal ions for their function. The type IB family of DNA topoisomerases includes eukaryotic nuclear topoisomerase I, topoisomerases of poxviruses and bacterial versions of Topo IB. They belong to the superfamily of DNA breaking-rejoining enzymes, which share the same fold in their C-terminal catalytic domain and the overall reaction mechanism with tyrosine recombinases. The C-terminal catalytic domain in topoisomerases is linked to a divergent N-terminal domain that shows no sequence or structure similarity to the N-terminal domains of tyrosine recombinases. 29497 cd00796: Rci recombinase, C-terminal catalytic domain. Rci enzymes are found in IncI1 incompatibility group plasmids such as R64. These recombinases belong to the superfamily of DNA breaking-rejoining enzymes, which share the same fold in their catalytic domain and the overall reaction mechanism. The R64 Rci recombinase mediates site specific recombination at the highly mobile DNA segments called shufflon located in the C-terminal region of the pilV gene, which determines the recipient specificity in liquid mating. This gene encodes a thin pilus component that recognizes recipient's receptors required for liquid mating. The recombination occurs between any of the seven inverted repeats that separate four DNA segments of the shufflon. The segments can be inverted independently or in groups, resulting in a complex DNA rearrangement. The catalytic domain of Rci is linked to a variable N-terminal domain, whose function is unknown. 29498 cd00797: Phage HP1 integrase, C-terminal catalytic domain. Bacteriophage HP1 and related integrases are found in eubacteria, plasmids and temperate bacteriophages of the P2 family. They belong to the DNA breaking-rejoining enzyme superfamily, which includes tyrosine recombinases and type IB topoisomerases. These enzymes share the same fold in their C-terminal catalytic domain containing six conserved active site residues and the overall reaction mechanism. The HP1 recombinase controls phage replication by site-specific recombination between the HP1 genome and the chromosomal DNA. It is a heterobifunctional DNA-binding protein, which recognizes two different DNA sequence motifs (type I and type II binding sites). The C-terminal catalytic domain of the HP1 integrase binds to the type I site, while the less conserved N-terminal domain is largely responsible for binding to the type II site. 29499 cd00798: XerD and XerC integrases, DNA breaking-rejoining enzymes, N- and C-terminal domains. XerD-like integrases are involved in the site-specific integration and excision of lysogenic bacteriophage genomes, transposition of conjugative transposons, termination of chromosomal replication, and stable plasmid inheritance. They share the same fold in their catalytic domain containing six conserved active site residues and the overall reaction mechanism with the DNA breaking-rejoining enzyme superfamily. In Escherichia coli, the Xer site-specific recombination system acts to convert dimeric chromosomes, which are formed by homologous recombination to monomers. Two related recombinases, XerC and XerD, bind cooperatively to a recombination site present in the E. coli chromosome. Each recombinase catalyzes the exchange of one pair of DNA strand in a reaction that proceeds through a Holliday junction intermediate. These enzymes can bridge two different and well-separated DNA sequences called arm- and core-sites. The C-terminal domain binds, cleaves and re-ligates DNA strands at the core-sites, while the N-terminal domain is largely responsible for high-affinity binding to the arm-type sites. 29500 cd00799: Cre recombinase, C-terminal catalytic domain. Cre-like recombinases belong to the superfamily of DNA breaking-rejoining enzymes, which share the same fold in their catalytic domain and the overall reaction mechanism. The bacteriophage P1 Cre recombinase maintains the circular phage replicon in a monomeric state by catalyzing a site-specific recombination between two loxP sites. The catalytic core domain of Cre recombinase is linked to a more divergent helical N-terminal domain, which interacts primarily with the DNA major groove proximal to the crossover region. 29501 cd00800: Lambda integrase, C-terminal catalytic domain. Lambda-type integrases catalyze site-specific integration and excision of temperate bacteriophages and other mobile genetic elements to and from the bacterial host chromosome. They belong to the superfamily of DNA breaking-rejoining enzymes, which share the same fold in their catalytic domain and the overall reaction mechanism. The phage lambda integrase can bridge two different and well-separated DNA sequences called arm- and core-sites. The C-terminal domain binds, cleaves and re-ligates DNA strands at the core-sites, while the N-terminal domain is largely responsible for high-affinity binding to the arm-type sites. 29502 cd00801: Bacteriophage P4 integrase. P4-like integrases are found in temperate bacteriophages, integrative plasmids, pathogenicity and symbiosis islands, and other mobile genetic elements. They share the same fold in their catalytic domain and the overall reaction mechanism with the superfamily of DNA breaking-rejoining enzymes. The P4 integrase mediates integrative and excisive site-specific recombination between two sites, called attachment sites, located on the phage genome and the bacterial chromosome. The phage attachment site is often found adjacent to the integrase gene, while the host attachment sites are typically situated near tRNA genes. 29503 cd01182: DNA breaking-rejoining enzymes, intergrase/recombinases, C-terminal catalytic domain. The tyrosine recombinase/integrase family share the same catalytic domain containing six conserved active site residues. The best-studied members of this diverse family include the bacteriophage lambda integrase, the bacteriophage P1 Cre recombinase, the yeast Flp recombinase and the bacterial XerD/C recombinases. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. Many intergrase/recombinases also have N-terminal domains, which show little sequence or structure similarity. 29504 cd01183: INT_SG1, DNA breaking-rejoining enzymes, integrase/recombinases subgroup 1, C-terminal catalytic domain. The CD contains mainly predicted integrase/recombinase and site-specific XerD recombinases. The members of this CD are found predominantly in proteobacteria. These proteins have not been biochemically characerised as yet. 29505 cd01184: INT_SG2, DNA breaking-rejoining enzymes, integrase/recombinases subgroup 2, C-terminal catalytic domain. The CD contains mainly predicted integrase/recombinases and phage-related integrases. Some have N-terminal domains, which show little sequence similarity to each other. Members of this subgroup are predominantly found in proteobacteria. 29506 cd01185: Tn4399 and related integrases, DNA breaking-rejoining enzymes, integrase/recombinases, N- and C-terminal domains. This CD includes various bacterial integrases, including cLV25, a Bacteroides fragilis chromosomal transfer factor integrase similar to the Bacteroides mobilizable transposon, Tn4399, integrase. 29507 cd01186: INT_SG3, DNA breaking-rejoining enzymes, integrase/recombinases subgroup 3, catalytic domain. The CD contains various predicted bacterial and phage integrase/recombinase sequences for which not much experimental characterization is available. 29508 cd01187: INT_SG4, DNA breaking-rejoining enzymes, integrase/recombinases subgroup 4, N- and C-terminal domains. The CD contains mainly predicted bacterial integrase/recombinases for which not much biochemical characterization is available. 29509 cd01188: pAE1 and related integrases, DNA breaking-rejoining enzymes, integrase/recombinases, C-terminal domain. This CD includes various bacterial integrases, including the predicted integrase of the deletion-prone region of plasmid pAE1 of Alcaligenes eutrophus H1. 29510 cd01189: phiLC3 phage and phage-related integrases, site-specific recombinases, DNA breaking-rejoining enzymes, C-terminal catalytic domain. This CD includes various bacterial (mainly gram positive) and phage integrases, including those similar to Lactococcus phage phiLC3, TPW22, Tuc2009, BK5-T, A2, bIL285, bIL286, bIL311, ul36 and phi g1e; Staphylococcus aureus phage phi13 and phi42; Oenococcus oeni phage fOg44; Streptococcus thermophilus phage O1205 and Sfi21; and Streptococcus pyogenes phage T12 and T270. 29511 cd01190: INT_SG5, DNA breaking-rejoining enzymes, integrase/recombinases subgroup 5, N- and C-terminal domains. The CD contains mainly predicted bacterial integrase/recombinases. 29512 cd01191: phiCTX phage and phage-related integrases, site-specific recombinases, DNA breaking-rejoining enzymes, C-terminal catalytic domain. This CD includes various phage and bacterial integrases, including those similar to phage integrases: Bordetella and Pseudomonas phiCTX; E. coli Rac, Qin, and Shiga toxin 2 933W; and Salmonella typhimurium LT2 Gifsy-2 and Fels-1; and a putative pore-forming cytotoxin integrase from Vibrio parahaemolyticus O3:K6. 29513 cd01192: P22-like integrases, site-specific recombinases, DNA breaking-rejoining enzymes, C-terminal catalytic domain. This CD includes various bacterial and phage integrases, including those similar to phage P22-like integrases, DLP12 and APSE-1. 29514 cd01193: IntI (E2) integrases, site-specific tyrosine recombinases, DNA breaking-rejoining enzymes, N- and C-terminal domains. This CD includes integrases which are components of multiresistant integrons and mediate recombination between a proximal attI site and a secondary target called the attC (or 59-base element) present on various mobile gene cassettes. Integron-integrases are present in many natural occurring mobile elements, including transposons and conjugative plasmids. Vibrio, Shewanella, Xanthomonas and Pseudomonas species harbor chromosomal super-integrons. All integron-integrases carry large inserts unlike the TnpF ermF-like proteins also seen in this group. 29515 cd01194: Tn544A and related transposases, DNA breaking-rejoining enzymes, integrase/recombinases, C-terminal catalytic domain. This CD includes various bacterial transposases similar to TnpA from transposon Tn554. 29516 cd01195: Tn544B and related transposases, DNA breaking-rejoining enzymes, integrase/recombinases, catalytic domain. This CD includes various bacterial transposases similar to TnpB from transposon Tn554. 29517 cd01196: VanD integrase, IntD, and related integrases, DNA breaking-rejoining enzymes, integrase/recombinases, N- and C-terminal domains. This CD includes various bacterial integrases including those similar to IntD, a putative integrase-like protein, a component of the vanD glycopeptide resistance cluster in Enterococcus faecium BM4339. Members of this CD are predominantly bacterial in origin. 29518 cd01197: FimB and FimE and related proteins, DNA breaking-rejoining enzymes, integrase/recombinases, catalytic domain. This CD includes those proteins similar to E.coli FimE and FimB regulatory proteins and Proteus mirabilis MrpI. 29519 cd01198: Archaeal site-specific recombinase A (ASSRA), DNA breaking-rejoining enzymes, integrase/recombinases, C-terminal catalytic domain. Members of this CD are archael in origin. No biochemical characterization is available for the proteins of this subgroup at this point. 29520 cd01199: Tn1545-related conjugative transposon integrases, site-specific recombinases, DNA breaking-rejoining enzymes, C-terminal catalytic domain. This CD includes bacterial (gram positive) and phage integrases, including those similar to Tn1545, Tn5252, and Tn5276 conjugative transposon integrases and Lactobacillus phage phi adh integrase. 29521 cd00398: Class II Aldolase and Adducin head (N-terminal) domain. Aldolases are ubiquitous enzymes catalyzing central steps of carbohydrate metabolism. Based on enzymatic mechanisms, this superfamily has been divided into two distinct classes (Class I and II). Class II enzymes are further divided into two sub-classes A and B. This family includes class II A aldolases and adducins which has not been ascribed any enzymatic function. Members of this class are primarily bacterial and eukaryotic in origin and include L-fuculose-1-phosphate, L-rhamnulose-1-phosphate aldolases and L-ribulose-5-phosphate 4-epimerases. They all share the ability to promote carbon-carbon bond cleavage and stabilize enolate intermediates using divalent cations. 29522 cd00401: S-adenosyl-L-homocysteine hydrolase (AdoHycase) catalyzes the hydrolysis of S-adenosyl-L-homocysteine (AdoHyc) to form adenosine (Ado) and homocysteine (Hcy). The equilibrium lies far on the side of AdoHyc synthesis, but in nature the removal of Ado and Hyc is sufficiently fast, so that the net reaction is in the direction of hydrolysis. Since AdoHyc is a potent inhibitor of S-adenosyl-L-methionine dependent methyltransferases, AdoHycase plays a critical role in the modulation of the activity of various methyltransferases. The enzyme forms homooligomers of 45-50kDa subunits, each binding one molecule of NAD+.. 29524 cd01356: Putative Aconitase X swivel domain. It is predicted by comparative genomic analysis. The proteins are mainly found in archaea and proteobacteria. They are distantly related to Aconitase family of proteins by sequence similarity and seconary structure prediction. The functions have not yet been experimentally characterized. Thus, the prediction should be treated with caution. 29525 cd01576: Aconitase B swivel domain. Aconitate hydratase B is involved in energy metabolism as part of the TCA cycle. It catalyses the formation of cis-aconitate from citrate. This is the aconitase swivel domain, which undergoes swivelling conformational change in the enzyme mechanism. The domain structure of Aconitase B is different from other Aconitases in that he swivel domain that is found at N-terminus of B family is normally found at C-terminus for other Aconitases. In most members of the family, there is also a HEAT domain before domain 4, which is believed to play a role in protein-protein interaction. 29527 cd01578: Mitochondrial aconitase A swivel domain. Aconitase (also known as aconitate hydratase and citrate hydro-lyase) catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. This is the aconitase swivel domain, which undergoes swivelling conformational change in the enzyme mechanism. In eukaryotes two isozymes of aconitase are known to exist: one found in the mitochondrial matrix and the other found in the cytoplasm. This is the mitochondrial form. The mitochondrial product is coded by a nuclear gene. Most members of this subfamily are mitochondrial but there are some bacterial members. 29528 cd01579: Bacterial Aconitase-like swivel domain. Aconitase (aconitate hydratase or citrate hydrolyase) catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. Cis-aconitate is formed as an intermediate product during the course of the reaction. This is the aconitase-like swivel domain, which is believed to undergo swivelling conformational change in the enzyme mechanism. This distinct subfamily is found only in bacteria and archea. Its exact characteristics are not known. 29529 cd01580: Aconitase A swivel domain. This is the major form of the TCA cycle enzyme aconitate hydratase, also known as aconitase and citrate hydro-lyase. It includes bacterial and archaeal aconitase A, and the eukaryotic cytosolic form of aconitase. This group also includes sequences that have been shown to act as an iron-responsive element (IRE) binding protein in animals and may have the same role in other eukaryotes. This is the aconitase-like swivel domain, which is believed to undergo swivelling conformational change in the enzyme mechanism. 29530 cd01674: Homoaconitase swivel domain. This family includes homoaconitase and other uncharacterized proteins of the Aconitase family. Homoaconitase is part of an unusual lysine biosynthesis pathway found only in filamentous fungi, in which lysine is synthesized via the alpha-aminoadipate pathway. In this pathway, homoaconitase catalyzes the conversion of cis-homoaconitic acid into homoisocitric acid. The reaction mechanism is believed to be similar to that of other aconitases. This is the swivel domain, which is believed to undergo swivelling conformational change in the enzyme mechanism. 29531 cd00410: Adenylosuccinate synthetase. The enzyme (also known as IMP:L-aspartate ligase (GDP forming)) catalyzes the first committed step in the biosynthesis of AMP. It forms adenylosuccinate from IMP and L-aspartate, converting GTP to GDP and Pi in the process. Adenylosuccinate synthetase along with adenylosuccinate lyase and AMP deaminase form the functional unit of purine nucleotide cycle, which interconverts IMP and AMP via the formation of adenylosuccinate. The enzyme is present in diverse organisms and tissues. In leishmanial and trypanosomal parasites, which lack a de novo pathway for the synthesis of purine nucleotides, adenylosuccinate synthetase still plays a prominent role in nucleotide salvage pathways. 29532 cd00411: Asparaginase (amidohydrolase): Asparaginases are tetrameric enzymes that catalyze the hydrolysis of asparagine to aspartic acid and ammonia. In bacteria, there are two classes of amidohydrolases, one highly specific for asparagine and localised to the periplasm, and a second (asparaginase- glutaminase) present in the cytosol that hydrolyzises both asparagine and glutamine with similar specificities. 29533 cd00412: Inorganic pyrophosphatase. These enzymes hydrolyze inorganic pyrophosphate (PPi) to two molecules of orthophosphates (Pi). The reaction requires bivalent cations. The enzymes in general exist as homooligomers. 29534 cd00413: The O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A glycosyl hydrolase classification system based on sequence similarity has led to the definition of more than 95 different families inlcuding glycoside hydrolase family 16. Family 16 includes lichenase, xyloglucan endotransglycosylase (XET), beta-agarase, kappa-carrageenase, endo-beta-1,3-glucanase, endo-beta-1,3-1,4-glucanase, and endo-beta-galactosidase, all of which have a conserved jelly roll fold with a deep active site channel harboring the catalytic residues. 29535 cd02175: Lichenase, also known as 1,3-1,4-beta-glucanase, is a member of glycosyl hydrolase family 16, that specifically cleaves 1,4-beta-D-glucosidic bonds in mixed-linked beta glucans that also contain 1,3-beta-D-glucosidic linkages. Natural substrates of beta-glucanase are beta-glucans from grain endosperm cell walls or lichenan from the Islandic moss, Cetraria islandica. This protein is found not only in bacteria but also in anaerobic fungi. This domain includes two seven-stranded antiparallel beta-sheets that are adjacent to one another forming a compact, jellyroll beta-sandwich structure. 29536 cd02176: Xyloglucan endotransglycosylases (XETs) cleave and religate xyloglucan polymers in plant cell walls via a transglycosylation mechanism. Thus, XET is a key enzyme in all plant processes that require cell wall remodeling. Even though the overall structure of XET is a curved beta-sandwich similar to other enzymes in the glycosyl hydrolase family 16, parts of its substrate binding cleft are more reminiscent of the distantly related glycosyl hydrolase family 7. 29537 cd02177: Kappa-carrageenase degrades kappa-carrageenans which are the gel-forming, sulfated 1,3-alpha-1,4-beta-galactans that make up the cell walls of marine red algae such as Rhodophyceaea. Kappa-carrageenases exist in bacteria belonging to at least three phylogenetically distant branches, including pseudoalteromonas, planctomycetes, and baceroidetes. This domain adopts a curved beta-sandwich conformation, with a tunnel-shaped active site cavity, referred to as a jellyroll fold. 29538 cd02178: Beta-agarase is a glycosyl hydrolase family 16 (GH16) member that hydrolyzes the internal beta-1,4-linkage of agarose, producing agaro-oligosaccharides. While beta-agarases are also found in two other families derived from the sequence-based classification of glycosyl hydrolases (GH50, and GH86) the GH16 members are most abundant. This domain adopts a curved beta-sandwich conformation, with a tunnel-shaped active site cavity, referred to as a jellyroll fold. 29539 cd02179: Beta-GRP (beta-1,3-glucan recognition protein) is one of several pattern recognition receptors (PRRs), also referred to as biosensor proteins, that complexes with pathogen-associated beta-1,3-glucans and then transduces signals necessary for activation of an appropriate immune response. Their structures adopt a jelly roll fold with a deep active site channel harboring the catalytic residues, like those of other glycosyl hydrolase family 16 members. 29540 cd02180: Laminarinase, also known as glucan endo-1,3-beta-D-glucosidase, is a glycosyl hydrolase family 16 member that hydrolyzes 1,3-beta-D-glucosidic linkages in 1,3-beta-D-glucans such as laminarins, curdlans, paramylons, and pachymans, with very limited action on mixed-link (1,3-1,4-)-beta-D-glucans. 29541 cd02181: The MLG1-like glucanses are a group of fungal beta-1,3-glucanases that are active on 1,3 beta-glucans as well as mixed-linked beta-glucans. MLG1 belongs to a family of glycosyl hydrolases that includes lichenase, xyloglucan endotransglycosylase (XET), beta-agarase, kappa-carrageenase, endo-beta-1,3-glucanase, endo-beta-1,3-1,4-glucanase, and endo-beta-galactosidase, all of which have a conserved jelly roll fold with a deep active site channel harboring the catalytic residues. 29542 cd02182: A beta-1,3-glucanase (laminarinase)-like protein exists in the bacterial genus Streptomyces as well as the fungal class Sordariomycetes. The laminarinases belong to glycosyl hydrolase family 16 all of which have a conserved jelly roll fold with an active site channel. The bacterial members contain an additional C-terminal carbohydrate-binding module (CBM).. 29543 cd02183: GPI (glycosylphosphatidylinositol) -glucanosyltransferase is a GPI-anchored membrane protein present in the fungal cell wall that is thought to play an important role in cell wall biosynthesis. GPI-glucanosyltransferase belongs to a family of glycosyl hydrolases that includes lichenase, xyloglucan endotransglycosylase (XET), beta-agarase, kappa-carrageenase, endo-beta-1,3-glucanase, endo-beta-1,3-1,4-glucanase, and endo-beta-galactosidase, all of which have a conserved jelly roll fold with a deep active site channel harboring the catalytic residues. 29544 cd00423: Pterin binding enzymes. This family includes dihydropteroate synthase (DHPS) and cobalamin-dependent methyltransferases such as methyltetrahydrofolate, corrinoid iron-sulfur protein methyltransferase (MeTr) and methionine synthase (MetH). DHPS, a functional homodimer, catalyzes the condensation of p-aminobenzoic acid (pABA) in the de novo biosynthesis of folate, which is an essential cofactor in both nucleic acid and protein biosynthesis. Prokaryotes (and some lower eukaryotes) must synthesize folate de novo, while higher eukaryotes are able to utilize dietary folate and therefore lack DHPS. Sulfonamide drugs, which are substrate analogs of pABA, target DHPS. Cobalamin-dependent methyltransferases catalyze the transfer of a methyl group via a methyl- cob(III)amide intermediate. These include MeTr, a functional heterodimer, and the folate binding domain of MetH. 29545 cd00739: DHPS subgroup of Pterin binding enzymes. DHPS (dihydropteroate synthase), a functional homodimer, catalyzes the condensation of p-aminobenzoic acid (pABA) in the de novo biosynthesis of folate, which is an essential cofactor in both nucleic acid and protein biosynthesis. Prokaryotes (and some lower eukaryotes) must synthesize folate de novo, while higher eukaryotes are able to utilize dietary folate and therefore lack DHPS. Sulfonamide drugs, which are substrate analogs of pABA, target DHPS. 29546 cd00740: MeTr subgroup of pterin binding enzymes. This family includes cobalamin-dependent methyltransferases such as methyltetrahydrofolate, corrinoid iron-sulfur protein methyltransferase (MeTr) and methionine synthase (MetH). Cobalamin-dependent methyltransferases catalyze the transfer of a methyl group via a methyl- cob(III)amide intermediate. These include MeTr, a functional heterodimer, and the folate binding domain of MetH. 29547 cd00430: Alanine racemase. This CD corresponds to alanine racemases, the prototype of the alanine racemase superfamily. Other proteins in this superfamily, such as, eukaryotic ornithine decarboxylases, bacterial diaminopimelate decarboxylases and biosynthetic arginine decarboxylases are not included in this CD. Alanine racemases have been classified as PyridoxaL 5'-phosphate Dependent Enzymes class III (PLPDE_III) and catalyzes the interconversion between L- and D-alanine. Homodimer formation is required for catalytic activity. 29548 cd00431: Cysteine hydrolases; This family contains amidohydrolases, like CSHase (N-carbamoylsarcosine amidohydrolase), involved in creatine metabolism and nicotinamidase, converting nicotinamide to nicotinic acid and ammonia in the pyridine nucleotide cycle. It also contains isochorismatase, an enzyme that catalyzes the conversion of isochorismate to 2,3-dihydroxybenzoate and pyruvate, via the hydrolysis of the vinyl ether bond, and other related enzymes with unknown function. 29549 cd01011: Nicotinamidase/pyrazinamidase (PZase). Nicotinamidase, a ubiquitous enzyme in prokaryotes, converts nicotinamide to nicotinic acid (niacin) and ammonia, which in turn can be recycled to make nicotinamide adenine dinucleotide (NAD). The same enzyme is also called pyrazinamidase, because in converts the tuberculosis drug pyrazinamide (PZA) into its active form pyrazinoic acid (POA).. 29550 cd01012: YcaC related amidohydrolases; E.coli YcaC is an homooctameric hydrolase with unknown specificity. Despite its weak sequence similarity, it is structurally related to other amidohydrolases and shares conserved active site residues with them. Multimerisation interface seems not to be conserved in all members. 29551 cd01013: Isochorismatase, also known as 2,3 dihydro-2,3 dihydroxybenzoate synthase, catalyses the conversion of isochorismate, in the presence of water, to 2,3-dihydroxybenzoate and pyruvate, via the hydrolysis of a vinyl ether, an uncommon reaction in biological systems. Isochorismatase is part of the phenazine biosynthesis pathway. Phenazines are antimicrobial compounds that provide the competitive advantage for certain bacteria. 29552 cd01014: Nicotinamidase_ related amidohydrolases. Cysteine hydrolases of unknown function that share the catalytic triad with other amidohydrolases, like nicotinamidase, which converts nicotinamide to nicotinic acid and ammonia. 29553 cd01015: N-carbamoylsarcosine amidohydrolase (CSHase) hydrolyzes N-carbamoylsarcosine to sarcosine, carbon dioxide and ammonia. CSHase is involved in one of the two alternative pathways for creatinine degradation to glycine in microorganisms.This CSHase-containing pathway degrades creatinine via N-methylhydantoin N-carbamoylsarcosine and sarcosine to glycine. Enzymes of this pathway are used in the diagnosis for renal disfunction, for determining creatinine levels in urine and serum. 29555 cd00435: Acyl CoA binding protein (ACBP) binds thiol esters of long fatty acids and coenzyme A in a one-to-one binding mode with high specificity and affinity. Acyl-CoAs are important intermediates in fatty lipid synthesis and fatty acid degradation and play a role in regulation of intermediary metabolism and gene regulation. The suggested role of ACBP is to act as a intracellular acyl-CoA transporter and pool former. ACBPs are present in a large group of eukaryotic species and several tissue-specific isoforms have been detected. 29556 cd00254: Lytic Transglycosylase (LT) and Goose Egg White Lysozyme (GEWL) domain. Members include the soluble and insoluble membrane-bound LTs in bacteria, the LTs in bacteriophage lambda, as well as, the eukaryotic ""goose-type"" lysozymes (GEWL). LTs catalyze the cleavage of the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc), as do ""goose-type"" lysozymes. However, in addition to this, they also make a new glycosidic bond with the C6 hydroxyl group of the same muramic acid residue. 29557 cd00325: Glycoside hydrolase family 19 chitinase domain. Chitinases are enzymes that catalyze the hydrolysis of the beta-1,4-N-acetyl-D-glucosamine linkages in chitin polymers. Family 19 chitinases are found primarily in plants (classes I, III, and IV), but some are found in bacteria. Class I and II chitinases are similar in their catalytic domains. Class I chitinases have an N-terminal cysteine-rich, chitin-binding domain which is separated from the catalytic domain by a proline and glycine-rich hinge region. Class II chitinases lack both the chitin-binding domain and the hinge region. Class IV chitinases are similar to class I chitinases but they are smaller in size due to certain deletions. Despite any significant sequence homology with lysozymes, structural analysis reveals that family 19 chitinases, together with family 46 chitosanases, are similar to several lysozymes including those from T4-phage and from goose. The structures reveal that the different enzyme groups arose from a common ancestor glycohydrolase antecedent to the procaryotic/eucaryotic divergence. 29558 cd00442: lysozyme_like domain. This contains several members including Soluble Lytic Transglycosylases (SLT), Goose Egg-White Lysozymes (GEWL), Hen Egg-White Lysozymes (HEWL), chitinases, bacteriophage lambda lysozymes, endolysins, autolysins, and chitosanases. All the members are involved in the hydrolysis of beta-1,4- linked polysaccharides. 29559 cd00735: Bacteriophage T4-like lysozymes hydrolyse the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetylglucosamine (GlcNAc) in peptidoglycan heteropolymers of prokaryotic cell walls. Members include a variety of bacteriophages (T4, RB49, RB69, Aeh1) as well as Dictyostelium. . 29560 cd00736: The lysozyme from bacteriophage lambda hydrolyses the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetylglucosamine (GlcNAc), as do other lysozymes. But unlike other lysozymes, bacteriophage lambda does not produce a reducing end upon cleavage of the peptidoglycan but rather uses the 6-OH of the same MurNAc residue to produce a 1,6-anhydromuramic acid terminal residue and is therefore a lytic transglycosylase. An identical 1,6-anhydro bond is formed in bacterial peptidoglycans by the action of the lytic transglycosylases of E. coli. However, they differ structurally. 29561 cd00737: Endolysins and autolysins are found in viruses and bacteria, respectively. The ds DNA phages of eubacteria use endolysins or muralytic enzymes in conjunction with hollin, a small membrane protein, to degrade the peptidoglycan found in bacterial cell walls. Similarly, bacteria produce autolysins to facilitate the biosynthesis of its cell wall hetropolymer peptidoglycan and cell division. Both endolysin and autolysin enzymes cleave the glycosidic beta 1,4-bonds between the N-acetylmuramic acid and the N-acetylglucosamine of the peptidoglycan. 29562 cd00978: Glycosyl hydrolase family 46 chitosanase domain. This family are composed of the chitosanase enzymes which hydrolyzes chitosan, a biopolymer of beta (1,4)-linked -D-glucosamine (GlcN) residues produced by partial or full deacetylation of chitin. Chitosanases play a role in defense against pathogens such as fungi and are found in microorganisms, fungi, viruses, and plants. Microbial chitosanases who members are the most prevalent can be divided into 3 subclasses based on the specificity of the cleavage positions for partial acetylated chitosan. Subclass I chitosanases such as N174 can split GlcN-GlcN and GlcNAc-GlcN linkages, whereas subclass II chitosanases such as Bacillus sp. no. 7-M can cleave only GlcN-GlcN linkages. Subclass III chitosanases such as MH-K1 chitosanase are the most versatile and can split both GlcN-GlcN and GlcN-GlcNAc linkages. 29563 cd01021: Goose Egg White Lysozyme domain. Eukaryotic ""go ose-type"" lysozymes (GEWL). These enzymes catalyze the cleavage of the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetylglucosamine (GlcNAc). Members include tunicate, Japanese flounder, ostrich, and mouse. 29564 cd00447: RNA binding domain of NusB (N protein-Utilization Substance B) and Sun (also known as RrmB or Fmu) proteins. This family includes two orthologous groups exemplified by the transcription termination factor NusB and the N-terminal domain of the rRNA-specific 5-methylcytidine transferase (m5C-methyltransferase) Sun. The NusB protein plays a key role in the regulation of ribosomal RNA biosynthesis in eubacteria by modulating the efficiency of transcriptional antitermination. NusB along with other Nus factors (NusA, NusE/S10 and NusG) forms the core complex with the boxA element of the nut site of the rRNA operons. These interactions help RNA polymerase to counteract polarity during transcription of rRNA operons and allow stable antitermination. The transcription antitermination system can be appropriated by some bacteriophages such as lambda, which use the system to switch between the lysogenic and lytic modes of phage propagation. The m5C-methyltransferase Sun shares the N-terminal non-catalytic RNA-binding domain with NusB. 29565 cd00619: Transcription termination factor NusB (N protein-Utilization Substance B). NusB plays a key role in the regulation of ribosomal RNA biosynthesis in eubacteria by modulating the efficiency of transcriptional antitermination. NusB along with other Nus factors (NusA, NusE/S10 and NusG) forms the core complex with the boxA element of the nut site of the rRNA operons. These interactions help RNA polymerase to counteract polarity during transcription of rRNA operons and allow stable antitermination. The transcription antitermination system can be appropriated by some bacteriophages such as lambda, which use the system to switch between the lysogenic and lytic modes of phage propagation. 29566 cd00620: N-terminal RNA binding domain of the methyltransferase Sun. The rRNA-specific 5-methylcytidine transferase Sun, also known as RrmB or Fmu shares the RNA-binding non-catalytic domain with the transcription termination factor NusB. The precise biological role of this domain in Sun is unknown, although it is likely to be involved in sequence-specific RNA binding. The C-terminal methyltransferase domain of Sun has been shown to catalyze formation of m5C at position 967 of 16S rRNA in Escherichia coli. 29567 cd00449: PyridoxaL 5'-Phosphate Dependent Enzymes class IV (PLPDE_IV). This D-amino acid superfamily, one of five classes of PLPDE, consists of branched-chain amino acid aminotransferases (BCAT), D-amino acid transferases (DAAT), and 4-amino-4-deoxychorismate lyases (ADCL). BCAT catalyzes the reversible transamination reaction between the L-branched-chain amino and alpha-keto acids. DAAT catalyzes the synthesis of D-glutamic acid and D-alanine, and ADCL converts 4-amino-4-deoxychorismate to p-aminobenzoate and pyruvate. Except for a few enzymes, i. e., Escherichia coli and Salmonella BCATs, which are homohexamers arranged as a double trimer, the class IV PLPDEs are homodimers. Homodimer formation is required for catalytic activity. 29568 cd01557: BCAT_beta_family: Branched-chain aminotransferase catalyses the transamination of the branched-chain amino acids leusine, isoleucine and valine to their respective alpha-keto acids, alpha-ketoisocaproate, alpha-keto-beta-methylvalerate and alpha-ketoisovalerate. The enzyme requires pyridoxal 5'-phosphate (PLP) as a cofactor to catalyze the reaction. It has been found that mammals have two foms of the enzyme - mitochondrial and cytosolic forms while bacteria contain only one form of the enzyme. The mitochondrial form plays a significant role in skeletal muscle glutamine and alanine synthesis and in interorgan nitrogen metabolism.Members of this subgroup are widely distributed in all three forms of life. 29569 cd01558: D-Alanine aminotransferase (D-AAT_like): D-amino acid aminotransferase catalyzes transamination between D-amino acids and their respective alpha-keto acids. It plays a major role in the synthesis of bacterial cell wall components like D-alanine and D-glutamate in addition to other D-amino acids. The enzyme like other members of this superfamily requires PLP as a cofactor. Members of this subgroup are found in all three forms of life. 29570 cd01559: ADCL_like: 4-Amino-4-deoxychorismate lyase: is a member of the fold-type IV of PLP dependent enzymes that converts 4-amino-4-deoxychorismate (ADC) to p-aminobenzoate and pyruvate. Based on the information available from the crystal structure, most members of this subgroup are likely to function as dimers. The enzyme from E.Coli, the structure of which is available, is a homodimer that is folded into a small and a larger domain. The coenzyme pyridoxal 5; -phosphate resides at the interface of the two domains that is linked by a flexible loop. Members of this subgroup are found in Eukaryotes and bacteria. 29571 cd00453: Fructose/tagarose-bisphosphate aldolase class II. This family includes fructose-1,6-bisphosphate (FBP) and tagarose 1,6-bisphosphate (TBP) aldolases. FBP-aldolase is homodimeric and used in gluconeogenesis and glycolysis; the enzyme controls the condensation of dihydroxyacetone phosphate with glyceraldehyde-3-phosphate to yield fructose-1,6-bisphosphate. TBP-aldolase is tetrameric and produces tagarose-1,6-bisphosphate. There is an absolute requirement for a divalent metal ion, usually zinc, and in addition the enzymes are activated by monovalent cations such as Na+. Although structurally similar, the class I aldolases use a different mechanism and are believed to have an independent evolutionary origin. 29572 cd00946: Class II Type A, Fructose-1,6-bisphosphate (FBP) aldolases. The enzyme catalyses the zinc-dependent, reversible aldol condensation of dihydroxyacetone phosphate with glyceraldehyde-3-phosphate to form fructose-1,6-bisphosphate. FBP aldolase is homodimeric and used in gluconeogenesis and glycolysis. The type A and type B Class II FBPA's differ in the presence and absence of distinct indels in the sequence that result in differing loop lengths in the structures. 29573 cd00947: Tagatose-1,6-bisphosphate (TBP) aldolase and related Type B Class II aldolases. TBP aldolase is a tetrameric class II aldolase that catalyzes the reversible condensation of dihydroxyacetone phosphate with glyceraldehyde 3-phsophate to produce tagatose 1,6-bisphosphate. There is an absolute requirement for a divalent metal ion, usually zinc, and in addition the enzymes are activated by monovalent cations such as Na+. The type A and type B Class II FBPA's differ in the presence and absence of distinct indels in the sequence that result in differing loop lengths in the structures. 29574 cd00455: nuc_hydro: Nucleoside hydrolases. Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity. This group contains eukaryotic, bacterial and archeal proteins similar to the inosine-uridine preferring nucleoside hydrolase from Crithidia fasciculata, the xanthosine-inosine-uridine-adenosine-preferring nucleoside hydrolase RihC from Salmonella enterica serovar Typhimurium, the purine-specific inosine-adenosine-guanosine-preferring nucleoside hydrolase from Trypanosoma vivax and, pyrimidine-specific uridine-cytidine preferring nucleoside hydrolases such as URH1 from Saccharomyces cerevisiae, RihA and RihB from Escherichia coli. Nucleoside hydrolases are of interest as a target for antiprotozoan drugs as, no nucleoside hydrolase activity or genes encoding these enzymes have been detected in humans and, parasitic protozoans lack de novo purine synthesis relying on nucleoside hydrolase to scavenge purine and/or pyrimidines from the environment. . 29575 cd02647: nuc_hydro_ TvIAG: Nucleoside hydrolases similar to the Inosine-adenosine-guanosine-preferring nucleoside hydrolase from Trypanosoma vivax. Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. Nucleoside hydrolases vary in their substrate specificity. This group contains eukaryotic and bacterial proteins similar to the purine specific inosine-adenosine-guanosine-preferring nucleoside hydrolase (IAG-NH) from T. vivax. T. vivax IAG-NH is of the order of a thousand to ten thousand fold more specific towards the naturally occurring purine nucleosides, than towards the pyrimidine nucleosides. . 29577 cd02649: nuc_hydro_CeIAG: Nucleoside hydrolases similar to the inosine-adenosine-guanosine-preferring nucleoside hydrolase from Caenorhabditis elegans. Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity. This group contains eukaryotic, bacterial and archeal proteins similar to the purine-preferring nucleoside hydrolase (IAG-NH) from C. elegans and the salivary purine nucleosidase from Aedes aegypti. C. elegans IAG-NH exhibits a high affinity for the substrate analogue p-nitrophenylriboside (p-NPR). . 29578 cd02650: NH_hydro_CaPnhB: A subgroup of nucleoside hydrolases similar to Corynebacterium ammoniagenes Purine/pyrimidine nucleoside hydrolase (pnhB). Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity. . 29579 cd02651: nuc_hydro_IU_UC_XIUA: inosine-uridine preferring, xanthosine-inosine-uridine-adenosine-preferring and, uridine-cytidine preferring nucleoside hydrolases. Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity. This group contains proteins similar to nucleoside hydrolases which hydrolyze both pyrimidine and purine ribonucleosides: the inosine-uridine preferring nucleoside hydrolase from Crithidia fasciculata, the inosine-uridine-xanthosine preferring nucleoside hydrolase RihC from Escherichia coli and the xanthosine-inosine-uridine-adenosine-preferring nucleoside hydrolase RihC from Salmonella enterica serovar Typhimurium. This group also contains proteins similar to the pyrimidine-specific uridine-cytidine preferring nucleoside hydrolases URH1 from Saccharomyces cerevisiae, E. coli RihA and E. coli RihB. E. coli RihA is equally efficient with uridine and cytidine, E. coli RihB prefers cytidine over uridine. S. cerevisiae URH1 prefers uridine over cytidine. . 29580 cd02652: NH_2: A subgroup of nucleoside hydrolases. This group contains eukaryotic and bacterial proteins similar to nucleoside hydrolases. Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity. . 29581 cd02653: NH_3: A subgroup of nucleoside hydrolases. This group contains eukaryotic and bacterial proteins similar to nucleoside hydrolases. Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity. . 29582 cd02654: nuc_hydro_CjNH. Nucleoside hydrolases similar to Campylobacter jejuni nucleoside hydrolase. This group contains eukaryotic and bacterial proteins similar to C. jejuni nucleoside hydrolase. Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity. C. jejuni nucleoside hydrolase is inactive against natural nucleosides or against common nucleoside analogues. . 29583 cd00457: bacterial/archael PhosphatidylEthanolamine-Binding Protein (PEBP) and its eukarytic homolog Raf Kinase Inhibitor Protein (RKIP) belong to a highly conserved family of phospholipid-binding proteins represented in all three major phylogenetic divisions (eukaryotes, bacteria, archaea) with no significant sequence homology to other proteins. A number of biological roles for members of the PEBP/RKIP family include serine protease inhibition, membrane biogenesis, and Raf-1 kinase inhibition. In mammals, RKIP is believed to be the precursor of HippoCampal Neurostimulatory Peptide (HCNP), a bioactive peptide comprising the first 12 amino acids of PEBP that plays an important role in development of the hippocampus. PEBP forms a homodimer and a small cavity within each PEBP monomer is thought to serve as a binding site for the polar head group of phosphatidylethanolamine. Although their overall structures are similar, PEBP and RKIP have very different substrate and dimer interaction sites. 29584 cd00865: bacterial/archael PhosphatidylEthanolamine-Binding Protein (PEBP) and its eukarytic homolog Raf Kinase Inhibitor Protein (RKIP) belong to a highly conserved family of phospholipid-binding proteins represented in all three major phylogenetic divisions (eukaryotes, bacteria, archaea) with no significant sequence homology to other proteins. A number of biological roles for members of the PEBP/RKIP family include serine protease inhibition, membrane biogenesis, and Raf-1 kinase inhibition. In mammals, RKIP is believed to be the precursor of HippoCampal Neurostimulatory Peptide (HCNP), a bioactive peptide comprising the first 12 amino acids of PEBP that plays an important role in development of the hippocampus. PEBP forms a homodimer and a small cavity within each PEBP monomer is thought to serve as a binding site for the polar head group of phosphatidylethanolamine. Although their overall structures are similar, PEBP and RKIP have very different substrate and dimer interaction sites. 29585 cd00866: eukaryotic Raf Kinase Inhibitor Protein (RKIP) and its bacterial/archael homolog PhosphatidylEthanolamine-Binding Protein (PEBP), belong to a highly conserved family of phospholipid-binding proteins represented in all three major phylogenetic divisions (eukaryotes, bacteria, archaea) with no significant sequence homology to other proteins. A number of biological roles for PEBP have been identified including serine protease inhibition, membrane biogenesis, and Raf-1 kinase inhibition. In addition, the mammalian PEBPs are believed to be the precursor of HippoCampal Neurostimulatory Peptide (HCNP), a bioactive peptide comprising the first 12 amino acids of PEBP that plays an important role in hippocampus development. PEBP forms a homodimer and a small cavity within each PEBP monomer is thought to serve as a binding site for the polar head group of phosphatidylethanolamine. 29586 cd00468: HIT family: HIT (Histidine triad) proteins, named for a motif related to the sequence HxHxH/Qxx (x, a hydrophobic amino acid), are a superfamily of nucleotide hydrolases and transferases, which act on the alpha-phosphate of ribonucleotides. On the basis of sequence, substrate specificity, structure, evolution and mechanism, HIT proteins are classified in the literacture into three major branches: the Hint branch, which consists of adenosine 5' -monophosphoramide hydrolases, the Fhit branch, that consists of diadenosine polyphosphate hydrolases, and the GalT branch consisting of specific nucloside monophosphate transferases. Further sequence analysis reveals several new closely related, yet uncharacterized subgroups. 29587 cd00608: Galactose-1-phosphate uridyl transferase (GalT): This enzyme plays a key role in galactose metabolism by catalysing the transfer of a uridine 5'-phosphoryl group from UDP-galactose 1-phosphate. The structure of E.coli GalT reveals that the enzyme contains two identical subunits. It also demonstrates that the active site is formed by amino acid residues from both subunits of the dimer. 29588 cd01275: FHIT (fragile histidine family): FHIT proteins, related to the HIT family carry a motif HxHxH/Qxx (x, is a hydrophobic amino acid), On the basis of sequence, substrate specificity, structure, evolution and mechanism, HIT proteins are classified into three branches: the Hint branch, which consists of adenosine 5' -monophosphoramide hydrolases, the Fhit branch, that consists of diadenosine polyphosphate hydrolases, and the GalT branch consisting of specific nucloside monophosphate transferases. Fhit plays a very important role in the development of tumours. Infact, Fhit deletions are among the earliest and most frequent genetic alterations in the development of tumours. 29589 cd01276: Protein Kinase C Interacting protein related (PKCI): PKCI and related proteins belong to the ubiquitous HIT family of hydrolases that act on alpha-phosphates of ribonucleotides. The members of this subgroup have a conserved HxHxHxx motif (x is a hydrophobic residue) that is a signature for this family. No enzymatic activity has been reported however, for PKCI and its related members. 29590 cd01277: HINT (histidine triad nucleotide-binding protein) subgroup: Members of this CD belong to the superfamily of histidine triad hydrolases that act on alpha-phosphate of ribonucleotides. This subgroup includes members from all three forms of cellular life. Although the biochemical function has not been characterised for many of the members of this subgroup, the proteins from Yeast have been shown to be involved in secretion, peroxisome formation and gene expression. 29591 cd01278: aprataxin related: Aprataxin, a HINT family hydrolase is mutated in ataxia oculomotor apraxia syndrome. All the members of this subgroup have the conserved HxHxHxx (where x is a hydrophobic residue) signature motif. Members of this subgroup are predominantly eukaryotic in origin. 29592 cd00474: The SUI1/eIF1 (eukaryotic initiation factor 1) fold is found in eukaryotes, archaea, and some bacteria and is thought to play an important role in accurate initiator codon recognition during translation initiation. This fold, which includes two antiparallel alpha helices packed against the same side of a five-strand beta sheet, is structurally similar to other RNA-binding domains suggesting that SUI1/eIF1 may bind RNA. Point mutations in the yeast eIF1 implicate the protein in maintaining accurate start-site selection but its mechanism of action is unknown. 29593 cd00475: Cis (Z)-Isoprenyl Diphosphate Synthases (cis-IPPS); homodimers which catalyze the successive 1'-4 condensation of the isopentenyl diphosphate (IPP) molecule to trans,trans-farnesyl diphosphate (FPP) or to cis,trans-FPP to form long-chain polyprenyl diphosphates. A few can also catalyze the condensation of IPP to trans-geranyl diphosphate to form the short-chain cis,trans- FPP. In prokaryotes, the cis-IPPS, undecaprenyl diphosphate synthase (UPP synthase) catalyzes the formation of the carrier lipid UPP in bacterial cell wall peptidooglycan biosynthesis. Similarly, in eukaryotes, the cis-IPPS, dehydrodolichyl diphosphate (dedol-PP) synthase catalyzes the formation of the polyisoprenoid glycosyl carrier lipid dolichyl monophosphate. cis-IPPS are mechanistically and structurally distinct from trans-IPPS, lacking the DDXXD motifs, yet requiring Mg2+ for activity. 29594 cd00476: SAICAR synthetase catalyzes the seventh step of the de novo biosynthesis of purine nucleotides, the conversion of carboximideaminoimidazole ribonucleotide (CAIR) into succinoaminoimidazolecarboximide ribonucleotide (SAICAR). CAIR and aspartic acid react in the presence of ATP and magnesium to form SAICAR. 29595 cd01414: Eukaryotic, prokaryotic and archaeal group of SAICAR synthetases represented by the Saccharomyces cerevisiae (Sc) SAICAR synthetase. SAICAR synthetase catalyzes the seventh step of the de novo biosynthesis of purine nucleotides. 29596 cd01415: Prokaryotic and archaeal group of SAICAR synthetases represented by the Thermotoga maritima (Tm) SAICAR synthetase and E. coli PurC. SAICAR synthetase catalyzes the seventh step of the de novo biosynthesis of purine nucleotides. 29597 cd01416: Eukaryotic group of SAICAR synthetases represented by the Drosophila melanogaster, N-terminal, SAICAR synthetase domain of Ade5. The Ade5 gene product (CAIR-SAICARs) catalyzes the sixth and seventh steps of the de novo biosynthesis of purine nucleotides. 29598 cd00481: Ribosomal protein L19e. L19e is found in the large ribosomal subunit of eukaryotes and archaea. L19e is distinct from the ribosomal subunit L19, which is found in prokaryotes. It consists of two small globular domains connected by an extended segment. It is located toward the surface of the large subunit, with one exposed end involved in forming the intersubunit bridge with the small subunit. The other exposed end is involved in forming the translocon binding site, along with L22, L23, L24, L29, and L31e subunits. 29599 cd01417: Ribosomal protein L19e, eukaryotic. L19e is found in the large ribosomal subunit of eukaryotes and archaea. L19e is distinct from the ribosomal subunit L19, which is found in prokaryotes. It consists of two small globular domains connected by an extended segment. It is located toward the surface of the large subunit, with one exposed end involved in forming the intersubunit bridge with the small subunit. The other exposed end is involved in forming the translocon binding site, along with L22, L23, L24, L29, and L31e subunits. 29600 cd01418: Ribosomal protein L19e, archaeal. L19e is found in the large ribosomal subunit of eukaryotes and archaea. L19e is distinct from the ribosomal subunit L19, which is found in prokaryotes. It consists of two small globular domains connected by an extended segment. It is located toward the surface of the large subunit, with one exposed end involved in forming the intersubunit bridge with the small subunit. The other exposed end is involved in forming the translocon binding site, along with L22, L23, L24, L29, and L31e subunits. 29601 cd00483: 7,8-dihydro-6-hydroxymethylpterin-pyrophosphokinase (HPPK). Folate derivatives are essential cofactors in the biosynthesis of purines, pyrimidines, and amino acids as well as formyl-tRNA. Mammalian cells are able to utilize pre-formed folates after uptake by a carrier-mediated active transport system. Most microbes and plants lack this system and must synthesize folates de novo from guanosine triphosphate. One enzyme from this pathway is HPPK which catalyzes pyrophosphoryl transfer from ATP to 6-hydroxymethyl-7,8-dihydropterin (HP). The functional enzyme is a monomer. Mammals lack many of the enzymes in the folate pathway including, HPPK. 29602 cd00487: Polypeptide or peptide deformylase; a family of metalloenzymes that catalyzes the removal of the N-terminal formyl group in a growing polypeptide chain following translation initiation during protein synthesis in prokaryotes. These enzymes utilize Fe(II) as the catalytic metal ion, which can be replaced with a nickel or cobalt ion with no loss of activity. There are two types of peptide deformylases, types I and II, which differ in structure only in the outer surface of the domain. Because these enzymes are essential only in prokaryotes (although eukaryotic gene sequences have been found), they are a target for a new class of antibacterial agents. 29603 cd00491: 4-Oxalocrotonate Tautomerase: Catalyzes the isomerization of unsaturated ketones. The structure is a homohexamer that is arranged as a trimer of dimers. The hexamer contains six active sites, each formed by residues from three monomers, two from one dimer and the third from a neighboring monomer. Each monomer is a beta-alpha-beta fold with two small beta strands at the C-terminus that fold back on themselves. A pair of monomers form a dimer with two-fold symmetry, consisting of a 4-stranded beta sheet with two helices on one side and two additional small beta strands at each end. The dimers are assembled around a 3-fold axis of rotation to form a hexamer, with the short beta strands from each dimer contacting the neighboring dimers. 29604 cd00494: Hydroxymethylbilane synthase (HMBS), also known as porphobilinogen deaminase (PBGD), is an intermediate enzyme in the biosynthetic pathway of tetrapyrrolic ring systems, such as heme, chlorophylls, and vitamin B12. HMBS catalyzes the conversion of porphobilinogen (PBG) into hydroxymethylbilane (HMB). HMBS consists of three domains, and is believed to bind substrate through a hinge-bending motion of domains I and II. HMBS is found in all organisms except viruses. 29605 cd00498: Heat shock protein 33 (Hsp33): Cytosolic protein that acts as a molecular chaperone under oxidative conditions. In normal (reducing) cytosolic conditions, four conserved Cys residues are coordinated by a Zn ion. Under oxidative stress (such as heat shock), the Cys are reversibly oxidized to disulfide bonds, which causes the chaperone activity to be turned on. Hsp33 is homodimeric in its functional form. 29606 cd00501: Pyroglutamyl peptidase (PGP) type I, also known as pyrrolidone carboxyl peptidase (pcp) type I: Enzymes responsible for cleaving pyroglutamate (pGlu) from the N-terminal end of specialized proteins. The N-terminal pGlu protects these proteins from proteolysis by other proteases until the pGlu is removed by a PGP. PGPs are cysteine proteases with a Cys-His-Glu/Asp catalytic triad. Type I PGPs are found in a wide variety of prokaryotes and eukaryotes. It is not clear whether the functional form is a monomer, a homodimer, or a homotetramer. 29607 cd00503: Frataxin is a nuclear-encoded mitochondrial protein implicated in Friedreich's ataxia (FRDA), an human autosomal recessive neurodegenerative disease; Frataxin is found in eukaryotes and in purple bacteria; lack of frataxin causes iron to accumulate in the mitochondrial matrix suggesting that frataxin is involved in mitochondrial iron homeostasis and possibly in iron transport; the domain has an alpha-beta fold consisting of two helices flanking an antiparallel beta sheet. 29608 cd00504: GXGXG domain. This domain of unknown function is found at the C-terminus of the large subunit (gltB) of glutamate synthase (GltS), in subunit C of tungsten formylmethanofuran dehydrogenase (FwdC) and in subunit C of molybdenum formylmethanofuran dehydrogenase (FmdC). It is also found in a primarily archeal group of proteins predicted to encode part of the large subunit of GltS. It is characterized by a repeated GXXGXXXG motif. GltS is a complex iron-sulfur flavoprotein that catalyzes the synthesis of L-glutamate from L-glutamine and 2-oxoglutarate. It requires the transfer of ammonia and electrons among three distinct active centers that carry out L-Gln hydrolysis, conversion of 2-oxoglutarate into L-Glu, and electron uptake from a donor. These catalytic sites occur in other domains within the protein or or encoded by separate genes, and are not present in the domain in this CD. FwdC and FmdC are reversible ion pumps that catalyze the formylation and deformylation of methanofuran in hyperthermophiles and bacteria. They require the presence of either tungstun (FwdC) or molybdenum (FmdC). The specific function of this domain also remains unidentified in the formylmethanofuran dehydrogenases. 29609 cd00980: FwdC/FmdC. This domain of unknown function is found in the subunit C of formylmethanofuran dehydrogenase, an enzyme that catalyzes the first step in methane formation from CO2 in methanogenic archaea, hyperthermophiles and bacteria. There are two isoenzymes, a tungsten-containing isoenzyme (Fwd) and a molybdenum-containing isoenzyme (Fmd). The subunits C of both isoenzymes (FwdC/FmdC) are characterized by a repeated GXXGXXXG motif. 29610 cd00981: Archaeal-type gltB domain. This domain shares sequence similarity with a region of unknown function found in the large subunit of glutamate synthase, which is encoded by gltB and found in most bacteria and eukaryotes. It is predicted to be homologous to the C-terminal domain of glutamate synthase based upon sequence similarity coupled with genome organization data, showing that this domain is found in a gene cluster with other domains of Glts, which are annotated. This domain is found primarily in archaea, but is also present in a few bacteria, likely as a result of lateral gene transfer. 29611 cd00982: gltb_C. This domain is found at the C-terminus of the large subunit (gltB) of glutamate synthase (GltS). GltS encodes a complex iron-sulfur flavoprotein that catalyzes the synthesis of L-glutamate from L-glutamine and 2-oxoglutarate. It requires the transfer of ammonia and electrons among three distinct active centers that carry out L-Gln hydrolysis, conversion of 2-oxoglutarate into L-Glu, and electron uptake from a donor. These catalytic sites appear to occur in other domains within the protein, and not the domain in this CD. This particular domain has no known function, but it likely has a structural role as it interacts with the amidotransferase and FMN-binding domains of gltS. 29612 cd00516: Phosphoribosyltransferase (PRTase) type II; This family contains two enzymes that play an important role in NAD production by either allowing quinolinic acid (QA) , quinolinate phosphoribosyl transferase (QAPRTase), or nicotinic acid (NA), nicotinate phosphoribosyltransferase (NAPRTase), to be used in the synthesis of NAD. QAPRTase catalyses the reaction of quinolinic acid (QA) with 5-phosphoribosyl-1-pyrophosphate (PRPP) in the presence of Mg2+ to produce nicotinic acid mononucleotide (NAMN), pyrophosphate and carbon dioxide, an important step in the de novo synthesis of NAD. NAPRTase catalyses a similar reaction leading to NAMN and pyrophosphate, using nicotinic acid an PPRP as substrates, used in the NAD salvage pathway. 29613 cd01401: Nicotinate phosphoribosyltransferase (NAPRTase), related to PncB. Nicotinate phosphoribosyltransferase catalyses the formation of NAMN and PPi from 5-phosphoribosy -1-pyrophosphate (PRPP) and nicotinic acid, this is the first, and also rate limiting, reaction in the NAD salvage synthesis. This salvage pathway serves to recycle NAD degradation products. This subgroup is present in bacteria, archea and funghi. 29614 cd01567: Nicotinate phosphoribosyltransferase (NAPRTase) family. Nicotinate phosphoribosyltransferase catalyses the formation of NAMN and PPi from 5-phosphoribosy -1-pyrophosphate (PRPP) and nicotinic acid, this is the first, and also rate limiting, reaction in the NAD salvage synthesis. This salvage pathway serves to recycle NAD degradation products. 29615 cd01568: Quinolinate phosphoribosyl transferase (QAPRTase or QPRTase), also called nicotinate-nucleotide pyrophosphorylase, is involved in the de novo synthesis of NAD in both prokaryotes and eukaryotes. It catalyses the reaction of quinolinic acid (QA) with 5-phosphoribosyl-1-pyrophosphate (PRPP) in the presence of Mg2+ to produce nicotinic acid mononucleotide (NAMN), pyrophosphate and carbon dioxide. QPRTase functions as a homodimer with two active sites, each formed by the C-terminal region of one subunit and the N-terminal region of the other. 29616 cd01569: pre-B-cell colony-enhancing factor (PBEF)-like. The mammalian members of this group of nicotinate phosphoribosyltransferases (NAPRTases) were originally identified as genes whose expression is upregulated upon activation in lymphoid cells. In general, nicotinate phosphoribosyltransferase catalyses the formation of NAMN and PPi from 5-phosphoribosy -1-pyrophosphate (PRPP) and nicotinic acid, this is the first, and also rate limiting, reaction in the NAD salvage synthesis. 29617 cd01570: Nicotinate phosphoribosyltransferase (NAPRTase), subgroup A. Nicotinate phosphoribosyltransferase catalyses the formation of NAMN and PPi from 5-phosphoribosy -1-pyrophosphate (PRPP) and nicotinic acid, this is the first, and also rate limiting, reaction in the NAD salvage synthesis. This salvage pathway serves to recycle NAD degradation products. This subgroup is present in bacteria and eukaryota (except funghi).. 29618 cd01571: Nicotinate phosphoribosyltransferase (NAPRTase), subgroup B. Nicotinate phosphoribosyltransferase catalyses the formation of NAMN and PPi from 5-phosphoribosy -1-pyrophosphate (PRPP) and nicotinic acid, this is the first, and also rate limiting, reaction in the NAD salvage synthesis. This salvage pathway serves to recycle NAD degradation products. 29619 cd01572: Quinolinate phosphoribosyl transferase (QAPRTase or QPRTase), also called nicotinate-nucleotide pyrophosphorylase, is involved in the de novo synthesis of NAD in both prokaryotes and eukaryotes. It catalyses the reaction of quinolinic acid (QA) with 5-phosphoribosyl-1-pyrophosphate (PRPP) in the presence of Mg2+ to produce nicotinic acid mononucleotide (NAMN), pyrophosphate and carbon dioxide. QPRTase functions as a homodimer with two active sites, each formed by the C-terminal region of one subunit and the N-terminal region of the other. 29620 cd01573: ModD; Quinolinate phosphoribosyl transferase (QAPRTase or QPRTase) present in some modABC operons in bacteria, which are involved in molybdate transport. In general, QPRTases are part of the de novo synthesis pathway of NAD in both prokaryotes and eukaryotes. They catalyse the reaction of quinolinic acid (QA) with 5-phosphoribosyl-1-pyrophosphate (PRPP) in the presence of Mg2+ to produce nicotinic acid mononucleotide (NAMN), pyrophosphate and carbon dioxide. 29621 cd00520: Ribosome recycling factor (RRF). Ribosome recycling factor dissociates the posttermination complex, composed of the ribosome, deacylated tRNA, and mRNA, after termination of translation. Thus ribosomes are ""recycled"" and ready for another round of protein synthesis. RRF is believed to bind the ribosome at the A-site in a manner that mimics tRNA, but the specific mechanisms remain unclear. RRF is essential for bacterial growth. It is not necessary for cell growth in archaea or eukaryotes, but is found in mitochondria or chloroplasts of some eukaryotic species. 29622 cd00522: Hemerythrin (Hr) is a non-heme diiron oxygen transport protein found in four marine invertebrate phyla including priapulida, brachiopoda, sipunculida, and annelida, as well as in protozoa. Myohemerythrin (Mhr), a hemerythrin homolog, is found in the muscle tissue of sipunculids as well as in polycheate and oligocheate annelids. In addition to oxygen transport, Mhr proteins are involved in cadmium fixation and host anti-bacterial defense. Hr and Mhr proteins have the same ""four alpha helix bundle"" motif and active site structure. Hr forms oligomers, the octameric form being most prevalent, while Mhr is monomeric. 29623 cd00527: Ribosome anti-association factor IF6 binds the large ribosomal subunit and prevents the two subunits from associating during translation initiation. IF6 comprises a family of translation factors that includes both eukaryotic (eIF6) and archeal (aIF6) members. All members of this family have a conserved pentameric fold referred to as a beta/alpha propeller. The eukaryotic IF6 members have a moderately conserved C-terminal extension which is not required for ribosomal binding, and may have an alternative function. 29624 cd00528: MoaC family. Members of this family are involved in molybdenum cofactor (Moco) biosynthesis, an essential cofactor of a diverse group of redox enzymes. MoaC, a small hexameric protein, converts, together with MoaA, a guanosine derivative to the precursor Z by inserting the carbon-8 of the purine between the 2' and 3' ribose carbon atoms, which is the first of three phases of Moco biosynthesis. 29625 cd01419: MoaC family, archaeal. Members of this family are involved in molybdenum cofactor (Moco) biosynthesis, an essential cofactor of a diverse group of redox enzymes. MoaC, a small hexameric protein, converts, together with MoaA, a guanosine derivative to the precursor Z by inserting the carbon-8 of the purine between the 2' and 3' ribose carbon atoms, which is the first of three phases of Moco biosynthesis. 29626 cd01420: MoaC family, prokaryotic and eukaryotic. Members of this family are involved in molybdenum cofactor (Moco) biosynthesis, an essential cofactor of a diverse group of redox enzymes. MoaC, a small hexameric protein, converts, together with MoaA, a guanosine derivative to the precursor Z by inserting the carbon-8 of the purine between the 2' and 3' ribose carbon atoms, which is the first of three phases of Moco biosynthesis. 29627 cd00529: Holliday junction resolvases (HJRs) are endonucleases that specifically resolve Holliday junction DNA intermediates during homologous recombination. HJR's occur in archaea, bacteria, and in the mitochondria of certain fungi, however this CD includes only the bacterial and mitochondrial HJR's. These are referred to as the RuvC family of Holliday junction resolvases, RuvC being the E.coli HJR. RuvC and its orthologs are homodimers and are structurely similar to RNase H and Hsp70. 29628 cd00531: Nuclear transport factor 2 (NTF2-like) superfamily. This family includes members of the NTF2 family, Delta-5-3-ketosteroid isomerases, Scytalone Dehydratases, and the beta subunit of Ring hydroxylating dioxygenases. This family is a classic example of divergent evolution wherein the proteins have many common structural details but diverge greatly in their function. For example, nuclear transport factor 2 (NTF2) mediates the nuclear import of RanGDP and binds to both RanGDP and FxFG repeat-containing nucleoporins while Ketosteroid isomerases catalyze the isomerization of delta-5-3-ketosteroid to delta-4-3-ketosteroid, by intramolecular transfer of the C4-beta proton to the C6-beta position. While the function of the beta sub-unit of the Ring hydroxylating dioxygenases is not known, Scytalone Dehydratases catalyzes two reactions in the biosynthetic pathway that produces fungal melanin. Members of the NTF2-like superfamily are widely distributed among bacteria, archaea and eukaryotes. 29629 cd00667: Ring hydroxylating dioxygenase beta subunit. This subunit has a similar structure to NTF-2, Ketosteroid isomerase and scytalone dehydratase.The degradation of aromatic compounds by aerobic bacteria frequently begins with the dihydroxylation of the substrate by nonheme iron-containing dioxygenases. These enzymes consist of two or three soluble proteins that interact to form an electron-transport chain that transfers electrons from reduced nucleotides (NADH) via flavin and [2Fe-2S] redox centers to a terminal dioxygenase. Aromatic-ring-hydroxylating dioxygenases oxidize aromatic hydrocarbons and related compounds to cis-arene diols. These enzymes utilize a mononuclear non-heme iron center to catalyze the addition of dioxygen to their respective substrates. The active site of these enzymes however is in the alpha sub-unit. No functional role has been attributed to the beta sub-unit except for a structural role. 29630 cd00780: Nuclear transport factor 2 (NTF2) domain plays an important role in the trafficking of macromolecules, ions and small molecules between the cytoplasm and nucleus. This bi-directional transport of macromolecules across the nuclear envelope requires many soluble factors that includes GDP-binding protein Ran (RanGDP). RanGDP is required for both import and export of proteins and poly(A) RNA. RanGDP also has been implicated in cell cycle control, specifically in mitotic spindle assembly. In interphase cells, RanGDP is predominately nuclear and thought to be GTP bound, but it is also present in the cytoplasm, probably in the GDP-bound state. NTF2 mediates the nuclear import of RanGDP. NTF2 binds to both RanGDP and FxFG repeat-containing nucleoporins. 29631 cd00781: ketosteroid isomerase: Many biological reactions proceed by enzymatic cleavage of a C-H bond adjacent to carbonyl or a carboxyl group, leading to an enol or a enolate intermediate that is subsequently re-protonated at the same or an adjacent carbon. Ketosteroid isomerases are important members of this class of enzymes which are the most proficient of all enzymes known and have served as a paradigm for enzymatic enolizations since its discovery in 1954. This CD includes members of this class that calalyze the isomerization of various beta,gamma-unsaturated isomers at nearly a diffusion-controlled rate. These enzymes are widely distributed in bacteria. 29632 cd00532: MGS-like domain. This domain composes the whole protein of methylglyoxal synthetase, which catalyzes the enolization of dihydroxyacetone phosphate (DHAP) to produce methylglyoxal. The family also includes the C-terminal domain in carbamoyl phosphate synthetase (CPS) where it catalyzes the last phosphorylation of a coaboxyphosphate intermediate to form the product carbamoyl phosphate and may also play a regulatory role. This family also includes inosine monophosphate cyclohydrolase. The known structures in this family show a common phosphate binding site. 29633 cd01421: Inosine monophosphate cyclohydrolase domain. This is the N-terminal domain in the purine biosynthesis pathway protein ATIC (purH). The bifunctional ATIC protein contains a C-terminal ATIC formylase domain that formylates 5-aminoimidazole-4-carboxamide-ribonucleotide. The IMPCH domain then converts the formyl-5-aminoimidazole-4-carboxamide-ribonucleotide to inosine monophosphate. This is the final step in de novo purine production. 29634 cd01422: Methylglyoxal synthase catalyzes the enolization of dihydroxyacetone phosphate (DHAP) to produce methylglyoxal. The first part of the catalytic mechanism is believed to be similar to TIM (triosephosphate isomerase) in that both enzymes utilize DHAP to form an ene-diolate phosphate intermediate. In MGS, the second catalytic step is characterized by the elimination of phosphate and collapse of the enediolate to form methylglyoxal instead of reprotonation to form the isomer glyceraldehyde 3-phosphate, as in TIM. This is the first reaction in the methylglyoxal bypass of the Embden-Myerhoff glycolytic pathway and is believed to provide physiological benefits under non-ideal growth conditions in bacteria. 29635 cd01423: Methylglyoxal synthase-like domain found in pyr1 and URA1-like carbamoyl phosphate synthetases (CPS), including ammonia-dependent CPS Type I, and glutamine-dependent CPS Type III. These are multidomain proteins, in which MGS is the C-terminal domain. 29636 cd01424: Methylglyoxal synthase-like domain from type II glutamine-dependent carbamoyl phosphate synthetase (CSP). CSP, a CarA and CarB heterodimer, catalyzes the production of carbamoyl phosphate which is subsequently employed in the metabolic pathways responsible for the synthesis of pyrimidine nucleotides or arginine. The MGS-like domain is the C-terminal domain of CarB and appears to play a regulatory role in CPS function by binding allosteric effector molecules, including UMP and ornithine. 29637 cd00537: Methylenetetrahydrofolate reductase (MTHFR). 5,10-Methylenetetrahydrofolate is reduced to 5-methyltetrahydrofolate by methylenetetrahydrofolate reductase, a cytoplasmic, NAD(P)-dependent enzyme. 5-methyltetrahydrofolate is utilized by methionine synthase to convert homocysteine to methionine. The enzymatic mechanism is a ping-pong bi-bi mechanism, in which NAD(P)+ release precedes the binding of methylenetetrahydrofolate and the acceptor is free FAD. The family includes the 5,10-methylenetetrahydrofolate reductase EC:1.7.99.5 from prokaryotes and methylenetetrahydrofolate reductase EC: 1.5.1.20 from eukaryotes. The bacterial enzyme is a homotetramer and NADH is the preferred reductant while the eukaryotic enzyme is a homodimer and NADPH is the preferred reductant. In humans, there are several clinically significant mutations in MTHFR that result in hyperhomocysteinemia, which is a risk factor for the development of cardiovascular disease. 29638 cd00539: Methyl-coenzyme M reductase (MCR) gamma subunit. MCR catalyzes the terminal step of methane formation in the energy metabolism of all methanogenic archaea, in which methyl-coenzyme M and coenzyme B are converted to methane and the heterodisulfide of coenzyme M and coenzyme B (CoM-S-S-CoB). MCR is a dimer of trimers, each of which consists of one alpha, one beta, and one gamma subunit, with two identical active sites containing nickel porphinoid factor 430 (F430).. 29639 cd00541: The outer membrane phospholipase A (OMPLA) is an integral membrane enzyme that catalyses the hydrolysis of acylester bonds in phospholipids using calcium as a cofactor. The enzyme has a fold of transmembrane beta-barrels and is widespread among Gram-negative bacteria, both in pathogens and nonpathogens. In pathogenic bacteria such as Campylobacter coli and Helicobacter pylori OMPLA is involved in pathogenesis and virulence. In nonpathogenic bacteria the physiological function of OMPLA is less clear. The Escherichia coli enzyme is involved in the secretion of bacteriocins, antibacterial peptides that are produced in order to survive under starvation conditions. The enzyme activity of OMPLA is strictly regulated to prevent uncontrolled breakdown of the surrounding phospholipids. The activity of OMPLA can be induced by membrane perturbation and concurs with dimerization of the enzyme. 29640 cd00545: Methenyltetrahydromethanopterin (methenyl-H4MPT) cyclohydrolase (MCH). MCH is a cytoplasmic enzyme that has been identified in methanogenic archaea, sulfate- reducing archaea, and methylotrophic bacteria. It catalyzes the reversible formation of N(5), N(10)-methenyltetrahydromethanopterin (methenyl-H4MPT+) from N(5)-formyltetrahydromethanopterin (formyl- H4MPT), in the third step of the reaction to reduce CO2 to CH4. The protein functions as a homodimer or homotrimer, depending on the organism. 29642 cd00552: RaiA (""ribosome-associated inhibitor A"", also known as Protein Y (PY), YfiA, and SpotY, is a stress-response protein that binds the ribosomal subunit interface and arrests translation by interfering with aminoacyl-tRNA binding to the ribosomal A site. RaiA is also thought to counteract miscoding at the A site thus reducing translation errors. The RaiA fold structurally resembles the double-stranded RNA-binding domain (dsRBD).. 29643 cd00557: Preprotein translocase subunit SecB. SecB is a cytoplasmic component of the multisubunit membrane-bound enzyme termed Sec protein translocase, which is the main constituent of the General Secretory (type II) Pathway involved in translocation of nascent polypeptides across the cytoplasmic membrane. SecB has been shown to function as export-specific molecular chaperone that selectively binds preproteins, maintains them in a translocation competent state and delivers them to SecA, the membrane-bound ATPase, that drives the translocation reaction. In solution, SecB exists as homotetramer, which is organized as a dimer of dimers. 29644 cd00562: This CD represents a family of iron-molybdenum cluster-binding proteins that includes NifB, NifX, and NifY, all of which are involved in the synthesis of an iron-molybdenum cofactor (FeMo-co) that binds the active site of the dinitrogenase enzyme. This domain is a predicted small-molecule-binding domain (SMBD) with an alpha/beta fold that is present either as a stand-alone domain (e.g. NifX and NifY) or fused to another conserved domain (e.g. NifB) however, its function is still undetermined.The SCOP database suggests that this domain is most similar to structures within the ribonuclease H superfamily. This conserved domain is represented in two of the three major divisions of life (bacteria and archaea).. 29645 cd00851: This uncharacterized conserved protein belongs to a family of iron-molybdenum cluster-binding proteins that includes NifX, NifB, and NifY, all of which are involved in the synthesis of an iron-molybdenum cofactor (FeMo-co) that binds the active site of the dinitrogenase enzyme. This domain is a predicted small-molecule-binding domain (SMBD) with an alpha/beta fold that is present either as a stand-alone domain (e.g. NifX and NifY) or fused to another conserved domain (e.g. NifB) however, its function is still undetermined.The SCOP database suggests that this domain is most similar to structures within the ribonuclease H superfamily. This conserved domain is represented in two of the three major divisions of life (bacteria and archaea).. 29646 cd00852: NifB belongs to a family of iron-molybdenum cluster-binding proteins that includes NifX, and NifY, all of which are involved in the synthesis of an iron-molybdenum cofactor (FeMo-co) that binds the active site of the dinitrogenase enzyme as part of nitrogen fixation in bacteria. This domain is sometimes found fused to a N-terminal domain (the Radical SAM domain) in nifB-like proteins. 29647 cd00853: NifX belongs to a family of iron-molybdenum cluster-binding proteins that includes NifB, and NifY, all of which are involved in the synthesis of an iron-molybdenum cofactor (FeMo-co) that binds the active site of the dinitrogenase enzyme. The protein is part of the nitrogen fixation gene cluster in nitrogen-fixing bacteria and has sequence similarity to other members of the cluster. 29648 cd00563: D-Tyrosyl-tRNAtyr deacylases; a class of tRNA-dependent hydrolases which are capable of hydrolyzing the ester bond of D-Tyrosyl-tRNA reducing the level of cellular D-Tyrosine while recycling the peptidyl-tRNA; found in bacteria and in eukaryotes but not in archea; beta barrel-like fold structure; forms homodimers in which two surface cavities serve as the active site for tRNA binding. 29651 cd01151: Glutaryl-CoA dehydrogenase (GCD). GCD is an acyl-CoA dehydrogenase, which catalyzes the oxidative decarboxylation of glutaryl-CoA to crotonyl-CoA and carbon dioxide in the catabolism of lysine, hydroxylysine, and tryptophan. It uses electron transfer flavoprotein (ETF) as an electron acceptor. GCD is a homotetramer. GCD deficiency leads to a severe neurological disorder in humans. 29652 cd01152: Putative acyl-CoA dehydrogenase (ACAD). Mitochondrial acyl-CoA dehydrogenases (ACAD) catalyze the alpha, beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. The ACD family includes the eukaryotic beta-oxidation, as well as amino acid catabolism enzymes. These enzymes share high sequence similarity, but differ in their substrate specificities. The mitochondrial ACD's are generally homotetramers and have an active site glutamate at a conserved position. 29653 cd01153: Putative acyl-CoA dehydrogenase (ACAD). Mitochondrial acyl-CoA dehydrogenases (ACAD) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. The ACD family includes the eukaryotic beta-oxidation, as well as amino acid catabolism enzymes. These enzymes share high sequence similarity, but differ in their substrate specificities. The mitochondrial ACD's are generally homotetramers and have an active site glutamate at a conserved position. 29654 cd01154: AidB. AidB is one of several genes involved in the SOS adaptive response to DNA alkylation damage, whose expression is activated by the Ada protein. Its function has not been entirely elucidated; however, it is similar in sequence and function to acyl-CoA dehyrdogenases. It has been proposed that aidB directly destroys DNA alkylating agents such as nitrosoguanidines (nitrosated amides) or their reaction intermediates. 29655 cd01155: FadE2-like Acyl-CoA dehydrogenase (ACAD). Acyl-CoA dehydrogenases (ACAD) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. The ACAD family includes the eukaryotic beta-oxidation, as well as amino acid catabolism enzymes. These enzymes share high sequence similarity, but differ in their substrate specificities. ACAD's are generally homotetramers and have an active site glutamate at a conserved position. 29656 cd01156: Isovaleryl-CoA dehydrogenase (IVD) is an is an acyl-CoA dehydrogenase, which catalyzes the third step in leucine catabolism, the conversion of isovaleryl-CoA (3-methylbutyryl-CoA) into 3-methylcrotonyl-CoA. IVD is a homotetramer and has the greatest affinity for small branched chain substrates. 29657 cd01157: Medium chain acyl-CoA dehydrogenase (MCAD). MCADs are mitochondrial beta-oxidation enzymes, which catalyze the alpha,beta dehydrogenation of the corresponding medium chain acyl-CoA by FAD, which becomes reduced. The reduced form of MCAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. MCAD is a homotetramer. 29658 cd01158: Short chain acyl-CoA dehydrogenase (SCAD). SCAD is a mitochondrial beta-oxidation enzyme. It catalyzes the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of SCAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. This subgroup also contains the eukaryotic short/branched chain acyl-CoA dehydrogenase(SBCAD), the bacterial butyryl-CoA dehydorgenase(BCAD) and 2-methylbutyryl-CoA dehydrogenase, which is involved in isoleucine catabolism. These enzymes are homotetramers. 29659 cd01159: Naphthocyclinone hydroxylase (NcnH). Naphthocyclinone is an aromatic polyketide and an antibiotic, which is active against Gram-positive bacteria. Polyketides are secondary metabolites, which have important biological functions such as antitumor, immunosupressive or antibiotic activities. NcnH is a hydroxylase involved in the biosynthesis of naphthocyclinone and possibly other polyketides. 29660 cd01160: Long chain acyl-CoA dehydrogenase (LCAD) is acyl-CoA dehydrogenases (ACAD), which is found in the mitochondria of eukaryotes and in some prokaryotes. It catalyzes the alpha, beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of LCAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. LCAD is a homodimer. 29661 cd01161: Very long chain acyl-CoA dehydrogenase (VLCAD). VLCAD acyl-CoA dehydrogenases (ACAD), which is found in the mitochondria of eukaryotes and in some bacteria. It catalyzes the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. VLCAD, which is a homodimer. 29662 cd01162: Isobutyryl-CoA dehydrogenase (IBD) catalyzes the alpha, beta- dehydrogenation of short branched chain acyl-CoA intermediates in valine catabolism. It is predicted to be a homotetramer. 29663 cd01163: Dibenzothiophene (DBT) desulfurization enzyme C (DszC). DszC is a flavin reductase dependent enzyme, which catalyzes the first two steps of DBT desulfurization in mesophilic bacteria. DszC converts DBT to DBT-sulfoxide, which is then converted to DBT-sulfone. Bacteria with this enzyme are candidates for the removal of organic sulfur compounds from fossil fuels, which pollute the environment. An equivalent enzyme tdsC, is found in thermophilic bacteria. This alignment also contains a closely related uncharacterized subgroup. 29664 cd00571: UreE urease accessory protein. UreE is a metallochaperone assisting the insertion of a Ni2+ ion in the active site of urease, an important step in the in vivo assembly of urease, an enzyme that hydrolyses urea into ammonia and carbamic acid. The C-terminal region of UreE contains a histidine rich nickel binding site. 29665 cd00575: Nitric oxide synthase (NOS) produces nitric oxide (NO) by catalyzing a five-electron heme-based oxidation of a guanidine nitrogen of L-arginine to L-citrulline via two successive monooxygenation reactions producing N(omega)-hydroxy-L-arginine (NHA) as an intermediate. In mammals, there are three distinct NOS isozymes: neuronal (nNOS or NOS-1), cytokine-inducible (iNOS or NOS-2) and endothelial (eNOS or NOS-3) . Nitric oxide synthases are homodimers. In eukaryotes, each monomer has an N-terminal oxygenase domain which binds to the substrate L-Arg, zinc, and to the cofactors heme and 5.6.7.8-(6R)-tetrahydrobiopterin (BH4) . Eukaryotic NOSs also have a C-terminal electron supplying reductase region, which is homologous to cytochrome P450 reductase and binds NADH, FAD and FMN. While prokaryotes can produce NO as a byproduct of denitrification, using a completely different set of enzymes than NOS, a few prokaryotes also have a NOS which consists solely of the NOS oxygenase domain. Prokaryotic NOS binds to the substrate L-Arg, zinc, and to the cofactors heme and tetrahydrofolate. 29666 cd00794: Nitric oxide synthase (NOS) prokaryotic oxygenase domain. NOS produces nitric oxide (NO) by catalyzing a five-electron heme-based oxidation of a guanidine nitrogen of L-arginine to L-citrulline via two successive monooxygenation reactions producing N(omega)-hydroxy-L-arginine (NHA) as an intermediate. Nitric oxide synthases are homodimers. Most prokaryotes produce NO as a byproduct of denitrification, using a completely different set of enzymes than NOS. However, a few prokaryotes also have a NOS, consisting solely of the NOS oxygenase domain. Prokaryotic NOS binds to the substrate L-Arg, zinc, and to the cofactors heme and tetrahydrofolate. 29667 cd00795: Nitric oxide synthase (NOS) eukaryotic oxygenase domain. NOS produces nitric oxide (NO) by catalyzing a five-electron heme-based oxidation of a guanidine nitrogen of L-arginine to L-citrulline via two successive monooxygenation reactions producing N(omega)-hydroxy-L-arginine (NHA) as an intermediate. In mammals, there are three distinct NOS isozymes: neuronal (nNOS or NOS-1), cytokine-inducible (iNOS or NOS-2) and endothelial (eNOS or NOS-3) . Nitric oxide synthases are homodimers. In eukaryotes, each monomer has an N-terminal oxygenase domain, which binds to the substrate L-Arg, zinc, and to the cofactors heme and 5.6.7.8-(6R)-tetrahydrobiopterin (BH4) . Eukaryotic NOS's also have a C-terminal electron supplying reductase region, which is homologous to cytochrome P450 reductase and binds NADH, FAD and FMN. 29669 cd01675: RNR, class III. Ribonucleotide reductase (RNR) catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides the precursors necessary for DNA synthesis. RNRs are separated into three classes based on their metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, and bacteriophage, use a diiron-tyrosyl radical, Class II RNRs, found in bacteria, bacteriophage, algae and archaea, use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs, found in strict or facultative anaerobic bacteria, bacteriophage, and archaea, use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. Many organisms have more than one class of RNR present in their genomes. All three RNRs have a ten-stranded alpha-beta barrel domain that is structurally similar to the domain of PFL (pyruvate formate lyase). The class III enzyme from phage T4 consists of two subunits, this model covers the larger subunit which contains the active and allosteric sites. 29670 cd01676: RNR, class II. Ribonucleotide reductase (RNR) catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides the precursors necessary for DNA synthesis. RNRs are separated into three classes based on their metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, and bacteriophage, use a diiron-tyrosyl radical, Class II RNRs, found in bacteria, and bacteriophage, use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs, found in anaerobic bacteria, bacteriophage, and archaea, use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. Many organisms have more than one class of RNR present in their genomes. All three RNRs have a ten-stranded alpha-beta barrel domain that is structurally similar to the domain of PFL (pyruvate formate lyase). Class II RNRs are found in bacteria that can live under both aerobic and anaerobic conditions. Many, but not all members of this class are found to be homodimers. Adenosylcobalamin interacts directly with an active site cysteine to form the reactive cysteine radical. 29671 cd01677: PFL2_DhaB_BssA This CD includes pyruvate formate lyase 2 (PFL2), B12-independent glycerol dehydratase (DhaB) and the alpha subunit of benzylsuccinate synthase (BssA), all of which have a higly conserved ten-stranded alpha/beta barrel domain. DhaB catalyzes the first step in the conversion of glycerol to 1,3-propanediol while BssA catalyzes the first step in the anaerobic mineralization of both toluene and m-xylene. This domain is similar to those of PFL1 (pyruvate formate lyase 1) and RNR (ribonucleotide reductase).. 29672 cd01678: PFL1 Pyruvate formate lyase catalyzes a key step in anaerobic glycolysis, the conversion of pyruvate and CoenzymeA to formate and acetylCoA. The PFL mechanism involves an unusual radical cleavage of pyruvate in which two cysteines and one glycine form radicals that are required for catalysis. PFL has a ten-stranded alpha/beta barrel domain that is structurally similar to those of all three ribonucleotide reductase (RNR) classes as well as benzylsuccinate synthase and B12-independent glycerol dehydratase. 29677 cd01915: Carbon monoxide dehydrogenase (CODH) is found in acetogenic and methanogenic organisms and is responsible for the synthesis and breakdown of acetyl-CoA, respectively. CODH has two types of metal clusters, a cubane [Fe4-S4] center (B-cluster) similar to that of hybrid cluster protein (HCP) and a Ni-Fe-S center (C-cluster) where carbon monoxide oxidation occurs. Bifunctional CODH forms a heterotetramer with acetyl-CoA synthase (ACS) consisting of two CODH and two ACS subunits while monofunctional CODH forms a homodimer. Bifunctional CODH reduces carbon dioxide to carbon monoxide and ACS then synthesizes acetyl-CoA from carbon monoxide, CoA, and a methyl group donated by another protein (CoFeSP), while monofunctional CODH oxidizes carbon monoxide to carbon dioxide. CODH and ACS each have a metal cluster referred to as the C- and A-clusters, respectively. 29678 cd01916: Acetyl-CoA synthase (ACS), also known as acetyl-CoA decarbonylase, is found in acetogenic and methanogenic organisms and is responsible for the synthesis and breakdown of acetyl-CoA. ACS forms a heterotetramer with carbon monoxide dehydrogenase (CODH) consisting of two ACS and two CODH subunits. CODH reduces carbon dioxide to carbon monoxide and ACS then synthesizes acetyl-CoA from carbon monoxide, CoA, and a methyl group donated by another protein (CoFeSP). ACS has three structural domains, an N-terminal rossman fold domain with a helical region at its N-terminus which interacts with CODH, and two alpha + beta fold domains. A Ni-Fe-S center referred to as the A-cluster is located in the C-terminal domain. A large cavity exists between the three domains which may bind CoA. 29679 cd01917: Acetyl-CoA synthase (ACS), also known as acetyl-CoA decarbonylase, is found in acetogenic and methanogenic organisms and is responsible for the synthesis and breakdown of acetyl-CoA. ACS forms a heterotetramer with carbon monoxide dehydrogenase (CODH) consisting of two ACS and two CODH subunits. CODH reduces carbon dioxide to carbon monoxide and ACS then synthesizes acetyl-CoA from carbon monoxide, CoA, and a methyl group donated by another protein (CoFeSP). ACS has three structural domains, an N-terminal rossman fold domain with a helical region at its N-terminus which interacts with CODH, and two alpha + beta fold domains. A Ni-Fe-S center referred to as the A-cluster is located in the C-terminal domain. A large cavity exists between the three domains which may bind CoA. 29680 cd00588: CheW-like domain. CheW proteins are part of the chemotaxis signalling mechanism in bacteria. CheW interacts with the methyl accepting chemotaxis proteins (MCPs) and relays signals to CheY, which affects flageller rotation. This family includes CheW and other related proteins that are involved in chemotaxis. The CheW-like regulatory domain in the chemotaxis associated histidine kinase CheA binds to CheW, suggesting that these domains can interact with each other. 29681 cd00731: CheA regulatory domain; CheA is a histidine protein kinase present in bacteria and archea. Activated by the chemotaxis receptor a histidine phosphoryl group from CheA is passed directly to an aspartate in the response regulator CheY. This signalling mechanism is modulated by the methyl accepting chemotaxis proteins (MCPs). MCPs form a highly interconnected, tightly packed array within the membrane that is organized, at least in part, through interactions with CheW and CheA. The CheA regulatory domain belongs to the family of CheW_like proteins and has been proposed to mediate interaction with the kinase regulator CheW. 29682 cd00732: CheW, a small regulator protein, unique to the chemotaxis signalling in prokaryotes and archea. CheW interacts with the histidine kinase CheA, most likely with the related regulatory domain of CheA. CheW is proposed to form signalling arrays together with CheA and the methyl-accepting chemotaxis proteins (MCPs), which are involved in response modulation. 29683 cd00591: Integration host factor (IHF) and HU are small heterodimeric members of the DNABII protein family that bind and bend DNA, functioning as architectural factors in many cellular processes including transcription, site-specific recombination, and higher-order nucleoprotein complex assembly. The dimer subunits associate to form a compact globular core from which two beta ribbon arms (one from each subunit) protrude. The beta arms track and bind the DNA minor groove. Despite sequence and structural similarity, IHF and HU can be distinguished by their different DNA substrate preferences. 29684 cd00592: Helix-turn-helix transcription regulator MERR, N-terminal domain. The MERR family transcription regulators have been shown to mediate responses to stress including exposure to heavy metals, drugs, or oxygen radicals in eubacterial and some archaeal species. They regulate transcription of multidrug/metal ion transporters genes and oxidative stress regulons by reconfiguring the spacer between the -35 and -10 promoter elements. A typical MERR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH (helix-turn-helix) motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates. 29685 cd01104: Helix-turn-helix transcription regulator MlrA (merR-like regulator A). The MlrA protein, also known as YehV, has been shown to control cell-cell aggregation by co-regulating the expression of curli and extracellular matrix production in Escherichia coli and Salmonella typhimurium. Its close homolog, CarA from Myxococcus xanthus, is involved in activation of the carotenoid biosynthesis genes by light. These proteins belong to the MERR superfamily of transcription regulators that promote expression of several stress regulon genes by reconfiguring the spacer between the -35 and -10 promoter elements. Their conserved N-terminal domains contain predicted HTH (helix-turn-helix) motives that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules. 29686 cd01105: Helix-turn-helix transcription regulator GlnR. The GlnR and TnrA (also known as ScgR) proteins have been shown to regulate expression of glutamine synthetase as well as several genes involved in nitrogen metabolism. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MERR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. A typical MERR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain predicted HTH (helix-turn-helix) motives that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules. 29687 cd01106: Helix-turn-helix transcription regulator TipA. The Mta, SkgA and TipA proteins have been shown to regulate expression of specific regulons in response to various antibiotics or oxygen radicals in Streptomyces, Bacillus subtilis, and Caulobacter crescentus. Also, the NolA protein has been shown to regulate nodulation in Bradyrhizobium. They are comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain predicted HTH (helix-turn-helix) motives that mediate DNA binding, while the C-terminal domains are often unrelated and bind specific coactivator molecules. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MERR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. 29688 cd01107: Helix-turn-helix tanscription regulator BmrR. The Bacillus subtilis BmrR and BltR proteins have been shown to promote gene expression of their cognate drug transporters, Bmr and Blt, respectively, by directly sensing the presence of the drug. These proteins are comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain predicted HTH (helix-turn-helix) motives that mediate DNA binding, while the C-terminal domains are often unrelated and bind specific coactivator molecules. They share the N-terminal DNA binding domain with other transcription regulators of the MERR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. 29689 cd01108: Helix-turn-helix transcription regulator CueR. The CueR-like proteins have been shown to activate metal ion efflux and resistance regulons in many eubacterial species. Besides transcriptional regulator of the copper efflux operon, CueR from Escherichia coli, the family includes Pseudomonas aeruginosa CadR, E. coli MerR, Ralstonia metallidurans PbrR, Rhizobium leguminosarum ActP and E. coli ZntR that regulate expression of the cadmium, mercury, lead, copper and zinc resistance operons, respectively. These proteins are comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain predicted HTH (helix-turn-helix) motives that mediate DNA binding, while the dissimilar C-terminal domains bind specific metal ions. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MERR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. 29690 cd01109: Helix-turn-helix transcription regulator YyaN. Based on sequence similarity, these proteins were predicted to function as transcription regulators that mediate responses to stress in eubacteria. They belong to the MERR suoerfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MERR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH (helix-turn-helix) motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates. 29691 cd01110: Helix-turn-helix transcriptional regulator SoxR. The global regulator SoxR that has been shown to up-regulate gene expression of another transcription activator, SoxS, which directly stimulates the oxidative stress regulon genes in E. coli. The soxRS response renders the bacterial cell resistant to superoxide-generating agents, macrophage-generated nitric oxide, organic solvents, and antibiotics. The SoxR proteins share the N-terminal DNA binding domain with other transcription regulators of the MERR superfamily that promote transcription by reconfiguring the unusually long spacer between the -35 and -10 promoter elements. They also harbor a regulatory C-terminal domain containing an iron-sulfur center. 29692 cd01111: Helix-turn-helix transcription regulator MerD. The putative secondary regulator of mercury resistance (mer) operons, MerD, has been shown to down regulate the expression of this operon in gram-negative bacteria. It binds to the same operator DNA as MerR that activates transcription of the operon in the presence of mercury ions. The MerD protein shares the N-terminal DNA binding domain with other transcription regulators of the MERR superfamily, which promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. A typical MERR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are conserved and contain predicted HTH (helix-turn-helix) motives that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules such as metal ions, drugs, and organic substrates. 29693 cd01279: Helix-turn-helix transcription regulator HspR. Heat shock protein regulators (HspR) have been shown to regulate expression of specific regulons in response to high temperature or high osmolarity in Streptomyces and Helicobacter, respectively. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MERR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. A typical MERR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain predicted HTH (helix-turn-helix) motives that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules. 29694 cd01280: Helix-turn-helix MERR transcription regulator, subgroup 1. Based on sequence similarity, these proteins were predicted to function as transcription regulators that mediate responses to stress in eubacteria. They belong to the MERR suoerfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MERR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH (helix-turn-helix) motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates. 29695 cd01281: Helix-turn-helix MERR transcription regulator, subgroup 2. Based on sequence similarity, these proteins were predicted to function as transcription regulators that mediate responses to stress in eubacteria. They belong to the MERR suoerfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MERR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH (helix-turn-helix) motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates. 29696 cd01282: Helix-turn-helix MERR transcription regulator, subgroup 3. Based on sequence similarity, these proteins were predicted to function as transcription regulators that mediate responses to stress in eubacteria. They belong to the MERR suoerfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MERR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH (helix-turn-helix) motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates. 29697 cd00593: RIBOc. Ribonuclease III C terminal domain. This group consists of eukaryotic, bacterial and archeal ribonuclease III (RNAse III) proteins. RNAse III is a double stranded RNA-specific endonuclease. Prokaryotic RNAse III is important in post-transcriptional control of mRNA stability and translational efficiency. It is involved in the processing of ribosomal RNA precursors. Prokaryotic RNAse III also plays a role in the maturation of tRNA precursors and in the processing of phage and plasmid transcripts. Eukaryotic RNase III's participate (through direct cleavage) in rRNA processing, in processing of small nucleolar RNAs (snoRNAs) and snRNA's (components of the spliceosome). In eukaryotes RNase III or RNaseIII like enzymes such as Dicer are involved in RNAi (RNA interference) and miRNA (micro-RNA) gene silencing. 29702 cd00600: The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes. 29703 cd01716: Hfq, an abundant, ubiquitous RNA-binding protein, functions as a pleiotrophic regulator of RNA metabolism in prokaryotes, required for transcription of some transcripts and degradation of others. Hfq binds small RNA molecules called riboregulators that modulate the stability or translation efficiency of RNA transcripts. Hfq binds preferentially to unstructured A/U-rich RNA sequences and is similar to the eukaryotic Sm proteins in both sequence and structure. Hfq forms a homo-hexameric ring similar to the heptameric ring of the Sm proteins. 29704 cd01717: The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit B heterodimerizes with subunit D3 and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes. . 29705 cd01718: The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit E binds subunits F and G to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes. 29706 cd01719: The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit G binds subunits E and F to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes. 29707 cd01720: The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D2 heterodimerizes with subunit D1 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing D2, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes. 29708 cd01721: The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D3 heterodimerizes with subunit B and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes. . 29709 cd01722: The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures. To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers. 29710 cd01723: The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes. 29711 cd01724: The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D1 heterodimerizes with subunit D2 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing DB, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes. 29712 cd01725: The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm2 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes. 29713 cd01726: The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm6 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes. 29714 cd01727: The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm8 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes. 29715 cd01728: The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm1 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes. 29716 cd01729: The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm7 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes. 29717 cd01730: The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm3 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes. 29718 cd01731: 29719 cd01732: The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes. 29720 cd01733: The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm10 is an SmD1-like protein which is thought to bind U7 snRNA along with LSm11 and five other Sm subunits to form a 7-member ring structure. LSm10 and the U7 snRNP of which it is a part are thought to play an important role in histone mRNA 3' processing. 29721 cd01734: YxlS is a Bacillus subtilis gene of unknown function with two domains that each have an alpha/beta fold. The N-terminal domain is composed of two alpha-helices and a three-stranded beta-sheet, while the C-terminal domain is composed of one alpha-helix and a five-stranded beta-sheet. This CD represents the C-terminal domain which has a fold similar to the Sm fold of proteins like Sm-D3. 29722 cd01735: LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures. In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain. 29723 cd01736: LSm14 (also known as RAP55) belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold, containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet, that associates with other Sm proteins to form hexameric and heptameric ring structures. In addition to the N-terminal Sm-like domain, LSm14 has an uncharacterized C-terminal domain containing a conserved DFDF box. In Xenopus laevis, LSm14 is an oocyte-specific constituent of ribonucleoprotein particles. 29724 cd01737: LSm16 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold, containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet, that associates with other Sm proteins to form hexameric and heptameric ring structures. LSm16 has, in addition to its N-terminal Sm-like domain, a C-terminal Yjef_N-type rossman fold domain of unknown function. 29725 cd01739: The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm11 is an SmD2 - like subunit which binds U7 snRNA along with LSm10 and five other Sm subunits to form a 7-member ring structure. LSm11 and the U7 snRNP of which it is a part are thought to play an important role in histone mRNA 3' processing. 29726 cd00125: Phospholipase A2; Cleaves the sn-2 position of the glycerol backbone of phospholipids that have aracidonic acid at the sn-2 position. This reaction is metal dependent. The resulting products are either dietary or used in synthetic pathways for leukotrienes and prostaglandins. As a toxin, it is a potent presynaptic neurotoxin that acts by blocking release of neurotransmitters by competitive inhibition, since the key catalytic residue is missing. May form dimers or oligimers and appears to recognize specific receptors on the cell membrane. Alignment does not include group III (bee venom related) phospholipase As. 29728 cd00625: Anion permease ArsB/NhaD. These permeases have been shown to translocate sodium, arsenate, antimonite, sulfate and organic anions across biological membranes in all three kingdoms of life. A typical anion permease contains 8-13 transmembrane helices and can function either independently as a chemiosmotic transporter or as a channel-forming subunit of an ATP-driven anion pump. 29730 cd01116: Permease P (pink-eyed dilution). Mutations in the human melanosomal P gene were responsible for classic phenotype of oculocutaneous albinism type 2 (OCA2). Although the precise function of the P protein is unknown, it was predicted to regulate the intraorganelle pH, together with the ATP-driven proton pump. It shows significant sequence similarity to the Na+/H+ antiporter NhaD from Vibrio parahaemolyticus. Both proteins belong to ArsB/NhaD superfamily of permeases that translocate sodium, arsenate, sulfate, and organic anions across biological membranes in all three kingdoms of life. A typical ArsB/NhaD permease contains 8-13 transmembrane domains. 29731 cd01117: Putative anion permease YbiR. Based on sequence similarity, YbiR proteins are predicted to function as anion translocating permeases in eubacteria, archaea and plants. They belong to ArsB/NhaD superfamily of permeases that have been shown to translocate sodium, sulfate, arsenite and organic anions. A typical ArsB/NhaD permease is composed of 8-13 transmembrane domains. 29732 cd01118: Anion permease ArsB. These permeases have been shown to export arsenate and antimonite in eubacteria and archaea. A typical ArsB permease contains 8-13 transmembrane helices and can function either independently as a chemiosmotic transporter or as a channel-forming subunit of an ATP-driven anion pump (ArsAB). The ArsAB complex is similar in many ways to ATP-binding cassette transporters, which have two groups of six transmembrane-spanning helical segments and two nucleotide-binding domains. The ArsB proteins belong to the ArsB/NhaD superfamily of permeases that translocate sodium, arsenate, sulfate, and organic anions across biological membranes in all three kingdoms of life. 29733 cd00635: PLP dependent enzymes class III (PLPDE_III). The prototype of this CD, yeast hypothetical protein YBL036c, is homologous to a P. aeruginosa gene that is co-transcribed with a known proline biosynthetic gene. These proteins are classified as PyridoxaL 5 '-phosphate Dependent Enzymes class III. This version is widely distributed among all three froms of life. This group has not been characterized extensively. 29734 cd00636: Helical backbone metal receptor (TroA-like domain). These proteins have been shown to function in the ABC transport of ferric siderophores and metal ions such as Mn2+, Fe3+, Cu2+ and/or Zn2+. Their ligand binding site is formed in the interface between two globular domains linked by a single helix. Many of these proteins also possess a low complexity region containing a metal-binding histidine-rich motif (repetitive HDH sequence). The TroA-like proteins differ in their fold and ligand-binding mechanism from the PBPI and PBPII proteins, but are structurally similar, however, to the beta-subunit of the nitrogenase molybdenum-iron protein MoFe. Most TroA-like proteins are encoded by ABC-type operons and appear to function as periplasmic components of ABC transporters in metal ion uptake. 29735 cd01016: Metal binding protein TroA. These proteins have been shown to function as initial receptors in ABC transport of Zn2+ and possibly Fe3+ in many eubacterial species. The TroA proteins belong to the TroA superfamily of periplasmic metal binding proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains. In addition, these proteins sometimes have a low complexity region containing a metal-binding histidine-rich motif (repetitive HDH sequence).. 29736 cd01017: Metal binding protein AcdA. These proteins have been shown to function in the ABC uptake of Zn2+ and Mn2+ and in competence for genetic transformation and adhesion. The AcdA proteins belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism. They are comprised of two globular subdomains connected by a long alpha helix and they bind their ligand in the cleft between these domains. In addition, many of these proteins have a low complexity region containing metal binding histidine-rich motif (repetitive HDH sequence).. 29737 cd01018: Metal binding protein ZntC. These proteins are predicted to function as initial receptors in ABC transport of metal ions. They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism. They are comprised of two globular subdomains connected by a long alpha helix and bind their specific ligands in the cleft between these domains. In addition, many of these proteins possess a metal-binding histidine-rich motif (repetitive HDH sequence).. 29738 cd01019: Zinc binding protein ZnuA. These proteins have been shown to function as initial receptors in the ABC uptake of Zn2+. They belong to the TroA superfamily of periplasmic metal binding proteins that share a distinct fold and ligand binding mechanism. They are comprised of two globular subdomains connected by a single helix and bind their specific ligands in the cleft between these domains. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains. In addition, these proteins sometimes have a low complexity region containing a metal-binding histidine-rich motif (repetitive HDH sequence).. 29739 cd01020: Metal binding protein TroA_b. These proteins are predicted to function as initial receptors in ABC transport of metal ions. They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains. In addition, these proteins sometimes have a low complexity region containing a metal-binding histidine-rich motif (repetitive HDH sequence).. 29740 cd01137: Metal binding protein PsaA. These proteins have been shown to function as initial receptors in ABC transport of Mn2+ and as surface adhesins in some eubacterial species. They belong to the TroA superfamily of periplasmic metal binding proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains. In addition, these proteins sometimes have a low complexity region containing a metal-binding histidine-rich motif (repetitive HDH sequence).. 29741 cd01138: Periplasmic binding protein FeuA. These proteins have predicted to function as initial receptors in ABC transport of metal ions in some eubacterial species. They belong to the TroA superfamily of periplasmic metal binding proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind their ligands in the cleft between these domains. 29742 cd01139: Periplasmic binding protein TroA_f. These proteins are predicted to function as initial receptors in the ABC metal ion uptake in eubacteria and archaea. They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind their ligands in the cleft between these domains. 29743 cd01140: Siderophore binding protein FatB. These proteins have been shown to function as ABC-type initial receptors in the siderophore-mediated iron uptake in some eubacterial species. They belong to the TroA superfamily of periplasmic metal binding proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind their ligands in the cleft between these domains. 29744 cd01141: Periplasmic binding protein TroA_d. These proteins are predicted to function as initial receptors in the ABC metal ion uptake in eubacteria and archaea. They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind their ligands in the cleft between these domains. 29745 cd01142: Periplasmic binding protein TroA_e. These proteins are predicted to function as initial receptors in the ABC metal ion uptake in eubacteria and archaea. They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind their ligands in the cleft between these domains. 29746 cd01143: Periplasmic binding protein YvrC. These proteins are predicted to function as initial receptors in ABC transport of metal ions in eubacteria and archaea. They belong to the TroA superfamily of periplasmic metal binding proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains. 29747 cd01144: Cobalamin binding protein BtuF. These proteins have been shown to function as initial receptors in ABC transport of vitamin B12 (cobalamin) in eubacterial and some archaeal species. They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains. In addition, these proteins sometimes have a low complexity region containing a metal-binding histidine-rich motif (repetitive HDH sequence).. 29748 cd01145: Periplasmic binding protein TroA_c. These proteins are predicted to function as initial receptors in the ABC metal ion uptake in eubacteria and archaea. They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind their ligands in the cleft between these domains. 29749 cd01146: Fe3+-siderophore binding domain FhuD. These proteins have been shown to function as initial receptors in ABC transport of Fe3+-siderophores in many eubacterial species. They belong to the TroA-like superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism. A typical TroA-like protein is comprised of two globular subdomains connected by a long alpha helix and binds its specific ligands in the cleft between these domains. 29750 cd01147: Metal binding protein HemV-2. These proteins are predicted to function as initial receptors in ABC transport of metal ions. They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains. In addition, these proteins sometimes have a low complexity region containing a metal-binding histidine-rich motif (repetitive HDH sequence).. 29751 cd01148: Metal binding protein TroA_a. These proteins are predicted to function as initial receptors in ABC transport of metal ions in eubacteria. They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains. 29752 cd01149: Hemin binding protein HutB. These proteins have been shown to function as initial receptors in ABC transport of hemin and hemoproteins in many eubacterial species. They belong to the TroA superfamily of periplasmic metal binding proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains. 29760 cd00445: Urate oxidase (UO, uricase) is a peroxisomal enzyme that catalyzes the oxidation of uric acid to allantoin in most fish, amphibian, and mammalian species. The enzymatic process involves catalyzing the oxidative opening of the purine ring during the purine degradation pathway. In humans and certain other primates, however, the enzyme has been lost by some unknown mechanism. Each monomer contains two instances of this domain. Its functional form is a homotetramer for most species, though there are reports that some may form heterotetramers based on a few biochemical studies. 29761 cd00470: 6-pyruvoyl tetrahydropterin synthase (PTPS). Folate derivatives are essential cofactors in the biosynthesis of purines, pyrimidines, and amino acids, as well as formyl-tRNA. Mammalian cells are able to utilize pre-formed folates after uptake by a carrier-mediated active transport system. Most microbes and plants lack this system and must synthesize folates de novo from guanosine triphosphate. One enzyme from this pathway is PTPS which catalyzes the conversion of dihydroneopterin triphosphate to 6-pyruvoyl tetrahydropterin. The functional enzyme is a hexamer of identical subunits. 29762 cd00534: Dihydroneopterin aldolase (DHNA) and 7,8-dihydroneopterin triphosphate epimerase domain (DHNTPE); these enzymes have been designated folB and folX, respectively. Folate derivatives are essential cofactors in the biosynthesis of purines, pyrimidines, and amino acids, as well as formyl-tRNA. Mammalian cells are able to utilize pre-formed folates after uptake by a carrier-mediated active transport system. Most microbes and plants lack this system and must synthesize folates de novo from guanosine triphosphate. One enzyme from this pathway is DHNA which catalyses the conversion of 7,8-dihydroneopterin to 6-hydroxymethyl-7,8-dihydropterin in the biosynthetic pathway of tetrahydrofolate. Though it is known that DHNTPE catalyzes the epimerization of dihydroneopterin triphosphate to dihydromonapterin triphosphate, the biological role of this enzyme is still unclear. It is hypothesized that it is not an essential protein since a folX knockout in E. coli has a normal phenotype and the fact that folX is not present in H. influenza. In addition both enzymes have been shown to be able to compensate for the other's activity albeit at slower reaction rates. The functional enzyme for both is an octamer of identical subunits. Mammals lack many of the enzymes in the folate pathway including, DHNA and DHNTPE. 29763 cd00642: GTP cyclohydrolase I (GTP-CH-I) catalyzes the conversion of GTP into dihydroneopterin triphosphate. The enzyme product is the precursor of tetrahydrofolate in eubacteria, fungi, and plants and of the folate analogs in methanogenic bacteria. In vertebrates and insects it is the biosynthtic precursor of tetrahydrobiopterin (BH4) which is involved in the formation of catacholamines, nitric oxide, and the stimulation of T lymphocytes. The biosynthetic reaction of BH4 is controlled by a regulatory protein GFRP which mediates feedback inhibition of GTP-CH-I by BH4. This inhibition is reversed by phenylalanine. The decameric GTP-CH-I forms a complex with two pentameric GFRP in the presence of phenylalanine or a combination of GTP and BH4, respectively. 29764 cd00651: Tunnelling fold (T-fold). The five known T-folds are found in five different enzymes with different functions: dihydroneopterin-triphosphate epimerase (DHNTPE), dihydroneopterin aldolase (DHNA) , GTP cyclohydrolase I (GTPCH-1), 6-pyrovoyl tetrahydropterin synthetase (PTPS), and uricase (UO,uroate/urate oxidase). They bind to substrates belonging to the purine or pterin families, and share a fold-related binding site with a glutamate or glutamine residue anchoring the substrate and a lot of conserved interactions. They also share a similar oligomerization mode: several T-folds join together to form a beta(2n)alpha(n) barrel, then two barrels join together in a head-to-head fashion to made up the native enzymes. The functional enzyme is a tetramer for UO, a hexamer for PTPS, an octamer for DHNA/DHNTPE and a decamer for GTPCH-1. The substrate is located in a deep and narrow pocket at the interface between monomers. In PTPS, the active site is located at the interface of three monomers, two from one trimer and one from the other trimer. In GTPCH-1, it is also located at the interface of three subunits, two from one pentamer and one from the other pentamer. There are four equivalent active sites in UO, six in PTPS, eight in DHNA/DHNTPE and ten in GTPCH-1. Each globular multimeric enzyme encloses a tunnel which is lined with charged residues for DHNA and UO, and with basic residues in PTPS. The N and C-terminal ends are located on one side of the T-fold while the residues involved in the catalytic activity are located at the opposite side. In PTPS, UO and DHNA/DHNTPE, the N and C-terminal extremities of the enzyme are located on the exterior side of the functional multimeric enzyme. In GTPCH-1, the extra C-terminal helix places the extremity inside the tunnel. 29765 cd00657: Ferritin-like, diiron-carboxylate proteins participate in a range of functions including iron regulation, mono-oxygenation, and reactive radical production. These proteins are characterized by the fact that they catalyze dioxygen-dependent oxidation-hydroxylation reactions within diiron centers; one exception is manganese catalase, which catalyzes peroxide-dependent oxidation-reduction within a dimanganese center. Diiron-carboxylate proteins are further characterized by the presence of duplicate metal ligands, glutamates and histidines (ExxH) and two additional glutamates within a four-helix bundle. Outside of these conserved residues there is little obvious homology. Members include bacterioferritin, ferritin, rubrerythrin, aromatic and alkene monooxygenase hydroxylases (AAMH), ribonucleotide reductase R2 (RNRR2), acyl-ACP-desaturases (Acyl_ACP_Desat), manganese (Mn) catalases, demethoxyubiquinone hydroxylases (DMQH), DNA protecting proteins (DPS), and ubiquinol oxidases (AOX). Additional members include the Fe-containing subunit of the aerobic cyclase system (ACSF), the ferritin-like domain present at the N-terminus of the iron transport protein CCC1 (Ferritin-like CCC1), and the uncharacterized archaeal and bacterial ferritin-like domains (Ferritin-like AB/AB2).. 29766 cd00904: Ferritins are the primary iron storage proteins of most living organisms and members of a broad superfamily of ferritin-like diiron-carboxylate proteins. The iron-free (apoferritin) ferritin molecule is a protein shell composed of 24 protein chains arranged in 432 symmetry. Iron storage involves the uptake of iron (II) at the protein shell, its oxidation by molecular oxygen at the dinuclear ferroxidase centers, and the movement of iron (III) into the cavity for deposition as ferrihydrite; the protein shell can hold up to 4500 iron atoms. In vertebrates, two types of chains (subunits) have been characterized, H or M (fast) and L (slow), which differ in rates of iron uptake and mineralization. Bacterial non-heme ferritins are composed only of H chains. Fe(II) oxidation in the H/M subunits take place initially at the ferroxidase center, a carboxylate-bridged diiron center, located within the subunit four-helix bundle. In a complementary role, negatively charged residues on the protein shell inner surface of the L subunits promote ferrihydrite nucleation. Most plant ferritins combine both oxidase and nucleation functions in one chain: they have four interior glutamate residues as well as seven ferroxidase center residues. 29767 cd00907: Bacterioferritins, also known as cytochrome b1, are members of a broad superfamily of ferritin-like diiron-carboxylate proteins. Similar to ferritin in architecture, Bfr forms an oligomer of 24 subunits that assembles to form a hollow sphere with 432 symmetry. Up to 12 heme cofactor groups (iron protoporphyrin IX or coproporphyrin III) are bound between dimer pairs. The role of the heme is unknown, although it may be involved in mediating iron-core reduction and iron release. Each subunit is composed of a four-helix bundle which carries a diiron ferroxidase center; it is here that initial oxidation of ferrous iron by molecular oxygen occurs, facilitating the detoxification of iron, protection against dioxygen and radical products, and storage of ferric-hydroxyphosphate at the core. Some bacterioferritins are composed of two subunit types, one conferring heme-binding ability (alpha) and the other (beta) bestowing ferroxidase activity. 29768 cd01041: Rubrerythrin domain is a nonheme iron binding domain found in many air-sensitive bacteria and archaea and member of a broad superfamily of ferritin-like diiron-carboxylate proteins. The homodimeric rubrerythrin protein contains a binuclear metal center located within a four helix bundle. Many, but not all, rubrerythrin proteins have a second domain with a rubredoxin-like hexacoordinated iron center. Rubrerythrin is thought to reduce hydrogen peroxide as part of an oxidative stress protection system but its function is still poorly understood. 29769 cd01042: Demethoxyubiquinone hydroxylases (DMQH) are members of the ferritin-like, diiron-carboxylate family which are present in eukaryotes (the CLK-1/CAT5 family) and prokaryotes (the Coq7 family). DMQH participates in one of the last steps of ubiquinone biosysnthesis and is responsible for DMQ hydroxylation, resulting in the formation of hydroxyubiquinone, a precursor of ubiquinone. CLK-1 is a mitochondrial inner membrane protein and Coq7 is a proposed interfacial integral membrane protein. Mutations in the Caenorhabditis elegans gene clk-1 affect biological timing and extend longevity. The conserved residues of a diiron center are present in this domain. 29770 cd01043: DPS (DNA Protecting protein under Starved conditions) domain is a member of a broad superfamily of ferritin-like diiron-carboxylate proteins. Some DPS proteins nonspecifically bind DNA, protecting it from cleavage caused by reactive oxygen species such as the hydroxyl radicals produced during oxidation of Fe(II) by hydrogen peroxide. These proteins assemble into dodecameric structures, some form DPS-DNA co-crystalline complexes, and possess iron and H2O2 detoxification capabilities. Expression of DPS is induced by oxidative or nutritional stress, including metal ion starvation. Members of the DPS family are homopolymers formed by 12 four-helix bundle subunits that assemble with 23 symmetry into a hollow shell. The DPS ferroxidase site is unusual in that it is not located in a four-helix bundle as is in ferritin, but is shared by 2-fold symmetry-related subunits providing the iron ligands. Many DPS sequences (e.g., E. coli) display an N-terminal extension of variable length that contains two or three positively charged lysine residues that extends into the solvent and is thought to play an important role in the stabilization of the complex with DNA. DPS Listeria Flp, Bacillus anthracis Dlp-1 and Dlp-2, and Helicobacter pylori HP-NAP which lack the N-terminal extension, do not bind DNA. DPS proteins from Helicobacter pylori, Treponema pallidum, and Borrelia burgdorferi are highly immunogenic. 29771 cd01044: Ferritin-like domain present at the N-terminus of the iron transport protein CCC1 of some eubacteria and archaebacteria, and member of a broad superfamily of ferritin-like diiron-carboxylate proteins. This uncharacterized domain has the conserved residues of a diiron center. 29772 cd01045: Ferritin-like domain found in Archaea and Bacteria (Ferritin_like_AB) is a member of a broad superfamily of ferritin-like diiron-carboxylate proteins. The CD contains unknown or hypothetical proteins which were sequenced from mostly anaerobic or microaerophilic metal-metabolizing and/or nitrogen-fixing microbes. The CD includes sequences from ferric-, sulfate-, and arsenic-reducing bacteria, Geobacter, Magnetospirillum, Desulfovibrio, and Desulfitobacterium. Also included are several nitrogen-fixing endosymbiotic bacteria, Rhizobium, Mesorhizobium, and Bradyrhizobium; also phototrophic purple nonsulfur bacteria, Rhodobacter and Rhodopseudomonas, as well as, obligate thermophiles, Thermotoga, Thermoanaerobacter, and Pyrococcus. The conserved residues of a diiron center are present in this uncharacterized domain. 29773 cd01046: Rubrerythrin-like domain,similar to rubrerythrin, a nonheme iron binding domain found in many air-sensitive bacteria and archaea, and member of a broad superfamily of ferritin-like diiron-carboxylate proteins. Rubrerythrin is thought to reduce hydrogen peroxide as part of an oxidative stress protection system. The rubrerythrin protein has two domains, a binuclear metal center located within a four-helix bundle of the rubrerythrin domain, and a rubredoxin domain. The Rubrerythrin-like domains in this CD are singular domains (no C-terminus rubredoxin domain) and are phylogenetically distinct from rubrerythrin domains of rubrerythrin-rubredoxin proteins. 29774 cd01047: Aerobic Cyclase System, Fe-containing subunit (ACSF) is a member of a broad superfamily of ferritin-like diiron-carboxylate proteins. Rubrivivax gelatinosus acsF codes for a conserved, putative binuclear iron-cluster-containing protein involved in aerobic oxidative cyclization of Mg-protoporphyrin IX monomethylester. AcsF and homologs have a leucine zipper and two copies of the conserved glutamate and histidine residues predicted to act as ligands for iron in the Ex(29-35)DExRH motifs. Several homologs of AcsF are found in a wide range of photosynthetic organisms, including Chlamydonomas reinhardtii Crd1 and Pharbitis nil PNZIP, suggesting that this aerobic oxidative cyclization mechanism is conserved from bacteria to plants. 29775 cd01048: Ferritin-like domain found in Archaea and Bacteria subgroup 2 (Ferritin_like_AB2) , an uncharacterized domain, is a member of a broad superfamily of ferritin-like diiron-carboxylate proteins whose function is unknown. The conserved residues of a diiron center are present in this uncharacterized domain. 29776 cd01049: Ribonucleotide Reductase, R2/beta subunit (RNRR2) is a member of a broad superfamily of ferritin-like diiron-carboxylate proteins. The RNR protein catalyzes the conversion of ribonucleotides to deoxyribonucleotides and is found in all eukaryotes, many prokaryotes, several viruses, and few archaea. The catalytically active form of RNR is a proposed alpha2-beta2 tetramer. The homodimeric alpha subunit (R1) contains the active site and redox active cysteines as well as the allosteric binding sites. The beta subunit (R2) contains a diiron cluster that, in its reduced state, reacts with dioxygen to form a stable tyrosyl radical and a diiron(III) cluster. This essential tyrosyl radical is proposed to generate a thiyl radical, located on a cysteine residue in the R1 active site that initiates ribonucleotide reduction. The beta subunit is composed of 10-13 helices, the 8 longest helices form an alpha-helical bundle; some have 2 addition beta strands. Yeast is unique in that it assembles both homodimers and heterodimers of RNRR2. The yeast heterodimer, Y2Y4, contains R2 (Y2) and a R2 homolog (Y4) that lacks the diiron center and is proposed to only assist in cofactor assembly, and perhaps stabilize R1 (Y1) in its active conformation. 29777 cd01050: Acyl-Acyl Carrier Protein Desaturase (Acyl_ACP_Desat) is a mu-oxo-bridged diiron-carboxylate enzyme, which belongs to a broad superfamily of ferritin-like proteins and that catalyzes the NADPH and O2-dependent formation of a cis-double bond in acyl-ACPs. Acyl-ACP desaturases are found in higher plants and a few bacterial species (Mycobacterium tuberculosis, M. leprae, M. avium and Streptomyces avermitilis, S. coelicolor). In plants, Acyl-ACP desaturase is a plastid-localized, covalently ACP linked, soluble desaturase that introduces the first double bound into saturated fatty acids, resulting in the corresponding monounsaturated fatty acid. Members of this class of soluble desaturases are specific for a particular substrate chain length and introduce the double bond between specific carbon atoms. For example, delta 9 stearoyl-ACP is specific for stearic acid and introduces a double bond between carbon 9 and 10 to yield oleic acid in the ACP-bound form. The enzymatic reaction requires molecular oxygen, NAD(P)H, NAD(P)H ferredoxin oxido-reductase and ferredoxin. The enzyme is active in the homodimeric form; the monomer consists mainly of alpha-helices with the catalytic diiron center buried within a four-helix bundle. Integral membrane fatty acid desaturases that introduce double bonds into fatty acid chains, acyl-CoA desaturases of animals, yeasts, and fungi, and acyl-lipid desaturases of cyanobacteria and higher plants, are distinct from soluble acyl-ACP desaturases, lack diiron centers, and are not included in this CD. 29778 cd01051: Manganese (Mn) catalase domain is a member of a broad superfamily of ferritin-like diiron enzymes. While many diiron enzymes catalyze dioxygen-dependent reactions, manganese catalase performs peroxide-dependent oxidation-reduction. Catalases are important antioxidant metalloenzymes that catalyze disproportionation of hydrogen peroxide, forming dioxygen and water. Manganese catalase, a nonheme type II catalase, contains a binuclear manganese cluster that catalyzes the redox dismutation of hydrogen peroxide, interconverting between dimanganese(II) [(2,2)] and dimanganese(III) [(3,3)] oxidation states during turnover. Mn catalases are found in a broad range of microorganisms in microaerophilic environments, including the mesophilic lactic acid bacteria (e.g., Lactobacillus plantarum) and bacterial and archaeal thermophiles (e.g., Thermus thermophilus and Pyrobaculum caldifontis). L. plantarum and T. thermophilus holoenzymes are homohexameric structures; each subunit contains a dimanganese active site. The manganese ions are linked by a mu 1,3-bridging glutamate carboxylate and two mu-bridging solvent oxygens that electronically couple the metal centers. Several members of this CD lack the C-terminal strands that pack against the neighboring catalytic domains as seen in L. plantarum. One such sequence, Bacillus subtilis CotJC, is known to be a component of the inner spore coat that interacts with spore coat protein, CotJA. It has been suggested that CotJC could modulate the degree of Mn SodA-dependent cross-linking of an outer coat component, or the two enzymes could serve to protect specific cellular structures during the developmental process. 29779 cd01052: Bacterioferritin-like domain is a member of a broad superfamily of ferritin-like diiron-carboxylate proteins. These proteins are distantly related to bacterial ferritins which assemble 24 monomers each of which have a four-helix bundle with a fifth shorter helix at the C terminus and a diiron (ferroxidase) center. A center where oxidation of ferrous iron by molecular oxygen occurs, facilitating the detoxification of iron, protection against dioxygen and radical products, and storage of iron in the ferric form. Many of the conserved residues of a diiron center are present in this domain. 29780 cd01053: Alternative oxidase (AOX) is a mitochondrial ubiquinol oxidase found in plants and some fungi and protists. AOX is a member of the ferritin-like diiron-carboxylate superfamily. The plant mitochondrial protein alternative oxidase catalyses dioxygen dependent ubiquinol oxidation to yield ubiquinone and water. AOX is a cyanide-resistant, salicylhydroxamic acid-sensitive oxidase that transfers electrons from ubiquinol to oxygen, bypassing the cyctochrome chain. AOX has been proposed to contain a hydroxo-bridged diiron center within a four-helix bundle and a proximal redox-active tyrosine residue. AOX is proposed to be peripherally associated with the matrix side of the inner mitochondrial membrane. Fungal and protozoan AOXs generally exist as monomers. In plants, AOX is dimeric. Pyruvate is an allosteric activator of plant AOX involved in the reversible inactivation of the enzyme though the formation of an intermolecular disulfide bridge between monomeric subunits. The enzyme is non-protonmotive and does not contribute to the conservation of energy. The heat that dissipates from AOX activity is used in thermogenic plants to volatilize primary amines to attract pollinating insects. Other functions have been proposed: i) that the alternative oxidase allows Krebs-cycle turnover when the energy charge of the cell is high, and ii) that the enzyme protects against oxidative stress. The expression of AOX is induced when plants are exposed to a variety of stresses including chilling, pathogen attack, sescence and fruit ripening. 29781 cd01054: Aromatic and Alkene Monooxygenase Hydroxylases, alpha- and beta-subunits (AAMH_AB). Hydroxylase subunits (alpha and beta) of soluble, multicomponent, aromatic and alkene monooxygenases are members of a broad superfamily of ferritin-like diiron-carboxylate proteins. AAMH exists as a hexamer (an alpha2-beta2-gamma2 homodimer) with each alpha-subunit housing one nonheme diiron center embedded in a four-helix bundle. The N-terminal domain of the alpha- and beta-subunits possess nearly identical folds, however, the noncatalytic beta-subunit lacks critical diiron ligands and a C-terminal domain found in the alpha-subunit. Methane monooxygenase is a multicomponent enzyme found in methanotrophic bacteria that catalyzes the hydroxylation of methane and higher alkenes (as large as octane). Phenol monooxygenase, found in a diverse group of bacteria, catalyses the hydroxylation of phenol, chloro- and methyl-phenol and naphthol. Both enzyme systems consist of three components: a hydroxylase, a coupling protein and a reductase. In the MMO hydroxylase, dioxygen and substrate interact with the diiron center in a hydrophobic cavity at the active site. The reductase component and protein coupling factor provide electrons from NADH for reducing the oxidized binuclear iron-oxo cluster to its reduced form. Reaction with dioxygen produces a peroxy-bridged complex and dehydration leads to the formation of complex Q, which is thought to be the oxygenating species that carries out the insertion of an oxygen atom into a C-H bond of the substrate. The toluene monooxygenase systems, toluene 2-, 3-, and 4-monooxygenase, are similar to MMO but with an additional component, a Rieske-type ferredoxin. The alkene monooxygenase from Xanthobacter strain Py2 is closely related to aromatic monooxygenases and catalyzes aromatic monohydroxylation of benzene, toluene, and phenol. Alkane omega-hydroxylase (AlkB) and xylene monooxygenase are members of a distinct class of integral membrane diiron proteins and are not included in this CD. 29782 cd01055: Nonheme Ferritin domain, found in Archaea and Bacteria, is a member of a broad superfamily of ferritin-like diiron-carboxylate proteins. The ferritin protein shell is composed of 24 protein subunits arranged in 432 symmetry. Each protein subunit, a four-helix bundle with a fifth short terminal helix, contains a dinuclear ferroxidase center (H type). Unique to this group of proteins is a third metal site in the ferroxidase center. Iron storage involves the uptake of iron (II) at the protein shell, its oxidation by molecular oxygen at the ferroxidase centers, and the movement of iron (III) into the cavity for deposition as ferrihydrite. 29783 cd01056: Eukaryotic Ferritin (Euk_Ferritin) domain. Ferritins are the primary iron storage proteins of most living organisms and members of a broad superfamily of ferritin-like diiron-carboxylate proteins. The iron-free (apoferritin) ferritin molecule is a protein shell composed of 24 protein chains arranged in 432 symmetry. Iron storage involves the uptake of iron (II) at the protein shell, its oxidation by molecular oxygen at the dinuclear ferroxidase centers, and the movement of iron (III) into the cavity for deposition as ferrihydrite; the protein shell can hold up to 4500 iron atoms. In vertebrates, two types of chains (subunits) have been characterized, H or M (fast) and L (slow), which differ in rates of iron uptake and mineralization. Fe(II) oxidation in the H/M subunits take place initially at the ferroxidase center, a carboxylate-bridged diiron center, located within the subunit four-helix bundle. In a complementary role, negatively charged residues on the protein shell inner surface of the L subunits promote ferrihydrite nucleation. Most plant ferritins combine both oxidase and nucleation functions in one chain: they have four interior glutamate residues as well as seven ferroxidase center residues. 29784 cd01057: Aromatic and Alkene Monooxygenase Hydroxylases, subunit A (AAMH_A). Subunit A of the soluble hydroxylase of multicomponent, aromatic and alkene monooxygenases are members of a superfamily of ferritin-like iron-storage proteins. AAMH exists as a hexamer (an alpha2-beta2-gamma2 homodimer) with each alpha-subunit housing one nonheme diiron center embedded in a four-helix bundle. The N-terminal domain of the alpha- and noncatalytic beta-subunits possess nearly identical folds, however, the beta-subunit lacks critical diiron ligands and a C-terminal domain found in the alpha-subunit. Methane monooxygenase is a multicomponent enzyme found in methanotrophic bacteria that catalyzes the hydroxylation of methane and higher alkenes (as large as octane). Phenol monooxygenase, found in a diverse group of bacteria, catalyses the hydroxylation of phenol, chloro- and methyl-phenol and naphthol. Both enzyme systems consist of three components: the hydroxylase, a coupling protein and a reductase. In the MMO hydroxylase, dioxygen and substrate interact with the diiron center in a hydrophobic cavity at the active site. The reductase component and protein coupling factor provide electrons from NADH for reducing the oxidized binuclear iron-oxo cluster to its reduced form. Reaction with dioxygen produces a peroxy-bridged complex and dehydration leads to the formation of complex Q, which is thought to be the oxygenating species that carries out the insertion of an oxygen atom into a C-H bond of the substrate. The toluene monooxygenase systems, toluene 2-, 3-, and 4-monooxygenase, are similar to MMO but with an additional component, a Rieske-type ferredoxin. The alkene monooxygenase from Xanthobacter strain Py2 is closely related to aromatic monooxygenases and catalyzes aromatic monohydroxylation of benzene, toluene, and phenol. Alkane omega-hydroxylase (AlkB) and xylene monooxygenase are members of a distinct class of integral membrane diiron proteins and are not included in this CD. 29785 cd01058: Aromatic and Alkene Monooxygenase Hydroxylases, subunit B (AAMH_B). Subunit B (beta) of the soluble hydroxylase of multicomponent, aromatic and alkene monooxygenases are members of a superfamily of ferritin-like iron-storage proteins. AAMH exists as a hexamer (an alpha2-beta2-gamma2 homodimer) with each alpha-subunit housing one nonheme diiron center embedded in a four-helix bundle. The N-terminal domain of the alpha- and noncatalytic beta-subunits possess nearly identical folds; the beta-subunit lacks the C-terminal domain found in the alpha-subunit. Methane monooxygenase is a multicomponent enzyme found in methanotrophic bacteria that catalyzes the hydroxylation of methane and higher alkenes (as large as octane). Phenol monooxygenase, found in a diverse group of bacteria, catalyses the hydroxylation of phenol, chloro- and methyl-phenol and naphthol. Both enzyme systems consist of three components: the hydroxylase, a coupling protein and a reductase. In the MMO hydroxylase, dioxygen and substrate interact with the diiron center in a hydrophobic cavity at the active site. The reductase component and protein coupling factor provide electrons from NADH for reducing the oxidized binuclear iron-oxo cluster to its reduced form. Reaction with dioxygen produces a peroxy-bridged complex and dehydration leads to the formation of complex Q, which is thought to be the oxygenating species that carries out the insertion of an oxygen atom into a C-H bond of the substrate. The toluene monooxygenase systems, toluene 2-, 3-, and 4-monooxygenase, are similar to MMO but with an additional component, a Rieske-type ferredoxin. The alkene monooxygenase from Xanthobacter strain Py2 is closely related to aromatic monooxygenases and catalyzes aromatic monohydroxylation of benzene, toluene, and phenol. Alkane omega-hydroxylase (AlkB) and xylene monooxygenase are members of a distinct class of integral membrane diiron proteins and are not included in this CD. 29786 cd00680: Ring hydroxylating alpha subunit (catalytic domain). This CD includes the catalytic domain (alpha) of aromatic-ring-hydroxylating dioxygenase systems of eubacteria. Eubacterial ring hydroxylating dioxygenases are multicomponent 1,2-dioxygenase complexes that convert closed-ring structures to non-aromatic cis-diols. The complex has both hydroxylase and electron transfer components. The hydroxylase component is itself composed of two subunits: an alpha subunit of about 50 Kd, and a beta subunit of about 20 Kd. The electron transfer component is either composed of two subunits: a ferredoxin and a ferredoxin reductase or by a single bifunctional ferredoxin/reductase subunit. Sequence analysis of hydroxylase subunits of ring hydroxylating systems (including toluene, benzene and napthalene 1,2-dioxygenases) suggests they are derived from a common ancestor. The alpha subunit binds both a Rieske-like 2Fe-2S cluster and an iron atom: conserved Cys and His residues in the N-terminal region may provide 2Fe-2S ligands, while conserved His and Tyr residues may coordinate the iron. The active site contains a non-heme ferrous ion coordinated by three ligands. 29787 cd00688: This group contains class II terpene cyclases, protein prenyltransferases beta subunit, two broadly specific proteinase inhibitors alpha2-macroglobulin (alpha (2)-M) and pregnancy zone protein (PZP) and, the C3 C4 and C5 components of vertebrate complement. Class II terpene cyclases include squalene cyclase (SQCY) and 2,3-oxidosqualene cyclase (OSQCY), these integral membrane proteins catalyze a cationic cyclization cascade converting linear triterpenes to fused ring compounds. The protein prenyltransferases include protein farnesyltransferase (FTase) and geranylgeranyltransferase types I and II (GGTase-I and GGTase-II) which catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Alpha (2)-M is a major carrier protein in serum and involved in the immobilization and entrapment of proteases. PZP is a pregnancy associated protein. Alpha (2)-M and PZP are known to bind to and, may modulate, the activity of placental protein-14 in T-cell growth and cytokine production thereby protecting the allogeneic fetus from attack by the maternal immune system. 29789 cd02890: Protein prenyltransferase (PTase) domain, beta subunit (alpha 6 - alpha 6 barrel fold). The protein prenyltransferase family of lipid-modifying enzymes includes protein farnesyltransferase (FTase) and geranylgeranyltransferase types I and II (GGTase-I and GGTase-II). They catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Prenyltransferases employ a Zn2+ ion to alkylate a thiol group catalyzing the formation of thioether linkages between the C1 atom of farnesyl (15-carbon by FTase) or geranylgeranyl (20-carbon by GGTase-I, II) isoprenoid lipids and cysteine residues at or near the C-terminus of protein acceptors. FTase and GGTase-I prenylate the cysteine in the terminal sequence, ""CAAX""; and GGTase-II prenylates both cysteines in the ""CC"" (or ""CXC"") terminal sequence. These enzymes are heterodimeric with both alpha and beta subunits required for catalytic activity. In contrast to other prenyltransferases, GGTase-II does not recognize its protein acceptor directly but requires Rab to complex with REP (Rab escort protein) before prenylation can occur. These enzymes are found exclusively in eukaryotes. 29790 cd02891: Proteins similar to alpha2-macroglobulin (alpha (2)-M). Alpha (2)-M is a major carrier protein in serum. It is a broadly specific proteinase inhibitor. The structural thioester of alpha (2)-M, is involved in the immobilization and entrapment of proteases. This group contains another broadly specific proteinase inhibitor: pregnancy zone protein (PZP). PZP is a trace protein in the plasma of non-pregnant females and males which is elevated in pregnancy. Alpha (2)-M and PZ bind to placental protein-14 and may modulate its activity in T-cell growth and cytokine production thereby protecting the allogeneic fetus from attack by the maternal immune system. This group also contains C3, C4 and C5 of vertebrate complement. The vertebrate complement is an effector of both the acquired and innate immune systems The point of convergence of the classical, alternative and lectin pathways of the complement system is the proteolytic activation of C3. C4 plays a key role in propagating the classical and lectin pathways. C5 participates in the classical and alternative pathways. The thioester bond located within the structure of C3 and C4 is central to the function of complement. C5 does not contain an active thioester bond. 29791 cd02892: Squalene cyclase (SQCY) domain subgroup 1; found in class II terpene cyclases that have an alpha 6 - alpha 6 barrel fold. Squalene cyclase (SQCY) and 2,3-oxidosqualene cyclase (OSQCY) are integral membrane proteins that catalyze a cationic cyclization cascade converting linear triterpenes to fused ring compounds. This group contains bacterial SQCY which catalyzes the convertion of squalene to hopene or diplopterol and eukaryotic OSQCY which transforms the 2,3-epoxide of squalene to compounds such as, lanosterol in mammals and fungi or, cycloartenol in plants. Deletion of a single glycine residue of Alicyclobacillus acidocaldarius SQCY alters its substrate specificity into that of eukaryotic OSQCY. Both enzymes have a second minor domain, which forms an alpha-alpha barrel that is inserted into the major domain. 29792 cd02893: Protein farnesyltransferase (FTase)_like proteins containing the protein prenyltransferase (PTase) domain, beta subunit (alpha 6 - alpha 6 barrel fold). FTases are a subgroup of PTase family of lipid-modifying enzymes. PTases catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. These proteins are heterodimers of alpha and beta subunits. Both subunits are required for catalytic activity. Prenyltransferases employ a Zn2+ ion to alkylate a thiol group catalyzing the formation of thioether linkages between cysteine residues at or near the C-terminus of protein acceptors and the C1 atom of isoprenoid lipids. Ftase attaches a 15-carbon farnesyl group to the cysteine within the C-terminal CaaX motif of substrate proteins when X is Ala, Met, Ser, Cys or Gln. Protein farnesylation has been shown to play critical roles in a variety of cellular processes including Ras/mitogen activated protein kinase signaling pathways in mammals and, abscisic acid signal transduction in Arabidopsis. 29793 cd02894: Geranylgeranyltransferase type II (GGTase-II)_like proteins containing the protein prenyltransferase (PTase) domain, beta subunit (alpha 6 - alpha 6 barrel fold). GGTase-IIs are a subgroup of the protein prenyltransferase family of lipid-modifying enzymes. PTases catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Prenyltransferases employ a Zn2+ ion to alkylate a thiol group catalyzing the formation of thioether linkages between cysteine residues at or near the C-terminus of protein acceptors and the C1 atom of isoprenoid lipids (geranylgeranyl (20-carbon) in the case of GGTase-II ). GGTase-II catalyzes alkylation of both cysteine residues in Rab proteins containing carboxy-terminal ""CC"", ""CXCX"" or ""CXC"" motifs. PTases are heterodimeric with both alpha and beta subunits required for catalytic activity. In contrast to other prenyltransferases, GGTas-II requires an escort protein to bring the substrate protein to the catalytic heterodimer and to escort the geryanylgeranylated product to the membrane. 29794 cd02895: Geranylgeranyltransferase types I (GGTase-I)-like proteins containing the protein prenyltransferase (PTase) domain, beta subunit (alpha 6 - alpha 6 barrel fold). GGTase-I s are a subgroup of the protein prenyltransferase family of lipid-modifying enzymes PTases catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Prenyltransferases employ a Zn2+ ion to alkylate a thiol group catalyzing the formation of thioether linkages between cysteine residues at or near the C-terminus of protein acceptors and the C1 atom of isoprenoid lipids (geranylgeranyl (20-carbon) in the case of GGTase-I ). GGTase-I prenylates the cysteine in the terminal sequence, ""CAAX"" when X is Leu or Phe. Substrates for GTTase-I include the gamma subunit of neural G-proteins and several Ras-related G-proteins. PTases are heterodimeric with both alpha and beta subunits required for catalytic activity. 29795 cd02896: Proteins similar to C3, C4 and C5 of vertebrate complement. The vertebrate complement system, comprised of a large number of distinct plasma proteins, is an effector of both the acquired and innate immune systems. The point of convergence of the classical, alternative and lectin pathways of the complement system is the proteolytic activation of C3. C4 plays a key role in propagating the classical and lectin pathways. C5 participates in the classical and alternative pathways. The thioester bond located within the structure of C3 and C4 is central to the function of complement. C5 does not contain an active thioester bond. 29796 cd02897: Proteins similar to alpha2-macroglobulin (alpha (2)-M). This group also contains the pregnancy zone protein (PZP). Alpha(2)-M and PZP are broadly specific proteinase inhibitors. Alpha (2)-M is a major carrier protein in serum. The structural thioester of alpha (2)-M, is involved in the immobilization and entrapment of proteases. PZP is a trace protein in the plasma of non-pregnant females and males which is elevated in pregnancy. Alpha (2)-M and PZ bind to placental protein-14 and may modulate its activity in T-cell growth and cytokine production contributing to fetal survival. It has been suggested that thioester bond cleavage promotes the binding of PZ and alpha (2)-M to the CD91 receptor clearing them from circulation. 29797 cd00738: HGTP anticodon binding domain, as found at the C-terminus of histidyl, glycyl, threonyl and prolyl tRNA synthetases, which are classified as a group of class II aminoacyl-tRNA synthetases (aaRS). In aaRSs, the anticodon binding domain is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only. This domain is also found in the accessory subunit of mitochondrial polymerase gamma (Pol gamma b).. 29798 cd00858: GlyRS Glycyl-anticodon binding domain. GlyRS belongs to class II aminoacyl-tRNA synthetases (aaRS). This alignment contains the anticodon binding domain, which is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only. 29799 cd00859: HisRS Histidyl-anticodon binding domain. HisRS belongs to class II aminoacyl-tRNA synthetases (aaRS). This alignment contains the anticodon binding domain, which is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only. 29800 cd00860: ThrRS Threonyl-anticodon binding domain. ThrRS belongs to class II aminoacyl-tRNA synthetases (aaRS). This alignment contains the anticodon binding domain, which is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only. 29801 cd00861: ProRS Prolyl-anticodon binding domain, short version found predominantly in bacteria. ProRS belongs to class II aminoacyl-tRNA synthetases (aaRS). This alignment contains the anticodon binding domain, which is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only. 29802 cd00862: ProRS Prolyl-anticodon binding domain, long version found predominantly in eukaryotes and archaea. ProRS belongs to class II aminoacyl-tRNA synthetases (aaRS). This alignment contains the anticodon binding domain, which is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only, and an additional C-terminal zinc-binding domain specific to this subfamily of aaRSs. 29803 cd02426: C-terminal domain of mitochondrial DNA polymerase gamma B subunit, which is required for processivity. Polymerase gamma replicates and repairs mitochondrial DNA. The c-terminal domain of its B subunit is strikingly similar to the anticodon-binding domain of glycyl tRNA synthetase. 29805 cd00707: Pancreatic lipase-like enzymes. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is ""interfacial activation,"" the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A ""lid"" or ""flap"" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site. 29806 cd00741: Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is ""interfacial activation"", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A ""lid"" or ""flap"" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site. 29807 cd00496: Phenylalanyl-tRNA synthetase (PheRS) alpha chain catalytic core domain. PheRS belongs to class II aminoacyl-tRNA synthetases (aaRS) based upon its structure and the presence of three characteristic sequence motifs. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. While class II aaRSs generally aminoacylate the 3'-OH ribose of the appropriate tRNA, PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. PheRS is an alpha-2/ beta-2 tetramer. 29808 cd00645: Asparagine synthetase (aspartate-ammonia ligase) (AsnA) catalyses the conversion of L-aspartate to L-asparagine in the presence of ATP and ammonia. AsnA is a homodimeric enzyme which is structurally similiar to the catalytic core domain of class II aminoacyl-tRNA synthetases. Ammonia-dependent AsnA is not homologous to the glutamine-dependent asparagine synthetase AsnB. 29810 cd00670: Gly_His_Pro_Ser_Thr_tRNA synthetase class II core domain. This domain is the core catalytic domain of tRNA synthetases of the subgroup containing glycyl, histidyl, prolyl, seryl and threonyl tRNA synthetases. It is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. These enzymes belong to class II aminoacyl-tRNA synthetases (aaRS) based upon their structure and the presence of three characteristic sequence motifs in the core domain. This domain is also found at the C-terminus of eukaryotic GCN2 protein kinase and at the N-terminus of the ATP phosphoribosyltransferase accessory subunit, HisZ and the accessory subunit of mitochondrial polymerase gamma (Pol gamma b) . Most class II tRNA synthetases are dimers, with this subgroup consisting of mostly homodimers. These enzymes attach a specific amino acid to the 3' OH group of ribose of the appropriate tRNA. 29811 cd00673: Alanyl-tRNA synthetase (AlaRS) class II core catalytic domain. AlaRS is a homodimer. It is responsible for the attachment of alanine to the 3' OH group of ribose of the appropriate tRNA. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. Class II assignment is based upon its predicted structure and the presence of three characteristic sequence motifs. 29812 cd00733: Class II Glycyl-tRNA synthetase (GlyRS) alpha subunit core catalytic domain. GlyRS functions as a homodimer in eukaryotes, archaea and some bacteria and as a heterotetramer in the remainder of prokaryotes and in arabidopsis. It is responsible for the attachment of glycine to the 3' OH group of ribose of the appropriate tRNA. This domain is primarily responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. This alignment contains only sequences from the GlyRS form which heterotetramerizes. The homodimer form of GlyRS is in a different family of class II aaRS. Class II assignment is based upon structure and the presence of three characteristic sequence motifs. 29813 cd00768: Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ. 29814 cd00769: Phenylalanyl-tRNA synthetase (PheRS) beta chain core domain. PheRS belongs to class II aminoacyl-tRNA synthetases (aaRS) based upon its structure. While class II aaRSs generally aminoacylate the 3'-OH ribose of the appropriate tRNA, PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. PheRS is an alpha-2/ beta-2 tetramer. While the alpha chain contains a catalytic core domain, the beta chain has a non-catalytic core domain. 29815 cd00770: Seryl-tRNA synthetase (SerRS) class II core catalytic domain. SerRS is responsible for the attachment of serine to the 3' OH group of ribose of the appropriate tRNA. This domain It is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs in the core domain. SerRS synthetase is a homodimer. 29816 cd00771: Threonyl-tRNA synthetase (ThrRS) class II core catalytic domain. ThrRS is a homodimer. It is responsible for the attachment of threonine to the 3' OH group of ribose of the appropriate tRNA. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs in the core domain. 29820 cd00775: Lys_tRNA synthetase (LysRS) class II core domain. Class II LysRS is a dimer which attaches a lysine to the 3' OH group of ribose of the appropriate tRNA. Its assignment to class II aaRS is based upon its structure and the presence of three characteristic sequence motifs in the core domain. It is found in eukaryotes as well as some prokaryotes and archaea. However, LysRS belongs to class I aaRS's in some prokaryotes and archaea. The catalytic core domain is primarily responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. 29821 cd00776: Asx tRNA synthetase (AspRS/AsnRS) class II core domain. Assignment to class II aminoacyl-tRNA synthetases (aaRS) based upon its structure and the presence of three characteristic sequence motifs in the core domain. This family includes AsnRS as well as a subgroup of AspRS. AsnRS and AspRS are homodimers, which attach either asparagine or aspartate to the 3'OH group of ribose of the appropriate tRNA. While archaea lack asnRS, they possess a non-discriminating aspRS, which can mischarge Asp-tRNA with Asn. Subsequently, a tRNA-dependent aspartate amidotransferase converts the bound aspartate to asparagine. The catalytic core domain is primarily responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. 29823 cd00778: Prolyl-tRNA synthetase (ProRS) class II core catalytic domain. ProRS is a homodimer. It is responsible for the attachment of proline to the 3' OH group of ribose of the appropriate tRNA. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs in the core domain. This subfamily contains the core domain of ProRS from archaea, the cytoplasm of eukaryotes and some bacteria. 29825 cd00786: Cytidine and deoxycytidylate deaminase zinc-binding region. The family contains cytidine deaminases, nucleoside deaminases, deoxycytidylate deaminases and riboflavin deaminases. Also included are the apoBec family of mRNA editing enzymes. All members are Zn dependent. The zinc ion in the active site plays a central role in the proposed catalytic mechanism, activating a water molecule to form a hydroxide ion that performs a nucleophilic attack on the substrate. 29826 cd01283: Cytidine deaminase zinc-binding domain. These enzymes are Zn dependent. The zinc ion in the active site plays a central role in the proposed catalytic mechanism, activating a water molecule to form a hydroxide ion that performs a nucleophilic attack on the substrate. Cytidine deaminases catalyze the deamination of cytidine to uridine and are important in the pyrimadine salvage pathway in many cell types, from bacteria to humans. This family also includes the apoBec proteins, which are a mammal specific expansion of RNA editing enzymes, and the closely related phorbolins, and the AID (activation-induced) enzymes. 29827 cd01284: Riboflavin-specific deaminase. Riboflavin biosynthesis protein RibD (Diaminohydroxyphosphoribosylaminopyrimidine deaminase) catalyzes the deamination of 2,5-diamino-6-ribosylamino-4(3H)-pyrimidinone 5'-phosphate, which is an intermediate step in the biosynthesis of riboflavin.The ribG gene of Bacillus subtilis and the ribD gene of E. coli are bifunctional and contain this deaminase domain and a reductase domain which catalyzes the subsequent reduction of the ribosyl side chain. 29828 cd01285: Nucleoside deaminases include adenosine, guanine and cytosine deaminases. These enzymes are Zn dependent and catalyze the deamination of nucleosides. The zinc ion in the active site plays a central role in the proposed catalytic mechanism, activating a water molecule to form a hydroxide ion that performs a nucleophilic attack on the substrate. The functional enzyme is a homodimer. Cytosine deaminase catalyzes the deamination of cytosine to uracil and ammonia and is a member of the pyrimidine salvage pathway. Cytosine deaminase is found in bacteria and fungi but is not present in mammals; for this reason, the enzyme is currently of interest for antimicrobial drug design and gene therapy applications against tumors. Some members of this family are tRNA-specific adenosine deaminases that generate inosine at the first position of their anticodon (position 34) of specific tRNAs; this modification is thought to enlarge the codon recognition capacity during protein synthesis. Other members of the family are guanine deaminases which deaminate guanine to xanthine as part of the utilization of guanine as a nitrogen source. 29829 cd01286: Deoxycytidylate deaminase domain. Deoxycytidylate deaminase catalyzes the deamination of dCMP to dUMP, providing the nucleotide substrate for thymidylate synthase. The enzyme binds Zn++, which is required for catalytic activity. The activity of the enzyme is allosterically regulated by the ratio of dCTP to dTTP not only in eukaryotic cells but also in T-even phage-infected Escherichia coli, with dCTP acting as an activator and dTTP as an inhibitor. 29830 cd00484: Phosphoenolpyruvate carboxykinase (PEPCK), a critical gluconeogenic enzyme, catalyzes the first committed step in the diversion of tricarboxylic acid cycle intermediates toward gluconeogenesis. It catalyzes the reversible decarboxylation and phosphorylation of oxaloacetate to yield phosphoenolpyruvate and carbon dioxide, using a nucleotide molecule (ATP) for the phosphoryl transfer, and has a strict requirement for divalent metal ions for activity. PEPCK's separate into two phylogenetic groups based on their nucleotide substrate specificity, this model describes the ATP-dependent groups. 29831 cd00819: Phosphoenolpyruvate carboxykinase (PEPCK), a critical gluconeogenic enzyme, catalyzes the first committed step in the diversion of tricarboxylic acid cycle intermediates toward gluconeogenesis. It catalyzes the reversible decarboxylation and phosphorylation of oxaloacetate to yield phosphoenolpyruvate and carbon dioxide, using a nucleotide molecule (GTP) for the phosphoryl transfer, and has a strict requirement for divalent metal ions for activity. PEPCK's separate into two phylogenetic groups based on their nucleotide substrate specificity, this model describes the GTP-dependent group. 29833 cd01918: HprK/P, the bifunctional histidine-containing protein kinase/phosphatase, controls the phosphorylation state of the phosphocarrier protein HPr and regulates the utilization of carbon sources by gram-positive bacteria. It catalyzes both the ATP-dependent phosphorylation of Ser-46 of HPr and its dephosphorylation by phosphorolysis. The latter reaction uses inorganic phosphate as substrate and produces pyrophosphate. Phosphoenolpyruvate carboxykinase (PEPCK) and the C-terminal catalytic domain of HprK/P are structurally similar with conserved active site residues suggesting these two phosphotransferases have related functions. The HprK/P N-terminal domain is structurally similar to the N-terminal domains of the MurE and MurF amino acid ligases. 29834 cd01919: Phosphoenolpyruvate carboxykinase (PEPCK), a critical gluconeogenic enzyme, catalyzes the first committed step in the diversion of tricarboxylic acid cycle intermediates toward gluconeogenesis. It catalyzes the reversible decarboxylation and phosphorylation of oxaloacetate to yield phosphoenolpyruvate and carbon dioxide, using a nucleotide molecule (ATP or GTP) for the phosphoryl transfer, and has a strict requirement for divalent metal ions for activity. PEPCK's separate into two phylogenetic groups based on their nucleotide substrate specificity (the ATP-, and GTP-dependent groups).. 29835 cd00864: Phosphoinositide 3-kinase family, accessory domain (PIK domain); PIK domain is conserved in PI3 and PI4-kinases. Its role is unclear, but it has been suggested to be involved in substrate presentation. Phosphoinositide 3-kinases play an important role in a variety of fundamental cellular processes and can be divided into three main classes, defined by their substrate specificity and domain architecture. 29836 cd00869: Phosphoinositide 3-kinase (PI3K) class II, accessory domain (PIK domain); PIK domain is conserved in all PI3 and PI4-kinases. Its role is unclear but it has been suggested to be involved in substrate presentation. In general, class II PI3-kinases phosphorylate phosphoinositol (PtdIns), PtdIns(4)-phosphate, but not PtdIns(4,5)-bisphosphate. They are larger, having a C2 domain at the C-terminus. 29837 cd00870: Phosphoinositide 3-kinase (PI3K) class III, accessory domain (PIK domain); PIK domain is conserved in all PI3 and PI4-kinases. Its role is unclear but it has been suggested to be involved in substrate presentation. In general, PI3Ks class III phosphorylate phosphoinositol (PtdIns) only. The prototypical PI3K class III, yeast Vps34, is involved in trafficking proteins from Golgi to the vacuole. 29838 cd00871: Phosphoinositide 4-kinase(PI4K), accessory domain (PIK domain); PIK domain is conserved in PI3 and PI4-kinases. Its role is unclear but it has been suggested to be involved in substrate presentation. PI4K phosphorylates hydroxylgroup at position 4 on the inositol ring of phosphoinositide, the first commited step in the phosphatidylinositol cycle. 29839 cd00872: Phosphoinositide 3-kinase (PI3K) class I, accessory domain ; PIK domain is conserved in all PI3 and PI4-kinases. Its role is unclear but it has been suggested to be involved in substrate presentation. In general, PI3K class I prefer phosphoinositol (4,5)-bisphosphate as a substrate. Mammalian members interact with active Ras. They form heterodimers with adapter molecules linking them to different signaling pathways. 29840 cd00584: Prefoldin alpha subunit; Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils. 29841 cd00632: Prefoldin beta; Prefoldin is a hexameric molecular chaperone complex, composed of two evolutionarily related subunits (alpha and beta), which are found in both eukaryotes and archaea. Prefoldin binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The hexameric structure consists of a double beta barrel assembly with six protruding coiled-coils. The alpha prefoldin subunits have two beta hairpin structures while the beta prefoldin subunits (this CD) have only one hairpin that is most similar to the second hairpin of the alpha subunit. The prefoldin hexamer consists of two alpha and four beta subunits and is assembled from the beta hairpins of all six subunits. The alpha subunits initially dimerize providing a structural nucleus for the assembly of the beta subunits. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. 29842 cd00890: Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils. 29843 cd00821: Pleckstrin homology (PH) domain. PH domains are only found in eukaryotes. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 29844 cd00824: IRS-like phosphotyrosine-binding domain (PTBi); This domain has a PH-like fold and is found in insulin receptor substrate molecules and in other eukaryotic signaling molecules such as FRS2 and Dok. IRS and Dok molecules have an N-terminal PH domain, which is followed by an IRS-like PTB domain. FRS2 just has an N-terminal PTBi domain. This PTBi domain is shorter than the PTB domain which is found in SHC, Numb and other proteins. The PTBi domain binds to phosphotyrosines which are in NPXpY motifs. 29845 cd00835: Ran-binding domain; This domain of approximately 150 residues shares structural similarity to the PH domain, but lacks detectable sequence similarity. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran . The Ran-binding domain is found in multiple copies in Nuclear pore complex proteins. 29846 cd00836: The FERM_C domain is the third structural domain within the FERM domain. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM_C domain is also found in protein tyrosine phosphatases (PTPs) , the tryosine kinases FAKand JAK, in addition to other proteins involved in signaling. This domain is structuraly similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites. 29847 cd00837: EVH1 (Enabled, Vasp-Homology) or WASP Homology (WH1) domain. The EVH1 domain binds to other proteins at proline rich sequences in either FPPPP or PPXXF motifs. It is found in the cytoskeletal reorganization proteins Enabled VASP, and WASP, and in the synaptic scaffolding protein Homer. It has a PH-like fold, despite having minimal sequence similarity to PH or PTB domains. 29848 cd00900: Pleckstrin homology-like domain. This family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. The PH domain is commonly found in eukaryotic signaling proteins. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins. 29849 cd00934: Phosphotyrosine-binding (PTB) domain; PTB domains have a PH-like fold and are found in various eukaryotic signaling molecules. They were initially identified based upon their ability to recognize phosphorylated tyrosine residues. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. The PTB domain of SHC binds to a NPXpY sequence. More recent studies have found that some types of PTB domains such as the neuronal protein X11 and in the cell-fate determinant protein Numb can bind to peptides which are not tyrosine phosphorylated; whereas, other PTB domains can bind motifs lacking tyrosine residues altogether. 29850 cd01201: Neurobeachin Pleckstrin homology-like domain. This domain is found in the large multi-domain eukaryotic protein Nerubeachin, N-terminal to the BEACH domain. This PH-like domain interacts with the BEACH domain in the same manner used by other PH-like domains to bind peptides. 29851 cd01202: Fibroblast growth factor receptor substrate 2 (FRS2/SNT1) Phosphotyrosine-binding domain (IRS1-like). FRS2 mediates signaling downstream of the FGF receptor. It has an N-terminal PTBi domain, which has a PH-like fold and is similiar to the PTB domain that is found in insulin receptor substrate molecules. This PTBi domain is shorter than the PTB domain which is found in SHC, Numb and other proteins. The PTBi domain binds to phosphotyrosines which are in NPXpY motifs. 29852 cd01203: Downstream of tyrosine kinase (DOK) Phosphotyrosine-binding domain. This domain has a PH-like fold and is similiar to the PTB domain that is found in insulin receptor substrate molecules The DOK family of eukaryotic signaling molecules have an N-terminal PH domain, followed by an IRS-like PTB domain. This PTBi domain is shorter than the PTB domain which is found in SHC, Numb and other proteins. The PTBi domain binds to phosphotyrosines which are in NPXpY motifs. 29853 cd01204: Insulin receptor substrate (IRS) Phosphotyrosine-binding domain(PTB). This domain has a PH-like fold and is found in insulin receptor substrate molecules. IRS molecules have an N-terminal PH domain , which is followed by an IRS-like PTB domain. This PTBi domain is shorter than the PTB domain which is found in SHC, Numb and other proteins. The PTBi domain binds to phosphotyrosines which are in NPXpY motifs in the insulin receptor, IGF-I receptor and the IL-4 receptor. 29854 cd01205: WASP-type EVH1 domain. Wiskott-Aldrich syndrome (WAS) is an X-linked recessive disease, characterized by eczema, immunodeficiency, and thrombocytopenia. The majority of patients with WAS, or a milder version of the disorder, X-linked thrombocytopenia (XLT), have point mutations in the EVH1 domain of WASP (Wiskott-Aldrich syndrome protein). WASP is an actin regulatory protein consisting of an N-terminal EVH1 domain, a basic region, a GTP binding domain, a proline rich region and a WH2 acidic region. Yeast members lack the GTP binding domain. WASP binds a 25 residue proline rich motif from the WASP Interacting Protein (WIP) via its N-terminal EVH1 domain. 29855 cd01206: Homer type EVH1 domain. Homer is a synaptic scaffolding protein, involved in neuronal signaling. It contains an EVH1 domain, which binds to both neurotransmitter receptors, such as the metabotropic glutamate receptor (mGluR) and to other scaffolding proteins via PPXXF motifs, in order to target them to the synaptic junction. It has a PH-like fold, despite having minimal sequence similarity to PH or PTB domains. 29856 cd01207: Enabled-VASP-type homology (EVH1) domain. The EVH1 domain binds to other proteins at proline rich sequences. It is found in proteins involved in cytoskeletal reorganization such as Enabled and VASP. Ena-VASP type EVH1 domains specifically recognize FPPPP motifs in the focal adhesion proteins zyxin and vinculin, and the ActA surface protein of Listeria monocytogenes. It has a PH-like fold, despite having minimal sequence similarity to PH or PTB domains. 29857 cd01208: X11 Phosphotyrosine-binding (PTB) domain. The neuronal protein X11 has a PTB domain followed by two PDZ domains. PTB domains have a PH-like fold and are found in various eukaryotic signaling molecules. They were initially identified based upon their ability to recognize phosphorylated tyrosine residues. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. More recent studies have found that some types of PTB domains can bind to peptides which are not tyrosine phosphorylated or lack tyrosine residues altogether. X11 binds to the cytoplasmic domain of the beta-amyloid precursor protein (beta-APP) and does not require the substrate to be tyrosine-phosphorylated for binding. 29858 cd01209: SHC phosphotyrosine-binding (PTB) domain. SHC is a substrate for receptor tyrosine kinases, which can interact with phosphoproteins at NPXY motifs. SHC contains an PTB domain followed by an SH2 domain. PTB domains have a PH-like fold and are found in various eukaryotic signaling molecules. They were initially identified based upon their ability to recognize phosphorylated tyrosine residues In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. More recent studies have found that some types of PTB domains can bind to peptides which are not tyrosine phosphorylated or lack tyrosine residues altogether. 29859 cd01210: Epidermal growth factor receptor kinase substrate (EPS8) Phosphotyrosine-binding (PTB) domain. EPS8 is a regulator of Rac signaling. It consists of a PTB and an SH3 domain. PTB domains have a PH-like fold and are found in various eukaryotic signaling molecules. They were initially identified based upon their ability to recognize phosphorylated tyrosine residues. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. More recent studies have found that some types of PTB domains can bind to peptides which are not tyrosine phosphorylated or lack tyrosine residues altogether. 29860 cd01211: GAPCenA Phosphotyrosine-binding (PTB) domain. GAPCenA is a centrosome-associated GTPase activating protein (GAP) for rab 6. It consists of an N-terminal PTB domain and a C-terminal TBC domain. PTB domains have a PH-like fold and are found in various eukaryotic signaling molecules. They were initially identified based upon their ability to recognize phosphorylated tyrosine residues. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. More recent studies have found that some types of PTB domains can bind to peptides which are not tyrosine phosphorylated or lack tyrosine residues altogether. 29861 cd01212: JNK-interacting protein (JIP) Phosphotyrosine-binding (PTB) domain. JIP is a mitogen-activated protein kinase scaffold protein. JIP consists of a C-terminal SH3 domain, followed by a PTB domain. PTB domains have a PH-like fold and are found in various eukaryotic signaling molecules. They were initially identified based upon their ability to recognize phosphorylated tyrosine residues In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. More recent studies have found that some types of PTB domains can bind to peptides which are not tyrosine phosphorylated or lack tyrosine residues altogether. 29862 cd01213: Tensin Phosphotyrosine-binding (PTB) domain. Tensin is a a focal adhesion protein, which contains a C-terminal SH2 domain followed by a PTB domain. PTB domains have a PH-like fold and are found in various eukaryotic signaling molecules. They were initially identified based upon their ability to recognize phosphorylated tyrosine residues. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. More recent studies have found that some types of PTB domains can bind to peptides which are not tyrosine phosphorylated or lack tyrosine residues altogether. 29863 cd01214: CG8312 Phosphotyrosine-binding (PTB) domain. PTB domains have a PH-like fold and are found in various eukaryotic signaling molecules. They were initially identified based upon their ability to recognize phosphorylated tyrosine residues. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. More recent studies have found that some types of PTB domains can bind to peptides which are not tyrosine phosphorylated or lack tyrosine residues altogether. 29864 cd01215: Disabled (Dab) Phosphotyrosine-binding domain. Dab is a cystosolic adaptor protein, which binds to the cytoplasmic tails of lipoprotein receptors, such as ApoER2 and VLDLR, via its PTB domain. The dab PTB domain has a preference for unphosphorylated tyrosine within an NPxY motif. Additionally, the Dab PTB domain, which is structurally similar to PH domains, binds to phosphatidlyinositol phosphate 4,5 bisphosphate in a manner characteristic of phosphoinositide binding PH domains. 29865 cd01216: Fe65 Phosphotyrosine-binding (PTB) domain, phosphotyrosine-interaction (PI) domain. Fe65 is an amyloid beta A4 precursor (APP) protein-binding. It contains an N-terminal WW domain followed by two PTB domains. The C-terminal PTB domain is responsible for APP binding. PTB domains have a PH-like fold and are found in various eukaryotic signaling molecules. They were initially identified based upon their ability to recognize phosphorylated tyrosine residues. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. More recent studies have found that some types of PTB domains can bind to peptides which are not tyrosine phosphorylated or lack tyrosine residues altogether. 29866 cd01217: CG12581 Phosphotyrosine-binding (PTB) domain. PTB domains have a PH-like fold and are found in various eukaryotic signaling molecules. They were initially identified based upon their ability to recognize phosphorylated tyrosine residues. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. More recent studies have found that some types of PTB domains can bind to peptides which are not tyrosine phosphorylated or lack tyrosine residues altogether. 29867 cd01218: Phafin2 Pleckstrin Homology (PH) domain. Phafin contains a PH domain and a FYVE domain. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinsases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 29868 cd01219: FGD (faciogenital dysplasia protein) pleckstrin homology (PH) domain. FGD has a RhoGEF (DH) domain, followed by a PH domain, a FYVE domain and a C-terminal PH domain. FGD is a guanine nucleotide exchange factor that activates the Rho GTPase Cdc42. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 29869 cd01220: Chondrocyte-derived ezrin-like domain containing protein (CDEP) Pleckstrin homology (PH) domain. CDEP consists of a Ferm domain, a rhoGEF (DH) domain followed by two PH domains. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 29870 cd01221: Ephexin Pleckstrin homology (PH) domain. Ephexin contains a RhoGEF (DH) followed by a PH domain and an SH3 domain. The ephexin PH domain is believed to act with the DH domain in mediating protein-protein interactions with the Eph receptor. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29871 cd01222: Clg (common-site lymphoma/leukemia guanine nucleotide exchange factor) pleckstrin homology (PH) domain. Clg contains a RhoGEF (DH) domain and a PH domain. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinsases, regulators of G-proteins, endocytotic GTPases, adaptors, a well as cytoskeletal associated molecules and in lipid associated enzymes. 29872 cd01223: Vav pleckstrin homology (PH) domain. Vav acts as a guanosine nucleotide exchange factor(GEF) for Rho/Rac proteins. Mammalian Vav proteins consist of a calponin homology (CH) domain, an acidic region, a rho-GEF (DH)domain, a PH domain, a Zinc finger region and an SH2 domain, flanked by two SH3 domains. In invertebrates such as Drosophila and C.elegans, Vav is missing the N-terminal SH3 domain . PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29873 cd01224: Collybistin pleckstrin homology (PH) domain. Collybistin is GEF which induces submembrane clustering of the receptor-associated peripheral membrane protein gephyrin. It consists of an SH3 domain, followed by a RhoGEF(dbH) and PH domain. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29874 cd01225: Cool (cloned out of library)/Pix (PAK-interactive exchange factor) pleckstrin homology (PH) domain. Cool/Pix contains an N-terminal SH3 domain followed by a RhoGEF (DH) and PH domain. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29875 cd01226: Exocyst complex 84-kDa subunit Pleckstrin Homology (PH) domain. Exo84 is a subunit of the exocyt complex, which is important in intracellular trafficking. In metazoa, Exo84 has a PH domain towards its N-terminus. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinsases, regulators of G-proteins, endocytotic GTPases, adaptors, a well as cytoskeletal associated molecules and in lipid associated enzymes. 29876 cd01227: Dbs (DBL's big sister) pleckstrin homology (PH) domain. Dbs is a guanine nucleotide exchange factor (GEF), which contains spectrin repeats, a rhoGEF (DH) domain and a PH domain. The Dbs PH domain participates in binding to both the Cdc42 and RhoA GTPases. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29878 cd01229: Epithelial cell transforming 2 (ECT2) pleckstrin homology (PH) domain. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinases, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, a well as cytoskeletal associated molecules and in lipid associated enzymes. 29879 cd01230: EFA6 Pleckstrin Homology (PH) domain. EFA6 is an guanine nucleotide exchange factor for ARF6, which is involved in membrane recycling. It consists of a SEC7 domain followed by a PH domain. The EFA6 PH domain regulates its association with the plasma membrane. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29880 cd01231: LNK-family Pleckstrin homology (PH) domain. The Lnk family of proteins consists of Lnk, APS and SH2B. They are adaptor proteins consisting of a PH domain and an SH2 domain, which mediates signaling through growth factor receptors. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. The lnk family PH domain is likely involved in targeting of the adaptor proteins to the plasma membrane. 29881 cd01232: Trio pleckstrin homology (PH) domain. Trio is a multidomain signaling protein that contains two RhoGEF(DH)-PH domains in tandem. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29882 cd01233: Unc-104 pleckstrin homology (PH) domain. Unc-104 is a kinesin-like protein containing an N-terminal kinesin catalytic domain, followed by a forkhead associated domain with a C-terminal PH domain. These proteins are responsible for the transport of membrane vesicles along microtubules. The mechanism involves the binding of the PH domain to phosphatidiylinositol (4,5) P2-containing liposomes. 29883 cd01234: CADPS (Ca2+-dependent activator protein) Pleckstrin homology (PH) domain. CADPS is a calcium-dependent activator involved in secretion. It contains a central PH domain that binds to phosphoinositide 4,5 bisphosphate containing liposomes. However, membrane association may also be mediated by binding to phosphatidlyserine via general electrostatic interactions. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29884 cd01235: Set binding factor Pleckstrin Homology (PH) domain. Set binding factor is a myotubularin-related pseudo-phosphatase consisting of a Denn domain, a Gram domain, an inactive phosphatase domain, a SID motif and a C-terminal PH domain. Its PH domain is predicted to bind lipids based upon its ability to respond to phosphatidylinositol 3-kinase .. 29885 cd01236: Outspread Pleckstrin homology (PH) domain. Outspread contains two PH domains and a C-terminal coiled-coil region. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinsases, regulators of G-proteins, endocytotic GTPAses, adaptors, a well as cytoskeletal associated molecules and in lipid associated enzymes. 29886 cd01237: Unc-112 pleckstrin homology (PH) domain. Unc-112 and related proteins contain two FERM domains with a PH domain between them. Both the PH and FERM domains have a PH-like fold. The FERM domains are likely responsible for the role of Unc-112 in organizing beta-integrin. The specific role of the Unc-112 PH domain is not known, but it is predicted to be involved in mediating membrane interactions. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29887 cd01238: Tec pleckstrin homology (PH) domain. Proteins in the Tec family of cytoplasmic protein tyrosine kinases that includes Bruton's tyrosine kinase (BTK), BMX, IL2-inducible T-cell kinase (Itk) and Tec. These proteins generally have an N-terminal PH domain, followed by a Tek homology (TH) domain, a SH3 domain, a SH2 domain and a kinase domain. Tec PH domains tether these proteins to membranes following the activation of PI3K and its subsequent phosphorylation of phosphoinositides. The importance of PH domain membrane anchoring is confirmed by the discovery of a mutation of a critical arginine residue in the BTK PH domain, which causes X-linked agammaglobulinemia (XLA) in humans and a related disorder is mice. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29888 cd01239: Protein kinase D (PKD/PKCmu) pleckstrin homology (PH) domain. PKD consists of 2 C1 domains, followed by a PH domain and a kinase domain. While the PKD PH domain has not been shown to bind phosphorylated inositol lipids and is not required for membrane translocation, it is required for nuclear export. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29889 cd01240: Beta adrenergic receptor kinase 1(beta ARK1)(GRK2) pleckstrin homology (PH) domain. Beta ARK1 is a G protein-coupled receptor kinase (GRK). It phosphorylates activated G-protein coupled receptors leading to the release of the previously bound heterotrimeric G protein agonist and thus signal termination. It consists of a domain found in regulators of G-protein signaling (RGS)(RH), a serine/threonine kinase domain and a C-terminal PH domain. The Beta-Ark 1 PH domain has an extended C-terminal helix, which mediates interactions with G beta gamma subunits. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29890 cd01241: Akt pleckstrin homology (PH) domain. Akt (Protein Kinase B (PKB)) is a phosphatidylinositol 3'-kinase (PI3K)-dependent Ser/Thr kinase. The PH domain recruits Akt to the plasma membrane by binding to phosphoinositides (PtdIns-3,4-P2) and is required for activation. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29892 cd01243: MRCK (myotonic dystrophy-related Cdc42-binding kinase) pleckstrin homology (PH) domain. MRCK consists of a serine/threonine kinase domain, a cysteine rich (C1) region, a PH domain and a p21 binding motif. It has been shown to promote cytoskeletal reorganization, which affects many biological processes. The MRCK PH domain is responsible for its targeting to cell to cell junctions. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29893 cd01244: RAS_GTPase activating protein (GAP)_CG9209 pleckstrin homology (PH) domain. This protein consists of two C2 domains, followed by a RasGAP domain, a PH domain and a BTK domain. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinsases, regulators of G-proteins, endocytotic GTPAses, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 29894 cd01245: RAS GTPase-activating protein (GAP) CG5898 Pleckstrin homology (PH) domain. This protein has a domain architecture of SH2-SH3-SH2-PH-C2-Ras_GAP. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinsases, regulators of G-proteins, endocytotic GTPAses, adaptors, a well as cytoskeletal associated molecules and in lipid associated enzymes. 29895 cd01246: Oxysterol binding protein (OSBP) Pleckstrin homology (PH) domain. Oxysterol binding proteins are a multigene family that is conserved in yeast, flies, worms, mammals and plants. They all contain a C-terminal oxysterol binding domain, and most contain an N-terminal PH domain. OSBP PH domains bind to membrane phosphoinositides and thus likely play an important role in intracellular targeting. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29896 cd01247: Goodpasture antigen binding protein (GPBP) Pleckstrin homology (PH) domain. The GPBP protein is a kinase that phosphorylates an N-terminal region of the alpha 3 chain of type IV collagen , which is commonly known as the goodpasture antigen. It has has an N-terminal PH domain and a C-terminal START domain. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinsases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 29897 cd01248: Phospholipase C (PLC) pleckstrin homology (PH) domain. There are several isozymes of PLC (beta, gamma, delta, epsilon. zeta). While, PLC beta, gamma and delta all have N-terminal PH domains, lipid binding specificity is not conserved between them. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29898 cd01249: Oligophrenin Pleckstrin homology (PH) domain. Oligophrenin is composed of a PH domain, a rhoGAP domain and a proline rich region. Closely related proteins have a C-terminal SH3 domain. PH domains a share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinsases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 29899 cd01250: Centaurin Pleckstrin homology (PH) domain. Centaurin beta and gamma consist of a PH domain, an ArfGAP domain and three ankyrin repeats. Centaurain gamma also has an N-terminal Ras homology domain. Centaurin alpha has a different domain architecture and its PH domain is in a different subfamily. Centaurin can bind to phosphatidlyinositol (3,4,5)P3. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29900 cd01251: Centaurin alpha Pleckstrin homology (PH) domain. Centaurin alpha is a phophatidlyinositide binding protein consisting of an N-terminal ArfGAP domain and two PH domains. In response to growth factor activation, PI3K phosphorylates phosphatidylinositol 4,5-bisphosphate to phosphatidylinositol 3,4,5-trisphosphate. Centaurin alpha 1 is recruited to the plasma membrane following growth factor stimulation by specific binding of its PH domain to phosphatidylinositol 3,4,5-trisphosphate. Centaurin alpha 2 is constitutively bound to the plasma membrane since it binds phosphatidylinositol 4,5-bisphosphate and phosphatidylinositol 3,4,5-trisphosphate with equal affinity. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29901 cd01252: Cytohesin Pleckstrin homology (PH) domain. Cytohesin is an ARF-Guanine nucleotide Exchange Factor (GEF), which has a Sec7-type Arf-GEFdomain and a pleckstrin homology domain. It specifically binds phosphatidylinositol-3,4,5-trisphosphate (PtdIns(3,4, 5)P3) via its PH domain and it acts as a PI 3-kinase effector mediating biological responses such as cell adhesion and membrane trafficking. PH domains are only found in eukaryotes. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29902 cd01253: Beta-spectrin pleckstrin homology (PH) domain. Beta spectrin binds actin and functions as a major component of the cytoskeleton underlying cellular membranes. Beta spectrin consists of multiple spectrin repeats followed by a PH domain, which binds to Inositol-1,4,5-Trisphosphate. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. PH domains are often involved in targeting proteins to the plasma membrane via lipid binding. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinsases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 29903 cd01254: Phospholipase D (PLD) pleckstrin homology (PH) domain. PLD hydrolyzes phosphatidylcholine to phosphatidic acid (PtdOH), which can bind target proteins. PLD contains a PH domain, a PX domain and four conserved PLD signature domains. The PLD PH domain is specific for bisphosphorylated inositides. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29904 cd01255: TIAM Pleckstrin homology (PH) domain. TIAM (T-cell invasion and metastasis) is a guanine nucleotide exchange factor specific for RAC1. It consists of an N-terminal PH domain followed by Raf-like ras binding domain(RDB), a PDZ domain, a RhoGEF (DH) domain and a PH domain. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. This subfamily contains the alignment of the PH domain that follows the DH domain. 29905 cd01256: Dynamin pleckstrin homology (PH) domain. Dynamin is a GTPase that regulates endocytic vesicle formation. It has an N-terminal GTPase domain, followed by a PH domain, a GTPase effector domain and a C-terminal proline arginine rich domain. Dynamin-like proteins, which are found in metazoa, plants and yeast have the same domain architecture as dynamin, but lack the PH domain. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29906 cd01257: Insulin receptor substrate (IRS) pleckstrin homology (PH) domain. PH domains are only found in eukaryotes, and are often involved in targeting proteins to the plasma membrane via lipid binding. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinsases, regulators of G-proteins, endocytotic GTPases, adaptors, a well as cytoskeletal associated molecules and in lipid associated enzymes. The IRS PH domain targets IRS molecules to the plasma membrane, usually in response to insulin stimulation. 29907 cd01258: Syntrophin pleckstrin homology (PH) domain. Syntrophins are peripheral membrane proteins, which associate with the Duchenne muscular dystrophy protein dystrophin and other proteins to form the dystrophin glycoprotein complex (DGC). There are five syntrophin isoforms, alpha1, beta1, beta2, gamma1, and gamma2. They all contain two PH domains, with the N-teminal PH domain interupted by a PDZ domain. The N-terminal PH domain of alpha1syntrophin binds phosphatidylinositol 4,5-bisphosphate. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29908 cd01259: Apbb1ip (Amyloid beta (A4) Precursor protein-Binding, family B, member 1 Interacting Protein) pleckstrin homology (PH) domain. Apbb1ip consists of a Ras-associated domain and a PH domain. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29909 cd01260: Connector enhancer of KSR (Kinase suppressor of ras) (CNK) pleckstrin homology (PH) domain. CNK is believed to regulate the activity and the subcellular localization of RAS activated RAF. CNK is composed of N-terminal SAM and PDZ domains along with a central or C-terminal PH domain. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinsases, regulators of G-proteins, endocytotic GTPAses, adaptors, a well as cytoskeletal associated molecules and in lipid associated enzymes. 29910 cd01261: Son of Sevenless (SOS) Pleckstrin homology (PH) domain. SOS is a Ras guanine nucleotide exchange factor. It has a RhoGEF (DbH) domain, a PH domain, and a RasGEF domain. The SOS PH domain can bind to inositol 1,4,5-triphosphate. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29911 cd01262: 3-Phosphoinositide dependent protein kinase 1 (PDK1) pleckstrin homology (PH) domain. PDK1 contains an N-terminal serine/threonine kinase domain followed by a PH domain. Following binding of the PH domain to PtdIns(3,4,5)P3 and PtdIns(3,4)P2, PDK1 activates kinases such as Akt (PKB). PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29912 cd01263: Anillin Pleckstrin homology (PH) domain. Anillin is an actin binding protein involved in cytokinesis. It has a C-terminal PH domain, which has been shown to be necessary, but not sufficient for targetting of anillin to ectopic septin containing foci . PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinsases, regulators of G-proteins, endocytotic GTPAses, adaptors, a well as cytoskeletal associated molecules and in lipid associated enzymes. 29913 cd01264: Melted pleckstrin homology (PH) domain. The melted protein has a C-terminal PH domain. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinsases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes. 29914 cd01265: PARIS-1 pleckstrin homology (PH) domain. PARIS-1 contains a PH domain and a TBC-type GTPase catalytic domain. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29915 cd01266: Gab (Grb2-associated binder) pleckstrin homology (PH) domain. The Gab subfamily includes several Gab proteins, Drosophila DOS and C. elegans SOC-1. They are scaffolding adaptor proteins, which possess N-terminal PH domains and a C-terminus with proline-rich regions and multiple phosphorylation sites. Following activation of growth factor receptors, Gab proteins are tyrosine phosphorylated and activate PI3K, which generates 3-phosphoinositide lipids. By binding to these lipids via the PH domain, Gab proteins remain in proximity to the receptor, leading to further signaling. While not all Gab proteins depend on the PH domain for recruitment, it is required for Gab activity. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 29916 cd01267: Phosphotyrosine-binding (PTB) domain, phosphotyrosine-interaction (PI) domain. PTB domains have a PH-like fold and are found in various eukaryotic signaling molecules. They were initially identified based upon their ability to recognize phosphorylated tyrosine residues. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. More recent studies have found that some types of PTB domains can bind to peptides which are not tyrosine phosphorylated or lack tyrosine residues altogether. 29917 cd01268: Numb Phosphotyrosine-binding (PTB) domain. Numb is a membrane associated adaptor protein, which is a determinant of asymmetric cell division. Numb has an N-terminal PTB domain. PTB domains have a PH-like fold and are found in various eukaryotic signaling molecules. They were initially identified based upon their ability to recognize phosphorylated tyrosine residues. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. More recent studies have found that some types of PTB domains can bind to peptides which are not tyrosine phosphorylated or lack tyrosine residues altogether. 29919 cd01270: DYC-1 (DYB-1 binding and Capon related) Phosphotyrosine-binding (PTB) domain. DYC-1 contains an N-terminal PTB domain. PTB domains have a PH-like fold and are found in various eukaryotic signaling molecules. They were initially identified based upon their ability to recognize phosphorylated tyrosine residues. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. The PTB domains of both SHC and IRS-1, for example, bind to a NPXpY sequence. More recent studies have found that some types of PTB domains can bind to peptides which are not tyrosine phosphorylated; whereas, other PTB domains can bind motifs lacking tyrosine residues altogether. 29920 cd01271: Fe65 C-terminal Phosphotyrosine-binding (PTB) domain. Fe65 is an amyloid beta A4 precursor (APP) protein-binding. It contains an N-terminal WW domain followed by two PTB domains. The C-terminal PTB domain is responsible for APP binding. PTB domains have a PH-like fold and are found in various eukaryotic signaling molecules. They were initially identified based upon their ability to recognize phosphorylated tyrosine residues In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. More recent studies have found that some types of PTB domains can bind to peptides which are not tyrosine phosphorylated or lack tyrosine residues altogether. 29921 cd01272: Fe65 Phosphotyrosine-binding (PTB) domain. Fe65 is an amyloid beta A4 precursor protein-binding. It contains an N-terminal WW domain followed by two PTB domains. PTB domains have a PH-like fold and are found in various eukaryotic signaling molecules. They were initially identified based upon their ability to recognize phosphorylated tyrosine residues. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. More recent studies have found that some types of PTB domains can bind to peptides which are not tyrosine phosphorylated or lack tyrosine residues altogether. 29922 cd01273: CED-6 Phosphotyrosine-binding (PTB) domain. CED6 is an adaptor protein involved in the engulfment of apoptotic cells. It has a C-terminal PTB domain, which can bind to NPXY motifs. PTB domains have a PH-like fold and are found in various eukaryotic signaling molecules. They were initially identified based upon their ability to recognize phosphorylated tyrosine residues. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. More recent studies have found that some types of PTB domains can bind to peptides which are not tyrosine phosphorylated or lack tyrosine residues altogether. 29923 cd01274: AIDA-1b Phosphotyrosine-binding (PTB) domain. AIDA-1b is an amyloid-beta precursor protein interacting protein. It consists of ankyrin repeats, a SAM domain and a C-terminal PTB domain. PTB domains have a PH-like fold and are found in various eukaryotic signaling molecules. They were initially identified based upon their ability to recognize phosphorylated tyrosine residues In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. More recent studies have found that some types of PTB domains can bind to peptides which are not tyrosine phosphorylated or lack tyrosine residues altogether. 29924 cd00258: GM2 activator protein (GM2-AP) is a non-enzymatic lysosomal protein that acts as cofactor in the sequential degradation of gangliosides. GM2A is an essential cofactor for beta-hexosaminidase A (Hex A) in the enzymatic hydrolysis of GM2 ganglioside to GM3. Mutation of the gene results in the AB variant of Tay-Sachs disease. GM2-AP and similar proteins belong to the ML domain family. 29925 cd00912: The ML (MD-2-related lipid-recognition) domain is present in MD-1, MD-2, GM2 activator protein, Niemann-Pick type C2 (Npc2) protein, phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP), mite allergen Der p 2 and several proteins of unknown function in plants, animals and fungi. These single-domain proteins form two anti-parallel beta-pleated sheets stabilized by three disulfide bonds and with an accessible central hydrophobic cavity, and are predicted to mediate diverse biological functions through interaction with specific lipids. 29926 cd00915: MD-1 and MD-2 are cofactors required for LPS signaling through cell surface receptors. MD-2 and its binding partner, Toll-like receptor 4 (TLR4), are essential for the innate immune responses of mammalian cells to bacterial lipopolysaccharide (LPS); MD-2 directly binds the lipid A moiety of LPS. The TLR4-like receptor, RP105, which mediates LPS-induced lymphocyte proliferation, interacts with MD-1; MD-1 enhances RP105-mediated LPS-induced growth of B cells. These proteins belong to the ML domain family. 29927 cd00916: Niemann-Pick type C2 (Npc2) is a lysosomal protein in which a mutation in the gene causes a rare form of Niemann-Pick type C disease, an autosomal recessive lipid storage disorder characterized by accumulation of low-density lipoprotein-derived cholesterol in lysosomes. Although Npc2 is known to bind cholesterol, the function of this protein is unknown. These proteins belong to the ML domain family. . 29928 cd00917: The phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP) has been shown to bind phosphatidylglycerol and phosphatidylinositol, but the biological significance of this is still obscure. These proteins belong to the ML domain family. 29929 cd00918: Several group 2 allergen proteins belong to the ML domain family. They include Dermatophagoides pteronyssinus, group 2 (Der p 2) and D. farinae, group 2 (Der f 2) allergens. These house dust mites cause heavy atopic diseases such as asthma and dermatitis. Although the allergenic properties of these proteins have been well characterized, their biological function in mites is unknown. 29930 cd00919: Heme-copper oxidase subunit I. Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which catalyze the reduction of O2 and simultaneously pump protons across the membrane. The superfamily is diverse in terms of electron donors, subunit composition, and heme types. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. It has been proposed that Archaea acquired heme-copper oxidases through gene transfer from Gram-positive bacteria. Membership in the superfamily is defined by subunit I, which contains a heme-copper binuclear center (the active site where O2 is reduced to water) formed by a high-spin heme and a copper ion. It also contains a low-spin heme, believed to participate in the transfer of electrons to the binuclear center. Only subunit I is common to the entire superfamily. For every reduction of an O2 molecule, eight protons are taken from the inside aqueous compartment and four electrons are taken from the electron donor on the opposite side of the membrane. The four electrons and four of the protons are used in the reduction of O2; the four remaining protons are pumped across the membrane. This charge separation of four charges contributes to the electrochemical gradient used for ATP synthesis. Two proton channels, the D-pathway and K-pathway, leading to the binuclear center have been identified in subunit I of cytochrome c oxidase (CcO) and ubiquinol oxidase. A well-defined pathway for the transfer of pumped protons beyond the binuclear center has not been identified. Electron transfer occurs in two segments: from the electron donor to the low-spin heme, and from the low-spin heme to the binuclear center. The first segment can be a multi-step process and varies among the different families, while the second segment, a direct transfer, is consistent throughout the superfamily. 29931 cd01660: ba3-like heme-copper oxidase subunit I. The ba3 family of heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and some archaea which catalyze the reduction of O2 and simultaneously pump protons across the membrane. It has been proposed that Archaea acquired heme-copper oxidases through gene transfer from Gram-positive bacteria. The ba3 family contains oxidases that lack the conserved residues that form the D- and K-pathways in CcO and ubiquinol oxidase. Instead they contain a potential alternative K-pathway. Additional proton channels have been proposed for this family of oxidases but none have been identified definitively. For general information on the heme-copper oxidase superfamily, please see cd00919. 29932 cd01661: Cytochrome cbb3 oxidase subunit I. Cytochrome cbb3 oxidase, the terminal oxidase in the respiratory chains of proteobacteria, is a multi-chain transmembrane protein located in the cell membrane. Like other cytochrome oxidases, it catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. Found mainly in proteobacteria, cbb3 is believed to be a modern enzyme that has evolved independently to perform a specialized function in microaerobic energy metabolism. Subunit I contains a heme-copper binuclear center (the active site where O2 is reduced to water) formed by a high-spin heme and a copper ion. It also contains a low-spin heme, believed to participate in the transfer of electrons to the binuclear center. The cbb3 operon contains four genes (ccoNOQP or fixNOQP), with ccoN coding for subunit I. Instead of a CuA-containing subunit II analogous to other cytochrome oxidases, cbb3 utilizes subunits ccoO and ccoP, which contain one and two hemes, respectively, to transfer electrons to the binuclear center. The fourth subunit (ccoQ) has been shown to protect the core complex from proteolytic degradation by serine proteases. For every reduction of an O2 molecule, eight protons are taken from the inside aqueous compartment and four electrons are taken from cytochrome c on the opposite side of the membrane. The four electrons and four of the protons are used in the reduction of O2; the four remaining protons are pumped across the membrane. This charge separation of four charges contributes to the electrochemical gradient used for ATP synthesis. The polar residues that form the D- and K-pathways in subunit I of other cytochrome c and ubiquinol oxidases are absent in cbb3. The proton pathways remain undefined. A pathway for the transfer of pumped protons beyond the binuclear center also remains undefined. It is believed that electrons are passed from cytochrome c (the electron donor) to the low-spin heme via ccoP and ccoO, respectively, and directly from the low-spin heme to the binuclear center. 29933 cd01662: Ubiquinol oxidase subunit I. Ubiquinol oxidase, the terminal oxidase in the respiratory chains of aerobic bacteria, is a multi-chain transmembrane protein located in the cell membrane. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits in ubiquinol oxidase varies from two to five. Subunit I contains a heme-copper binuclear center (the active site where O2 is reduced to water) formed by a high-spin heme and a copper ion. It also contains a low-spin heme, believed to participate in the transfer of electrons from ubiquinol to the binuclear center. For every reduction of an O2 molecule, eight protons are taken from the inside aqueous compartment and four electrons are taken from ubiquinol on the opposite side of the membrane. The four electrons and four of the protons are used in the reduction of O2; the four remaining protons are pumped across the membrane. This charge separation of four charges contributes to the electrochemical gradient used for ATP synthesis. Two proton channels, the D-pathway and K-pathway, leading to the binuclear center have been identified in subunit I. It is generally believed that the channels contain water molecules that act as 'proton wires' to transfer the protons. A well-defined pathway for the transfer of pumped protons beyond the binuclear center has not been identified. Electrons are believed to be transferred directly from ubiquinol (the electron donor) to the low-spin heme, and directly from the low-spin heme to the binuclear center. 29934 cd01663: Cytochrome C oxidase subunit I. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Only subunits I and II are essential for function, but subunit III, which is also conserved, may play a role in assembly or oxygen delivery to the active site. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Subunit I contains a heme-copper binuclear center (the active site where O2 is reduced to water) formed by a high-spin heme (heme a3) and a copper ion (CuB). It also contains a low-spin heme (heme a), believed to participate in the transfer of electrons to the binuclear center. For every reduction of an O2 molecule, eight protons are taken from the inside aqueous compartment and four electrons are taken from cytochrome c on the opposite side of the membrane. The four electrons and four of the protons are used in the reduction of O2; the four remaining protons are pumped across the membrane. This charge separation of four charges contributes to the electrochemical gradient used for ATP synthesis. Two proton channels, the D-pathway and K-pathway, leading to the binuclear center have been identified in subunit I. A well-defined pathway for the transfer of pumped protons beyond the binuclear center has not been identified. Electrons are transferred from cytochrome c (the electron donor) to heme a via the CuA binuclear site in subunit II, and directly from heme a to the binuclear center. 29935 cd00344: Fructose-bisphosphate aldolase class I. Fructose-1,6-bisphosphate aldolase is an enzyme of the glycolytic and gluconeogenic pathways found in vertebrates, plants, and bacteria. The enzyme catalyzes the cleavage of fructose 1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate (DHAP). Mutations in the aldolase genes in humans cause hemolytic anemia and hereditary fructose intolerance. The enzyme is a member of the class I aldolase family, which utilizes covalent catalysis through a Schiff base formed between a lysine residue of the enzyme and ketose substrates. Although structurally similar, the class II aldolases use a different mechanism and are believed to have an independent evolutionary origin. 29936 cd00408: Dihydrodipicolinate synthase family. A member of the class I aldolases, which use an active-site lysine which stablilzes a reaction intermediate via Schiff base formation, and have TIM beta/alpha barrel fold. The dihydrodipicolinate synthase family comprises several pyruvate-dependent class I aldolases that use the same catalytic step to catalyze different reactions in different pathways and includes such proteins as N-acetylneuraminate lyase, MosA protein, 5-keto-4-deoxy-glucarate dehydratase, trans-o-hydroxybenzylidenepyruvate hydratase-aldolase, trans-2 '-carboxybenzalpyruvate hydratase-aldolase, and 2-keto-3-deoxy- gluconate aldolase. The family is also referred to as the N-acetylneuraminate lyase (NAL) family. 29938 cd00452: KDPG and KHG aldolase. This family belongs to the class I adolases whose reaction mechanism involves Schiff base formation between a substrate carbonyl and lysine residue in the active site. 2-keto-3-deoxy-6-phosphogluconate (KDPG) aldolase, is best known for its role in the Entner-Doudoroff pathway of bacteria, where it catalyzes the reversible cleavage of KDPG to pyruvate and glyceraldehyde-3-phosphate. 2-keto-4-hydroxyglutarate (KHG) aldolase, which has enzymatic specificity toward glyoxylate, forming KHG in the presence of pyruvate, and is capable of regulating glyoxylate levels in the glyoxylate bypass, an alternate pathway when bacteria are grown on acetate carbon sources. 29939 cd00502: Type I 3-dehydroquinase, (3-dehydroquinate dehydratase or DHQase.) Catalyzes the cis-dehydration of 3-dehydroquinate via a covalent imine intermediate to produce dehydroshikimate. Dehydroquinase is the third enzyme in the shikimate pathway, which is involved in the biosynthesis of aromatic amino acids. Type I DHQase exists as a homodimer. Type II 3-dehydroquinase also catalyzes the same overall reaction, but is unrelated in terms of sequence and structure, and utilizes a completely different reaction mechanism. 29940 cd00945: Class I aldolases. The class I aldolases use an active-site lysine which stablilzes a reaction intermediates via Schiff base formation, and have TIM beta/alpha barrel fold. The members of this family include 2-keto-3-deoxy-6-phosphogluconate (KDPG) and 2-keto-4-hydroxyglutarate (KHG) aldolases, transaldolase, dihydrodipicolinate synthase sub-family, Type I 3-dehydroquinate dehydratase, DeoC and DhnA proteins, and metal-independent fructose-1,6-bisphosphate aldolase. Although structurally similar, the class II aldolases use a different mechanism and are believed to have an independent evolutionary origin. 29941 cd00948: Fructose-1,6-bisphosphate aldolase. The enzyme catalyzes the cleavage of fructose 1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate (DHAP). This family includes proteins found in vertebrates, plants, and bacterial plant pathogens. Mutations in the aldolase genes in humans cause hemolytic anemia and hereditary fructose intolerance. The enzyme is a member of the class I aldolase family, which utilizes covalent catalysis through a Schiff base formed between a lysine residue of the enzyme and ketose substrates. 29942 cd00949: Fructose-1.6-bisphosphate aldolase found in gram +/- bacteria. The enzyme catalyzes the cleavage of fructose 1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate (DHAP). The enzyme is member of the class I aldolase family, which utilizes covalent catalysis through a Schiff base formed between a lysine residue of the enzyme and ketose substrates. 29943 cd00950: Dihydrodipicolinate synthase (DHDPS) is a key enzyme in lysine biosynthesis. It catalyzes the aldol condensation of L-aspartate-beta- semialdehyde and pyruvate to dihydropicolinic acid via a Schiff base formation between pyruvate and a lysine residue. The functional enzyme is a homotetramer consisting of a dimer of dimers. DHDPS is member of dihydrodipicolinate synthase family that comprises several pyruvate-dependent class I aldolases that use the same catalytic step to catalyze different reactions in different pathways. 29944 cd00951: 5-dehydro-4-deoxyglucarate dehydratase, also called 5-keto-4-deoxy-glucarate dehydratase (KDGDH), which is member of dihydrodipicolinate synthase (DHDPS) family that comprises several pyruvate-dependent class I aldolases. The enzyme is involved in glucarate metabolism, and its mechanism presumbly involves a Schiff-base intermediate similar to members of DHDPS family. While in the case of Pseudomonas sp. 5-dehydro-4-deoxy-D-glucarate is degraded by KDGDH to 2,5-dioxopentanoate, in certain species of Enterobacteriaceae it is degraded instead to pyruvate and glycerate. 29945 cd00952: Trans-o-hydroxybenzylidenepyruvate hydratase-aldolase (HBPHA) and trans-2'-carboxybenzalpyruvate hydratase-aldolase (CBPHA). HBPHA catalyzes HBP to salicyaldehyde and pyruvate. This reaction is part of the degradative pathways for naphthalene and naphthalenesulfonates by bacteria. CBPHA is homologous to HBPHA and catalyzes the cleavage of CBP to 2-carboxylbenzaldehyde and pyruvate during the degradation of phenanthrene. They are member of the DHDPS family of Schiff-base-dependent class I aldolases. 29946 cd00953: KDG (2-keto-3-deoxygluconate) aldolases found in archaea. This subfamily of enzymes is adapted for high thermostability and shows specificity for non-phosphorylated substrates. The enzyme catalyses the reversible aldol cleavage of 2-keto-3-dexoygluconate to pyruvate and glyceraldehyde, the third step of a modified non-phosphorylated Entner-Doudoroff pathway of glucose oxidation. KDG aldolase shows no significant sequence similarity to microbial 2-keto-3-deoxyphosphogluconate (KDPG) aldolases, and the enzyme shows no activity with glyceraldehyde 3-phosphate as substrate. The enzyme is a tetramer and a member of the DHDPS family of Schiff-base-dependent class I aldolases. 29947 cd00954: N-Acetylneuraminic acid aldolase, also called N-acetylneuraminate lyase (NAL), which catalyses the reversible aldol reaction of N-acetyl-D-mannosamine and pyruvate to give N-acetyl-D-neuraminic acid (D-sialic acid). It has a widespread application as biocatalyst for the synthesis of sialic acid and its derivatives. This enzyme has been shown to be quite specific for pyruvate as the donor, but flexible to a variety of D- and, to some extent, L-hexoses and pentoses as acceptor substrates. NAL is member of dihydrodipicolinate synthase family that comprises several pyruvate-dependent class I aldolases. 29948 cd00955: Transaldolase-like proteins from plants and bacteria. Transaldolase is found in the non-oxidative branch of the pentose phosphate pathway, that catalyze the reversible transfer of a dihydroxyacetone group from fructose-6-phosphate to erythrose-4-phosphate yielding sedoheptulose-7-phosphate and glyceraldehyde-3-phosphate. They are members of the class I aldolases, who are characterized by using a Schiff-base mechanism for stabilization of the reaction intermediates. 29949 cd00956: Transaldolase-like fructose-6-phosphate aldolases (FSA) found in bacteria and archaea, which are member of the MipB/TalC subfamily of class I aldolases. FSA catalyze an aldol cleavage of fructose 6-phosphate and do not utilize fructose, fructose 1-phosphate, fructose 1,6-phosphate, or dihydroxyacetone phosphate. The enzymes belong to the transaldolase family that serves in transfer reactions in the pentose phosphate cycle, and are more distantly related to fructose 1,6-bisphosphate aldolase. 29950 cd00957: Transaldolases including both TalA and TalB. The enzyme catalyses the reversible transfer of a dyhydroxyacetone moiety, derived from fructose-6-phosphate to erythrose-4-phosphate yielding sedoheptulose-7-phosphate and glyceraldehyde-3-phosphate. The catalytic mechanism is similar to other class I aldolases. The enzyme is found in the non-oxidative branch of the pentose phosphate pathway and forms a dimer in solution. 29951 cd00958: Class I fructose-1,6-bisphosphate (FBP) aldolases of the archaeal type (DhnA homologs) found in bacteria and archaea. Catalysis of the enzymes proceeds via a Schiff-base mechanism like other class I aldolases, although this subfamily is clearly divergent based on sequence similarity to other class I and class II (metal dependent) aldolase subfamilies. 29952 cd00959: 2-deoxyribose-5-phosphate aldolase (DERA) of the DeoC family. DERA belongs to the class I aldolases and catalyzes a reversible aldol reaction between acetaldehyde and glyceraldehyde 3-phosphate to generate 2-deoxyribose 5-phosphate. DERA is unique in catalyzing the aldol reaction between two aldehydes, and its broad substrate specificity confers considerable utility as a biocatalyst, offering an environmentally benign alternative to chiral transition metal catalysis of the asymmetric aldol reaction. 29953 cd00515: NTPase/HAM1. This family consists of the HAM1 protein and pyrophosphate-releasing xanthosine/ inosine triphosphatase. HAM1 protects the cell against mutagenesis by the base analog 6-N-hydroxylaminopurine (HAP) in E. Coli and S. cerevisiae. A Ham1-related protein from Methanococcus jannaschii is a novel NTPase that has been shown to hydrolyze nonstandard nucleotides such as XTP to XMP and ITP to IMP, but not the standard nucleotides, in the presence of Mg or Mn ions. The enzyme exists as a homodimer. The HAM1 protein may be acting as an NTPase by hydrolyzing the HAP triphosphate. 29954 cd00555: Nucleotide binding protein Maf. Maf has been implicated in inhibition of septum formation in eukaryotes, bacteria and archaea, but homologs in B.subtilis and S.cerevisiae are nonessential for cell division. Maf has been predicted to be a nucleotide- or nucleic acid-binding protein with structural similarity to the hypoxanthine/xanthine NTP pyrophosphatase Ham1 from Methanococcus jannaschii, RNase H from Escherichia coli, and some other nucleotide or RNA-binding proteins. 29956 cd00221: Very Short Patch Repair (Vsr) Endonuclease. Endonucleases in DNA repair that recognize damaged DNA and cleave the phosphodiester backbone. Vsr endonucleases have a common endonuclease topology that has been tailored for recognition of TG mismatches. 29957 cd00523: Holliday junction resolvases (HJRs) are endonucleases that specifically resolve Holliday junction DNA intermediates during homologous recombination. HJR's occur in archaea, bacteria, and in the mitochondria of certain eukaryotes, however this CD includes only the archeal HJR's. The bacterial and archeal HJRs perform a similar function but differ in both sequence and structure. Structural similarity does however, exist between the archeal HJRs and type II restriction endonucleases, such as EcoRV, BglII, and Fok, and this similarity includes their active site configurations. 29958 cd00583: MutH is a 28kD endonuclease involved in methyl-directed DNA mismatch repair in gram negative bacteria. MutH is both sequence-specific and methylation-specific, introducing a nick in the unmethylated strand of a hemi-methylated d(GATC) DNA duplex. MutH is homologous to the type II restriction endonuclease Sau3AI which also recognizes the d(GATC) sequence however, Sau3AI cleaves both strands regardless of their methylation state. The active form of MutH is monomeric while that of Sau3AI is homodimeric. In addition to MutH, MutS, involved in mismatch recognition, and MutL, involved in mediating the interactions between MutH and MutS, are essential in initiating mismatch repair in Escherichia coli. 29959 cd01037: Superfamily of nucleases including Short Patch Repair (Vsr) Endonucleases, archaeal Holliday junction resolvases, MutH methy-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. 29960 cd01038: Domain of unknown function, appears to be related to a diverse group of endonucleases. 29962 cd02431: Ferritin_CCC1_like_C: The proteins of this family contain two domains. This is the C-terminal domain that is closely related to the CCC1, a vacuole transmembrane protein functioning as an iron and manganese transporter. The N-terminal domain is similar to ferritin-like diiron-carboxylate proteins, which are involved in a variety of iron ion related functions, such as iron storage and regulation, mono-oxygenation, and reactive radical production. This family may be unique to certain bacteria and archaea. . 29963 cd02432: Nodulin-21_like_1: This is a family of proteins closely related to nodulin-21, a plant nodule-specific protein that may be involved in symbiotic nitrogen fixation. This family is also related to CCC1, a yeast vacuole transmembrane protein that functions as an iron and manganese transporter. . 29964 cd02433: Nodulin-21_like_2: This is a family of proteins closely related to nodulin-21, a plant nodule-specific protein that may be involved in symbiotic nitrogen fixation. This family is also related to CCC1, a yeast vacuole transmembrane protein that functions as an iron and manganese transporter. . 29965 cd02434: Nodulin-21_like_3: This is a family of proteins closely related to nodulin-21, a plant nodule-specific protein that may be involved in symbiotic nitrogen fixation. This family is also related to CCC1, a yeast vacuole transmembrane protein that functions as an iron and manganese transporter. . 29966 cd02435: CCC1: This domain is present in the CCC1, an iron and manganese transporter of Saccharomyces cerevisiae. CCC1 is a transmembrane protein that is located in the vacuole and transfers the iron and manganese ions from the cytosol to the vacuole. This domain may be unique to certain fungi and plants. 29967 cd02436: Nodulin-21: This is a family of proteins that may be unique to certain plants. The family member in soybean is found to be nodule-specific and is abundant during nodule development. The proteins of this family thus may play a role in symbiotic nitrogen fixation. . 29968 cd02437: CCC1_like_1: This is a protein family closely related to CCC1, a family of proteins involved in iron and manganese transport. Yeast CCC1 is a vacuole transmembrane protein responsible for the iron and manganese accumulation in vacuole. . 29969 cd01066: A family including aminopeptidase P, aminopeptidase M, and prolidase. Also known as metallopeptidase family M24. This family of enzymes is able to cleave amido-, imido- and amidino-containing bonds. Members exibit relatively narrow substrate specificity compared to other metallo-aminopeptidases, suggesting they play roles in regulation of biological processes rather than general protein degradation. 29970 cd01085: X-Prolyl Aminopeptidase 2. E.C. 3.4.11.9. Also known as X-Pro aminopeptidase, proline aminopeptidase, aminopeptidase P, and aminoacylproline aminopeptidase. Catalyses release of any N-terminal amino acid, including proline, that is linked with proline, even from a dipeptide or tripeptide. 29971 cd01086: Methionine Aminopeptidase 1. E.C. 3.4.11.18. Also known as methionyl aminopeptidase and Peptidase M. Catalyzes release of N-terminal amino acids, preferentially methionine, from peptides and arylamides. 29972 cd01087: Prolidase. E.C. 3.4.13.9. Also known as Xaa-Pro dipeptidase, X-Pro dipeptidase, proline dipeptidase., imidodipeptidase, peptidase D, gamma-peptidase. Catalyses hydrolysis of Xaa-Pro dipeptides; also acts on aminoacyl-hydroxyproline analogs. No action on Pro-Pro. 29973 cd01088: Methionine Aminopeptidase 2. E.C. 3.4.11.18. Also known as methionyl aminopeptidase and peptidase M. Catalyzes release of N-terminal amino acids, preferentially methionine, from peptides and arylamides. 29975 cd01090: Creatine amidinohydrolase. E.C.3.5.3.3. Hydrolyzes creatine to sarcosine and urea. 29976 cd01091: Related to aminopeptidase P and aminopeptidase M, a member of this domain family is present in cell division control protein 68, a transcription factor. 29977 cd01092: Similar to Prolidase and Aminopeptidase P. The members of this subfamily presumably catalyse hydrolysis of Xaa-Pro dipeptides and/or release of any N-terminal amino acid, including proline, that is linked with proline. 29978 cd00454: Truncated hemoglobins (trHbs) are a family of oxygen-binding heme proteins found in cyanobacteria, eubacteria, unicellular eukaryotes, and plants. The truncated hemoglobins have a characteristic two-over-two alpha helical folding pattern that is distinct from the three-over-three pattern found in other globins. A subset of these have been demonstrated to form homodimers. 29979 cd01040: Globins are heme proteins, which bind and transport oxygen. This family summarizes a diverse set of homologous protein domains, including: (1) tetrameric vertebrate hemoglobins, which are the major protein component of erythrocytes and transport oxygen in the bloodstream, (2) microorganismal flavohemoglobins, which are linked to C-terminal FAD-dependend reductase domains, (3) homodimeric bacterial hemoglobins, such as from Vitreoscilla, (4) plant leghemoglobins (symbiotic hemoglobins, involved in nitrogen metabolism in plant rhizomes), (5) plant non-symbiotic hexacoordinate globins and hexacoordinate globins from bacteria and animals, such as neuroglobin, (6) invertebrate hemoglobins, which may occur in tandem-repeat arrangements, and (7) monomeric myoglobins found in animal muscle tissue. 29980 cd01067: superfamily containing globins and truncated hemoglobins. 29981 cd01068: Globin domain present in Globin-Coupled-Sensors (GCS). These domains detect changes in intracellular concentrations of oxygen, carbon monoxyde, or nitrous oxide, which result in aerotaxis and/or gene regulation. One subgroup, the HemATs, are aerotactic heme sensors combining a globin with an MCP signaling domain, others function as gene regulators, by direct combination with DNA-binding domains, with domains modulating 2nd messengers, or with domains interacting with transcription factors or regulators. 29982 cd00544: Adenosylcobinamide kinase / adenosylcobinamide phosphate guanyltransferase (CobU). CobU is bifunctional cobalbumin biosynthesis enzymes which display adenosylcobinamide kinase and adenosylcobinamide phosphate guanyltransferase activity. This enzyme is a homotrimer with a propeller-like shape. 29983 cd00561: ATP:corrinoid adenosyltransferase BtuR/CobO/CobP. This family consists of the BtuR, CobO, CobP proteins all of which are Cob(I)alamin (vitamin B12) adenosyltransferase, which is involved in cobalamin (vitamin B12) biosynthesis. This enzyme is a homodimer, which catalyzes the adenosylation reaction: ATP + cob(I)alamin + H2O <=> phosphate + diphosphate + adenosylcobalamin. 29984 cd00983: RecA is a bacterial enzyme which has roles in homologous recombination, DNA repair, and the induction of the SOS response. RecA couples ATP hydrolysis to DNA strand exchange. 29985 cd00984: DnaB helicase C terminal domain. The hexameric helicase DnaB unwinds the DNA duplex at the chromosome replication fork. Although the mechanism by which DnaB both couples ATP hydrolysis to translocation along DNA and denatures the duplex is unknown, a change in the quaternary structure of the protein involving dimerization of the N-terminal domain has been observed and may occur during the enzymatic cycle. This C-terminal domain contains an ATP-binding site and is therefore probably the site of ATP hydrolysis. 29986 cd01120: RecA-like NTPases. This family includes the NTP binding domain of F1 and V1 H+ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. This group also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion. 29987 cd01121: Sms (bacterial radA) DNA repair protein. This protein is not related to archael radA any more than is to other RecA-like NTPases. Sms has a role in recombination and recombinational repair and is responsible for the stabilization or processing of branched DNA molecules. 29988 cd01122: GP4d_helicase is a homohexameric 5'-3' helicases. Helicases couple NTP hydrolysis to the unwinding of nucleic acid duplexes into their component strands. 29989 cd01123: Rad51_DMC1_radA,B. This group of recombinases includes the eukaryotic proteins RAD51, RAD55/57 and the meiosis-specific protein DMC1, and the archaeal proteins radA and radB. They are closely related to the bacterial RecA group. Rad51 proteins catalyze a similiar recombination reaction as RecA, using ATP-dependent DNA binding activity and a DNA-dependent ATPase. However, this reaction is less efficient and requires accessory proteins such as RAD55/57 .. 29990 cd01124: KaiC is a circadian clock protein primarily found in cyanobacteria KaiC is a RecA-like ATPase, having both Walker A and Walker B motifs. A related protein is found in archaea. 29991 cd01125: Hexameric Replicative Helicase RepA. RepA is encoded by a plasmid, which is found in most Gram negative bacteria. RepA is a 5'-3' DNA helicase which can utilize ATP, GTP and CTP to a lesser extent. 29992 cd01126: The TraG/TraD/VirD4 family are bacterial conjugation proteins involved in type IV secretion. These proteins aid the transfer of DNA from the plasmid into the host bacterial chromosome. They contain an ATP binding domain. VirD4 is involved in DNA transfer to plant cells and is required for virulence. 29993 cd01127: Bacterial conjugation protein TrwB, ATP binding domain. TrwB is a homohexamer encoded by conjugative plasmids in Gram-negative bacteria. TrwB also has an all alpha domain which has been hypothesized to be responsible for DNA binding. TrwB is a component of Type IV secretion and is responsible for the horizontal transfer of DNA between bacteria. 29994 cd01128: Transcription termination factor rho is a bacterial ATP-dependent RNA/DNA helicase. It is a homohexamer. Each monomer consists of an N-terminal domain of the OB fold, which is responsible for binding to cysteine rich nucleotides. This alignment is of the C-terminal ATP binding domain. 29995 cd01129: PulE/GspE The type II secretory pathway is the main terminal branch of the general secretory pathway (GSP). It is responsible for the export the majority of Gram-negative bacterial exoenzymes and toxins. PulE is a cytoplasmic protein of the GSP, which contains an ATP binding site and a tetracysteine motif. This subgroup also includes PillB and HofB. 29996 cd01130: Type IV secretory pathway component VirB11, and related ATPases. The homohexamer, VirB11 is one of eleven Vir proteins, which are required for T-pilus biogenesis and virulence in the transfer of T-DNA from the Ti (tumor-inducing) plasmid of bacterial to plant cells. The pilus is a fibrous cell surface organelle, which mediates adhesion between bacteria during conjugative transfer or between bacteria and host eukaryotic cells during infection. VirB11- related ATPases include the archaeal flagella biosynthesis protein and the pilus assembly proteins CpaF/TadA and TrbB. This alignment contains the C-terminal domain, which is the ATPase. 29997 cd01131: Pilus retraction ATPase PilT. PilT is a nucleotide binding protein responsible for the retraction of type IV pili, likely by pili disassembly. This retraction provides the force required for travel of bacteria in low water environments by a mechanism known as twitching motility. 29998 cd01132: F1 ATP synthase alpha, central domain. The F-ATPase is found in bacterial plasma membranes, mitochondrial inner membranes and in chloroplast thylakoid membranes. It has also been found in the archaea Methanosarcina barkeri. It uses a proton gradient to drive ATP synthesis and hydrolyzes ATP to build the proton gradient. The extrinisic membrane domain, F1, is composed of alpha, beta, gamma, delta and epsilon subunits with a stoichiometry of 3:3:1:1:1. The alpha subunit of the F1 ATP synthase can bind nucleotides, but is non-catalytic. 29999 cd01133: F1 ATP synthase beta subunit, nucleotide-binding domain. The F-ATPase is found in bacterial plasma membranes, mitochondrial inner membranes and in chloroplast thylakoid membranes. It has also been found in the archaea Methanosarcina barkeri. It uses a proton gradient to drive ATP synthesis and hydrolyzes ATP to build the proton gradient. The extrinisic membrane domain, F1, is composed of alpha, beta, gamma, delta and epsilon subunits with a stoichiometry of 3:3:1:1:1. The beta subunit of ATP synthase is catalytic. 30000 cd01134: V/A-type ATP synthase catalytic subunit A. These ATPases couple ATP hydrolysis to the build up of a H+ gradient, but V-type ATPases do not catalyze the reverse reaction. The Vacuolar (V-type) ATPase is found in the membranes of vacuoles, the golgi apparatus and in other coated vesicles in eukaryotes. Archaea have a protein which is similar in sequence to V-ATPases, but functions like an F-ATPase (called A-ATPase). A similar protein is also found in a few bacteria. 30001 cd01135: V/A-type ATP synthase (non-catalytic) subunit B. These ATPases couple ATP hydrolysis to the build up of a H+ gradient, but V-type ATPases do not catalyze the reverse reaction. The Vacuolar (V-type) ATPase is found in the membranes of vacuoles, the golgi apparatus and in other coated vesicles in eukaryotes. Archaea have a protein which is similar in sequence to V-ATPases, but functions like an F-ATPase (called A-ATPase). A similar protein is also found in a few bacteria. This subfamily consists of the non-catalytic beta subunit. 30002 cd01136: Flagellum-specific ATPase/type III secretory pathway virulence-related protein. This group of ATPases are responsible for the export of flagellum and virulence-related proteins. The bacterial flagellar motor is similar to the F0F1-ATPase, in that they both are proton driven rotary molecular devices. However, the main function of the bacterial flagellar motor is to rotate the flagellar filament for cell motility. Intracellular pathogens such as Salmonella and Chlamydia also have proteins which are similar to the flagellar-specific ATPase, but function in the secretion of virulence-related proteins via the type III secretory pathway. 30003 cd01393: RecA is a bacterial enzyme which has roles in homologous recombination, DNA repair, and the induction of the SOS response. RecA couples ATP hydrolysis to DNA strand exchange. While prokaryotes have a single RecA protein, eukaryotes have multiple RecA homologs such as Rad51, DMC1 and Rad55/57. Archaea have the RecA-like homologs radA and radB. 30004 cd01394: RadB. The archaeal protein radB shares similarity radA, the archaeal functional homologue to the bacterial RecA. The precise function of radB is unclear. 30005 cd00497: PseudoU_synth_TruA: Pseudouridine synthase, TruA family. This group consists of eukaryotic, bacterial and archeal pseudouridine synthases similar to Escherichia coli TruA, Saccharomyces cerevisiae Pus1p, S. cerevisiae Pus3p Caenorhabditis elegans Pus1p and human PUS1. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required. S. cerevisiae PUS1 catalyzes the formation of psi34 and psi36 in the intron containing tRNAIle, psi35 in the intron containing tRNATyr, psi27 and/or psi28 in several yeast cytoplasmic tRNAs and, psi44 in U2 small nuclear RNA (U2 snRNA). The presence of the intron is required for the formation of psi 34, 35 and 36. In addition S. cerevisiae PUS1 makes psi 26, 65 and 67. C. elegans Pus1p does not modify psi44 in U2 snRNA. S. cerevisiae Pus3p makes psi38 and psi39 in tRNAs. Psi44 in U2 snRNA and, psi38 and psi39 in tRNAs are highly phylogenetically conserved. Psi 26,27,28,34,35,36,65 and 67 in tRNAs are less highly conserved. Mouse Pus1p regulates nuclear receptor activity through pseudouridylation of Steroid Receptor RNA Activator. Missense mutation in human PUS1 causes mitochondrial myopathy and sideroblastic anemia (MLASA).. 30006 cd00506: PseudoU_synth_TruB: Pseudouridine synthase, TruB family. This group consists of eukaryotic, bacterial and archeal pseudouridine synthases similar to Escherichia coli TruB, Saccharomyces cerevisiae Pus4, M. tuberculosis TruB, S. cerevisiae Cbf5 and human dyskerin. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required. E. coli TruB, M. tuberculosis TruB and S. cerevisiae Pus4, make psi55 in the T loop of tRNAs. Pus4 catalyses the formation of psi55 in both cytoplasmic and mitochondrial tRNAs. Psi55 is almost universally conserved. S. cerevisiae Cbf5 and human dyskerin are nucleolar proteins that, with the help of guide RNAs, make the hundreds of psueudouridnes present in rRNA and small nuclear RNAs (snRNAs). Cbf5/Dyskerin is the catalytic subunit of eukaryotic box H/ACA small nucleolar ribonucleoprotein (snoRNP) particles. Mutations in human dyskerin cause X-linked dyskeratosis congenitas. 30008 cd02550: PseudoU_synth_Rsu_Rlu: Pseudouridine synthase, Rsu/Rlu family. This group is comprised of eukaryotic, bacterial and archeal proteins similar to eight site specific Escherichia coli pseudouridine synthases: RsuA, RluA, RluB, RluC, RluD, RluE, RluF and TruA. Pseudouridine synthases catalyze the isomerization of specific uridines in a n RNA molecule to pseudouridines (5-ribosyluracil, psi) requiring no cofactors. E. coli RluC for example makes psi955, 2504 and 2580 in 23S RNA. Some psi sites such as psi1917 in 23S RNA made by RluD are universally conserved. Other psi sites occur in a more restricted fashion, for example psi2819 in 21S mitochondrial ribosomal RNA made by S. cerevisiae Pus5p is only found in mitochondrial large subunit rRNAs from some other species and in gram negative bacteria. The E. coli counterpart of this psi residue is psi2580 in 23S rRNA. psi2604in 23S RNA made by RluF has only been detected in E.coli. 30010 cd02553: PseudoU_synth_RsuA: Pseudouridine synthase, Escherichia coli RsuA like. This group is comprised of eukaryotic and bacterial proteins similar to Escherichia coli RsuA. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required. E.coli RsuA makes psi516 in 16S RNA. Psi at this position is not generally conserved in other organisms. 30011 cd02554: PseudoU_synth_RluF_like: Pseudouridine synthase, Escherichia coli RluF like. This group is comprised of bacterial proteins similar to Escherichia coli RluF. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required. E.coli RluF makes psi2604 in 23S RNA. psi2604 has only been detected in E. coli. It is absent from other eubacteria despite a precursor U at that site and from eukarya and archea which lack a precursor U at that site. 30012 cd02555: PSSA_1: Pseudouridine synthase, a subgroup of the RsuA family. This group is comprised of bacterial proteins assigned to the RsuA family of pseudouridine synthases. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required. The TruA family is comprised of proteins related to Escherichia coli RsuA. . 30013 cd02556: PseudoU_synth_RluB: Pseudouridine synthase, Escherichia coli RluB like. This group is comprised of bacterial and eukaryotic proteins similar to E. coli RluB. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required. E.coli RluB makes psi2605 in 23S RNA. psi2605 has been detected in eubacteria but, not in eukarya and archea despite the presence of a precursor U at that site. 30015 cd02558: PSRA_1: Pseudouridine synthase, a subgroup of the RluA family. This group is comprised of bacterial proteins assigned to the RluA family of pseudouridine synthases. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required. The RluA family is comprised of proteins related to Escherichia coli RluA. . 30016 cd02563: tRNA pseudouridine isomerase C: Pseudouridine synthases catalyze the isomerization of specific uridines in an tRNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required. TruC makes psi65 in tRNAs. This psi residue is not universally conserved. 30017 cd02566: PseudoU_synth_RluE: Pseudouridine synthase, Escherichia coli RluE. This group is comprised of bacterial proteins similar to E. coli RluE. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required. Escherichia coli RluE makes psi2457 in 23S RNA. psi2457 is not universally conserved. 30019 cd02569: PseudoU_synth_ScPus3-like: Pseudouridine synthase, Saccharomyces cerevisiae Pus3 like. This group consists of eukaryotic pseudouridine synthases similar to S. cerevisiae Pus3p, mouse Pus3p and, human PUS2. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required. S. cerevisiae Pus3p makes psi38 and psi39 in tRNAs. Mouse Pus3p has been shown to makes psi38 and, possibly also psi 39, in tRNAs. Psi38 and psi39 are highly conserved in tRNAs from eubacteria, archea and eukarya. 30020 cd02570: PseudoU_synth_EcTruA: Pseudouridine synthase, Escherichia coli TruA like. This group consists of eukaryotic and bacterial pseudouridine synthases similar to E. coli TruA, Pseudomonas aeruginosa truA and human pseudouridine synthase-like 1 (PUSL1). Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required. E. coli TruA makes psi38/39 and/or 40 in tRNA. psi38 and psi39 in tRNAs are highly phylogenetically conserved. P. aeruginosa truA is required for induction of type III secretory genes and may act through modifying tRNAs critical for the expression of type III genes or their regulators. 30021 cd02572: PseudoU_synth_hDyskerin_Like: Pseudouridine synthase, human dyskerin like. This group consists of eukaryotic and archeal pseudouridine synthases similar to human dyskerin, Saccharomyces cerevisiae Cbf5, and Drosophila melanogaster Mfl (minifly protein). Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactor is required. S. cerevisiae Cbf5 and human dyskerin are nucleolar proteins that, with the help of guide RNAs, make the hundreds of psueudouridnes present in rRNA and small nuclear RNAs (snRNAs). Cbf5/Dyskerin is the catalytic subunit of eukaryotic box H/ACA small nucleolar ribonucleoprotein (snoRNP) particles. D. melanogaster mfl hosts in its fourth intron, a box H/AC snoRNA gene. In addition dyskerin is likely to have a structural role in the telomerase complex. Mutations in human dyskerin cause X-linked dyskeratosis congenitas. Mutations in Drosophila Mfl results in miniflies that suffer abnormalities. . 30022 cd02573: PseudoU_synth_EcTruB: Pseudouridine synthase, Escherichia coli TruB like. This group consists of bacterial pseudouridine synthases similar to E. coli TruB and Mycobacterium tuberculosis TruB. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). E. coli TruB and M. tuberculosis TruB make psi55 in the T loop of tRNAs. Psi55 is nearly universally conserved. E. coli TruB is not inhibited by RNA containing 5-fluorouridine. . 30026 cd02866: PseudoU_synth_archea: Pseudouridine synthase,. This group consists of archealpseudouridine synthases.Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required. This group of proteins make Psedouridine in tRNAs. 30027 cd02867: PseudoU_synth_TruB_4: Pseudouridine synthase homolog 4. This group consists of Eukaryotic TruB proteins similar to Saccharomyces cerevisiae Pus4. S. cerevisiae Pus4, makes psi55 in the T loop of both cytoplasmic and mitochondrial tRNAs. Psi55 is almost universally conserved. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi).. 30029 cd02869: PseudoU_synth_RsuA/RluD: Pseudouridine synthase, RsuA/RluD family. This group is comprised of eukaryotic, bacterial and archeal proteins similar to eight site specific Escherichia coli pseudouridine synthases: RsuA, RluA, RluB, RluC, RluD, RluE, RluF and TruA. Pseudouridine synthases catalyze the isomerization of specific uridines in a n RNA molecule to pseudouridines (5-ribosyluracil, psi) requiring no cofactors. E. coli RluC for example makes psi955, 2504 and 2580 in 23S RNA. Some psi sites such as psi1917 in 23S RNA made by RluD are universally conserved. Other psi sites occur in a more restricted fashion, for example psi2819 in 21S mitochondrial ribosomal RNA made by S. cerevisiae Pus5p is only found in mitochondrial large subunit rRNAs from some other species and in gram negative bacteria. The E. coli counterpart of this psi residue is psi2580 in 23S rRNA. psi2604in 23S RNA made by RluF has only been detected in E.coli. . 30030 cd02870: Pseudouridine synthases are responsible for the synthesis of pseudouridine from uracil in ribosomal RNA. The RsuA subfamily includes Pseudouridine Synthase similar to Ribosomal small subunit pseudouridine 516 synthase. Most of the proteins in this family are bacterial proteins. 30031 cd00375: Urease alpha-subunit; Urease is a nickel-dependent metalloenzyme that catalyzes the hydrolysis of urea to form ammonia and carbon dioxide. Nickel-dependent ureases are found in bacteria, fungi and plants. Their primary role is to allow the use of external and internally generated urea as a nitrogen source. The enzyme consists of 3 subunits, alpha, beta and gamma, which can be fused and present on a single protein chain and which in turn forms multimers, mainly trimers. The large alpha subunit is the catalytic domain containing an active site with a bi-nickel center complexed by a carbamylated lysine. The beta and gamma subunits play a role in subunit association to form the higher order trimers. 30033 cd00530: Phosphotriesterase (PTE) catalyzes the hydrolysis of organophosphate nerve agents, including the chemical warfare agents VX, soman, and sarin as well as the insecticide paraoxon. PTE exists as a homodimer with one active site per monomer. The active site is located next to a binuclear metal center, at the C-terminal end of a TIM alpha- beta barrel motif. The native enzyme contains two zinc ions at the active site however these can be replaced with other metals such as cobalt, cadmium, nickel or manganese and the enzyme remains active. 30034 cd00854: N-acetylglucosamine-6-phosphate deacetylase, NagA, catalyzes the hydrolysis of the N-acetyl group of N-acetyl-glucosamine-6-phosphate (GlcNAc-6-P) to glucosamine 6-phosphate and acetate. This is the first committed step in the biosynthetic pathway to amino-sugar-nucleotides, which is needed for cell wall peptidoglycan and teichoic acid biosynthesis. Deacetylation of N-acetylglucosamine is also important in lipopolysaccharide synthesis and cell wall recycling. 30035 cd01292: Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others. 30036 cd01293: Bacterial cytosine deaminase and related metal-dependent hydrolases. Cytosine deaminases (CDs) catalyze the deamination of cytosine, producing uracil and ammonia. They play an important role in pyrimidine salvage. CDs are present in prokaryotes and fungi, but not mammalian cells. The bacterial enzymes, but not the fungal enzymes, are related to the adenosine deaminases (ADA). The bacterial enzymes are iron dependent and hexameric. 30037 cd01294: Dihydroorotase (DHOase) catalyzes the reversible interconversion of carbamoyl aspartate to dihydroorotate, a key reaction in the pyrimidine biosynthesis. In contrast to the large polyfunctional CAD proteins of higher organisms, this group of DHOases is monofunctional and mainly dimeric. 30039 cd01296: Imidazolonepropionase/imidazolone-5-propionate hydrolase (Imidazolone-5PH) catalyzes the third step in the histidine degradation pathway, the hydrolysis of (S)-3-(5-oxo-4,5-dihydro-3H-imidazol-4-yl)propanoate to N-formimidoyl-L-glutamate. In bacteria, the enzyme is part of histidine utilization (hut) operon. 30041 cd01298: TRZ/ATZ family contains enzymes from the atrazine degradation pathway and related hydrolases. Atrazine, a chlorinated herbizide, can be catabolized by a variety of different bacteria. The first three steps of the atrazine dehalogenation pathway are catalyzed by atrazine chlorohydrolase (AtzA), hydroxyatrazine ethylaminohydrolase (AtzB), and N-isopropylammelide N-isopropylaminohydrolase (AtzC). All three enzymes belong to the superfamily of metal dependent hydrolases. AtzA and AtzB, beside other related enzymes are represented in this CD. 30042 cd01299: Metallo-dependent hydrolases, subgroup A is part of the superfamily of metallo-dependent hydrolases, a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The function of this subgroup is unknown. 30043 cd01300: YtcJ_like metal dependent amidohydrolases. YtcJ is a Bacillus subtilis ORF of unknown function. The Arabidopsis homolog LAF3 has been identified as a factor required for photochrome A signalling. 30044 cd01301: renal dipeptidase (rDP), best studied in mammals and also called membrane or microsomal dipeptidase, is a membrane-bound glycoprotein hydrolyzing dipeptides and is involved in hydrolytic metabolism of penem and carbapenem beta-lactam antibiotics. Although the biological function of the enzyme is still unknown, it has been suggested to play a role in the renal glutathione metabolism. 30046 cd01303: Guanine deaminase (GDEase). Guanine deaminase is an aminohydrolase responsible for the conversion of guanine to xanthine and ammonia, the first step to utilize guanine as a nitrogen source. This reaction also removes the guanine base from the pool and therefore can play a role in the regulation of cellular GTP and the guanylate nucleotide pool. 30047 cd01304: Formylmethanofuran dehydrogenase (FMDH) subunit A; Methanogenic bacteria and archea derive the energy for autotrophic growth from methanogenesis, the reduction of CO2 with molecular hydrogen as the electron donor. FMDH catalyzes the first step in methanogenesis, the formyl-methanofuran synthesis. In this step, CO2 is bound to methanofuran and subsequently reduced to the formyl state with electrons derived from hydrogen. 30048 cd01305: Predicted chlorohydrolases. These metallo-dependent hydrolases from archea are part of the superfamily of metallo-dependent hydrolases, a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. They have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. Some members of this subgroup are predicted to be chlorohyrolases. 30049 cd01306: PhnM is believed to be a subunit of the membrane associated C-P lyase complex. C-P lyase is thought to catalyze the direct cleavage of inactivated C-P bonds to yield inorganic phosphate and the corresponding hydrocarbons. It is responsible for cleavage of alkylphosphonates, which are utilized as sole phosphorus sources by many bacteria. 30050 cd01307: Metallo-dependent hydrolases, subgroup B is part of the superfamily of metallo-dependent hydrolases, a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The function of this subgroup is unknown. 30051 cd01308: Isoaspartyl dipeptidase hydrolyzes the beta-L-isoaspartyl linkages in dipeptides, as part of the degradative pathway to eliminate proteins with beta-L-isoaspartyl peptide bonds, bonds whereby the beta-group of an aspartate forms the peptide link with the amino group of the following amino acid. Formation of this bond is a spontaneous nonenzymatic reaction in nature and can profoundly effect the function of the protein. Isoaspartyl dipeptidase is an octameric enzyme that contains a binuclear zinc center in the active site of each subunit and shows a strong preference of hydrolyzing Asp-Leu dipeptides. 30052 cd01309: Metallo-dependent hydrolases, subgroup C is part of the superfamily of metallo-dependent hydrolases, a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The function of this subgroup is unknown. 30053 cd01310: TatD like proteins; E.coli TatD is a cytoplasmic protein, shown to have magnesium dependent DNase activity. 30054 cd01311: 2-pyrone-4,6-dicarboxylic acid (PDC) hydrolase hydrolyzes PDC to yield 4-oxalomesaconic acid (OMA) or its tautomer, 4-carboxy-2-hydroxymuconic acid (CHM). This reaction is part of the protocatechuate (PCA) 4,5-cleavage pathway. PCA is one of the most important intermediate metabolites in the bacterial pathways for various phenolic compounds, including lignin, which is the most abundant aromatic material in nature. 30055 cd01312: Metallo-dependent hydrolases, subgroup D is part of the superfamily of metallo-dependent hydrolases, a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The function of this subgroup is unknown. 30056 cd01313: Metallo-dependent hydrolases, subgroup D is part of the superfamily of metallo-dependent hydrolases, a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The function of this subgroup is unknown. 30057 cd01314: D-hydantoinases (D-HYD) also called dihydropyrimidases (DHPase) and related proteins; DHPases are a family of enzymes that catalyze the reversible hydrolytic ring opening of the amide bond in five- or six-membered cyclic diamides, like dihydropyrimidine or hydantoin. The hydrolysis of dihydropyrimidines is the second step of reductive catabolism of pyrimidines in human. The hydrolysis of 5-substituted hydantoins in microorganisms leads to enantiomerically pure N-carbamyl amino acids, which are used for the production of antibiotics, peptide hormones, pyrethroids, and pesticides. HYDs are classified depending on their stereoselectivity. This family also includes collapsin response regulators (CRMPs), cytosolic proteins involved in neuronal differentiation and axonal guidance which have strong homology to DHPases, but lack most of the active site residues. 30058 cd01315: L-Hydantoinases (L-HYDs) and Allantoinase (ALN); L-Hydantoinases are a member of the dihydropyrimidinase family, which catalyzes the reversible hydrolytic ring opening of dihydropyrimidines and hydantoins (five-membered cyclic diamides used in biotechnology). But L-HYDs differ by having an L-enantio specificity and by lacking activity on possible natural substrates such as dihydropyrimidines. Allantoinase catalyzes the hydrolytic cleavage of the five-member ring of allantoin (5-ureidohydantoin) to form allantoic acid. 30059 cd01316: The eukaryotic CAD protein is a trifunctional enzyme of carbamoylphosphate synthetase-aspartate transcarbamoylase-dihydroorotase, which catalyzes the first three steps of de novo pyrimidine nucleotide biosynthesis. Dihydroorotase (DHOase) catalyzes the third step, the reversible interconversion of carbamoyl aspartate to dihydroorotate. 30060 cd01317: Dihydroorotase (DHOase), subgroup IIa; DHOases catalyze the reversible interconversion of carbamoyl aspartate to dihydroorotate, a key reaction in pyrimidine biosynthesis. This subgroup also contains proteins that lack the active site, like unc-33, a C.elegans protein involved in axon growth. 30061 cd01318: Dihydroorotase (DHOase), subgroup IIb; DHOases catalyze the reversible interconversion of carbamoyl aspartate to dihydroorotate, a key reaction in pyrimidine biosynthesis. This group contains the archeal members of the DHOase family. 30062 cd01319: AMP deaminase (AMPD) catalyzes the hydrolytic deamination of adensosine monophosphate (AMP) at position 6 of the adenine nucleotide ring. AMPD is a diverse and highly regulated eukaryotic key enzyme of the adenylate catabolic pathway. 30063 cd01320: Adenosine deaminase (ADA) is a monomeric zinc dependent enzyme which catalyzes the irreversible hydrolytic deamination of both adenosine, as well as desoxyadenosine, to ammonia and inosine or desoxyinosine, respectively. ADA plays an important role in the purine pathway. Low, as well as high levels of ADA activity have been linked to several diseases. 30065 cd01324: Cytochrome cbb oxidase CcoQ. Cytochrome cbb3 oxidase, the terminal oxidase in the respiratory chains of proteobacteria, is a multi-chain transmembrane protein located in the cell membrane. Like other cytochrome oxidases, it catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. Found exclusively in proteobacteria, cbb3 is believed to be a modern enzyme that has evolved independently to perform a specialized function in microaerobic energy metabolism. The cbb3 operon contains four genes (ccoNOQP or fixNOQP), with ccoN coding for subunit I. Instead of a CuA-containing subunit II analogous to other cytochrome oxidases, cbb3 utilizes subunits ccoO and ccoP, which contain one and two hemes, respectively, to transfer electrons to the binuclear center. ccoQ, the fourth subunit, is a single transmembrane helix protein. It has been shown to protect the core complex from proteolytic degradation by serine proteases. See cd00919, cd01322, or cd01323 for more information on cbb3 oxidase. 30066 cd01341: ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active. 30067 cd01436: Mono-ADP-ribosylating toxins catalyze the transfer of ADP_ribose from NAD+ to eukaryotic Elongation Factor 2, halting protein synthesis. A single molecule of delivered toxin is sufficient to kill a cell. These toxins share mono-ADP-ribosylating activity with a variety of bacterial toxins, such as cholera toxin and pertussis toxin. The structural core is homologous to the poly-ADP ribosylating enzymes such as the PARP enzymes and Tankyrase. Diphtheria toxin is encoded by a lysogenic bacteriophage. Both diphtheria toxin and Pseudomonas aeruginosa exotoxin A are multi-domain proteins. These domains provide a EF2 ADP_ribosylating, receptor-binding, and intracellular trafficking/transmembrane functions .. 30068 cd01437: Poly(ADP-ribose) polymerase (parp) catalytic domain catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active. Poly(ADP-ribose)-like polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length through interactions with telomere repeat binding factor 1. 30069 cd01438: Tankyrases interact with the telomere reverse transcriptase complex (TERT). Tankyrase 1 poly-ADP-ribosylates Telomere Repeat Binding Factor 1 (TRF1) while Tankyrase 2 can poly-ADP-ribosylate itself or TRF1. The tankyrases also contain multiple ankyrin repeats that mediate protein-protein interaction (binding TRF1 and insulin-responsive aminopeptidase) and may function as a complex. Overexpression of Tank1 promotes increased telomere length when overexpressed, while overexpressed Tank2 has been shown to promote PARP cleavage- independent cell death (necrosis).. 30070 cd01439: Poly(ADP-ribose) polymerases catalyse the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. 2,3,7,8-Tetrachlorodibenzo-p-dioxin (TCDD) causes pleotropic effects in mammalian species through modulating gene expression. TCCD indicible PARP (TiPARP) is a target of TCDD that may contribute to multiple responses to TCDD by modulating protein function through poly ADP-ribosylation. 30071 cd00342: Porins form aqueous channels for the diffusion of small hydrophillic molecules across the outer membrane. Individual 16-strand anti-parallel beta-barrels form a central pore, and trimerizes thru mainly hydrophobic interactions at the interface. Trimers are stabilized by hytrophillic clamping of Loop L2. Loop 3 bends into the pore, creating an elliptical constriction of about 7 x 11A, large enough to allow passage of a glucose molecule without steric hindrance. Removal of the C-terminal residue (usuallly F) destabilizes the trimer and removal of the 16th beta-sheet abolishes trimerization. Unlike typical membrane proteins, porins lack long hydrophobic stretches. Short turns are found at the smooth, periplasmic end, longer irregular loops are found at the rough, extracellular end. C-terminal residue forms salt bridge with N-terminus. 30073 cd01346: The Maltoporin-like channels (LamB porin) form a trimeric structure which facilitate the diffusion of maltodextrins and other sugars across the outer membrane of Gram-negative bacteria. The membrane channel is formed by an 18-strand antiparallel beta-barrel (18,22). Loop 3 folds into the core to constrict pore size. Long irregular loops are found on the extracelllular side, while short turns are in the periplasm.Tightly-bound water molecules are found in the eyelet of the passage, and only substrates that can displace and replace the broken hydrogen bonds are likely to enter the pore. In the MPR structure, loops 4,6, and 9 have the greatest mobility and are highly variable; these are postulated to attract maltodextrins. 30075 cd01351: Aconitase catalytic domain. Aconitase (aconitate hydratase) catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. Cis-aconitate is formed as an intermediate product during the course of the reaction. In eukaryotes two isozymes of aconitase are known to exist: one found in the mitochondrial matrix and the other found in the cytoplasm. Aconitase, in its active form, contains a 4Fe-4S iron-sulfur cluster; three cysteine residues have been shown to be ligands of the 4Fe-4S cluster. This is the Aconitase core domain, including structural domains 1, 2 and 3, which binds the Fe-S cluster. The aconitase family also contains the following proteins: - Iron-responsive element binding protein (IRE-BP), a cytosolic protein that binds to iron-responsive elements (IREs). IREs are stem-loop structures found in the 5'UTR of ferritin, and delta aminolevulinic acid synthase mRNAs, and in the 3 'UTR of transferrin receptor mRNA. IRE-BP also express aconitase activity. - 3-isopropylmalate dehydratase (isopropylmalate isomerase), the enzyme that catalyzes the second step in the biosynthesis of leucine. - Homoaconitase (homoaconitate hydratase), an enzyme that participates in the alpha-aminoadipate pathway of lysine biosynthesis and that converts cis-homoaconitate into homoisocitric acid. 30076 cd01355: Putative Aconitase X catalytic domain. It is predicted by comparative genomic analysis. The proteins are mainly found in archaea and proteobacteria. They are distantly related to Aconitase family of proteins by sequence similarity and seconary structure prediction. The functions have not yet been experimentally characterized. Thus, the prediction should be treated with caution. 30077 cd01581: Aconitase B catalytic domain. Aconitate hydratase B catalyses the formation of cis-aconitate from citrate as part of the TCA cycle. Aconitase has an active (4FE-4S) and an inactive (3FE-4S) form. The active cluster is part of the catalytic site that interconverts citrate, cis-aconitase and isocitrate. The domain architecture of aconitase B is different from other aconitases in that the catalytic domain is normally found at C-terminus for other aconitases, but it is at N-terminus for B family. It also has a HEAT domain before domain 4 which plays a role in protein-protein interaction. This alignment is the core domain including domains 1,2 and 3. 30079 cd01583: Aconatase-like catalytic domain of 3-isopropylmalate dehydratase and related uncharacterized proteins. 3-isopropylmalate dehydratase catalyzes the isomerization between 2-isopropylmalate and 3-isopropylmalate, via the formation of 2-isopropylmaleate 3-isopropylmalate. IPMI is involved in fungal and bacterial leucine biosynthesis and is also found in eukaryotes. 30080 cd01584: Mitochondrial aconitase A catalytic domain. Aconitase (also known as aconitate hydratase and citrate hydro-lyase) catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. Cis-aconitate is formed as an intermediary product during the course of the reaction. In eukaryotes two isozymes of aconitase are known to exist: one found in the mitochondrial matrix and the other found in the cytoplasm. This is the motochondrial form. The mitochondrial product is coded by a nuclear gene. Most members of this subfamily are mitochondrial but there are some bacterial members. 30081 cd01585: Bacterial Aconitase-like catalytic domain. Aconitase (aconitate hydratase or citrate hydrolyase) catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. Cis-aconitate is formed as an intermediate product during the course of the reaction. This distinct subfamily is found only in bacteria and archea. Its exact characteristics are not known. 30083 cd00106: Kinesin motor domain. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type), in some its is found in the middle (M-type), or C-terminal (C-type). N-type and M-type kinesins are (+) end-directed motors, while C-type kinesins are (-) end-directed motors, i.e. they transport cargo towards the (-) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward. 30084 cd00124: Myosin motor domain. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. 30086 cd01364: Kinesin motor domain, BimC/Eg5 spindle pole proteins, participate in spindle assembly and chromosome segregation during cell division. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type), N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward. 30087 cd01365: Kinesin motor domain, KIF1_like proteins. KIF1A (Unc104) transports synaptic vesicles to the nerve terminal, KIF1B has been implicated in transport of mitochondria. Both proteins are expressed in neurons. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. In contrast to the majority of dimeric kinesins, most KIF1A/Unc104 kinesins are monomeric motors. A lysine-rich loop in KIF1A binds to the negatively charged C-terminus of tubulin and compensates for the lack of a second motor domain, allowing KIF1A to move processively. 30088 cd01366: Kinesin motor domain, KIFC2/KIFC3/ncd-like carboxy-terminal kinesins. Ncd is a spindle motor protein necessary for chromosome segregation in meiosis. KIFC2/KIFC3-like kinesins have been implicated in motility of the Golgi apparatus as well as dentritic and axonal transport in neurons. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In this subgroup the motor domain is found at the C-terminus (C-type). C-type kinesins are (-) end-directed motors, i.e. they transport cargo towards the (-) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward. 30089 cd01367: Kinesin motor domain, KIF2-like group. KIF2 is a protein expressed in neurons, which has been associated with axonal transport and neuron development; alternative splice forms have been implicated in lysosomal translocation. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In this subgroup the motor domain is found in the middle (M-type) of the protein chain. M-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second (KIF2 may be slower). To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward. 30091 cd01369: Kinesin motor domain, kinesin heavy chain (KHC) or KIF5-like subgroup. Members of this group have been associated with organelle transport. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward. 30092 cd01370: Kinesin motor domain, KIP3-like subgroup. The yeast kinesin KIP3 plays a role in positioning the mitotic spindle. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward. 30093 cd01371: Kinesin motor domain, kinesins II or KIF3_like proteins. Subgroup of kinesins, which form heterotrimers composed of 2 kinesins and one non-motor accessory subunit. Kinesins II play important roles in ciliary transport, and have been implicated in neuronal transport, melanosome transport, the secretory pathway, and mitosis. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In this group the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward. 30094 cd01372: Kinesin motor domain, KIF4-like subfamily. Members of this group seem to perform a variety of functions, and have been implicated in neuronal organelle transport and chromosome segregation during mitosis. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward. 30095 cd01373: Kinesin motor domain, KLP2-like subgroup. Members of this subgroup seem to play a role in mitosis and meiosis. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward. 30096 cd01374: Kinesin motor domain, CENP-E/KIP2-like subgroup, involved in chromosome movement and/or spindle elongation during mitosis. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward. 30097 cd01375: Kinesin motor domain, KIF9-like subgroup; might play a role in cell shape remodeling. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward. 30098 cd01376: Kinesin motor domain, KIF22/Kid-like subgroup. Members of this group might play a role in regulating chromosomal movement along microtubules in mitosis. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward. 30099 cd01377: Myosin motor domain, type II myosins. Myosin II mediates cortical contraction in cell motility, and is the motor in smooth and skeletal muscle. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. 30100 cd01378: Myosin motor domain, type I myosins. Myosin I generates movement at the leading edge in cell motility, and class I myosins have been implicated in phagocytosis and vesicle transport. Myosin I, an unconventional myosin, does not form dimers. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. 30102 cd01380: Myosin motor domain, type V myosins. Myosins V transport a variety of intracellular cargo processively along actin filaments, such as membraneous organelles and mRNA. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. 30103 cd01381: Myosin motor domain, type VII myosins. Myosins in this group have been associated with functions in sensory systems such as vision and hearing. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. 30104 cd01382: Myosin motor domain, type VI myosins. Myosin VI is a monomeric myosin, which moves towards the minus-end of actin filaments, in contrast to most other myosins. It has been implicated in endocytosis, secretion, and cell migration. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the minus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. 30105 cd01383: Myosin motor domain, plant-specific type VIII myosins, a subgroup which has been associated with endocytosis, cytokinesis, cell-to-cell coupling and gating at plasmodesmata. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. 30106 cd01384: Myosin motor domain, plant-specific type XI myosin, involved in organelle transport. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. 30108 cd01386: Myosin motor domain, type XVIII myosins. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. 30109 cd01387: Myosin motor domain, type XV myosins. In vertebrates, myosin XV appears to be expressed in sensory tissue and play a role in hearing. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. 30110 cd00755: Family of activating enzymes (E1) of ubiquitin-like proteins related to the E.coli hypothetical protein ygdL. The common reaction mechanism catalyzed by E1-like enzymes begins with a nucleophilic attack of the C-terminal carboxylate of the ubiquitin-like substrate, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of the substrate. The exact function of this family is unknown. 30111 cd00757: ThiF_MoeB_HesA. Family of E1-like enzymes involved in molybdopterin and thiamine biosynthesis family. The common reaction mechanism catalyzed by MoeB and ThiF, like other E1 enzymes, begins with a nucleophilic attack of the C-terminal carboxylate of MoaD and ThiS, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS. MoeB, as the MPT synthase (MoaE/MoaD complex) sulfurase, is involved in the biosynthesis of the molybdenum cofactor, a derivative of the tricyclic pterin, molybdopterin (MPT). ThiF catalyzes the adenylation of ThiS, as part of the biosynthesis pathway of thiamin pyrophosphate (vitamin B1). . 30115 cd01486: Apg7 is an E1-like protein, that activates two different ubiquitin-like proteins, Apg12 and Apg8, and assigns them to specific E2 enzymes, Apg10 and Apg3, respectively. This leads to the covalent conjugation of Apg8 with phosphatidylethanolamine, an important step in autophagy. Autophagy is a dynamic membrane phenomenon for bulk protein degradation in the lysosome/vacuole. 30116 cd01487: E1_ThiF_like. Member of superfamily of activating enzymes (E1) of the ubiquitin-like proteins. The common reaction mechanism catalyzed by E1-like enzymes begins with a nucleophilic attack of the C-terminal carboxylate of the ubiquitin-like substrate, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of the substrate. The exact function of this family is unknown. 30117 cd01488: Ubiquitin activating enzyme (E1) subunit UBA3. UBA3 is part of the heterodimeric activating enzyme (E1), specific for the Rub family of ubiquitin-like proteins (Ubls). E1 enzymes are part of a conjugation cascade to attach Ub or Ubls, covalently to substrate proteins. consisting of activating (E1), conjugating (E2), and/or ligating (E3) enzymes. E1 activates ubiquitin(-like) by C-terminal adenylation, and subsequently forms a highly reactive thioester bond between its catalytic cysteine and Ubls C-terminus. E1 also associates with E2 and promotes ubiquitin transfer to the E2's catalytic cysteine. Post-translational modification by Rub family of ubiquitin-like proteins (Ublps) activates SCF ubiquitin ligases and is involved in cell cycle control, signaling and embryogenesis. UBA3 contains both the nucleotide-binding motif involved in adenylation and the catalytic cysteine involved in the thioester intermediate and Ublp transfer to E2. 30123 cd00295: RNA 3' phosphate cyclase domain - RNA phosphate cyclases are enzymes that catalyze the ATP-dependent conversion of 3'-phosphate at the end of RNA into 2', 3'-cyclic phosphodiester bond. The enzymes are conserved in eucaryotes, bacteria and archaea. The exact biological role of this enzyme is unknown, but it has been proposed that it is likely to function in cellular RNA metabolism and processing. RNA phosphate cyclase has been characterized in human (with at least three isozymes), and E. coli, and it seems to be taxonomically widespread. The crystal structure of RNA phospate cyclase shows that it consists of two domains. The larger domain contains three repeats of a fold originally identified in the bacterial translation initiation factor IF3. 30124 cd00874: RNA 3' phosphate cyclase domain (class II). These proteins function as RNA cyclase to catalyze the ATP-dependent conversion of 3'-phosphate to a 2'.3'-cyclic phosphodiester at the end of RNA molecule. A conserved catalytic histidine residue is found in all members of this subfamily. 30125 cd00875: RNA 3' phosphate cyclase domain (class I) This subfamily of cyclase-like proteins are encoded in eukaryotic genomes. They lack a conserved catalytic histidine residue required for cyclase activity, so probably do not function as cyclases. They are believed to play a role in ribosomal RNA processing and assembly. 30127 cd01554: Enol pyruvate transferases family includes EPSP synthases and UDP-N-acetylglucosamine enolpyruvyl transferase. Both enzymes catalyze the reaction of enolpyruvyl transfer. 30128 cd01555: UDP-N-acetylglucosamine enolpyruvyl transferase catalyzes enolpyruvyl transfer as part of the first step in the biosynthesis of peptidoglycan, a component of the bacterial cell wall. The reaction is phosphoenolpyruvate + UDP-N-acetyl-D-glucosamine = phosphate + UDP-N-acetyl-3-(1-carboxyvinyl)-D-glucosamine. This enzyme is of interest as a potential target for anti-bacterial agents. The only other known enolpyruvyl transferase is the related 5-enolpyruvylshikimate-3-phosphate synthase. 30129 cd01556: EPSP synthase domain. 3-phosphoshikimate 1-carboxyvinyltransferase (5-enolpyruvylshikimate-3-phosphate synthase) (EC 2.5.1.19) catalyses the reaction between shikimate-3-phosphate (S3P) and phosphoenolpyruvate (PEP) to form 5-enolpyruvylshkimate-3-phosphate (EPSP), an intermediate in the shikimate pathway leading to aromatic amino acid biosynthesis. The reaction is phosphoenolpyruvate + 3-phosphoshikimate = phosphate + 5-O-(1-carboxyvinyl)-3-phosphoshikimate. It is found in bacteria and plants but not animals. The enzyme is the target of the widely used herbicide glyphosate, which has been shown to occupy the active site. In bacteria and plants, it is a single domain protein, while in fungi, the domain is found as part of a multidomain protein with functions that are all part of the shikimate pathway. 30130 cd00354: Fructose-1,6-bisphosphatase, an enzyme that catalyzes the hydrolysis of fructose-1,6-biphosphate into fructose-6-phosphate and is critical in gluconeogenesis pathway. The alignment model also includes chloroplastic FBPases and sedoheptulose-1,7-biphosphatases that play a role in pentose phosphate pathway (Calvin cycle).. 30131 cd01515: Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol monophosphatase family (FBPase class IV). These are Mg++ dependent phosphatases. Members in this family may have both fructose-1,6-bisphosphatase and inositol-monophosphatase activity. In hyperthermophilic archaea, inositol monophosphatase is thought to play a role in the biosynthesis of di-myo-inositol-1,1'-phosphate, an osmolyte unique to hyperthermophiles. 30132 cd01516: Bacterial fructose-1,6-bisphosphatase, glpX-encoded. A dimeric enzyme dependent on Mg(2+). glpX-encoded FPBase (FBPase class II) differs from other members of the inositol-phosphatase superfamily by permutation of secondary structure elements. The core structure around the active site is well preserved. In E. coli, FBPase II is part of the glp regulon, which mediates growth on glycerol or sn-glycerol 3-phosphate as the sole carbon source. 30135 cd01637: Inositol-monophosphatase-like domains. This family of phosphatases is dependent on bivalent metal ions such as Mg++, and many members are inhibited by Li+ (which is thought to displace a bivalent ion in the active site). Substrates include fructose-1,6-bisphosphate, inositol poly- and monophosphates, PAP and PAPS, sedoheptulose-1,7-bisphosphate and probably others. 30136 cd01638: CysQ, a 3'-Phosphoadenosine-5'-phosphosulfate (PAPS) 3'-phosphatase, is a bacterial member of the inositol monophosphatase family. It has been proposed that CysQ helps control intracellular levels of PAPS, which is an intermediate in cysteine biosynthesis (a principal route of sulfur assimilation).. 30137 cd01639: IMPase, inositol monophosphatase and related domains. A family of Mg++ dependent phosphatases, inhibited by lithium, many of which may act on inositol monophosphate substrate. They dephosphorylate inositol phosphate to generate inositol, which may be recycled into inositol lipids; in eukaryotes IMPase plays a vital role in intracellular signaling. IMPase is one of the proposed targets of Li+ therapy in manic-depressive illness. This family contains some bacterial members of the inositol monophosphatase family classified as SuhB-like. E. coli SuhB has been suggested to participate in posstranscriptional control of gene expression, and its inositol monophosphatase activity doesn't appear to be sufficient for its cellular function. It has been proposed, that SuhB plays a role in the biosynthesis of phosphatidylinositol in mycobacteria. 30138 cd01640: IPPase; Inositol polyphosphate-1-phosphatase, a member of the Mg++ dependent family of inositol monophosphatase-like domains, hydrolyzes the 1' position phosphate from inositol 1,3,4-trisphosphate and inositol 1,4-bisphosphate. Members in this group may also exhibit 3'-phosphoadenosine 5'-phosphate phosphatase activity, and they all appear to be inhibited by lithium. IPPase is one of the proposed targets of Li+ therapy in manic-depressive illness. 30139 cd01641: Predominantly bacterial family of Mg++ dependend phosphatases, related to inositol monophosphatases. These enzymes may dephosphorylate fructose-1,6-bisphosphate, inositol monophospate, 3'-phosphoadenosine-5'-phosphate, or similar substrates. 30140 cd01642: Putative fructose-1,6-bisphosphatase or related enzymes of inositol monophosphatase family. These are Mg++ dependent phosphatases. Members in this family may have fructose-1,6-bisphosphatase and/or inositol-monophosphatase activity. Fructose-1,6-bisphosphatase catalyzes the hydrolysis of fructose-1,6-biphosphate into fructose-6-phosphate and is critical in gluconeogenesis pathway. 30141 cd01643: Bacterial family of Mg++ dependent phosphatases, related to inositol monophosphatases. These enzymes may dephosphorylate inositol monophosphate or similar substrates. 30163 cd01948: EAL domain. This domain is found in diverse bacterial signaling proteins. It is called EAL after its conserved residues and is also known as domain of unknown function 2 (DUF2). The EAL domain has been shown to stimulate degradation of a second messenger, cyclic di-GMP, and is a good candidate for a diguanylate phosphodiesterase function. Together with the GGDEF domain, EAL might be involved in regulating cell surface adhesiveness in bacteria. 30164 cd01949: Diguanylate-cyclase (DGC) or GGDEF domain: Originally named after a conserved residue pattern, and initially described as domain of unknown function 1 (DUF1). It is widely present in bacteria and often links to a wide range of non-homologous domains in a variety of cell signaling proteins. The domain has been suggested to be homologous to the adenylyl cyclase catalytic domain. This prediction correlates with the functional information available on two GGDEF-containing proteins, namely diguanylate cyclase and phosphodiesterase A of Acetobacter xylinum, both of which regulate the turnover of cyclic diguanosine monophosphate. Together with the EAL domain, GGDEF might be involved in regulating cell surface adhesiveness in bacteria. 30165 cd00293: Usp: Universal stress protein family. The universal stress protein Usp is a small cytoplasmic bacterial protein whose expression is enhanced when the cell is exposed to stress agents. Usp enhances the rate of cell survival during prolonged exposure to such conditions, and may provide a general ""stress endurance"" activity. The crystal structure of Haemophilus influenzae Usp reveals an alpha/beta fold similar to that of the Methanococcus jannaschii MJ0577 protein, which binds ATP, athough Usp lacks ATP-binding activity. 30166 cd00553: NAD+ synthase is a homodimer, which catalyzes the final step in de novo nicotinamide adenine dinucleotide (NAD+) biosynthesis, an amide transfer from either ammonia or glutamine to nicotinic acid adenine dinucleotide (NaAD). The conversion of NaAD to NAD+ occurs via an NAD-adenylate intermediate and requires ATP and Mg2+. The intemediate is subsequently cleaved into NAD+ and AMP. In many prokaryotes, such as E. coli , NAD synthetase consists of a single domain and is strictly ammonia dependent. In contrast, eukaryotes and other prokaryotes have an additional N-terminal amidohydrolase domain that prefer glutamine, Interestingly, NAD+ synthases in these prokaryotes, can also utilize ammonia as an amide source .. 30167 cd01712: ThiI is required for thiazole synthesis in the thiamine biosynthesis pathway. It belongs to the Adenosine Nucleotide Hydrolysis suoerfamily and predicted to bind to Adenosine nucleotide. 30168 cd01713: This domain is found in phosphoadenosine phosphosulphate (PAPS) reductase enzymes or PAPS sulphotransferase. PAPS reductase is part of the adenine nucleotide alpha hydrolases superfamily also including N type ATP PPases and ATP sulphurylases. A highly modified version of the P loop, the fingerprint peptide of mononucleotide-binding proteins, is present in the active site of the protein, which appears to be a positively charged cleft containing a number of conserved arginine and lysine residues. Although PAPS reductase has no ATPase activity, it shows a striking similarity to the structure of the ATP pyrophosphatase (ATP PPase) domain of GMP synthetase, indicating that both enzyme families have evolved from a common ancestral nucleotide-binding fold. The enzyme uses thioredoxin as an electron donor for the reduction of PAPS to phospho-adenosine-phosphate (PAP) . It is also found in NodP nodulation protein P from Rhizobium meliloti which has ATP sulphurylase activity (sulphate adenylate transferase) .. 30169 cd01714: The electron transfer flavoprotein (ETF) serves as a specific electron acceptor for various mitochondrial dehydrogenases. ETF transfers electrons to the main respiratory chain via ETF-ubiquinone oxidoreductase. ETF is an heterodimer that consists of an alpha and a beta subunit which binds one molecule of FAD per dimer . A similar system also exists in some bacteria. The homologous pair of proteins (FixA/FixB) are essential for nitrogen fixation. The beta subunit protein is distantly related to and forms a heterodimer with the alpha subunit. 30170 cd01715: The electron transfer flavoprotein (ETF) serves as a specific electron acceptor for various mitochondrial dehydrogenases. ETF transfers electrons to the main respiratory chain via ETF-ubiquinone oxidoreductase. ETF is an heterodimer that consists of an alpha and a beta subunit which binds one molecule of FAD per dimer . A similar system also exists in some bacteria. The homologous pair of proteins (FixA/FixB) are essential for nitrogen fixation. The alpha subunit of ETF is structurally related to the bacterial nitrogen fixation protein fixB which could play a role in a redox process and feed electrons to ferredoxin. 30172 cd01985: The electron transfer flavoprotein (ETF) serves as a specific electron acceptor for various mitochondrial dehydrogenases. ETF transfers electrons to the main respiratory chain via ETF-ubiquinone oxidoreductase. ETF is an heterodimer that consists of an alpha and a beta subunit which binds one molecule of FAD per dimer . A similar system also exists in some bacteria. The homologous pair of proteins (FixA/FixB) are essential for nitrogen fixation. The alpha subunit of ETF is structurally related to the bacterial nitrogen fixation protein fixB which could play a role in a redox process and feed electrons to ferredoxin. The beta subunit protein is distantly related to and forms a heterodimer with the alpha subunit. 30174 cd01987: USP domain is located between the N-terminal sensor domain and C-terminal catalytic domain of this Osmosensitive K+ channel histidine kinase family. The family of KdpD sensor kinase proteins regulates the kdpFABC operon responsible for potassium transport. The USP domain is homologous to the universal stress protein Usp Usp is a small cytoplasmic bacterial protein whose expression is enhanced when the cell is exposed to stress agents. Usp enhances the rate of cell survival during prolonged exposure to such conditions, and may provide a general ""stress endurance"" activity. 30175 cd01988: The C-terminal domain of a subfamily of Na+ /H+ antiporter existed in bacteria and archea . Na+/H+ exchange proteins eject protons from cells, effectively eliminating excess acid from actively metabolising cells. Na+ /H+ exchange activity is also crucial for the regulation of cell volume, and for the reabsorption of NaCl across renal, intestinal, and other epithelia. These antiports exchange Na+ for H+ in an electroneutral manner, and this activity is carried out by a family of Na+ /H+ exchangers, or NHEs, which are known to be present in both prokaryotic and eukaryotic cells. These exchangers are highly-regulated (glyco)phosphoproteins, which, based on their primary structure, appear to contain 10-12 membrane-spanning regions (M) at the N-terminus and a large cytoplasmic region at the C-terminus. The transmembrane regions M3-M12 share identity wit h other members of the family. The M6 and M7 regions are highly conserved. Thus, this is thought to be the region that is involved in the transport of sodium and hydrogen ions. The cytoplasmic region or C-terminal has homology with a family universal stress protein.Usp is a small cytoplasmic bacterial protein whose expression is enhanced when the cell is exposed to stress agents. Usp enhances the rate of cell survival during prolonged exposure to such conditions, and may provide a general ""stress endurance"" activity. 30176 cd01989: The N-terminal domain of Eukaryotic Serine Threonine kinases. The Serine Threonine kinases are enzymes that belong to a very extensive family of proteins which share a conserved catalytic core common with both serine/threonine and tyrosine protein kinases. The N-terminal domain is homologous to the USP family which has a ATP binding fold. The N-terminal domain is predicted to be involved in ATP binding. 30177 cd01990: This is a subfamily of Adenine nucleotide alpha hydrolases superfamily. Adenine nucleotide alpha hydrolases superfamily includes N type ATP PPases and ATP sulphurylases. It forms a apha/beta/apha fold which binds to Adenosine group. This subfamily of proteins probably binds ATP. This domain is about 200 amino acids long with a strongly conserved motif SGGKD at the N terminus. 30178 cd01991: The C-terminal domain of Asparagine Synthase B. This domain is always found associated n-terminal amidotransferase domain. Family members that contain this domain catalyse the conversion of aspartate to asparagine. Asparagine synthetase B catalyzes the assembly of asparagine from aspartate, Mg(2+)ATP, and glutamine. The three-dimensional architecture of the N-terminal domain of asparagine synthetase B is similar to that observed for glutamine phosphoribosylpyrophosphate amidotransferase while the molecular motif of the C-domain is reminiscent to that observed for GMP synthetase .. 30179 cd01992: N-terminal domain of predicted ATPase of the PP-loop faimly implicated in cell cycle control [Cell division and chromosome partitioning]. This is a subfamily of Adenine nucleotide alpha hydrolases superfamily.Adeninosine nucleotide alpha hydrolases superfamily includes N type ATP PPases and ATP sulphurylases. It forms a apha/beta/apha fold which binds to Adenosine group. This domain has a strongly conserved motif SGGXD at the N terminus. 30180 cd01993: This is a subfamily of Adenine nucleotide alpha hydrolases superfamily.Adeninosine nucleotide alpha hydrolases superfamily includes N type ATP PPases and ATP sulphurylases. It forms a apha/beta/apha fold which binds to Adenosine group. This subfamily of proteins is predicted to bind ATP. This domainhas a strongly conserved motif SGGKD at the N terminus. 30181 cd01994: This is a subfamily of Adenine nucleotide alpha hydrolases superfamily.Adeninosine nucleotide alpha hydrolases superfamily includes N type ATP PPases and ATP sulphurylases. It forms a apha/beta/apha fold which binds to Adenosine group. This subfamily of proteins is predicted to bind ATP. This domainhas a strongly conserved motif SGGKD at the N terminus. 30183 cd01996: This is a subfamily of Adenine nucleotide alpha hydrolases superfamily.Adeninosine nucleotide alpha hydrolases superfamily includes N type ATP PPases and ATP sulphurylases. It forms a apha/beta/apha fold which binds to Adenosine group. This subfamily of proteins is predicted to bind ATP. This domain has a strongly conserved motif SGGKD at the N terminus. 30184 cd01997: The C-terminal domain of GMP synthetase. It contains two subdomains; the ATP pyrophosphatase domain which closes to the N-termial and the dimerization domain at C-terminal end. The ATP-PPase is a twisted, five-stranded parallel beta-sheet sandwiched between helical layers. It has a signature nucleotide-binding motif, or P-loop, at the end of the first-beta strand.The dimerization domain formed by the C-terminal 115 amino acid for prokaryotic proteins. It is adjacent to teh ATP-binding site of the ATP-PPase subdomain. The largest difference between the primary sequence of prokaryotic and eukaryotic GMP synthetase map to the dimerization domain.Eukaryotic GMP synthetase has several large insertions relative to prokaryotes. 30185 cd01998: tRNA methyl transferase. This family represents tRNA(5-methylaminomethyl-2-thiouridine)-methyltransferase which is involved in the biosynthesis of the modified nucleoside 5-methylaminomethyl-2-thiouridine present in the wobble position of some tRNAs. This family of enzyme only presents in bacteria and eukaryote. The archaeal counterpart of this enzyme performs same function, but is completely unrelated in sequence. 30186 cd01999: Argininosuccinate synthase. The Argininosuccinate synthase is a urea cycle enzyme that catalyzes the penultimate step in arginine biosynthesis: the ATP-dependent ligation of citrulline to aspartate to form argininosuccinate, AMP and pyrophosphate . In humans, a defect in the AS gene causes citrullinemia, a genetic disease characterized by severe vomiting spells and mental retardation. AS is a homotetrameric enzyme of chains of about 400 amino-acid residues. An arginine seems to be important for the enzyme's catalytic mechanism. The sequences of AS from various prokaryotes, archaebacteria and eukaryotes show significant similarity. 30188 cd00464: Shikimate kinase (SK) is the fifth enzyme in the shikimate pathway, a seven-step biosynthetic pathway which converts erythrose-4-phosphate to chorismic acid, found in bacteria, fungi and plants. Chorismic acid is a important intermediate in the synthesis of aromatic compounds, such as aromatic amino acids, p-aminobenzoic acid, folate and ubiquinone. Shikimate kinase catalyses the phosphorylation of the 3-hydroxyl group of shikimic acid using ATP. 30189 cd01428: Adenylate kinase (ADK) catalyzes the reversible phosphoryl transfer from adenosine triphosphates (ATP) to adenosine monophosphates (AMP) and to yield adenosine diphosphates (ADP). This enzyme is required for the biosynthesis of ADP and is essential for homeostasis of adenosine phosphates. 30190 cd01672: Thymidine monophosphate kinase (TMPK), also known as thymidylate kinase, catalyzes the phosphorylation of thymidine monophosphate (TMP) to thymidine diphosphate (TDP) utilizing ATP as its preferred phophoryl donor. TMPK represents the rate-limiting step in either de novo or salvage biosynthesis of thymidine triphosphate (TTP).. 30191 cd01673: Deoxyribonucleoside kinase (dNK) catalyzes the phosphorylation of deoxyribonucleosides to yield corresponding monophosphates (dNMPs). This family consists of various deoxynucleoside kinases including deoxyribo- cytidine (EC 2.7.1.74), guanosine (EC 2.7.1.113), adenosine (EC 2.7.1.76), and thymidine (EC 2.7.1.21) kinases. They are key enzymes in the salvage of deoxyribonucleosides originating from extra- or intracellular breakdown of DNA. 30194 cd02021: Gluconate kinase (GntK) catalyzes the phosphoryl transfer from ATP to gluconate. The resulting product gluconate-6-phoshate is an important precursor of gluconate metabolism. GntK acts as a dimmer composed of two identical subunits. 30195 cd02022: Dephospho-coenzyme A kinase (DPCK, EC 2.7.1.24) catalyzes the phosphorylation of dephosphocoenzyme A (dCoA) to yield CoA, which is the final step in CoA biosynthesis. 30196 cd02023: Uridine monophosphate kinase (UMPK, EC 2.7.1.48), also known as uridine kinase or uridine-cytidine kinase (UCK), catalyzes the reversible phosphoryl transfer from ATP to uridine or cytidine to yield UMP or CMP. In the primidine nucleotide-salvage pathway, this enzyme combined with nucleoside diphosphate kinases further phosphorylates UMP and CMP to form UTP and CTP. This kinase also catalyzes the phosphorylation of several cytotoxic ribonucleoside analogs such as 5-flurrouridine and cyclopentenyl-cytidine. 30197 cd02024: Nicotinamide riboside kinase (NRK) is an enzyme involved in the metabolism of nicotinamide adenine dinucleotide (NAD+). This enzyme catalyzes the phosphorylation of nicotinamide riboside (NR) to form nicotinamide mononucleotide (NMN). It defines the NR salvage pathway of NAD+ biosynthesis in addition to the pathways through nicotinic acid mononucleotide (NaMN). This enzyme can also phosphorylate the anticancer drug tiazofurin, which is an analog of nicotinamide riboside. 30198 cd02025: Pantothenate kinase (PanK) catalyzes the phosphorylation of pantothenic acid to form 4'-phosphopantothenic, which is the first of five steps in coenzyme A (CoA) biosynthetic pathway. The reaction carried out by this enzyme is a key regulatory point in CoA biosynthesis. 30199 cd02026: Phosphoribulokinase (PRK) is an enzyme involved in the Benson-Calvin cycle in chloroplasts or photosynthetic prokaryotes. This enzyme catalyzes the phosphorylation of D-ribulose 5-phosphate to form D-ribulose 1, 5-biphosphate, using ATP and NADPH produced by the primary reactions of photosynthesis. 30200 cd02027: Adenosine 5'-phosphosulfate kinase (APSK) catalyzes the phosphorylation of adenosine 5'-phosphosulfate to form 3 '-phosphoadenosine 5'-phosphosulfate (PAPS). The end-product PAPS is a biologically ""activated"" sulfate form important for the assimilation of inorganic sulfate. 30201 cd02028: Uridine monophosphate kinase_like (UMPK_like) is a family of proteins highly similar to the uridine monophosphate kinase (UMPK, EC 2.7.1.48), also known as uridine kinase or uridine-cytidine kinase (UCK).. 30202 cd02029: Phosphoribulokinase-like (PRK-like) is a family of proteins similar to phosphoribulokinase (PRK), the enzyme involved in the Benson-Calvin cycle in chloroplasts or photosynthetic prokaryotes. PRK catalyzes the phosphorylation of D-ribulose 5-phosphate to form D-ribulose 1, 5-biphosphate, using ATP and NADPH produced by the primary reactions of photosynthesis. 30203 cd02030: NADH:Ubiquinone oxioreductase, 42 kDa (NDUO42) is a family of proteins that are highly similar to deoxyribonucleoside kinases (dNK). Members of this family have been identified as one of the subunits of NADH:Ubiquinone oxioreductase (complex I), a multi-protein complex located in the inner mitochondrial membrane. The main function of the complex is to transport electrons from NADH to ubiquinone, which is accompanied by the translocation of protons from the mitochondrial matrix to the inter membrane space. 30204 cd02065: B12 binding domain (B12-BD). Most of the members bind different cobalamid derivates, like B12 (adenosylcobamide) or methylcobalamin or methyl-Co(III) 5-hydroxybenzimidazolylcobamide. This domain is found in several enzymes, such as glutamate mutase, methionine synthase and methylmalonyl-CoA mutase. Cobalamin undergoes a conformational change on binding the protein; the dimethylbenzimidazole group, which is coordinated to the cobalt in the free cofactor, moves away from the corrin and is replaced by a histidine contributed by the protein. The sequence Asp-X-His-X-X-Gly, which contains this histidine ligand, is conserved in many cobalamin-binding proteins. Not all members of this family contain the conserved binding motif. 30205 cd02067: B12 binding domain (B12-BD). This domain binds different cobalamid derivates, like B12 (adenosylcobamide) or methylcobalamin or methyl-Co(III) 5-hydroxybenzimidazolylcobamide, it is found in several enzymes, such as glutamate mutase, methionine synthase and methylmalonyl-CoA mutase. Cobalamin undergoes a conformational change on binding the protein; the dimethylbenzimidazole group, which is coordinated to the cobalt in the free cofactor, moves away from the corrin and is replaced by a histidine contributed by the protein. The sequence Asp-X-His-X-X-Gly, which contains this histidine ligand, is conserved in many cobalamin-binding proteins. 30206 cd02068: B12 binding domain_like associated with radical SAM domain. This domain shows similarity with B12 (adenosylcobamide) binding domains found in several enzymes, such as glutamate mutase, methionine synthase and methylmalonyl-CoA mutase, but it lacks the signature motif Asp-X-His-X-X-Gly, which contains the histidine that acts as a cobalt ligand. The function of this domain remains unclear. 30207 cd02069: B12 binding domain of methionine synthase. This domain binds methylcobalamin, which it uses as an intermediate methyl carrier from methyltetrahydrofolate (CH3H4folate) to homocysteine (Hcy).. 30208 cd02070: B12 binding domain of corrinoid proteins. A family of small methanogenic corrinoid proteins that bind methyl-Co(III) 5-hydroxybenzimidazolylcobamide as a cofactor. They play a role on the methanogenesis from trimethylamine, dimethylamine or monomethylamine, which is initiated by a series of corrinoid-dependent methyltransferases. 30209 cd02071: methylmalonyl CoA mutase B12 binding domain. This domain binds to B12 (adenosylcobamide), which initiates the conversion of succinyl CoA and methylmalonyl CoA by forming an adenosyl radical, which then undergoes a rearrangement exchanging a hydrogen atom with a group attached to a neighboring carbon atom. This family is present in both mammals and bacteria. Bacterial members are heterodimers and involved in the fermentation of pyruvate to propionate. Mammalian members are homodimers and responsible for the conversion of odd-chain fatty acids and branched-chain amino acids via propionyl CoA to succinyl CoA for further degradation. 30210 cd02072: B12 binding domain of glutamate mutase (Glm). Glutamate mutase catalysis the conversion of (S)-glutamate with (2S,3S)-3-methylaspartate. The rearrangement reaction is initiated by the extraction of a hydrogen from the protein-bound substrate by a 5 '-desoxyadenosyl radical, which is generated by the homolytic cleavage of the organometallic bond of the cofactor B12. Glm is a heterotetrameric molecule consisting of two alpha and two epsilon polypeptide chains. 30211 cd00286: Tubulin/FtsZ: Family includes tubulin alpha-, beta-, gamma-, delta-, and epsilon-tubulins as well as FtsZ, all of which are involved in polymer formation. Tubulin is the major component of microtubules, but also exists as a heterodimer and as a curved oligomer. Microtubules exist in all eukaryotic cells and are responsible for many functions, including cellular transport, cell motility, and mitosis. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria, archaea, and chloroplasts. 30212 cd00396: AIR synthase related protein, N-terminal domain. This family includes Hydrogen expression/formation protein HypE, AIR synthases, FGAM synthase and Selenophosphate synthetase (SelD). The N-terminal domain of AIR synthase forms the dimer interface of the protein, and is suggested as a putative ATP binding domain. 30213 cd00448: YjgF, YER057c, and UK114 belong to a large family of proteins present in bacteria, archaea, and eukaryotes with no definitive function. The conserved domain is similar in structure to chorismate mutase however there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site. 30214 cd00554: MECDP_synthase (2-C-methyl-D-erythritol-2,4-cyclodiphosphate synthase), encoded by the ispF gene, catalyzes the formation of 2-C-methyl-D-erythritol 2,4-cyclodiphosphate (MEC) in the nonmevalonate deoxyxylulose (DOXP) pathway for isoprenoid biosynthesis. This pathway is present in some bacterial and apicomplexans but is distinct from that used by mammals. MECDP_synthase forms a homotrimer, built around a beta prism, carrying three active sites, each of which is formed in a cleft between pairs of subunits. 30215 cd02184: YgbB family. The ygbB protein is a putative enzyme of deoxy-xylulose pathway (terpenoid biosynthesis).. 30216 cd02185: Chorismate mutase (AroH) is one of at least five chorismate-utilizing enzymes present in microorganisms which catalyze the rearrangement of chorismate to prephenic acid, the first committed step in the biosynthesis of aromatic amino acids. In prokaryotes, chorismate mutase may be fused to prephenate dehydratase, prephenate dehydrogenase, or 3-deoxy-D-arabino-heptulosonat-7-phosphate (DAHP) as part of a bifunctional enzyme. The AroH domain forms a homotrimer with three-fold symmetry. 30217 cd02186: The tubulin superfamily includes five distinct families, the alpha-, beta-, gamma-, delta-, and epsilon-tubulins and a sixth family (zeta-tubulin) which is present only in kinetoplastid protozoa. The alpha- and beta-tubulins are the major components of microtubules, while gamma-tubulin plays a major role in the nucleation of microtubule assembly. The delta- and epsilon-tubulins are widespread but unlike the alpha, beta, and gamma-tubulins they are not ubiquitous among eukaryotes. The alpha/beta-tubulin heterodimer is the structural subunit of microtubules. The alpha- and beta-tubulins share 40% amino-acid sequence identity, exist in several isotype forms, and undergo a variety of posttranslational modifications. The structures of alpha- and beta-tubulin are basically identical: each monomer is formed by a core of two beta-sheets surrounded by alpha-helices. The monomer structure is very compact, but can be divided into three regions based on function: the amino-terminal nucleotide-binding region, an intermediate taxol-binding region and the carboxy-terminal region which probably constitutes the binding surface for motor proteins. 30218 cd02187: The tubulin superfamily includes five distinct families, the alpha-, beta-, gamma-, delta-, and epsilon-tubulins and a sixth family (zeta-tubulin) which is present only in kinetoplastid protozoa. The alpha- and beta-tubulins are the major components of microtubules, while gamma-tubulin plays a major role in the nucleation of microtubule assembly. The delta- and epsilon-tubulins are widespread but unlike the alpha, beta, and gamma-tubulins they are not ubiquitous among eukaryotes. The alpha/beta-tubulin heterodimer is the structural subunit of microtubules. The alpha- and beta-tubulins share 40% amino-acid sequence identity, exist in several isotype forms, and undergo a variety of posttranslational modifications. The structures of alpha- and beta-tubulin are basically identical: each monomer is formed by a core of two beta-sheets surrounded by alpha-helices. The monomer structure is very compact, but can be divided into three regions based on function: the amino-terminal nucleotide-binding region, an intermediate taxol-binding region and the carboxy-terminal region which probably constitutes the binding surface for motor proteins. 30219 cd02188: Gamma-tubulin is a ubiquitous phylogenetically conserved member of tubulin superfamily. Gamma is a low abundance protein present within the cells in both various types of microtubule-organizing centers and cytoplasmic protein complexes. Gamma-tubulin recruits the alpha/beta-tubulin dimers that form the minus ends of microtubules and is thought to be involved in microtubule nucleation and capping. 30222 cd02191: FtsZ is a GTPase that is similar to the eukaryotic tubulins and is essential for cell division in prokaryotes. FtsZ is capable of polymerizing in a GTP-driven process into structures similar to those formed by tubulin. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. 30223 cd02192: This family of unknown function has a predicted N-terminal ATP-binding domain which is similar in sequence to ThiL (Thiamine-monophosphate kinase). This domain belongs to a family of ATP-binding domains that includes hydrogen expression/formation protein HypE, the AIR synthases, FGAM synthase and selenophosphate synthetase (SelD).. 30224 cd02193: Formylglycinamide ribonucleotide amidotransferase (FGAR-AT) catalyzes the ATP-dependent conversion of formylglycinamide ribonucleotide (FGAR) and glutamine to formylglycinamidine ribonucleotide (FGAM), ADP, phosphate, and glutamate in the fourth step of the purine biosynthetic pathway. In eukaryotes and Gram-negative bacteria, FGAR-AT is encoded by the purL gene as a multidomain protein with a molecular mass of about 140 kDa. In Gram-positive bacteria and archaea FGAR-AT is a complex of three proteins: PurS, PurL, and PurQ. PurL itself contains two tandem N- and C-terminal domains (four domains altogether). The N-terminal domains bind ATP and are related to the ATP-binding domains of HypE, ThiL, SelD and PurM. 30225 cd02194: ThiL (Thiamine-monophosphate kinase) plays a dual role in de novo biosynthesis and in salvage of exogenous thiamin. Thiamine salvage occurs in two steps, with thiamine kinase catalyzing the formation of thiamine phosphate, and ThiL catalyzing the conversion of this intermediate to thiamine pyrophosphate. The N-terminal domain of ThiL is thought to bind ATP and is related to the ATP-binding domains of hydrogen expression/formation protein HypE, the AIR synthases, FGAM synthase and selenophosphate synthetase (SelD).. 30226 cd02195: Selenophosphate synthetase (SelD) catalyzes the conversion of selenium to selenophosphate which is required by a number of bacterial, archaeal and eukaryotic organisms for synthesis of Secys-tRNA, the precursor of selenocysteine in selenoenzymes. The N-terminal domain of SelD is related to the ATP-binding domains of hydrogen expression/formation protein HypE, the AIR synthases, and FGAM synthase and is thought to bind ATP. 30227 cd02196: PurM (Aminoimidazole Ribonucleotide [AIR] synthetase), one of eleven enzymes required for purine biosynthesis, catalyzes the conversion of formylglycinamide ribonucleotide (FGAM) and ATP to AIR, ADP, and Pi, the fifth step in de novo purine biosynthesis. The N-terminal domain of PurM s related to the ATP-binding domains of hydrogen expression/formation protein HypE, the AIR synthases, selenophosphate synthetase (SelD), and FGAM synthase and is thought to bind ATP. 30228 cd02197: HypE (Hydrogenase expression/formation protein). HypE is involved in Ni-Fe hydrogenase biosynthesis. HypE dehydrates its own carbamoyl moiety in an ATP-dependent process to yield the enzyme thiocyanate. The N-terminal domain of SelD is related to the ATP-binding domains of hydrogen expression/formation protein HypE, the AIR synthases, selenophosphate synthetase (SelD), and FGAM synthase and is thought to bind ATP. 30229 cd02198: YjgH belong to a large family of proteins present in bacteria, archaea, and eukaryotes with no definitive function. The conserved domain is similar in structure to chorismate mutase however there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site. 30230 cd02199: YjgF_like1 belong to a large family of proteins present in bacteria, archaea, and eukaryotes with no definitive function. The conserved domain is similar in structure to chorismate mutase however there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site. 30231 cd02201: FtsZ is a GTPase that is similar to the eukaryotic tubulins and is essential for cell division in prokaryotes. FtsZ is capable of polymerizing in a GTP-driven process into structures similar to those formed by tubulin. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. 30234 cd02204: Formylglycinamide ribonucleotide amidotransferase (FGAR-AT) catalyzes the ATP-dependent conversion of formylglycinamide ribonucleotide (FGAR) and glutamine to formylglycinamidine ribonucleotide (FGAM), ADP, phosphate, and glutamate in the fourth step of the purine biosynthetic pathway. In eukaryotes and Gram-negative bacteria, FGAR-AT is encoded by the purL gene as a multidomain protein with a molecular mass of about 140 kDa. In Gram-positive bacteria and archaea FGAR-AT is a complex of three proteins: PurS, PurL, and PurQ. PurL itself contains two tandem N- and C-terminal domains (four domains altogether). The N-terminal domains bind ATP and are related to the ATP-binding domains of HypE, ThiL, SelD and PurM. 30235 cd02691: HypE (Hydrogenase expression/formation protein). HypE is involved in Ni-Fe hydrogenase biosynthesis. HypE dehydrates its own carbamoyl moiety in an ATP-dependent process to yield the enzyme thiocyanate. The N-terminal domain of SelD is related to the ATP-binding domains of hydrogen expression/formation protein HypE, the AIR synthases, selenophosphate synthetase (SelD), and FGAM synthase and is thought to bind ATP. 30236 cd02205: CBS domain; originally identified in cystathionine beta-synthase. This domain is found in a wide range of proteins, often in tandem arrangements and together with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members); and homocystinuria (cystathionine beta-synthase).. 30237 cd02249: Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins. 30238 cd02334: Zinc finger, ZZ type. Zinc finger present in dystrophin and dystrobrevin. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. Dystrophin attaches actin filaments to an integral membrane glycoprotein complex in muscle cells. The ZZ domain in dystrophin has been shown to be essential for binding to the membrane protein beta-dystroglycan. 30239 cd02335: Zinc finger, ZZ type. Zinc finger present in ADA2, a putative transcriptional adaptor, and related proteins. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. 30240 cd02336: Zinc finger, ZZ type. Zinc finger present in RSC8 and related proteins. RSC8 is a component of the RSC complex, which is closely related to the SWI/SNF complex and is involved in remodeling chromatin structure. The ZZ motif coordinates a zinc ion and most likely participates in ligand binding or molecular scaffolding. 30241 cd02337: Zinc finger, ZZ type. Zinc finger present in CBP/p300 and related proteins. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. CREB-binding protein (CBP) is a large multidomain protein that provides binding sites for transcriptional coactivators, the role of the ZZ domain in CBP/p300 is unclear. 30242 cd02338: Zinc finger, ZZ type. Zinc finger present in potassium channel modulatory factor (PCMF) 1 and related proteins. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. Human potassium channel modulatory factor 1 or FIGC has been shown to possess intrinsic E3 ubiquitin ligase activity and to promote ubiquitination. 30243 cd02339: Zinc finger, ZZ type. Zinc finger present in Drosophila Mind bomb (D-mib) and related proteins. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. Mind bomb is an E3 ubiqitin ligase that has been shown to regulate signaling by the Notch ligand Delta in Drosophila melanogaster. 30244 cd02340: Zinc finger, ZZ type. Zinc finger present in Drosophila ref(2)P, NBR1, Human sequestosome 1 and related proteins. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. Drosophila ref(2)P appears to control the multiplication of sigma rhabdovirus. NBR1 (Next to BRCA1 gene 1 protein) interacts with fasciculation and elongation protein zeta-1 (FEZ1) and calcium and integrin binding protein (CIB), and may function in cell signalling pathways. Sequestosome 1 is a phosphotyrosine independent ligand for the Lck SH2 domain and binds noncovalently to ubiquitin via its UBA domain. 30245 cd02341: Zinc finger, ZZ type. Zinc finger present in ZZZ3 (ZZ finger containing 3) and related proteins. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. 30246 cd02342: Zinc finger, ZZ type. Zinc finger present in plant ubiquitin-associated (UBA) proteins. The ZZ motif coordinates a zinc ion and most likely participates in ligand binding or molecular scaffolding. 30247 cd02343: Zinc finger, ZZ type. Zinc finger present in proteins with an EF_hand motif. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. 30248 cd02344: Zinc finger, ZZ type. Zinc finger present in HERC2 and related proteins. HERC2 is a potential E3 ubiquitin protein ligase and/or guanine nucleotide exchange factor. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. 30249 cd02345: Zinc finger, ZZ type. Zinc finger present in Drosophila dah and related proteins. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. Dah (discontinuous actin hexagon) is a membrane associated protein essential for cortical furrow formation in Drosophila. . 30252 cd02658: A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 30253 cd02659: A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 30254 cd02660: A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 30255 cd02661: A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 30262 cd02668: A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 30263 cd02669: A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 30269 cd02259: Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the ""double-glycine"" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is not conserved in all sub-families. . 30270 cd02417: A sub-family of peptidase C39 which contains Cyclolysin and Hemolysin processing peptidases. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the ""double-glycine"" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is not conserved in this sub-family. 30271 cd02418: A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the ""double-glyc ine"" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is conserved in this sub-family. 30272 cd02419: A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the ""double-glyc ine"" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is conserved in this sub-family. 30273 cd02420: A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the ""double-glyc ine"" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is conserved in this sub-family. 30274 cd02421: A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the ""double-glycine"" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is not conserved in this sub-family. 30275 cd02423: A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the ""double-glyc ine"" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is conserved in this sub-family of proteins with a single peptidase domain, which are lacking the nucleotide-binding transporter signature. 30276 cd02424: A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the ""double-glyc ine"" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is conserved in this sub-family, which contains Colicin V perocessing peptidase. 30277 cd02425: A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the ""double-glyc ine"" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is conserved in this sub-family. 30278 cd02549: A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the ""double-glyc ine"" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is conserved in this sub-family of proteins with a single peptidase domain, which are lacking the nucleotide-binding transporter signature or have different domain architectures. 30279 cd02325: R3H domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. R3H domains are found in proteins together with ATPase domains, SF1 helicase domains, SF2 DEAH helicase domains, Cys-rich repeats, ring-type zinc fingers, and KH domains. The function of the domain is predicted to be binding ssDNA or ssRNA in a sequence-specific manner. 30280 cd02636: R3H domain of a group of metazoan proteins that is related to the sperm-associated antigen 7. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to be binding ssDNA or ssRNA in a sequence-specific manner. . 30281 cd02637: R3H domain of Poly(A)-specific ribonuclease (PARN). PARN is a poly(A)-specific 3' exonuclease from the RNase D family that, in Xenopus, deadenylates a specific class of maternal mRNAs which results in their translational repression. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to be binding ssDNA or ssRNA. 30282 cd02638: R3H domain of a group of eukaryotic proteins with unknown function. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to be binding ssDNA or ssRNA in a sequence-specific manner. . 30283 cd02639: R3H domain of a group of mainly fungal proteins with unknown function, who also contain a RNA recognition motif (RRM) domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to be binding ssDNA or ssRNA. 30284 cd02640: R3H domain of the NF-kappaB-repression factor (NRF). NRF is a nuclear inhibitor of NF-kappaB proteins that can silence the IFNbeta promoter via binding to a negative regulatory element (NRE). Beside R3H NRF also contains a G-patch domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to be binding ssDNA or ssRNA in a sequence-specific manner. 30285 cd02641: R3H domain of Smubp-2_like proteins. Smubp-2_like proteins also contain a helicase_like and an AN1-like Zinc finger domain and have been shown to bind single-stranded DNA. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to be binding ssDNA or ssRNA. 30286 cd02642: R3H domain of encore-like proteins. Drosophila encore is involved in the germline exit after four mitotic divisions, by facilitating SCF-ubiquitin-proteasome-dependent proteolysis. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to be binding ssDNA or ssRNA in a sequence-specific manner. 30287 cd02643: R3H domain of the X1 box binding protein (NF-X1) and related proteins. Human NF-X1 is a transcription factor that regulates the expression of class II major histocompatibility complex (MHC) genes. The Drosophila homolog shuttle craft (STC) has been shown to be a DNA- or RNA-binding protein required for proper axon guidance in the central nervous system and, the yeast homolog FAP1 encodes a dosage suppressor of rapamycin toxicity. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to be binding ssDNA or ssRNA in a sequence-specific manner. 30288 cd02644: R3H domain found in proteins homologous to Bacillus subtilus Jag, which is associated with SpoIIIJ. SpoIIIJ is necessary for the third stage of sporulation. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to be binding ssDNA or ssRNA in a sequence-specific manner. 30289 cd02645: R3H domain of a group of proteins with unknown function, who also contain a AAA-ATPase (AAA) domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to be binding ssDNA or ssRNA in a sequence-specific manner. 30290 cd02646: R3H domain of a group of fungal proteins with unknown function, who also contain a G-patch domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the R3H domain is predicted to be binding ssDNA or ssRNA in a sequence-specific manner. 30291 cd00585: Peptidase C1B subfamily (MEROPS database nomenclature); composed of eukaryotic bleomycin hydrolases (BH) and bacterial aminopeptidases C (pepC). The proteins of this subfamily contain a large insert relative to the C1A peptidase (papain) subfamily. BH is a cysteine peptidase that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. Bleomycin, a glycopeptide derived from the fungus Streptomyces verticullus, is an effective anticancer drug due to its ability to induce DNA strand breaks. Human BH is the major cause of tumor cell resistance to bleomycin chemotherapy, and is also genetically linked to Alzheimer's disease. In addition to its peptidase activity, the yeast BH (Gal6) binds DNA and acts as a repressor in the Gal4 regulatory system. BH forms a hexameric ring barrel structure with the active sites imbedded in the central channel. The bacterial homolog of BH, called pepC, is a cysteine aminopeptidase possessing broad specificity. Although its crystal structure has not been solved, biochemical analysis shows that pepC also forms a hexamer. . 30292 cd02248: Peptidase C1A subfamily (MEROPS database nomenclature); composed of cysteine peptidases (CPs) similar to papain, including the mammalian CPs (cathepsins B, C, F, H, L, K, O, S, V, X and W). Papain is an endopeptidase with specific substrate preferences, primarily for bulky hydrophobic or aromatic residues at the S2 subsite, a hydrophobic pocket in papain that accommodates the P2 sidechain of the substrate (the second residue away from the scissile bond). Most members of the papain subfamily are endopeptidases. Some exceptions to this rule can be explained by specific details of the catalytic domains like the occluding loop in cathepsin B which confers an additional carboxydipeptidyl activity and the mini-chain of cathepsin H resulting in an N-terminal exopeptidase activity. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds. Parasitic CPs act extracellularly to help invade tissues and cells, to hatch or to evade the host immune system. Mammalian CPs are primarily lysosomal enzymes with the exception of cathepsin W, which is retained in the endoplasmic reticulum. They are responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. In addition to its inhibitory role, the propeptide is required for proper folding of the newly synthesized enzyme and its stabilization in denaturing pH conditions. Residues within the propeptide region also play a role in the transport of the proenzyme to lysosomes or acidified vesicles. Also included in this subfamily are proteins classified as non-peptidase homologs, which lack peptidase activity or have missing active site residues. 30293 cd02619: C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues. 30294 cd02620: Cathepsin B group; composed of cathepsin B and similar proteins, including tubulointerstitial nephritis antigen (TIN-Ag). Cathepsin B is a lysosomal papain-like cysteine peptidase which is expressed in all tissues and functions primarily as an exopeptidase through its carboxydipeptidyl activity. Together with other cathepsins, it is involved in the degradation of proteins, proenzyme activation, Ag processing, metabolism and apoptosis. Cathepsin B has been implicated in a number of human diseases such as cancer, rheumatoid arthritis, osteoporosis and Alzheimer's disease. The unique carboxydipeptidyl activity of cathepsin B is attributed to the presence of an occluding loop in its active site which favors the binding of the C-termini of substrate proteins. Some members of this group do not possess the occluding loop. TIN-Ag is an extracellular matrix basement protein which was originally identified as a target Ag involved in anti-tubular basement membrane antibody-mediated interstitial nephritis. It plays a role in renal tubulogenesis and is defective in hereditary tubulointerstitial disorders. TIN-Ag is exclusively expressed in kidney tissues. . 30295 cd02621: Cathepsin C; also known as Dipeptidyl Peptidase I (DPPI), an atypical papain-like cysteine peptidase with chloride dependency and dipeptidyl aminopeptidase activity, resulting from its tetrameric structure which limits substrate access. Each subunit of the tetramer is composed of three peptides: the heavy and light chains, which together adopts the papain fold and forms the catalytic domain; and the residual propeptide region, which forms a beta barrel and points towards the substrate's N-terminus. The subunit composition is the result of the unique characteristic of procathepsin C maturation involving the cleavage of the catalytic domain and the non-autocatalytic excision of an activation peptide within its propeptide region. By removing N-terminal dipeptide extensions, cathepsin C activates granule serine peptidases (granzymes) involved in cell-mediated apoptosis, inflammation and tissue remodelling. Loss-of-function mutations in cathepsin C are associated with Papillon-Lefevre and Haim-Munk syndromes, rare diseases characterized by hyperkeratosis and early-onset periodontitis. Cathepsin C is widely expressed in many tissues with high levels in lung, kidney and placenta. It is also highly expressed in cytotoxic lymphocytes and mature myeloid cells. 30296 cd02698: Cathepsin X; the only papain-like lysosomal cysteine peptidase exhibiting carboxymonopeptidase activity. It can also act as a carboxydipeptidase, like cathepsin B, but has been shown to preferentially cleave substrates through a monopeptidyl carboxypeptidase pathway. The propeptide region of cathepsin X, the shortest among papain-like peptidases, is covalently attached to the active site cysteine in the inactive form of the enzyme. Little is known about the biological function of cathepsin X. Some studies point to a role in early tumorigenesis. A more recent study indicates that cathepsin X expression is restricted to immune cells suggesting a role in phagocytosis and the regulation of the immune response. 30297 cd02656: MIT: domain contained within Microtubule Interacting and Trafficking molecules. The MIT domain is found in sorting nexins, the nuclear thiol protease PalBH, the AAA protein spastin and archaebacterial proteins with similar domain architecture, vacuolar sorting proteins and others. The molecular function of the MIT domain is unclear. 30298 cd02677: MIT: domain contained within Microtubule Interacting and Trafficking molecules. This MIT domain sub-family is found in sorting nexin 15 and related proteins. The molecular function of the MIT domain is unclear. 30299 cd02678: MIT: domain contained within Microtubule Interacting and Trafficking molecules. This sub-family of MIT domains is found in intracellular protein transport proteins of the AAA-ATPase family. The molecular function of the MIT domain is unclear. 30300 cd02679: MIT: domain contained within Microtubule Interacting and Trafficking molecules. This MIT domain sub-family is found in the AAA protein spastin, a probable ATPase involved in the assembly or function of nuclear protein complexes; spastins might also be involved in microtubule dynamics. The molecular function of the MIT domain is unclear. 30301 cd02680: MIT: domain contained within Microtubule Interacting and Trafficking molecules. This sub-family of MIT domains is found in the nuclear thiol protease PalBH. The molecular function of the MIT domain is unclear. 30302 cd02681: MIT: domain contained within Microtubule Interacting and Trafficking molecules. This sub-family of MIT domains is found in the nuclear thiol protease PalBH. The molecular function of the MIT domain is unclear. 30303 cd02682: MIT: domain contained within Microtubule Interacting and Trafficking molecules. This sub-family of MIT domains is found in mostly archaebacterial AAA-ATPases. The molecular function of the MIT domain is unclear. 30304 cd02683: MIT: domain contained within Microtubule Interacting and Trafficking molecules. This sub-family of MIT domains is found in proteins with unknown function, co-occuring with an as yet undescribed domain. The molecular function of the MIT domain is unclear. 30305 cd02684: MIT: domain contained within Microtubule Interacting and Trafficking molecules. This sub-family of MIT domains is found in proteins with an n-terminal serine/threonine kinase domain. The molecular function of the MIT domain is unclear. 30306 cd00508: This CD includes formate dehydrogenases (Fdh) H and N; nitrate reductases, Nap and Nas; and other related proteins. Formate dehydrogenase H is a component of the anaerobic formate hydrogen lyase complex and catalyzes the reversible oxidation of formate to CO2 with the release of a proton and two electrons. Formate dehydrogenase N (alpha subunit) is the major electron donor to the bacterial nitrate respiratory chain and nitrate reductases, Nap and Nas, catalyze the reduction of nitrate to nitrite. This CD (MopB_CT_Fdh-Nap-like) is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 30307 cd02775: Molybdopterin-Binding, C-terminal (MopB_CT) domain of the MopB superfamily of proteins, a large, diverse, heterogeneous superfamily of enzymes that, in general, bind molybdopterin as a cofactor. The MopB domain is found in a wide variety of molybdenum- and tungsten-containing enzymes, including formate dehydrogenase-H (Fdh-H) and -N (Fdh-N), several forms of nitrate reductase (Nap, Nas, NarG), dimethylsulfoxide reductase (DMSOR), thiosulfate reductase, formylmethanofuran dehydrogenase, and arsenite oxidase. Molybdenum is present in most of these enzymes in the form of molybdopterin, a modified pterin ring with a dithiolene side chain, which is responsible for ligating the Mo. In many bacterial and archaeal species, molybdopterin is in the form of a dinucleotide, with two molybdopterin dinucleotide units per molybdenum. These proteins can function as monomers, heterodimers, or heterotrimers, depending on the protein and organism. Also included in the MopB superfamily is the eukaryotic/eubacterial protein domain family of the 75-kDa subunit/Nad11/NuoG (second domain) of respiratory complex 1/NADH-quinone oxidoreductase which is postulated to have lost an ancestral formate dehydrogenase activity and only vestigial sequence evidence remains of a molybdopterin binding site. This hierarchy is of the conserved MopB_CT domain present in many, but not all, MopB homologs. 30308 cd02776: Respiratory nitrate reductase A (NarGHI), alpha chain (NarG) and related proteins. Under anaerobic conditions in the presence of nitrate, E. coli synthesizes the cytoplasmic membrane-bound quinol-nitrate oxidoreductase (NarGHI), which reduces nitrate to nitrite and forms part of a redox loop generating a proton-motive force. Found in prokaryotes and some archaea, NarGHI usually functions as a heterotrimer. The alpha chain contains the molybdenum cofactor-containing Mo-bisMGD catalytic subunit. This CD (MopB_CT_Nitrate-R-NarG-like) is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 30309 cd02777: The MopB_CT_DMSOR-like CD contains dimethylsulfoxide reductase (DMSOR), biotin sulfoxide reductase (BSOR), trimethylamine N-oxide reductase (TMAOR) and other related proteins. DMSOR always catalyzes the reduction of DMSO to dimethylsulfide, but its cellular location and oligomerization state are organism-dependent. For example, in Rhodobacter sphaeriodes and Rhodobacter capsulatus, it is an 82-kDa monomeric soluble protein found in the periplasmic space; in E. coli, it is membrane-bound and exists as a heterotrimer. BSOR catalyzes the reduction of biotin sulfixode to biotin, and is unique among Mo enzymes because no additional auxiliary proteins or cofactors are required. TMAOR is similar to DMSOR, but its only natural substrate is TMAO. Also included in this group is the pyrogallol-phloroglucinol transhydroxylase from Pelobacter acidigallici. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 30310 cd02778: The MopB_CT_Thiosulfate-R-like CD contains thiosulfate-, sulfur-, and polysulfide-reductases, and other related proteins. Thiosulfate reductase catalyzes the cleavage of sulfur-sulfur bonds in thiosulfate. Polysulfide reductase is a membrane-bound enzyme that catalyzes the reduction of polysulfide using either hydrogen or formate as the electron donor. Also included in this CD is the phenylacetyl-CoA:acceptor oxidoreductase, large subunit (PadB2), which has been characterized as a membrane-bound molybdenum-iron-sulfur enzyme involved in anaerobic metabolism of phenylalanine in the denitrifying bacterium Thauera aromatica. The MopB_CT_Thiosulfate-R-like CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 30311 cd02779: This CD contains the molybdopterin_binding C-terminal (MopB_CT) region of Arsenite oxidase (Arsenite-Ox) and related proteins. Arsenite oxidase oxidizes arsenite to the less toxic arsenate; it transfers the electrons obtained from the oxidation of arsenite towards the soluble periplasmic electron carriers cytochrome c and/or amicyanin. 30312 cd02780: This CD contains the molybdopterin_binding C-terminal (MopB_CT) region of tetrathionate reductase, subunit A, (TtrA); respiratory arsenate As(V) reductase, catalytic subunit (ArrA); and other related proteins. 30313 cd02781: The MopB_CT_Acetylene-hydratase CD contains acetylene hydratase (Ahy) and other related proteins. The acetylene hydratase of Pelobacter acetylenicus is a tungsten iron-sulfur protein involved in the fermentation of acetylene to ethanol and acetate. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 30314 cd02782: The MopB_CT_1 CD includes a group of related uncharacterized bacterial molybdopterin-binding oxidoreductase-like domains with a putative N-terminal iron-sulfur [4Fe-4S] cluster binding site and molybdopterin cofactor binding site. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 30315 cd02783: The MopB_CT_2 CD includes a group of related uncharacterized bacterial and archaeal molybdopterin-binding oxidoreductase-like domains with a putative N-terminal iron-sulfur [4Fe-4S] cluster binding site and molybdopterin cofactor binding site. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 30316 cd02784: The MopB_CT_PHLH CD includes a group of related uncharacterized putative hydrogenase-like homologs (PHLH) of molybdopterin binding proteins. This CD is of the PHLH region homologous to the conserved molybdopterin-binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 30317 cd02785: The MopB_CT_4 CD includes a group of related uncharacterized bacterial and archaeal molybdopterin-binding oxidoreductase-like domains with a putative N-terminal iron-sulfur [4Fe-4S] cluster binding site and molybdopterin cofactor binding site. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 30318 cd02786: The MopB_CT_3 CD includes a group of related uncharacterized bacterial molybdopterin-binding oxidoreductase-like domains with a putative N-terminal iron-sulfur [4Fe-4S] cluster binding site and molybdopterin cofactor binding site. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 30319 cd02787: The MopB_CT_ydeP CD includes a group of related uncharacterized bacterial molybdopterin-binding oxidoreductase-like domains with a putative molybdopterin cofactor binding site. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 30320 cd02788: MopB_CT_NDH-1_NuoG2-N7: C-terminal region of the NuoG-like subunit (of the variant with a [4Fe-4S] cluster, N7) of the NADH-quinone oxidoreductase/NADH dehydrogenase-1 (NDH-1) found in various bacteria. The NDH-1 is the first energy-transducting complex in the respiratory chain and functions as a redox pump that uses the redox energy to translocate H+ ions across the membrane, resulting in a significant contribution to energy production. In Escherichia coli NDH-1, the largest subunit is encoded by the nuoG gene, and is part of the 14 distinct subunits constituting the functional enzyme. The NuoG subunit is made of two domains: the first contains three binding sites for FeS clusters (the fer2 domain), the second domain, is of unknown function or, as postulated, has lost an ancestral formate dehydrogenase activity that became redundant during the evolution of the complex I enzyme. Unique to this group, compared to the other prokaryotic and eukaryotic groups in this domain protein family (NADH-Q-OR-NuoG2), is an N-terminal [4Fe-4S] cluster (N7/N1c) present in the second domain and a C-terminal region (this CD) homologous to the formate dehydrogenase C-terminal molybdopterin_binding (MopB) region. 30321 cd02789: The MopB_FmdC-FwdD CD includes the C-terminus of subunit C of molybdenum formylmethanofuran dehydrogenase (FmdC) and subunit D of tungsten formylmethanofuran dehydrogenase (FwdD), and other related proteins. Formylmethanofuran dehydrogenase catalyzes the first step in methane formation from CO2 in methanogenic archaea and some eubacteria. Members of this CD belong to the molybdopterin_binding superfamily of proteins. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 30322 cd02790: Formate dehydrogenase H (Formate-Dh-H) catalyzes the reversible oxidation of formate to CO2 with the release of a proton and two electrons. It is a component of the anaerobic formate hydrogen lyase complex. The E. coli formate dehydrogenase H (Fdh-H) is a monomer composed of a single polypeptide chain with a Mo active site region and a [4Fe-4S] center. This CD (MopB_CT_Formate-Dh_H) is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 30323 cd02791: Nitrate reductases, NapA (Nitrate-R-NapA), NasA, and NarB catalyze the reduction of nitrate to nitrite. Monomeric Nas is located in the cytoplasm and participates in nitrogen assimilation. Dimeric Nap is located in the periplasm and is coupled to quinol oxidation via a membrane-anchored tetraheme cytochrome. This CD (MopB_CT_Nitrate-R-Nap) is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 30324 cd02792: Formate dehydrogenase N, alpha subunit (Formate-Dh-Na) is a major component of nitrate respiration in bacteria such as in the E. coli formate dehydrogenase N (Fdh-N). Fdh-N is a membrane protein that is a complex of three different subunits and is the major electron donor to the nitrate respiratory chain. Also included in this CD is the Desulfovibrio gigas tungsten formate dehydrogenase, DgW-FDH. In contrast to Fdh-N, which is a functional heterotrimer, DgW-FDH is a heterodimer. The DgW-FDH complex is composed of a large subunit carrying the W active site and one [4Fe-4S] center, and a small subunit that harbors a series of three [4Fe-4S] clusters as well as a putative vacant binding site for a fourth cluster. The smaller subunit is not included in this alignment. This CD (MopB_CT_Formate-Dh-Na-like) is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 30325 cd02793: The MopB_DMSOR-BSOR-TMAOR CD contains dimethylsulfoxide reductase (DMSOR), biotin sulfoxide reductase (BSOR), trimethylamine N-oxide reductase (TMAOR) and other related proteins. DMSOR always catalyzes the reduction of DMSO to dimethylsulfide, but its cellular location and oligomerization state are organism-dependent. For example, in Rhodobacter sphaeriodes and Rhodobacter capsulatus, it is an 82-kDa monomeric soluble protein found in the periplasmic space; in E. coli, it is membrane-bound and exists as a heterotrimer. BSOR catalyzes the reduction of biotin sulfixode to biotin, and is unique among Mo enzymes because no additional auxiliary proteins or cofactors are required. TMAOR is similar to DMSOR, but its only natural substrate is TMAO.This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 30326 cd02794: The MopB_CT_DmsA-EC CD includes the DmsA enzyme of the dmsABC operon encoding the anaerobic dimethylsulfoxide reductase (DMSOR) of Escherichia coli and other related DMSOR-like enzymes. Unlike other DMSOR-like enzymes, this group has a predicted N-terminal iron-sulfur [4Fe-4S] cluster binding site. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs. 30327 cd02825: PAZ domain, named PAZ after the proteins Piwi Argonaut and Zwille. PAZ is found in two families of proteins that are essential components of RNA-mediated gene-silencing pathways, including RNA interference, the piwi and Dicer families. PAZ functions as a nucleic-acid binding domain, with a strong preference for single-stranded nucleic acids (RNA or DNA) or RNA duplexes with single-stranded 3' overhangs. It has been suggested that the PAZ domain provides a unique mode for the recognition of the two 3'-terminal nucleotides in single-stranded nucleic acids and buries the 3' OH group, and that it might recognize characteristic 3' overhangs in siRNAs within RISC (RNA-induced silencing) and other complexes. This parent model also contains structures of an archaeal PAZ domain. 30328 cd02843: PAZ domain, dicer_like subfamily. Dicer is an RNAse involved in cleaving dsRNA in the RNA interference pathway. It generates dsRNAs which are approximately 20 bp long (siRNAs), which in turn target hydrolysis of homologous RNAs. PAZ domains are named after the proteins Piwi Argonaut and Zwille. PAZ is found in two families of proteins that are essential components of RNA-mediated gene-silencing pathways, including RNA interference, the piwi and Dicer families. PAZ functions as a nucleic-acid binding domain, with a strong preference for single-stranded nucleic acids (RNA or DNA) or RNA duplexes with single-stranded 3' overhangs. It has been suggested that the PAZ domain provides a unique mode for the recognition of the two 3'-terminal nucleotides in single-stranded nucleic acids and buries the 3' OH group, and that it might recognize characteristic 3' overhangs in siRNAs within RISC (RNA-induced silencing) and other complexes. 30329 cd02844: PAZ domain, CAF_like subfamily. CAF (for carpel factory) is a plant homolog of Dicer. CAF has been implicated in flower morphogenesis and in early Arabidopsis development and might function through posttranscriptional regulation of specific mRNA molecules. PAZ domains are named after the proteins Piwi, Argonaut, and Zwille. PAZ is found in two families of proteins that are essential components of RNA-mediated gene-silencing pathways, including RNA interference, the Piwi and Dicer families. PAZ functions as a nucleic-acid binding domain, with a strong preference for single-stranded nucleic acids (RNA or DNA) or RNA duplexes with single-stranded 3' overhangs. It has been suggested that the PAZ domain provides a unique mode for the recognition of the two 3'-terminal nucleotides in single-stranded nucleic acids and buries the 3' OH group, and that it might recognize characteristic 3' overhangs in siRNAs within RISC (RNA-induced silencing) and other complexes. 30330 cd02845: PAZ domain, Piwi_like subfamily. In multi-cellular organisms, the Piwi protein appears to be essential for the maintenance of germline stem cells. In the Drosophila male germline, Piwi was shown to be involved in the silencing of retrotransposons in the male gametes. The Piwi proteins share their domain architecture with other members of the argonaute family. The PAZ domain has been named after the proteins Piwi, Argonaut, and Zwille. PAZ is found in two families of proteins that are essential components of RNA-mediated gene-silencing pathways, including RNA interference, the Piwi and Dicer families. PAZ functions as a nucleic acid binding domain, with a strong preference for single-stranded nucleic acids (RNA or DNA) or RNA duplexes with single-stranded 3' overhangs. It has been suggested that the PAZ domain provides a unique mode for the recognition of the two 3'-terminal nucleotides in single-stranded nucleic acids and buries the 3' OH group, and that it might recognize characteristic 3' overhangs in siRNAs within RISC (RNA-induced silencing) and other complexes. 30331 cd02846: PAZ domain, argonaute_like subfamily. Argonaute is part of the RNA-induced silencing complex (RISC), and is an endonuclease that plays a key role in the RNA interference pathway. The PAZ domain has been named after the proteins Piwi,Argonaut, and Zwille. PAZ is found in two families of proteins that are essential components of RNA-mediated gene-silencing pathways, including RNA interference, the Piwi and Dicer families. PAZ functions as a nucleic acid binding domain, with a strong preference for single-stranded nucleic acids (RNA or DNA) or RNA duplexes with single-stranded 3' overhangs. It has been suggested that the PAZ domain provides a unique mode for the recognition of the two 3'-terminal nucleotides in single-stranded nucleic acids and buries the 3' OH group, and that it might recognize characteristic 3' overhangs in siRNAs within RISC (RNA-induced silencing) and other complexes. 30332 cd00331: Indole-3-glycerol phosphate synthase (IGPS); an enzyme in the tryptophan biosynthetic pathway, catalyzing the ring closure reaction of 1-(o-carboxyphenylamino)-1-deoxyribulose-5-phosphate (CdRP) to indole-3-glycerol phosphate (IGP), accompanied by the release of carbon dioxide and water. IGPS is active as a separate monomer in most organisms, but is also found fused to other enzymes as part of a bifunctional or multifunctional enzyme involved in tryptophan biosynthesis. 30333 cd02688: E or ""early"" set of sugar utilizing enzymes which may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. 30334 cd02847: Chitobiase C-terminus domain. Chitobiase (AKA N-acetylglucosaminidase) digests the beta, 1-4 glycosidic bonds of the N-acetylglucosamine (NAG) oligomers found in chitin, an important structural element of fungal cell wall and arthropod exoskeletons. It is thought to proceed through an acid-base reaction mechanism, in which one protein carboxylate acts as catalytic acid, while the nucleophile is the polar acetamido group of the sugar in a substrate-assisted reaction with retention of the anomeric configuration. The C-terminus of chitobiase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. 30335 cd02848: Chitinase N-terminus domain. Chitinases hydrolyze the abundant natural biopolymer chitin, producing smaller chito-oligosaccharides. Chitin consists of multiple N-acetyl-D-glucosamine (NAG) residues connected via beta-1,4-glycosidic linkages and is an important structural element of fungal cell wall and arthropod exoskeletons. On the basis of the mode of chitin hydrolysis, chitinases are classified as random, endo-, and exo-chitinases and based on sequence criteria, chitinases belong to families 18 and 19 of glycosyl hydrolases. The N-terminus of chitinase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. 30336 cd02849: Cgtase (cyclodextrin glycosyltransferase) C-terminus domain. Enzymes such as amylases, cyclomaltodextrinase (CDase), and CGTase degrade starch to smaller oligosaccharides by hydrolyzing the alpha-D-(1,4) linkages between glucose residues present in starch. In the case of CGTases, an additional cyclization reaction is catalyzed yielding mixtures of cyclic oligosaccharides which are referred to as alpha-, beta-, or gamma-cyclodextrins (CDs) (consisting of six, seven, or eight glucoses, respectively). CGTases are characterized as depending on the major product of the cyclization reaction. Besides having similar catalytic site residues, amylases and CGTases contain carbohydrate binding domains that are distant from the active site and which are implicated in attaching the enzyme to raw starch granules and in guiding the amylose chain into the active site. The C-terminus of CGTase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. 30337 cd02850: Cellulase N-terminus domain. Cellulases are O-glycosyl hydrolases (GHs) that hydrolyze beta 1-4 glucosidic bonds in cellulose. They are usually catagorized into either exoglucanases which sequentially release sugar units from the cellulose chain and endoglucanases which also attack the chain internally. The N-terminus of cellulase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. 30338 cd02851: Galactose oxidase C-terminus domain. Galactose oxidase is an extracellular monomeric enzyme which catalyses the stereospecific oxidation of a broad range of primary alcohol substrates and possesses a unique mononuclear copper site essential for catalysing a two-electron transfer reaction during the oxidation of primary alcohols to corresponding aldehydes. The second redox active center necessary for the reaction was found to be situated at a tyrosine residue. The C-terminus of galactose oxidase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. 30340 cd02853: Maltooligosyl trehalose synthase (MTSase) N-terminus domain. MTSase and maltooligosyl trehalose trehalohydrolase (MTHase) work together to produce trehalose. MTSase is responsible for converting the alpha-1,4-glucosidic linkage to an alpha,alpha-1,1-glucosidic linkage at the reducing end of the maltooligosaccharide through an intramolecular transglucosylation reaction, while MTHase hydrolyzes the penultimate alpha-1,4 linkage of the reducing end, resulting in the release of trehalose. The N-terminus of MTSase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. 30341 cd02854: Glycogen branching enzyme-like N-terminus domain. Glycogen branching enzyme (AKA 1,4 alpha glucan branching enzyme) catalyzes the formation of alpha-1,6 branch points in either glycogen or starch by cleavage of the alpha-1,4 glucosidic linkage yielding a non-reducing end oligosaccharide chain and subsequent attachment to the alpha-1,6 position. By increasing the number of non-reducing ends glycogen is more reactive to synthesis and digestion as well as being more soluble. The N-terminus of the glycogen branching enzyme-like proteins may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. 30342 cd02855: Glycogen branching enzyme N-terminus domain. Glycogen branching enzyme (AKA 1,4 alpha glucan branching enzyme) catalyzes the formation of alpha-1,6 branch points in either glycogen or starch by cleavage of the alpha-1,4 glucosidic linkage yielding a non-reducing end oligosaccharide chain and subsequent attachment to the alpha-1,6 position. By increasing the number of non-reducing ends glycogen is more reactive to synthesis and digestion as well as being more soluble. The N-terminus of the 1,4 alpha glucan branching enzyme may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. 30343 cd02856: Glycogen_debranching_enzyme N-terminal domain. Glycogen debranching enzymes have both 4-alpha-glucanotransferase and amylo-1,6-glucosidase activities. As a transferase it transfers a segment of a 1,4-alpha-D-glucan to a new 4-position in an acceptor, which may be glucose or another 1,4-alpha-D-glucan. As a glucosidase it catalyzes the endohydrolysis of 1,6-alpha-D-glucoside linkages at points of branching in chains of 1,4-linked alpha-D-glucose residues. The N-terminus of the glycogen debranching enzyme may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. 30344 cd02857: CD and pullulan-degrading enzymes N-terminus domain. Members of this subgroup include: Cyclomaltodextrinase (CDase), maltogenic amylase, and neopullulanase all of which are capable of hydrolyzing all or two of the following three types of substrates: cyclomaltodextrins (CDs), pullulan, and starch. These enzymes hydrolyze CDs and starch to maltose and pullulan to panose by cleavage of alpha-1,4 glycosidic bonds whereas alpha-amylases essentially lack activity on CDs and pullulan. They also catalyze transglycosylation of oligosaccharides to the C3-, C4- or C6-hydroxyl groups of various acceptor sugar molecules. The N-terminus of the CD and pullulan-degrading enzymes may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. 30345 cd02858: Esterase N-terminal domain. Esterases catalyze the hydrolysis of organic esters to release an alcohol or thiol and acid. The term can be applied to enzymes that hydrolyze carboxylate, phosphate and sulphate esters, but is more often restricted to the first class of substrate. The N-terminus of esterase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. 30346 cd02859: AMP-activated protein kinase (AMPK) beta subunit glycogen binding domain (GBD). AMPK is a metabolic stress sensing protein that senses AMP/ATP and has recently been found to act as a glycogen sensor as well. The protein functions as a alpha-beta-gamma heterotrimer. This domain is the glycogen binding domain of the beta subunit. 30347 cd02860: Pullulanase domain N-terminus. Pullulanase (AKA dextrinase; alpha-dextrin endo-1,6-alpha glucosidase) is an enzyme with action similar to that of isoamylase; it cleaves 1,6-alpha-glucosidic linkages in pullulan, amylopectin, and glycogen, and in alpha-and beta-amylase limit-dextrins of amylopectin and glycogen. The N-terminus of pullulanase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. 30348 cd02861: E or ""early"" set-like proteins. These alpha amylase-like sugar utilizing enzymes which may be related to the immunoglobulin and/or fibronectin type III superfamilies are associated with different types of catalytic domains at either the N-terminal or C-terminal end. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. 30349 smart00735: ZASP-like motif; Short motif (26 amino acids) present in an alpha-actinin-binding protein, ZASP, and similar molecules. . 30350 COG0001: Glutamate-1-semialdehyde aminotransferase [Coenzyme metabolism]. 30351 COG0002: Acetylglutamate semialdehyde dehydrogenase [Amino acid transport and metabolism]. 30352 COG0003: Predicted ATPase involved in chromosome partitioning [Cell division and chromosome partitioning]. 30353 COG0004: Ammonia permease [Inorganic ion transport and metabolism]. 30354 COG0005: Purine nucleoside phosphorylase [Nucleotide transport and metabolism]. 30356 COG0006: Xaa-Pro aminopeptidase [Amino acid transport and metabolism]. 30357 COG0007: Uroporphyrinogen-III methylase [Coenzyme metabolism]. 30358 COG0008: Glutamyl- and glutaminyl-tRNA synthetases [Translation, ribosomal structure and biogenesis]. 30359 COG0009: Putative translation factor (SUA5) [Translation, ribosomal structure and biogenesis]. 30360 COG0010: Arginase/agmatinase/formimionoglutamate hydrolase, arginase family [Amino acid transport and metabolism]. 30361 COG0011: Uncharacterized conserved protein [Function unknown]. 30362 COG0012: Predicted GTPase, probable translation factor [Translation, ribosomal structure and biogenesis]. 30363 COG0013: Alanyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 30364 COG0014: Gamma-glutamyl phosphate reductase [Amino acid transport and metabolism]. 30365 COG0015: Adenylosuccinate lyase [Nucleotide transport and metabolism]. 30366 COG0016: Phenylalanyl-tRNA synthetase alpha subunit [Translation, ribosomal structure and biogenesis]. 30367 COG0017: Aspartyl/asparaginyl-tRNA synthetases [Translation, ribosomal structure and biogenesis]. 30368 COG0018: Arginyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 30369 COG0019: Diaminopimelate decarboxylase [Amino acid transport and metabolism]. 30370 COG0020: Undecaprenyl pyrophosphate synthase [Lipid metabolism]. 30371 COG0021: Transketolase [Carbohydrate transport and metabolism]. 30372 COG0022: Pyruvate/2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, eukaryotic type, beta subunit [Energy production and conversion]. 30373 COG0023: Translation initiation factor 1 (eIF-1/SUI1) and related proteins [Translation, ribosomal structure and biogenesis]. 30374 COG0024: Methionine aminopeptidase [Translation, ribosomal structure and biogenesis]. 30375 COG0025: NhaP-type Na+/H+ and K+/H+ antiporters [Inorganic ion transport and metabolism]. 30376 COG0026: Phosphoribosylaminoimidazole carboxylase (NCAIR synthetase) [Nucleotide transport and metabolism]. 30377 COG0027: Formate-dependent phosphoribosylglycinamide formyltransferase (GAR transformylase) [Nucleotide transport and metabolism]. 30378 COG0028: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] [Amino acid transport and metabolism / Coenzyme metabolism]. 30379 COG0029: Aspartate oxidase [Coenzyme metabolism]. 30380 COG0030: Dimethyladenosine transferase (rRNA methylation) [Translation, ribosomal structure and biogenesis]. 30381 COG0031: Cysteine synthase [Amino acid transport and metabolism]. 30382 COG0033: Phosphoglucomutase [Carbohydrate transport and metabolism]. 30383 COG0034: Glutamine phosphoribosylpyrophosphate amidotransferase [Nucleotide transport and metabolism]. 30384 COG0035: Uracil phosphoribosyltransferase [Nucleotide transport and metabolism]. 30385 COG0036: Pentose-5-phosphate-3-epimerase [Carbohydrate transport and metabolism]. 30386 COG0037: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control [Cell division and chromosome partitioning]. 30387 COG0038: Chloride channel protein EriC [Inorganic ion transport and metabolism]. 30388 COG0039: Malate/lactate dehydrogenases [Energy production and conversion]. 30389 COG0040: ATP phosphoribosyltransferase [Amino acid transport and metabolism]. 30390 COG0041: Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase [Nucleotide transport and metabolism]. 30391 COG0042: tRNA-dihydrouridine synthase [Translation, ribosomal structure and biogenesis]. 30392 COG0043: 3-polyprenyl-4-hydroxybenzoate decarboxylase and related decarboxylases [Coenzyme metabolism]. 30393 COG0044: Dihydroorotase and related cyclic amidohydrolases [Nucleotide transport and metabolism]. 30394 COG0045: Succinyl-CoA synthetase, beta subunit [Energy production and conversion]. 30395 COG0046: Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain [Nucleotide transport and metabolism]. 30396 COG0047: Phosphoribosylformylglycinamidine (FGAM) synthase, glutamine amidotransferase domain [Nucleotide transport and metabolism]. 30397 COG0048: Ribosomal protein S12 [Translation, ribosomal structure and biogenesis]. 30398 COG0049: Ribosomal protein S7 [Translation, ribosomal structure and biogenesis]. 30399 COG0050: GTPases - translation elongation factors [Translation, ribosomal structure and biogenesis]. 30400 COG0051: Ribosomal protein S10 [Translation, ribosomal structure and biogenesis]. 30401 COG0052: Ribosomal protein S2 [Translation, ribosomal structure and biogenesis]. 30402 COG0053: Predicted Co/Zn/Cd cation transporters [Inorganic ion transport and metabolism]. 30403 COG0054: Riboflavin synthase beta-chain [Coenzyme metabolism]. 30404 COG0055: F0F1-type ATP synthase, beta subunit [Energy production and conversion]. 30405 COG0056: F0F1-type ATP synthase, alpha subunit [Energy production and conversion]. 30406 COG0057: Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase [Carbohydrate transport and metabolism]. 30407 COG0058: Glucan phosphorylase [Carbohydrate transport and metabolism]. 30408 COG0059: Ketol-acid reductoisomerase [Amino acid transport and metabolism / Coenzyme metabolism]. 30409 COG0060: Isoleucyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 30410 COG0061: Predicted sugar kinase [Carbohydrate transport and metabolism]. 30411 COG0062: Uncharacterized conserved protein [Function unknown]. 30412 COG0063: Predicted sugar kinase [Carbohydrate transport and metabolism]. 30413 COG0064: Asp-tRNAAsn/Glu-tRNAGln amidotransferase B subunit (PET112 homolog) [Translation, ribosomal structure and biogenesis]. 30414 COG0065: 3-isopropylmalate dehydratase large subunit [Amino acid transport and metabolism]. 30415 COG0066: 3-isopropylmalate dehydratase small subunit [Amino acid transport and metabolism]. 30416 COG0067: Glutamate synthase domain 1 [Amino acid transport and metabolism]. 30417 COG0068: Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]. 30418 COG0069: Glutamate synthase domain 2 [Amino acid transport and metabolism]. 30419 COG0070: Glutamate synthase domain 3 [Amino acid transport and metabolism]. 30420 COG0071: Molecular chaperone (small heat shock protein) [Posttranslational modification, protein turnover, chaperones]. 30421 COG0072: Phenylalanyl-tRNA synthetase beta subunit [Translation, ribosomal structure and biogenesis]. 30422 COG0073: EMAP domain [General function prediction only]. 30423 COG0074: Succinyl-CoA synthetase, alpha subunit [Energy production and conversion]. 30424 COG0075: Serine-pyruvate aminotransferase/archaeal aspartate aminotransferase [Amino acid transport and metabolism]. 30425 COG0076: Glutamate decarboxylase and related PLP-dependent proteins [Amino acid transport and metabolism]. 30426 COG0077: Prephenate dehydratase [Amino acid transport and metabolism]. 30427 COG0078: Ornithine carbamoyltransferase [Amino acid transport and metabolism]. 30428 COG0079: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase [Amino acid transport and metabolism]. 30429 COG0080: Ribosomal protein L11 [Translation, ribosomal structure and biogenesis]. 30430 COG0081: Ribosomal protein L1 [Translation, ribosomal structure and biogenesis]. 30431 COG0082: Chorismate synthase [Amino acid transport and metabolism]. 30432 COG0083: Homoserine kinase [Amino acid transport and metabolism]. 30433 COG0084: Mg-dependent DNase [DNA replication, recombination, and repair]. 30434 COG0085: DNA-directed RNA polymerase, beta subunit/140 kD subunit [Transcription]. 30435 COG0086: DNA-directed RNA polymerase, beta' subunit/160 kD subunit [Transcription]. 30436 COG0087: Ribosomal protein L3 [Translation, ribosomal structure and biogenesis]. 30437 COG0088: Ribosomal protein L4 [Translation, ribosomal structure and biogenesis]. 30438 COG0089: Ribosomal protein L23 [Translation, ribosomal structure and biogenesis]. 30439 COG0090: Ribosomal protein L2 [Translation, ribosomal structure and biogenesis]. 30440 COG0091: Ribosomal protein L22 [Translation, ribosomal structure and biogenesis]. 30441 COG0092: Ribosomal protein S3 [Translation, ribosomal structure and biogenesis]. 30442 COG0093: Ribosomal protein L14 [Translation, ribosomal structure and biogenesis]. 30443 COG0094: Ribosomal protein L5 [Translation, ribosomal structure and biogenesis]. 30444 COG0095: Lipoate-protein ligase A [Coenzyme metabolism]. 30445 COG0096: Ribosomal protein S8 [Translation, ribosomal structure and biogenesis]. 30446 COG0097: Ribosomal protein L6P/L9E [Translation, ribosomal structure and biogenesis]. 30447 COG0098: Ribosomal protein S5 [Translation, ribosomal structure and biogenesis]. 30448 COG0099: Ribosomal protein S13 [Translation, ribosomal structure and biogenesis]. 30449 COG0100: Ribosomal protein S11 [Translation, ribosomal structure and biogenesis]. 30450 COG0101: Pseudouridylate synthase [Translation, ribosomal structure and biogenesis]. 30451 COG0102: Ribosomal protein L13 [Translation, ribosomal structure and biogenesis]. 30452 COG0103: Ribosomal protein S9 [Translation, ribosomal structure and biogenesis]. 30453 COG0104: Adenylosuccinate synthase [Nucleotide transport and metabolism]. 30454 COG0105: Nucleoside diphosphate kinase [Nucleotide transport and metabolism]. 30455 COG0106: Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase [Amino acid transport and metabolism]. 30456 COG0107: Imidazoleglycerol-phosphate synthase [Amino acid transport and metabolism]. 30457 COG0108: 3,4-dihydroxy-2-butanone 4-phosphate synthase [Coenzyme metabolism]. 30458 COG0109: Polyprenyltransferase (cytochrome oxidase assembly factor) [Posttranslational modification, protein turnover, chaperones]. 30459 COG0110: Acetyltransferase (isoleucine patch superfamily) [General function prediction only]. 30460 COG0111: Phosphoglycerate dehydrogenase and related dehydrogenases [Amino acid transport and metabolism]. 30461 COG0112: Glycine/serine hydroxymethyltransferase [Amino acid transport and metabolism]. 30462 COG0113: Delta-aminolevulinic acid dehydratase [Coenzyme metabolism]. 30463 COG0114: Fumarase [Energy production and conversion]. 30464 COG0115: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase [Amino acid transport and metabolism / Coenzyme metabolism]. 30465 COG0116: Predicted N6-adenine-specific DNA methylase [DNA replication, recombination, and repair]. 30466 COG0117: Pyrimidine deaminase [Coenzyme metabolism]. 30467 COG0118: Glutamine amidotransferase [Amino acid transport and metabolism]. 30468 COG0119: Isopropylmalate/homocitrate/citramalate synthases [Amino acid transport and metabolism]. 30469 COG0120: Ribose 5-phosphate isomerase [Carbohydrate transport and metabolism]. 30470 COG0121: Predicted glutamine amidotransferase [General function prediction only]. 30471 COG0122: 3-methyladenine DNA glycosylase/8-oxoguanine DNA glycosylase [DNA replication, recombination, and repair]. 30472 COG0123: Deacetylases, including yeast histone deacetylase and acetoin utilization protein [Chromatin structure and dynamics / Secondary metabolites biosynthesis, transport, and catabolism]. 30473 COG0124: Histidyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 30474 COG0125: Thymidylate kinase [Nucleotide transport and metabolism]. 30475 COG0126: 3-phosphoglycerate kinase [Carbohydrate transport and metabolism]. 30476 COG0127: Xanthosine triphosphate pyrophosphatase [Nucleotide transport and metabolism]. 30477 COG0128: 5-enolpyruvylshikimate-3-phosphate synthase [Amino acid transport and metabolism]. 30478 COG0129: Dihydroxyacid dehydratase/phosphogluconate dehydratase [Amino acid transport and metabolism / Carbohydrate transport and metabolism]. 30479 COG0130: Pseudouridine synthase [Translation, ribosomal structure and biogenesis]. 30480 COG0131: Imidazoleglycerol-phosphate dehydratase [Amino acid transport and metabolism]. 30481 COG0132: Dethiobiotin synthetase [Coenzyme metabolism]. 30482 COG0133: Tryptophan synthase beta chain [Amino acid transport and metabolism]. 30483 COG0134: Indole-3-glycerol phosphate synthase [Amino acid transport and metabolism]. 30484 COG0135: Phosphoribosylanthranilate isomerase [Amino acid transport and metabolism]. 30485 COG0136: Aspartate-semialdehyde dehydrogenase [Amino acid transport and metabolism]. 30486 COG0137: Argininosuccinate synthase [Amino acid transport and metabolism]. 30487 COG0138: AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) [Nucleotide transport and metabolism]. 30488 COG0139: Phosphoribosyl-AMP cyclohydrolase [Amino acid transport and metabolism]. 30489 COG0140: Phosphoribosyl-ATP pyrophosphohydrolase [Amino acid transport and metabolism]. 30490 COG0141: Histidinol dehydrogenase [Amino acid transport and metabolism]. 30491 COG0142: Geranylgeranyl pyrophosphate synthase [Coenzyme metabolism]. 30492 COG0143: Methionyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 30493 COG0144: tRNA and rRNA cytosine-C5-methylases [Translation, ribosomal structure and biogenesis]. 30494 COG0145: N-methylhydantoinase A/acetone carboxylase, beta subunit [Amino acid transport and metabolism / Secondary metabolites biosynthesis, transport, and catabolism]. 30495 COG0146: N-methylhydantoinase B/acetone carboxylase, alpha subunit [Amino acid transport and metabolism / Secondary metabolites biosynthesis, transport, and catabolism]. 30496 COG0147: Anthranilate/para-aminobenzoate synthases component I [Amino acid transport and metabolism / Coenzyme metabolism]. 30497 COG0148: Enolase [Carbohydrate transport and metabolism]. 30498 COG0149: Triosephosphate isomerase [Carbohydrate transport and metabolism]. 30499 COG0150: Phosphoribosylaminoimidazole (AIR) synthetase [Nucleotide transport and metabolism]. 30500 COG0151: Phosphoribosylamine-glycine ligase [Nucleotide transport and metabolism]. 30501 COG0152: Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase [Nucleotide transport and metabolism]. 30502 COG0153: Galactokinase [Carbohydrate transport and metabolism]. 30503 COG0154: Asp-tRNAAsn/Glu-tRNAGln amidotransferase A subunit and related amidases [Translation, ribosomal structure and biogenesis]. 30504 COG0155: Sulfite reductase, beta subunit (hemoprotein) [Inorganic ion transport and metabolism]. 30505 COG0156: 7-keto-8-aminopelargonate synthetase and related enzymes [Coenzyme metabolism]. 30506 COG0157: Nicotinate-nucleotide pyrophosphorylase [Coenzyme metabolism]. 30507 COG0158: Fructose-1,6-bisphosphatase [Carbohydrate transport and metabolism]. 30508 COG0159: Tryptophan synthase alpha chain [Amino acid transport and metabolism]. 30509 COG0160: 4-aminobutyrate aminotransferase and related aminotransferases [Amino acid transport and metabolism]. 30510 COG0161: Adenosylmethionine-8-amino-7-oxononanoate aminotransferase [Coenzyme metabolism]. 30511 COG0162: Tyrosyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 30512 COG0163: 3-polyprenyl-4-hydroxybenzoate decarboxylase [Coenzyme metabolism]. 30513 COG0164: Ribonuclease HII [DNA replication, recombination, and repair]. 30514 COG0165: Argininosuccinate lyase [Amino acid transport and metabolism]. 30515 COG0166: Glucose-6-phosphate isomerase [Carbohydrate transport and metabolism]. 30516 COG0167: Dihydroorotate dehydrogenase [Nucleotide transport and metabolism]. 30517 COG0168: Trk-type K+ transport systems, membrane components [Inorganic ion transport and metabolism]. 30518 COG0169: Shikimate 5-dehydrogenase [Amino acid transport and metabolism]. 30519 COG0170: Dolichol kinase [Lipid metabolism]. 30520 COG0171: NAD synthase [Coenzyme metabolism]. 30521 COG0172: Seryl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 30522 COG0173: Aspartyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 30523 COG0174: Glutamine synthetase [Amino acid transport and metabolism]. 30524 COG0175: 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes [Amino acid transport and metabolism / Coenzyme metabolism]. 30525 COG0176: Transaldolase [Carbohydrate transport and metabolism]. 30526 COG0177: Predicted EndoIII-related endonuclease [DNA replication, recombination, and repair]. 30527 COG0178: Excinuclease ATPase subunit [DNA replication, recombination, and repair]. 30528 COG0179: 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) [Secondary metabolites biosynthesis, transport, and catabolism]. 30529 COG0180: Tryptophanyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 30530 COG0181: Porphobilinogen deaminase [Coenzyme metabolism]. 30531 COG0182: Predicted translation initiation factor 2B subunit, eIF-2B alpha/beta/delta family [Translation, ribosomal structure and biogenesis]. 30532 COG0183: Acetyl-CoA acetyltransferase [Lipid metabolism]. 30533 COG0184: Ribosomal protein S15P/S13E [Translation, ribosomal structure and biogenesis]. 30534 COG0185: Ribosomal protein S19 [Translation, ribosomal structure and biogenesis]. 30535 COG0186: Ribosomal protein S17 [Translation, ribosomal structure and biogenesis]. 30536 COG0187: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit [DNA replication, recombination, and repair]. 30537 COG0188: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit [DNA replication, recombination, and repair]. 30538 COG0189: Glutathione synthase/Ribosomal protein S6 modification enzyme (glutaminyl transferase) [Coenzyme metabolism / Translation, ribosomal structure and biogenesis]. 30539 COG0190: 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase [Coenzyme metabolism]. 30540 COG0191: Fructose/tagatose bisphosphate aldolase [Carbohydrate transport and metabolism]. 30541 COG0192: S-adenosylmethionine synthetase [Coenzyme metabolism]. 30542 COG0193: Peptidyl-tRNA hydrolase [Translation, ribosomal structure and biogenesis]. 30543 COG0194: Guanylate kinase [Nucleotide transport and metabolism]. 30544 COG0195: Transcription elongation factor [Transcription]. 30545 COG0196: FAD synthase [Coenzyme metabolism]. 30546 COG0197: Ribosomal protein L16/L10E [Translation, ribosomal structure and biogenesis]. 30547 COG0198: Ribosomal protein L24 [Translation, ribosomal structure and biogenesis]. 30548 COG0199: Ribosomal protein S14 [Translation, ribosomal structure and biogenesis]. 30549 COG0200: Ribosomal protein L15 [Translation, ribosomal structure and biogenesis]. 30550 COG0201: Preprotein translocase subunit SecY [Intracellular trafficking and secretion]. 30551 COG0202: DNA-directed RNA polymerase, alpha subunit/40 kD subunit [Transcription]. 30552 COG0203: Ribosomal protein L17 [Translation, ribosomal structure and biogenesis]. 30553 COG0204: 1-acyl-sn-glycerol-3-phosphate acyltransferase [Lipid metabolism]. 30554 COG0205: 6-phosphofructokinase [Carbohydrate transport and metabolism]. 30555 COG0206: Cell division GTPase [Cell division and chromosome partitioning]. 30556 COG0207: Thymidylate synthase [Nucleotide transport and metabolism]. 30557 COG0208: Ribonucleotide reductase, beta subunit [Nucleotide transport and metabolism]. 30559 COG0210: Superfamily I DNA and RNA helicases [DNA replication, recombination, and repair]. 30560 COG0211: Ribosomal protein L27 [Translation, ribosomal structure and biogenesis]. 30561 COG0212: 5-formyltetrahydrofolate cyclo-ligase [Coenzyme metabolism]. 30562 COG0213: Thymidine phosphorylase [Nucleotide transport and metabolism]. 30563 COG0214: Pyridoxine biosynthesis enzyme [Coenzyme metabolism]. 30564 COG0215: Cysteinyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 30565 COG0216: Protein chain release factor A [Translation, ribosomal structure and biogenesis]. 30566 COG0217: Uncharacterized conserved protein [Function unknown]. 30567 COG0218: Predicted GTPase [General function prediction only]. 30568 COG0219: Predicted rRNA methylase (SpoU class) [Translation, ribosomal structure and biogenesis]. 30569 COG0220: Predicted S-adenosylmethionine-dependent methyltransferase [General function prediction only]. 30570 COG0221: Inorganic pyrophosphatase [Energy production and conversion]. 30571 COG0222: Ribosomal protein L7/L12 [Translation, ribosomal structure and biogenesis]. 30572 COG0223: Methionyl-tRNA formyltransferase [Translation, ribosomal structure and biogenesis]. 30573 COG0224: F0F1-type ATP synthase, gamma subunit [Energy production and conversion]. 30574 COG0225: Peptide methionine sulfoxide reductase [Posttranslational modification, protein turnover, chaperones]. 30575 COG0226: ABC-type phosphate transport system, periplasmic component [Inorganic ion transport and metabolism]. 30576 COG0227: Ribosomal protein L28 [Translation, ribosomal structure and biogenesis]. 30577 COG0228: Ribosomal protein S16 [Translation, ribosomal structure and biogenesis]. 30578 COG0229: Conserved domain frequently associated with peptide methionine sulfoxide reductase [Posttranslational modification, protein turnover, chaperones]. 30579 COG0230: Ribosomal protein L34 [Translation, ribosomal structure and biogenesis]. 30580 COG0231: Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) [Translation, ribosomal structure and biogenesis]. 30581 COG0232: dGTP triphosphohydrolase [Nucleotide transport and metabolism]. 30582 COG0233: Ribosome recycling factor [Translation, ribosomal structure and biogenesis]. 30583 COG0234: Co-chaperonin GroES (HSP10) [Posttranslational modification, protein turnover, chaperones]. 30584 COG0235: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases [Carbohydrate transport and metabolism]. 30585 COG0236: Acyl carrier protein [Lipid metabolism / Secondary metabolites biosynthesis, transport, and catabolism]. 30586 COG0237: Dephospho-CoA kinase [Coenzyme metabolism]. 30587 COG0238: Ribosomal protein S18 [Translation, ribosomal structure and biogenesis]. 30588 COG0239: Integral membrane protein possibly involved in chromosome condensation [Cell division and chromosome partitioning]. 30589 COG0240: Glycerol-3-phosphate dehydrogenase [Energy production and conversion]. 30590 COG0241: Histidinol phosphatase and related phosphatases [Amino acid transport and metabolism]. 30591 COG0242: N-formylmethionyl-tRNA deformylase [Translation, ribosomal structure and biogenesis]. 30592 COG0243: Anaerobic dehydrogenases, typically selenocysteine-containing [Energy production and conversion]. 30593 COG0244: Ribosomal protein L10 [Translation, ribosomal structure and biogenesis]. 30594 COG0245: 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase [Lipid metabolism]. 30595 COG0246: Mannitol-1-phosphate/altronate dehydrogenases [Carbohydrate transport and metabolism]. 30596 COG0247: Fe-S oxidoreductase [Energy production and conversion]. 30597 COG0248: Exopolyphosphatase [Nucleotide transport and metabolism / Inorganic ion transport and metabolism]. 30598 COG0249: Mismatch repair ATPase (MutS family) [DNA replication, recombination, and repair]. 30599 COG0250: Transcription antiterminator [Transcription]. 30600 COG0251: Putative translation initiation inhibitor, yjgF family [Translation, ribosomal structure and biogenesis]. 30601 COG0252: L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D [Amino acid transport and metabolism / Translation, ribosomal structure and biogenesis]. 30602 COG0253: Diaminopimelate epimerase [Amino acid transport and metabolism]. 30603 COG0254: Ribosomal protein L31 [Translation, ribosomal structure and biogenesis]. 30604 COG0255: Ribosomal protein L29 [Translation, ribosomal structure and biogenesis]. 30605 COG0256: Ribosomal protein L18 [Translation, ribosomal structure and biogenesis]. 30606 COG0257: Ribosomal protein L36 [Translation, ribosomal structure and biogenesis]. 30607 COG0258: 5'-3' exonuclease (including N-terminal domain of PolI) [DNA replication, recombination, and repair]. 30608 COG0259: Pyridoxamine-phosphate oxidase [Coenzyme metabolism]. 30609 COG0260: Leucyl aminopeptidase [Amino acid transport and metabolism]. 30610 COG0261: Ribosomal protein L21 [Translation, ribosomal structure and biogenesis]. 30611 COG0262: Dihydrofolate reductase [Coenzyme metabolism]. 30612 COG0263: Glutamate 5-kinase [Amino acid transport and metabolism]. 30613 COG0264: Translation elongation factor Ts [Translation, ribosomal structure and biogenesis]. 30614 COG0265: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]. 30615 COG0266: Formamidopyrimidine-DNA glycosylase [DNA replication, recombination, and repair]. 30616 COG0267: Ribosomal protein L33 [Translation, ribosomal structure and biogenesis]. 30617 COG0268: Ribosomal protein S20 [Translation, ribosomal structure and biogenesis]. 30618 COG0269: 3-hexulose-6-phosphate synthase and related proteins [Carbohydrate transport and metabolism]. 30619 COG0270: Site-specific DNA methylase [DNA replication, recombination, and repair]. 30620 COG0271: Stress-induced morphogen (activity unknown) [Signal transduction mechanisms]. 30621 COG0272: NAD-dependent DNA ligase (contains BRCT domain type II) [DNA replication, recombination, and repair]. 30622 COG0274: Deoxyribose-phosphate aldolase [Nucleotide transport and metabolism]. 30623 COG0275: Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis [Cell envelope biogenesis, outer membrane]. 30624 COG0276: Protoheme ferro-lyase (ferrochelatase) [Coenzyme metabolism]. 30625 COG0277: FAD/FMN-containing dehydrogenases [Energy production and conversion]. 30626 COG0278: Glutaredoxin-related protein [Posttranslational modification, protein turnover, chaperones]. 30627 COG0279: Phosphoheptose isomerase [Carbohydrate transport and metabolism]. 30628 COG0280: Phosphotransacetylase [Energy production and conversion]. 30629 COG0281: Malic enzyme [Energy production and conversion]. 30630 COG0282: Acetate kinase [Energy production and conversion]. 30631 COG0283: Cytidylate kinase [Nucleotide transport and metabolism]. 30632 COG0284: Orotidine-5'-phosphate decarboxylase [Nucleotide transport and metabolism]. 30633 COG0285: Folylpolyglutamate synthase [Coenzyme metabolism]. 30634 COG0286: Type I restriction-modification system methyltransferase subunit [Defense mechanisms]. 30635 COG0287: Prephenate dehydrogenase [Amino acid transport and metabolism]. 30636 COG0288: Carbonic anhydrase [Inorganic ion transport and metabolism]. 30637 COG0289: Dihydrodipicolinate reductase [Amino acid transport and metabolism]. 30638 COG0290: Translation initiation factor 3 (IF-3) [Translation, ribosomal structure and biogenesis]. 30639 COG0291: Ribosomal protein L35 [Translation, ribosomal structure and biogenesis]. 30640 COG0292: Ribosomal protein L20 [Translation, ribosomal structure and biogenesis]. 30641 COG0293: 23S rRNA methylase [Translation, ribosomal structure and biogenesis]. 30642 COG0294: Dihydropteroate synthase and related enzymes [Coenzyme metabolism]. 30643 COG0295: Cytidine deaminase [Nucleotide transport and metabolism]. 30644 COG0296: 1,4-alpha-glucan branching enzyme [Carbohydrate transport and metabolism]. 30645 COG0297: Glycogen synthase [Carbohydrate transport and metabolism]. 30646 COG0298: Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]. 30647 COG0299: Folate-dependent phosphoribosylglycinamide formyltransferase PurN [Nucleotide transport and metabolism]. 30648 COG0300: Short-chain dehydrogenases of various substrate specificities [General function prediction only]. 30649 COG0301: Thiamine biosynthesis ATP pyrophosphatase [Coenzyme metabolism]. 30650 COG0302: GTP cyclohydrolase I [Coenzyme metabolism]. 30651 COG0303: Molybdopterin biosynthesis enzyme [Coenzyme metabolism]. 30652 COG0304: 3-oxoacyl-(acyl-carrier-protein) synthase [Lipid metabolism / Secondary metabolites biosynthesis, transport, and catabolism]. 30653 COG0305: Replicative DNA helicase [DNA replication, recombination, and repair]. 30654 COG0306: Phosphate/sulphate permeases [Inorganic ion transport and metabolism]. 30655 COG0307: Riboflavin synthase alpha chain [Coenzyme metabolism]. 30656 COG0308: Aminopeptidase N [Amino acid transport and metabolism]. 30657 COG0309: Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]. 30658 COG0310: ABC-type Co2+ transport system, permease component [Inorganic ion transport and metabolism]. 30659 COG0311: Predicted glutamine amidotransferase involved in pyridoxine biosynthesis [Coenzyme metabolism]. 30660 COG0312: Predicted Zn-dependent proteases and their inactivated homologs [General function prediction only]. 30661 COG0313: Predicted methyltransferases [General function prediction only]. 30662 COG0314: Molybdopterin converting factor, large subunit [Coenzyme metabolism]. 30663 COG0315: Molybdenum cofactor biosynthesis enzyme [Coenzyme metabolism]. 30664 COG0316: Uncharacterized conserved protein [Function unknown]. 30665 COG0317: Guanosine polyphosphate pyrophosphohydrolases/synthetases [Signal transduction mechanisms / Transcription]. 30666 COG0318: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II [Lipid metabolism / Secondary metabolites biosynthesis, transport, and catabolism]. 30667 COG0319: Predicted metal-dependent hydrolase [General function prediction only]. 30668 COG0320: Lipoate synthase [Coenzyme metabolism]. 30669 COG0321: Lipoate-protein ligase B [Coenzyme metabolism]. 30670 COG0322: Nuclease subunit of the excinuclease complex [DNA replication, recombination, and repair]. 30671 COG0323: DNA mismatch repair enzyme (predicted ATPase) [DNA replication, recombination, and repair]. 30672 COG0324: tRNA delta(2)-isopentenylpyrophosphate transferase [Translation, ribosomal structure and biogenesis]. 30673 COG0325: Predicted enzyme with a TIM-barrel fold [General function prediction only]. 30674 COG0326: Molecular chaperone, HSP90 family [Posttranslational modification, protein turnover, chaperones]. 30675 COG0327: Uncharacterized conserved protein [Function unknown]. 30676 COG0328: Ribonuclease HI [DNA replication, recombination, and repair]. 30677 COG0329: Dihydrodipicolinate synthase/N-acetylneuraminate lyase [Amino acid transport and metabolism / Cell envelope biogenesis, outer membrane]. 30678 COG0330: Membrane protease subunits, stomatin/prohibitin homologs [Posttranslational modification, protein turnover, chaperones]. 30679 COG0331: (acyl-carrier-protein) S-malonyltransferase [Lipid metabolism]. 30680 COG0332: 3-oxoacyl-[acyl-carrier-protein]. 30681 COG0333: Ribosomal protein L32 [Translation, ribosomal structure and biogenesis]. 30682 COG0334: Glutamate dehydrogenase/leucine dehydrogenase [Amino acid transport and metabolism]. 30683 COG0335: Ribosomal protein L19 [Translation, ribosomal structure and biogenesis]. 30684 COG0336: tRNA-(guanine-N1)-methyltransferase [Translation, ribosomal structure and biogenesis]. 30685 COG0337: 3-dehydroquinate synthetase [Amino acid transport and metabolism]. 30686 COG0338: Site-specific DNA methylase [DNA replication, recombination, and repair]. 30687 COG0339: Zn-dependent oligopeptidases [Amino acid transport and metabolism]. 30688 COG0340: Biotin-(acetyl-CoA carboxylase) ligase [Coenzyme metabolism]. 30689 COG0341: Preprotein translocase subunit SecF [Intracellular trafficking and secretion]. 30690 COG0342: Preprotein translocase subunit SecD [Intracellular trafficking and secretion]. 30691 COG0343: Queuine/archaeosine tRNA-ribosyltransferase [Translation, ribosomal structure and biogenesis]. 30692 COG0344: Predicted membrane protein [Function unknown]. 30693 COG0345: Pyrroline-5-carboxylate reductase [Amino acid transport and metabolism]. 30694 COG0346: Lactoylglutathione lyase and related lyases [Amino acid transport and metabolism]. 30695 COG0347: Nitrogen regulatory protein PII [Amino acid transport and metabolism]. 30696 COG0348: Polyferredoxin [Energy production and conversion]. 30697 COG0349: Ribonuclease D [Translation, ribosomal structure and biogenesis]. 30699 COG0350: Methylated DNA-protein cysteine methyltransferase [DNA replication, recombination, and repair]. 30700 COG0351: Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase [Coenzyme metabolism]. 30701 COG0352: Thiamine monophosphate synthase [Coenzyme metabolism]. 30702 COG0353: Recombinational DNA repair protein (RecF pathway) [DNA replication, recombination, and repair]. 30703 COG0354: Predicted aminomethyltransferase related to GcvT [General function prediction only]. 30704 COG0355: F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) [Energy production and conversion]. 30705 COG0356: F0F1-type ATP synthase, subunit a [Energy production and conversion]. 30706 COG0357: Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division [Cell envelope biogenesis, outer membrane]. 30707 COG0358: DNA primase (bacterial type) [DNA replication, recombination, and repair]. 30708 COG0359: Ribosomal protein L9 [Translation, ribosomal structure and biogenesis]. 30709 COG0360: Ribosomal protein S6 [Translation, ribosomal structure and biogenesis]. 30710 COG0361: Translation initiation factor 1 (IF-1) [Translation, ribosomal structure and biogenesis]. 30711 COG0362: 6-phosphogluconate dehydrogenase [Carbohydrate transport and metabolism]. 30712 COG0363: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase [Carbohydrate transport and metabolism]. 30713 COG0364: Glucose-6-phosphate 1-dehydrogenase [Carbohydrate transport and metabolism]. 30714 COG0365: Acyl-coenzyme A synthetases/AMP-(fatty) acid ligases [Lipid metabolism]. 30715 COG0366: Glycosidases [Carbohydrate transport and metabolism]. 30716 COG0367: Asparagine synthase (glutamine-hydrolyzing) [Amino acid transport and metabolism]. 30717 COG0368: Cobalamin-5-phosphate synthase [Coenzyme metabolism]. 30718 COG0369: Sulfite reductase, alpha subunit (flavoprotein) [Inorganic ion transport and metabolism]. 30719 COG0370: Fe2+ transport system protein B [Inorganic ion transport and metabolism]. 30720 COG0371: Glycerol dehydrogenase and related enzymes [Energy production and conversion]. 30721 COG0372: Citrate synthase [Energy production and conversion]. 30722 COG0373: Glutamyl-tRNA reductase [Coenzyme metabolism]. 30723 COG0374: Ni,Fe-hydrogenase I large subunit [Energy production and conversion]. 30724 COG0375: Zn finger protein HypA/HybF (possibly regulating hydrogenase expression) [General function prediction only]. 30725 COG0376: Catalase (peroxidase I) [Inorganic ion transport and metabolism]. 30726 COG0377: NADH:ubiquinone oxidoreductase 20 kD subunit and related Fe-S oxidoreductases [Energy production and conversion]. 30727 COG0378: Ni2+-binding GTPase involved in regulation of expression and maturation of urease and hydrogenase [Posttranslational modification, protein turnover, chaperones / Transcription]. 30728 COG0379: Quinolinate synthase [Coenzyme metabolism]. 30729 COG0380: Trehalose-6-phosphate synthase [Carbohydrate transport and metabolism]. 30730 COG0381: UDP-N-acetylglucosamine 2-epimerase [Cell envelope biogenesis, outer membrane]. 30731 COG0382: 4-hydroxybenzoate polyprenyltransferase and related prenyltransferases [Coenzyme metabolism]. 30732 COG0383: Alpha-mannosidase [Carbohydrate transport and metabolism]. 30733 COG0384: Predicted epimerase, PhzC/PhzF homolog [General function prediction only]. 30734 COG0385: Predicted Na+-dependent transporter [General function prediction only]. 30735 COG0386: Glutathione peroxidase [Posttranslational modification, protein turnover, chaperones]. 30736 COG0387: Ca2+/H+ antiporter [Inorganic ion transport and metabolism]. 30737 COG0388: Predicted amidohydrolase [General function prediction only]. 30738 COG0389: Nucleotidyltransferase/DNA polymerase involved in DNA repair [DNA replication, recombination, and repair]. 30739 COG0390: ABC-type uncharacterized transport system, permease component [General function prediction only]. 30740 COG0391: Uncharacterized conserved protein [Function unknown]. 30741 COG0392: Predicted integral membrane protein [Function unknown]. 30742 COG0393: Uncharacterized conserved protein [Function unknown]. 30743 COG0394: Protein-tyrosine-phosphatase [Signal transduction mechanisms]. 30744 COG0395: ABC-type sugar transport system, permease component [Carbohydrate transport and metabolism]. 30745 COG0396: ABC-type transport system involved in Fe-S cluster assembly, ATPase component [Posttranslational modification, protein turnover, chaperones]. 30746 COG0397: Uncharacterized conserved protein [Function unknown]. 30747 COG0398: Uncharacterized conserved protein [Function unknown]. 30748 COG0399: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis [Cell envelope biogenesis, outer membrane]. 30749 COG0400: Predicted esterase [General function prediction only]. 30750 COG0401: Uncharacterized homolog of Blt101 [Function unknown]. 30751 COG0402: Cytosine deaminase and related metal-dependent hydrolases [Nucleotide transport and metabolism / General function prediction only]. 30752 COG0403: Glycine cleavage system protein P (pyridoxal-binding), N-terminal domain [Amino acid transport and metabolism]. 30753 COG0404: Glycine cleavage system T protein (aminomethyltransferase) [Amino acid transport and metabolism]. 30754 COG0405: Gamma-glutamyltransferase [Amino acid transport and metabolism]. 30755 COG0406: Fructose-2,6-bisphosphatase [Carbohydrate transport and metabolism]. 30756 COG0407: Uroporphyrinogen-III decarboxylase [Coenzyme metabolism]. 30757 COG0408: Coproporphyrinogen III oxidase [Coenzyme metabolism]. 30758 COG0409: Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]. 30759 COG0410: ABC-type branched-chain amino acid transport systems, ATPase component [Amino acid transport and metabolism]. 30760 COG0411: ABC-type branched-chain amino acid transport systems, ATPase component [Amino acid transport and metabolism]. 30761 COG0412: Dienelactone hydrolase and related enzymes [Secondary metabolites biosynthesis, transport, and catabolism]. 30762 COG0413: Ketopantoate hydroxymethyltransferase [Coenzyme metabolism]. 30763 COG0414: Panthothenate synthetase [Coenzyme metabolism]. 30764 COG0415: Deoxyribodipyrimidine photolyase [DNA replication, recombination, and repair]. 30765 COG0416: Fatty acid/phospholipid biosynthesis enzyme [Lipid metabolism]. 30766 COG0417: DNA polymerase elongation subunit (family B) [DNA replication, recombination, and repair]. 30767 COG0418: Dihydroorotase [Nucleotide transport and metabolism]. 30768 COG0419: ATPase involved in DNA repair [DNA replication, recombination, and repair]. 30769 COG0420: DNA repair exonuclease [DNA replication, recombination, and repair]. 30770 COG0421: Spermidine synthase [Amino acid transport and metabolism]. 30771 COG0422: Thiamine biosynthesis protein ThiC [Coenzyme metabolism]. 30772 COG0423: Glycyl-tRNA synthetase (class II) [Translation, ribosomal structure and biogenesis]. 30773 COG0424: Nucleotide-binding protein implicated in inhibition of septum formation [Cell division and chromosome partitioning]. 30774 COG0425: Predicted redox protein, regulator of disulfide bond formation [Posttranslational modification, protein turnover, chaperones]. 30775 COG0426: Uncharacterized flavoproteins [Energy production and conversion]. 30776 COG0427: Acetyl-CoA hydrolase [Energy production and conversion]. 30777 COG0428: Predicted divalent heavy-metal cations transporter [Inorganic ion transport and metabolism]. 30778 COG0429: Predicted hydrolase of the alpha/beta-hydrolase fold [General function prediction only]. 30779 COG0430: RNA 3'-terminal phosphate cyclase [RNA processing and modification]. 30780 COG0431: Predicted flavoprotein [General function prediction only]. 30781 COG0432: Uncharacterized conserved protein [Function unknown]. 30782 COG0433: Predicted ATPase [General function prediction only]. 30783 COG0434: Predicted TIM-barrel enzyme [General function prediction only]. 30784 COG0435: Predicted glutathione S-transferase [Posttranslational modification, protein turnover, chaperones]. 30785 COG0436: Aspartate/tyrosine/aromatic aminotransferase [Amino acid transport and metabolism]. 30786 COG0437: Fe-S-cluster-containing hydrogenase components 1 [Energy production and conversion]. 30787 COG0438: Glycosyltransferase [Cell envelope biogenesis, outer membrane]. 30788 COG0439: Biotin carboxylase [Lipid metabolism]. 30789 COG0440: Acetolactate synthase, small (regulatory) subunit [Amino acid transport and metabolism]. 30790 COG0441: Threonyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 30791 COG0442: Prolyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 30792 COG0443: Molecular chaperone [Posttranslational modification, protein turnover, chaperones]. 30793 COG0444: ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component [Amino acid transport and metabolism / Inorganic ion transport and metabolism]. 30794 COG0445: NAD/FAD-utilizing enzyme apparently involved in cell division [Cell division and chromosome partitioning]. 30795 COG0446: Uncharacterized NAD(FAD)-dependent dehydrogenases [General function prediction only]. 30796 COG0447: Dihydroxynaphthoic acid synthase [Coenzyme metabolism]. 30797 COG0448: ADP-glucose pyrophosphorylase [Carbohydrate transport and metabolism]. 30798 COG0449: Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains [Cell envelope biogenesis, outer membrane]. 30799 COG0450: Peroxiredoxin [Posttranslational modification, protein turnover, chaperones]. 30800 COG0451: Nucleoside-diphosphate-sugar epimerases [Cell envelope biogenesis, outer membrane / Carbohydrate transport and metabolism]. 30801 COG0452: Phosphopantothenoylcysteine synthetase/decarboxylase [Coenzyme metabolism]. 30802 COG0454: Histone acetyltransferase HPA2 and related acetyltransferases [Transcription / General function prediction only]. 30803 COG0455: ATPases involved in chromosome partitioning [Cell division and chromosome partitioning]. 30804 COG0456: Acetyltransferases [General function prediction only]. 30805 COG0457: FOG: TPR repeat [General function prediction only]. 30806 COG0458: Carbamoylphosphate synthase large subunit (split gene in MJ) [Amino acid transport and metabolism / Nucleotide transport and metabolism]. 30807 COG0459: Chaperonin GroEL (HSP60 family) [Posttranslational modification, protein turnover, chaperones]. 30808 COG0460: Homoserine dehydrogenase [Amino acid transport and metabolism]. 30809 COG0461: Orotate phosphoribosyltransferase [Nucleotide transport and metabolism]. 30810 COG0462: Phosphoribosylpyrophosphate synthetase [Nucleotide transport and metabolism / Amino acid transport and metabolism]. 30811 COG0463: Glycosyltransferases involved in cell wall biogenesis [Cell envelope biogenesis, outer membrane]. 30812 COG0464: ATPases of the AAA+ class [Posttranslational modification, protein turnover, chaperones]. 30813 COG0465: ATP-dependent Zn proteases [Posttranslational modification, protein turnover, chaperones]. 30814 COG0466: ATP-dependent Lon protease, bacterial type [Posttranslational modification, protein turnover, chaperones]. 30815 COG0467: RecA-superfamily ATPases implicated in signal transduction [Signal transduction mechanisms]. 30816 COG0468: RecA/RadA recombinase [DNA replication, recombination, and repair]. 30817 COG0469: Pyruvate kinase [Carbohydrate transport and metabolism]. 30819 COG0471: Di- and tricarboxylate transporters [Inorganic ion transport and metabolism]. 30820 COG0472: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N- acetylglucosamine-1-phosphate transferase [Cell envelope biogenesis, outer membrane]. 30821 COG0473: Isocitrate/isopropylmalate dehydrogenase [Amino acid transport and metabolism]. 30822 COG0474: Cation transport ATPase [Inorganic ion transport and metabolism]. 30823 COG0475: Kef-type K+ transport systems, membrane components [Inorganic ion transport and metabolism]. 30824 COG0476: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 [Coenzyme metabolism]. 30825 COG0477: Permeases of the major facilitator superfamily [Carbohydrate transport and metabolism / Amino acid transport and metabolism / Inorganic ion transport and metabolism / General function prediction only]. 30826 COG0478: RIO-like serine/threonine protein kinase fused to N-terminal HTH domain [Signal transduction mechanisms]. 30827 COG0479: Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit [Energy production and conversion]. 30828 COG0480: Translation elongation factors (GTPases) [Translation, ribosomal structure and biogenesis]. 30829 COG0481: Membrane GTPase LepA [Cell envelope biogenesis, outer membrane]. 30830 COG0482: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain [Translation, ribosomal structure and biogenesis]. 30831 COG0483: Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol monophosphatase family [Carbohydrate transport and metabolism]. 30832 COG0484: DnaJ-class molecular chaperone with C-terminal Zn finger domain [Posttranslational modification, protein turnover, chaperones]. 30833 COG0486: Predicted GTPase [General function prediction only]. 30834 COG0488: ATPase components of ABC transporters with duplicated ATPase domains [General function prediction only]. 30835 COG0489: ATPases involved in chromosome partitioning [Cell division and chromosome partitioning]. 30836 COG0490: Putative regulatory, ligand-binding protein related to C-terminal domains of K+ channels [Inorganic ion transport and metabolism]. 30837 COG0491: Zn-dependent hydrolases, including glyoxylases [General function prediction only]. 30838 COG0492: Thioredoxin reductase [Posttranslational modification, protein turnover, chaperones]. 30839 COG0493: NADPH-dependent glutamate synthase beta chain and related oxidoreductases [Amino acid transport and metabolism / General function prediction only]. 30840 COG0494: NTP pyrophosphohydrolases including oxidative damage repair enzymes [DNA replication, recombination, and repair / General function prediction only]. 30841 COG0495: Leucyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 30842 COG0496: Predicted acid phosphatase [General function prediction only]. 30843 COG0497: ATPase involved in DNA repair [DNA replication, recombination, and repair]. 30844 COG0498: Threonine synthase [Amino acid transport and metabolism]. 30845 COG0499: S-adenosylhomocysteine hydrolase [Coenzyme metabolism]. 30846 COG0500: SAM-dependent methyltransferases [Secondary metabolites biosynthesis, transport, and catabolism / General function prediction only]. 30847 COG0501: Zn-dependent protease with chaperone function [Posttranslational modification, protein turnover, chaperones]. 30848 COG0502: Biotin synthase and related enzymes [Coenzyme metabolism]. 30849 COG0503: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins [Nucleotide transport and metabolism]. 30850 COG0504: CTP synthase (UTP-ammonia lyase) [Nucleotide transport and metabolism]. 30851 COG0505: Carbamoylphosphate synthase small subunit [Amino acid transport and metabolism / Nucleotide transport and metabolism]. 30852 COG0506: Proline dehydrogenase [Amino acid transport and metabolism]. 30853 COG0507: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member [DNA replication, recombination, and repair]. 30854 COG0508: Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes [Energy production and conversion]. 30855 COG0509: Glycine cleavage system H protein (lipoate-binding) [Amino acid transport and metabolism]. 30856 COG0510: Predicted choline kinase involved in LPS biosynthesis [Cell envelope biogenesis, outer membrane]. 30857 COG0511: Biotin carboxyl carrier protein [Lipid metabolism]. 30858 COG0512: Anthranilate/para-aminobenzoate synthases component II [Amino acid transport and metabolism / Coenzyme metabolism]. 30859 COG0513: Superfamily II DNA and RNA helicases [DNA replication, recombination, and repair / Transcription / Translation, ribosomal structure and biogenesis]. 30860 COG0514: Superfamily II DNA helicase [DNA replication, recombination, and repair]. 30861 COG0515: Serine/threonine protein kinase [General function prediction only / Signal transduction mechanisms / Transcription / DNA replication, recombination, and repair]. 30862 COG0516: IMP dehydrogenase/GMP reductase [Nucleotide transport and metabolism]. 30863 COG0517: FOG: CBS domain [General function prediction only]. 30864 COG0518: GMP synthase - Glutamine amidotransferase domain [Nucleotide transport and metabolism]. 30865 COG0519: GMP synthase, PP-ATPase domain/subunit [Nucleotide transport and metabolism]. 30866 COG0520: Selenocysteine lyase [Amino acid transport and metabolism]. 30867 COG0521: Molybdopterin biosynthesis enzymes [Coenzyme metabolism]. 30868 COG0522: Ribosomal protein S4 and related proteins [Translation, ribosomal structure and biogenesis]. 30869 COG0523: Putative GTPases (G3E family) [General function prediction only]. 30870 COG0524: Sugar kinases, ribokinase family [Carbohydrate transport and metabolism]. 30871 COG0525: Valyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 30872 COG0526: Thiol-disulfide isomerase and thioredoxins [Posttranslational modification, protein turnover, chaperones / Energy production and conversion]. 30873 COG0527: Aspartokinases [Amino acid transport and metabolism]. 30874 COG0528: Uridylate kinase [Nucleotide transport and metabolism]. 30875 COG0529: Adenylylsulfate kinase and related kinases [Inorganic ion transport and metabolism]. 30876 COG0530: Ca2+/Na+ antiporter [Inorganic ion transport and metabolism]. 30877 COG0531: Amino acid transporters [Amino acid transport and metabolism]. 30878 COG0532: Translation initiation factor 2 (IF-2; GTPase) [Translation, ribosomal structure and biogenesis]. 30879 COG0533: Metal-dependent proteases with possible chaperone activity [Posttranslational modification, protein turnover, chaperones]. 30880 COG0534: Na+-driven multidrug efflux pump [Defense mechanisms]. 30881 COG0535: Predicted Fe-S oxidoreductases [General function prediction only]. 30882 COG0536: Predicted GTPase [General function prediction only]. 30883 COG0537: Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases [Nucleotide transport and metabolism / Carbohydrate transport and metabolism / General function prediction only]. 30884 COG0538: Isocitrate dehydrogenases [Energy production and conversion]. 30885 COG0539: Ribosomal protein S1 [Translation, ribosomal structure and biogenesis]. 30886 COG0540: Aspartate carbamoyltransferase, catalytic chain [Nucleotide transport and metabolism]. 30887 COG0541: Signal recognition particle GTPase [Intracellular trafficking and secretion]. 30888 COG0542: ATPases with chaperone activity, ATP-binding subunit [Posttranslational modification, protein turnover, chaperones]. 30889 COG0543: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases [Coenzyme metabolism / Energy production and conversion]. 30890 COG0544: FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) [Posttranslational modification, protein turnover, chaperones]. 30891 COG0545: FKBP-type peptidyl-prolyl cis-trans isomerases 1 [Posttranslational modification, protein turnover, chaperones]. 30892 COG0546: Predicted phosphatases [General function prediction only]. 30893 COG0547: Anthranilate phosphoribosyltransferase [Amino acid transport and metabolism]. 30894 COG0548: Acetylglutamate kinase [Amino acid transport and metabolism]. 30895 COG0549: Carbamate kinase [Amino acid transport and metabolism]. 30896 COG0550: Topoisomerase IA [DNA replication, recombination, and repair]. 30897 COG0551: Zn-finger domain associated with topoisomerase type I [DNA replication, recombination, and repair]. 30898 COG0552: Signal recognition particle GTPase [Intracellular trafficking and secretion]. 30899 COG0553: Superfamily II DNA/RNA helicases, SNF2 family [Transcription / DNA replication, recombination, and repair]. 30900 COG0554: Glycerol kinase [Energy production and conversion]. 30901 COG0555: ABC-type sulfate transport system, permease component [Posttranslational modification, protein turnover, chaperones]. 30902 COG0556: Helicase subunit of the DNA excision repair complex [DNA replication, recombination, and repair]. 30903 COG0557: Exoribonuclease R [Transcription]. 30904 COG0558: Phosphatidylglycerophosphate synthase [Lipid metabolism]. 30905 COG0559: Branched-chain amino acid ABC-type transport system, permease components [Amino acid transport and metabolism]. 30906 COG0560: Phosphoserine phosphatase [Amino acid transport and metabolism]. 30907 COG0561: Predicted hydrolases of the HAD superfamily [General function prediction only]. 30908 COG0562: UDP-galactopyranose mutase [Cell envelope biogenesis, outer membrane]. 30909 COG0563: Adenylate kinase and related kinases [Nucleotide transport and metabolism]. 30910 COG0564: Pseudouridylate synthases, 23S RNA-specific [Translation, ribosomal structure and biogenesis]. 30911 COG0565: rRNA methylase [Translation, ribosomal structure and biogenesis]. 30912 COG0566: rRNA methylases [Translation, ribosomal structure and biogenesis]. 30913 COG0567: 2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, and related enzymes [Energy production and conversion]. 30914 COG0568: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) [Transcription]. 30915 COG0569: K+ transport systems, NAD-binding component [Inorganic ion transport and metabolism]. 30916 COG0571: dsRNA-specific ribonuclease [Transcription]. 30917 COG0572: Uridine kinase [Nucleotide transport and metabolism]. 30918 COG0573: ABC-type phosphate transport system, permease component [Inorganic ion transport and metabolism]. 30919 COG0574: Phosphoenolpyruvate synthase/pyruvate phosphate dikinase [Carbohydrate transport and metabolism]. 30920 COG0575: CDP-diglyceride synthetase [Lipid metabolism]. 30921 COG0576: Molecular chaperone GrpE (heat shock protein) [Posttranslational modification, protein turnover, chaperones]. 30922 COG0577: ABC-type antimicrobial peptide transport system, permease component [Defense mechanisms]. 30923 COG0578: Glycerol-3-phosphate dehydrogenase [Energy production and conversion]. 30924 COG0579: Predicted dehydrogenase [General function prediction only]. 30925 COG0580: Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) [Carbohydrate transport and metabolism]. 30926 COG0581: ABC-type phosphate transport system, permease component [Inorganic ion transport and metabolism]. 30927 COG0582: Integrase [DNA replication, recombination, and repair]. 30928 COG0583: Transcriptional regulator [Transcription]. 30929 COG0584: Glycerophosphoryl diester phosphodiesterase [Energy production and conversion]. 30930 COG0585: Uncharacterized conserved protein [Function unknown]. 30931 COG0586: Uncharacterized membrane-associated protein [Function unknown]. 30932 COG0587: DNA polymerase III, alpha subunit [DNA replication, recombination, and repair]. 30933 COG0588: Phosphoglycerate mutase 1 [Carbohydrate transport and metabolism]. 30934 COG0589: Universal stress protein UspA and related nucleotide-binding proteins [Signal transduction mechanisms]. 30935 COG0590: Cytosine/adenosine deaminases [Nucleotide transport and metabolism / Translation, ribosomal structure and biogenesis]. 30936 COG0591: Na+/proline symporter [Amino acid transport and metabolism / General function prediction only]. 30937 COG0592: DNA polymerase sliding clamp subunit (PCNA homolog) [DNA replication, recombination, and repair]. 30938 COG0593: ATPase involved in DNA replication initiation [DNA replication, recombination, and repair]. 30939 COG0594: RNase P protein component [Translation, ribosomal structure and biogenesis]. 30940 COG0595: Predicted hydrolase of the metallo-beta-lactamase superfamily [General function prediction only]. 30941 COG0596: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) [General function prediction only]. 30942 COG0597: Lipoprotein signal peptidase [Cell envelope biogenesis, outer membrane / Intracellular trafficking and secretion]. 30943 COG0598: Mg2+ and Co2+ transporters [Inorganic ion transport and metabolism]. 30944 COG0599: Uncharacterized homolog of gamma-carboxymuconolactone decarboxylase subunit [Function unknown]. 30945 COG0600: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component [Inorganic ion transport and metabolism]. 30946 COG0601: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components [Amino acid transport and metabolism / Inorganic ion transport and metabolism]. 30947 COG0602: Organic radical activating enzymes [Posttranslational modification, protein turnover, chaperones]. 30948 COG0603: Predicted PP-loop superfamily ATPase [General function prediction only]. 30949 COG0604: NADPH:quinone reductase and related Zn-dependent oxidoreductases [Energy production and conversion / General function prediction only]. 30950 COG0605: Superoxide dismutase [Inorganic ion transport and metabolism]. 30951 COG0606: Predicted ATPase with chaperone activity [Posttranslational modification, protein turnover, chaperones]. 30952 COG0607: Rhodanese-related sulfurtransferase [Inorganic ion transport and metabolism]. 30953 COG0608: Single-stranded DNA-specific exonuclease [DNA replication, recombination, and repair]. 30954 COG0609: ABC-type Fe3+-siderophore transport system, permease component [Inorganic ion transport and metabolism]. 30955 COG0610: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases [Defense mechanisms]. 30956 COG0611: Thiamine monophosphate kinase [Coenzyme metabolism]. 30957 COG0612: Predicted Zn-dependent peptidases [General function prediction only]. 30958 COG0613: Predicted metal-dependent phosphoesterases (PHP family) [General function prediction only]. 30959 COG0614: ABC-type Fe3+-hydroxamate transport system, periplasmic component [Inorganic ion transport and metabolism]. 30960 COG0615: Cytidylyltransferase [Cell envelope biogenesis, outer membrane / Lipid metabolism]. 30961 COG0616: Periplasmic serine proteases (ClpP class) [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]. 30962 COG0617: tRNA nucleotidyltransferase/poly(A) polymerase [Translation, ribosomal structure and biogenesis]. 30963 COG0618: Exopolyphosphatase-related proteins [General function prediction only]. 30964 COG0619: ABC-type cobalt transport system, permease component CbiQ and related transporters [Inorganic ion transport and metabolism]. 30965 COG0620: Methionine synthase II (cobalamin-independent) [Amino acid transport and metabolism]. 30966 COG0621: 2-methylthioadenine synthetase [Translation, ribosomal structure and biogenesis]. 30967 COG0622: Predicted phosphoesterase [General function prediction only]. 30968 COG0623: Enoyl-[acyl-carrier-protein]. 30969 COG0624: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases [Amino acid transport and metabolism]. 30970 COG0625: Glutathione S-transferase [Posttranslational modification, protein turnover, chaperones]. 30971 COG0626: Cystathionine beta-lyases/cystathionine gamma-synthases [Amino acid transport and metabolism]. 30972 COG0627: Predicted esterase [General function prediction only]. 30973 COG0628: Predicted permease [General function prediction only]. 30974 COG0629: Single-stranded DNA-binding protein [DNA replication, recombination, and repair]. 30975 COG0630: Type IV secretory pathway, VirB11 components, and related ATPases involved in archaeal flagella biosynthesis [Cell motility and secretion / Intracellular trafficking and secretion]. 30976 COG0631: Serine/threonine protein phosphatase [Signal transduction mechanisms]. 30977 COG0632: Holliday junction resolvasome, DNA-binding subunit [DNA replication, recombination, and repair]. 30978 COG0633: Ferredoxin [Energy production and conversion]. 30979 COG0634: Hypoxanthine-guanine phosphoribosyltransferase [Nucleotide transport and metabolism]. 30980 COG0635: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases [Coenzyme metabolism]. 30981 COG0636: F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K [Energy production and conversion]. 30982 COG0637: Predicted phosphatase/phosphohexomutase [General function prediction only]. 30983 COG0638: 20S proteasome, alpha and beta subunits [Posttranslational modification, protein turnover, chaperones]. 30984 COG0639: Diadenosine tetraphosphatase and related serine/threonine protein phosphatases [Signal transduction mechanisms]. 30985 COG0640: Predicted transcriptional regulators [Transcription]. 30986 COG0641: Arylsulfatase regulator (Fe-S oxidoreductase) [General function prediction only]. 30987 COG0642: Signal transduction histidine kinase [Signal transduction mechanisms]. 30988 COG0643: Chemotaxis protein histidine kinase and related kinases [Cell motility and secretion / Signal transduction mechanisms]. 30989 COG0644: Dehydrogenases (flavoproteins) [Energy production and conversion]. 30990 COG0645: Predicted kinase [General function prediction only]. 30991 COG0646: Methionine synthase I (cobalamin-dependent), methyltransferase domain [Amino acid transport and metabolism]. 30992 COG0647: Predicted sugar phosphatases of the HAD superfamily [Carbohydrate transport and metabolism]. 30993 COG0648: Endonuclease IV [DNA replication, recombination, and repair]. 30994 COG0649: NADH:ubiquinone oxidoreductase 49 kD subunit 7 [Energy production and conversion]. 30995 COG0650: Formate hydrogenlyase subunit 4 [Energy production and conversion]. 30996 COG0651: Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit [Energy production and conversion / Inorganic ion transport and metabolism]. 30997 COG0652: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family [Posttranslational modification, protein turnover, chaperones]. 30998 COG0653: Preprotein translocase subunit SecA (ATPase, RNA helicase) [Intracellular trafficking and secretion]. 30999 COG0654: 2-polyprenyl-6-methoxyphenol hydroxylase and related FAD-dependent oxidoreductases [Coenzyme metabolism / Energy production and conversion]. 31000 COG0655: Multimeric flavodoxin WrbA [General function prediction only]. 31001 COG0656: Aldo/keto reductases, related to diketogulonate reductase [General function prediction only]. 31002 COG0657: Esterase/lipase [Lipid metabolism]. 31003 COG0658: Predicted membrane metal-binding protein [General function prediction only]. 31004 COG0659: Sulfate permease and related transporters (MFS superfamily) [Inorganic ion transport and metabolism]. 31005 COG0661: Predicted unusual protein kinase [General function prediction only]. 31006 COG0662: Mannose-6-phosphate isomerase [Carbohydrate transport and metabolism]. 31007 COG0663: Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily [General function prediction only]. 31008 COG0664: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases [Signal transduction mechanisms]. 31009 COG0665: Glycine/D-amino acid oxidases (deaminating) [Amino acid transport and metabolism]. 31010 COG0666: FOG: Ankyrin repeat [General function prediction only]. 31011 COG0667: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) [Energy production and conversion]. 31012 COG0668: Small-conductance mechanosensitive channel [Cell envelope biogenesis, outer membrane]. 31013 COG0669: Phosphopantetheine adenylyltransferase [Coenzyme metabolism]. 31014 COG0670: Integral membrane protein, interacts with FtsH [General function prediction only]. 31015 COG0671: Membrane-associated phospholipid phosphatase [Lipid metabolism]. 31016 COG0672: High-affinity Fe2+/Pb2+ permease [Inorganic ion transport and metabolism]. 31017 COG0673: Predicted dehydrogenases and related proteins [General function prediction only]. 31018 COG0674: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit [Energy production and conversion]. 31019 COG0675: Transposase and inactivated derivatives [DNA replication, recombination, and repair]. 31020 COG0676: Uncharacterized enzymes related to aldose 1-epimerase [Carbohydrate transport and metabolism]. 31021 COG0677: UDP-N-acetyl-D-mannosaminuronate dehydrogenase [Cell envelope biogenesis, outer membrane]. 31022 COG0678: Peroxiredoxin [Posttranslational modification, protein turnover, chaperones]. 31023 COG0679: Predicted permeases [General function prediction only]. 31024 COG0680: Ni,Fe-hydrogenase maturation factor [Energy production and conversion]. 31025 COG0681: Signal peptidase I [Intracellular trafficking and secretion]. 31026 COG0682: Prolipoprotein diacylglyceryltransferase [Cell envelope biogenesis, outer membrane]. 31027 COG0683: ABC-type branched-chain amino acid transport systems, periplasmic component [Amino acid transport and metabolism]. 31028 COG0684: Demethylmenaquinone methyltransferase [Coenzyme metabolism]. 31029 COG0685: 5,10-methylenetetrahydrofolate reductase [Amino acid transport and metabolism]. 31030 COG0686: Alanine dehydrogenase [Amino acid transport and metabolism]. 31031 COG0687: Spermidine/putrescine-binding periplasmic protein [Amino acid transport and metabolism]. 31032 COG0688: Phosphatidylserine decarboxylase [Lipid metabolism]. 31033 COG0689: RNase PH [Translation, ribosomal structure and biogenesis]. 31034 COG0690: Preprotein translocase subunit SecE [Intracellular trafficking and secretion]. 31035 COG0691: tmRNA-binding protein [Posttranslational modification, protein turnover, chaperones]. 31036 COG0692: Uracil DNA glycosylase [DNA replication, recombination, and repair]. 31037 COG0693: Putative intracellular protease/amidase [General function prediction only]. 31038 COG0694: Thioredoxin-like proteins and domains [Posttranslational modification, protein turnover, chaperones]. 31039 COG0695: Glutaredoxin and related proteins [Posttranslational modification, protein turnover, chaperones]. 31040 COG0696: Phosphoglyceromutase [Carbohydrate transport and metabolism]. 31041 COG0697: Permeases of the drug/metabolite transporter (DMT) superfamily [Carbohydrate transport and metabolism / Amino acid transport and metabolism / General function prediction only]. 31042 COG0698: Ribose 5-phosphate isomerase RpiB [Carbohydrate transport and metabolism]. 31043 COG0699: Predicted GTPases (dynamin-related) [General function prediction only]. 31044 COG0700: Uncharacterized membrane protein [Function unknown]. 31045 COG0701: Predicted permeases [General function prediction only]. 31046 COG0702: Predicted nucleoside-diphosphate-sugar epimerases [Cell envelope biogenesis, outer membrane / Carbohydrate transport and metabolism]. 31047 COG0703: Shikimate kinase [Amino acid transport and metabolism]. 31048 COG0704: Phosphate uptake regulator [Inorganic ion transport and metabolism]. 31049 COG0705: Uncharacterized membrane protein (homolog of Drosophila rhomboid) [General function prediction only]. 31050 COG0706: Preprotein translocase subunit YidC [Intracellular trafficking and secretion]. 31051 COG0707: UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase [Cell envelope biogenesis, outer membrane]. 31052 COG0708: Exonuclease III [DNA replication, recombination, and repair]. 31053 COG0709: Selenophosphate synthase [Amino acid transport and metabolism]. 31054 COG0710: 3-dehydroquinate dehydratase [Amino acid transport and metabolism]. 31055 COG0711: F0F1-type ATP synthase, subunit b [Energy production and conversion]. 31056 COG0712: F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) [Energy production and conversion]. 31057 COG0713: NADH:ubiquinone oxidoreductase subunit 11 or 4L (chain K) [Energy production and conversion]. 31058 COG0714: MoxR-like ATPases [General function prediction only]. 31059 COG0715: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components [Inorganic ion transport and metabolism]. 31060 COG0716: Flavodoxins [Energy production and conversion]. 31061 COG0717: Deoxycytidine deaminase [Nucleotide transport and metabolism]. 31062 COG0718: Uncharacterized protein conserved in bacteria [Function unknown]. 31063 COG0719: ABC-type transport system involved in Fe-S cluster assembly, permease component [Posttranslational modification, protein turnover, chaperones]. 31064 COG0720: 6-pyruvoyl-tetrahydropterin synthase [Coenzyme metabolism]. 31065 COG0721: Asp-tRNAAsn/Glu-tRNAGln amidotransferase C subunit [Translation, ribosomal structure and biogenesis]. 31066 COG0722: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase [Amino acid transport and metabolism]. 31067 COG0723: Rieske Fe-S protein [Energy production and conversion]. 31068 COG0724: RNA-binding proteins (RRM domain) [General function prediction only]. 31069 COG0725: ABC-type molybdate transport system, periplasmic component [Inorganic ion transport and metabolism]. 31070 COG0726: Predicted xylanase/chitin deacetylase [Carbohydrate transport and metabolism]. 31071 COG0727: Predicted Fe-S-cluster oxidoreductase [General function prediction only]. 31072 COG0728: Uncharacterized membrane protein, putative virulence factor [General function prediction only]. 31073 COG0729: Outer membrane protein [Cell envelope biogenesis, outer membrane]. 31074 COG0730: Predicted permeases [General function prediction only]. 31075 COG0731: Fe-S oxidoreductases [Energy production and conversion]. 31076 COG0732: Restriction endonuclease S subunits [Defense mechanisms]. 31077 COG0733: Na+-dependent transporters of the SNF family [General function prediction only]. 31078 COG0735: Fe2+/Zn2+ uptake regulation proteins [Inorganic ion transport and metabolism]. 31079 COG0736: Phosphopantetheinyl transferase (holo-ACP synthase) [Lipid metabolism]. 31080 COG0737: 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases [Nucleotide transport and metabolism]. 31081 COG0738: Fucose permease [Carbohydrate transport and metabolism]. 31082 COG0739: Membrane proteins related to metalloendopeptidases [Cell envelope biogenesis, outer membrane]. 31083 COG0740: Protease subunit of ATP-dependent Clp proteases [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]. 31084 COG0741: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) [Cell envelope biogenesis, outer membrane]. 31085 COG0742: N6-adenine-specific methylase [DNA replication, recombination, and repair]. 31086 COG0743: 1-deoxy-D-xylulose 5-phosphate reductoisomerase [Lipid metabolism]. 31087 COG0744: Membrane carboxypeptidase (penicillin-binding protein) [Cell envelope biogenesis, outer membrane]. 31088 COG0745: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain [Signal transduction mechanisms / Transcription]. 31089 COG0746: Molybdopterin-guanine dinucleotide biosynthesis protein A [Coenzyme metabolism]. 31090 COG0747: ABC-type dipeptide transport system, periplasmic component [Amino acid transport and metabolism]. 31091 COG0748: Putative heme iron utilization protein [Inorganic ion transport and metabolism]. 31092 COG0749: DNA polymerase I - 3'-5' exonuclease and polymerase domains [DNA replication, recombination, and repair]. 31093 COG0750: Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]. 31094 COG0751: Glycyl-tRNA synthetase, beta subunit [Translation, ribosomal structure and biogenesis]. 31095 COG0752: Glycyl-tRNA synthetase, alpha subunit [Translation, ribosomal structure and biogenesis]. 31096 COG0753: Catalase [Inorganic ion transport and metabolism]. 31097 COG0754: Glutathionylspermidine synthase [Amino acid transport and metabolism]. 31098 COG0755: ABC-type transport system involved in cytochrome c biogenesis, permease component [Posttranslational modification, protein turnover, chaperones]. 31099 COG0756: dUTPase [Nucleotide transport and metabolism]. 31100 COG0757: 3-dehydroquinate dehydratase II [Amino acid transport and metabolism]. 31101 COG0758: Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake [DNA replication, recombination, and repair / Intracellular trafficking and secretion]. 31102 COG0759: Uncharacterized conserved protein [Function unknown]. 31103 COG0760: Parvulin-like peptidyl-prolyl isomerase [Posttranslational modification, protein turnover, chaperones]. 31104 COG0761: Penicillin tolerance protein [Lipid metabolism / Cell envelope biogenesis, outer membrane]. 31105 COG0762: Predicted integral membrane protein [Function unknown]. 31106 COG0763: Lipid A disaccharide synthetase [Cell envelope biogenesis, outer membrane]. 31107 COG0764: 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases [Lipid metabolism]. 31108 COG0765: ABC-type amino acid transport system, permease component [Amino acid transport and metabolism]. 31109 COG0766: UDP-N-acetylglucosamine enolpyruvyl transferase [Cell envelope biogenesis, outer membrane]. 31110 COG0767: ABC-type transport system involved in resistance to organic solvents, permease component [Secondary metabolites biosynthesis, transport, and catabolism]. 31111 COG0768: Cell division protein FtsI/penicillin-binding protein 2 [Cell envelope biogenesis, outer membrane]. 31112 COG0769: UDP-N-acetylmuramyl tripeptide synthase [Cell envelope biogenesis, outer membrane]. 31113 COG0770: UDP-N-acetylmuramyl pentapeptide synthase [Cell envelope biogenesis, outer membrane]. 31114 COG0771: UDP-N-acetylmuramoylalanine-D-glutamate ligase [Cell envelope biogenesis, outer membrane]. 31115 COG0772: Bacterial cell division membrane protein [Cell division and chromosome partitioning]. 31116 COG0773: UDP-N-acetylmuramate-alanine ligase [Cell envelope biogenesis, outer membrane]. 31117 COG0774: UDP-3-O-acyl-N-acetylglucosamine deacetylase [Cell envelope biogenesis, outer membrane]. 31118 COG0775: Nucleoside phosphorylase [Nucleotide transport and metabolism]. 31119 COG0776: Bacterial nucleoid DNA-binding protein [DNA replication, recombination, and repair]. 31120 COG0777: Acetyl-CoA carboxylase beta subunit [Lipid metabolism]. 31121 COG0778: Nitroreductase [Energy production and conversion]. 31122 COG0779: Uncharacterized protein conserved in bacteria [Function unknown]. 31123 COG0780: Enzyme related to GTP cyclohydrolase I [General function prediction only]. 31124 COG0781: Transcription termination factor [Transcription]. 31125 COG0782: Transcription elongation factor [Transcription]. 31126 COG0783: DNA-binding ferritin-like protein (oxidative damage protectant) [Inorganic ion transport and metabolism]. 31127 COG0784: FOG: CheY-like receiver [Signal transduction mechanisms]. 31128 COG0785: Cytochrome c biogenesis protein [Posttranslational modification, protein turnover, chaperones]. 31129 COG0786: Na+/glutamate symporter [Amino acid transport and metabolism]. 31130 COG0787: Alanine racemase [Cell envelope biogenesis, outer membrane]. 31131 COG0788: Formyltetrahydrofolate hydrolase [Nucleotide transport and metabolism]. 31132 COG0789: Predicted transcriptional regulators [Transcription]. 31133 COG0790: FOG: TPR repeat, SEL1 subfamily [General function prediction only]. 31134 COG0791: Cell wall-associated hydrolases (invasion-associated proteins) [Cell envelope biogenesis, outer membrane]. 31135 COG0792: Predicted endonuclease distantly related to archaeal Holliday junction resolvase [DNA replication, recombination, and repair]. 31136 COG0793: Periplasmic protease [Cell envelope biogenesis, outer membrane]. 31137 COG0794: Predicted sugar phosphate isomerase involved in capsule formation [Cell envelope biogenesis, outer membrane]. 31138 COG0795: Predicted permeases [General function prediction only]. 31139 COG0796: Glutamate racemase [Cell envelope biogenesis, outer membrane]. 31140 COG0797: Lipoproteins [Cell envelope biogenesis, outer membrane]. 31141 COG0798: Arsenite efflux pump ACR3 and related permeases [Inorganic ion transport and metabolism]. 31142 COG0799: Uncharacterized homolog of plant Iojap protein [Function unknown]. 31143 COG0800: 2-keto-3-deoxy-6-phosphogluconate aldolase [Carbohydrate transport and metabolism]. 31144 COG0801: 7,8-dihydro-6-hydroxymethylpterin-pyrophosphokinase [Coenzyme metabolism]. 31145 COG0802: Predicted ATPase or kinase [General function prediction only]. 31146 COG0803: ABC-type metal ion transport system, periplasmic component/surface adhesin [Inorganic ion transport and metabolism]. 31147 COG0804: Urea amidohydrolase (urease) alpha subunit [Amino acid transport and metabolism]. 31148 COG0805: Sec-independent protein secretion pathway component TatC [Intracellular trafficking and secretion]. 31149 COG0806: RimM protein, required for 16S rRNA processing [Translation, ribosomal structure and biogenesis]. 31150 COG0807: GTP cyclohydrolase II [Coenzyme metabolism]. 31151 COG0809: S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) [Translation, ribosomal structure and biogenesis]. 31152 COG0810: Periplasmic protein TonB, links inner and outer membranes [Cell envelope biogenesis, outer membrane]. 31153 COG0811: Biopolymer transport proteins [Intracellular trafficking and secretion]. 31154 COG0812: UDP-N-acetylmuramate dehydrogenase [Cell envelope biogenesis, outer membrane]. 31155 COG0813: Purine-nucleoside phosphorylase [Nucleotide transport and metabolism]. 31156 COG0814: Amino acid permeases [Amino acid transport and metabolism]. 31157 COG0815: Apolipoprotein N-acyltransferase [Cell envelope biogenesis, outer membrane]. 31158 COG0816: Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) [DNA replication, recombination, and repair]. 31159 COG0817: Holliday junction resolvasome, endonuclease subunit [DNA replication, recombination, and repair]. 31160 COG0818: Diacylglycerol kinase [Cell envelope biogenesis, outer membrane]. 31161 COG0819: Putative transcription activator [Transcription]. 31162 COG0820: Predicted Fe-S-cluster redox enzyme [General function prediction only]. 31163 COG0821: Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis [Lipid metabolism]. 31164 COG0822: NifU homolog involved in Fe-S cluster formation [Energy production and conversion]. 31165 COG0823: Periplasmic component of the Tol biopolymer transport system [Intracellular trafficking and secretion]. 31166 COG0824: Predicted thioesterase [General function prediction only]. 31167 COG0825: Acetyl-CoA carboxylase alpha subunit [Lipid metabolism]. 31168 COG0826: Collagenase and related proteases [Posttranslational modification, protein turnover, chaperones]. 31169 COG0827: Adenine-specific DNA methylase [DNA replication, recombination, and repair]. 31170 COG0828: Ribosomal protein S21 [Translation, ribosomal structure and biogenesis]. 31171 COG0829: Urease accessory protein UreH [Posttranslational modification, protein turnover, chaperones]. 31172 COG0830: Urease accessory protein UreF [Posttranslational modification, protein turnover, chaperones]. 31173 COG0831: Urea amidohydrolase (urease) gamma subunit [Amino acid transport and metabolism]. 31174 COG0832: Urea amidohydrolase (urease) beta subunit [Amino acid transport and metabolism]. 31175 COG0833: Amino acid transporters [Amino acid transport and metabolism]. 31176 COG0834: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain [Amino acid transport and metabolism / Signal transduction mechanisms]. 31177 COG0835: Chemotaxis signal transduction protein [Cell motility and secretion / Signal transduction mechanisms]. 31178 COG0836: Mannose-1-phosphate guanylyltransferase [Cell envelope biogenesis, outer membrane]. 31179 COG0837: Glucokinase [Carbohydrate transport and metabolism]. 31180 COG0838: NADH:ubiquinone oxidoreductase subunit 3 (chain A) [Energy production and conversion]. 31181 COG0839: NADH:ubiquinone oxidoreductase subunit 6 (chain J) [Energy production and conversion]. 31182 COG0840: Methyl-accepting chemotaxis protein [Cell motility and secretion / Signal transduction mechanisms]. 31183 COG0841: Cation/multidrug efflux pump [Defense mechanisms]. 31184 COG0842: ABC-type multidrug transport system, permease component [Defense mechanisms]. 31185 COG0843: Heme/copper-type cytochrome/quinol oxidases, subunit 1 [Energy production and conversion]. 31186 COG0845: Membrane-fusion protein [Cell envelope biogenesis, outer membrane]. 31187 COG0846: NAD-dependent protein deacetylases, SIR2 family [Transcription]. 31188 COG0847: DNA polymerase III, epsilon subunit and related 3'-5' exonucleases [DNA replication, recombination, and repair]. 31189 COG0848: Biopolymer transport protein [Intracellular trafficking and secretion]. 31190 COG0849: Actin-like ATPase involved in cell division [Cell division and chromosome partitioning]. 31191 COG0850: Septum formation inhibitor [Cell division and chromosome partitioning]. 31192 COG0851: Septum formation topological specificity factor [Cell division and chromosome partitioning]. 31193 COG0852: NADH:ubiquinone oxidoreductase 27 kD subunit [Energy production and conversion]. 31194 COG0853: Aspartate 1-decarboxylase [Coenzyme metabolism]. 31195 COG0854: Pyridoxal phosphate biosynthesis protein [Coenzyme metabolism]. 31196 COG0855: Polyphosphate kinase [Inorganic ion transport and metabolism]. 31197 COG0856: Orotate phosphoribosyltransferase homologs [Nucleotide transport and metabolism]. 31198 COG0857: BioD-like N-terminal domain of phosphotransacetylase [General function prediction only]. 31199 COG0858: Ribosome-binding factor A [Translation, ribosomal structure and biogenesis]. 31200 COG0859: ADP-heptose:LPS heptosyltransferase [Cell envelope biogenesis, outer membrane]. 31201 COG0860: N-acetylmuramoyl-L-alanine amidase [Cell envelope biogenesis, outer membrane]. 31202 COG0861: Membrane protein TerC, possibly involved in tellurium resistance [Inorganic ion transport and metabolism]. 31203 COG0863: DNA modification methylase [DNA replication, recombination, and repair]. 31204 COG0864: Predicted transcriptional regulators containing the CopG/Arc/MetJ DNA-binding domain and a metal-binding domain [Transcription]. 31205 COG1001: Adenine deaminase [Nucleotide transport and metabolism]. 31206 COG1002: Type II restriction enzyme, methylase subunits [Defense mechanisms]. 31207 COG1003: Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain [Amino acid transport and metabolism]. 31208 COG1004: Predicted UDP-glucose 6-dehydrogenase [Cell envelope biogenesis, outer membrane]. 31209 COG1005: NADH:ubiquinone oxidoreductase subunit 1 (chain H) [Energy production and conversion]. 31210 COG1006: Multisubunit Na+/H+ antiporter, MnhC subunit [Inorganic ion transport and metabolism]. 31211 COG1007: NADH:ubiquinone oxidoreductase subunit 2 (chain N) [Energy production and conversion]. 31212 COG1008: NADH:ubiquinone oxidoreductase subunit 4 (chain M) [Energy production and conversion]. 31213 COG1009: NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit [Energy production and conversion / Inorganic ion transport and metabolism]. 31214 COG1010: Precorrin-3B methylase [Coenzyme metabolism]. 31215 COG1011: Predicted hydrolase (HAD superfamily) [General function prediction only]. 31216 COG1012: NAD-dependent aldehyde dehydrogenases [Energy production and conversion]. 31217 COG1013: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit [Energy production and conversion]. 31218 COG1014: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit [Energy production and conversion]. 31219 COG1015: Phosphopentomutase [Carbohydrate transport and metabolism]. 31220 COG1017: Hemoglobin-like flavoprotein [Energy production and conversion]. 31221 COG1018: Flavodoxin reductases (ferredoxin-NADPH reductases) family 1 [Energy production and conversion]. 31222 COG1019: Predicted nucleotidyltransferase [General function prediction only]. 31223 COG1020: Non-ribosomal peptide synthetase modules and related proteins [Secondary metabolites biosynthesis, transport, and catabolism]. 31224 COG1021: Peptide arylation enzymes [Secondary metabolites biosynthesis, transport, and catabolism]. 31225 COG1022: Long-chain acyl-CoA synthetases (AMP-forming) [Lipid metabolism]. 31226 COG1023: Predicted 6-phosphogluconate dehydrogenase [Carbohydrate transport and metabolism]. 31227 COG1024: Enoyl-CoA hydratase/carnithine racemase [Lipid metabolism]. 31228 COG1025: Secreted/periplasmic Zn-dependent peptidases, insulinase-like [Posttranslational modification, protein turnover, chaperones]. 31229 COG1026: Predicted Zn-dependent peptidases, insulinase-like [General function prediction only]. 31230 COG1027: Aspartate ammonia-lyase [Amino acid transport and metabolism]. 31231 COG1028: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) [Secondary metabolites biosynthesis, transport, and catabolism / General function prediction only]. 31232 COG1029: Formylmethanofuran dehydrogenase subunit B [Energy production and conversion]. 31233 COG1030: Membrane-bound serine protease (ClpP class) [Posttranslational modification, protein turnover, chaperones]. 31234 COG1031: Uncharacterized Fe-S oxidoreductase [Energy production and conversion]. 31235 COG1032: Fe-S oxidoreductase [Energy production and conversion]. 31236 COG1033: Predicted exporters of the RND superfamily [General function prediction only]. 31237 COG1034: NADH dehydrogenase/NADH:ubiquinone oxidoreductase 75 kD subunit (chain G) [Energy production and conversion]. 31238 COG1035: Coenzyme F420-reducing hydrogenase, beta subunit [Energy production and conversion]. 31239 COG1036: Archaeal flavoproteins [Energy production and conversion]. 31240 COG1038: Pyruvate carboxylase [Energy production and conversion]. 31241 COG1039: Ribonuclease HIII [DNA replication, recombination, and repair]. 31242 COG1040: Predicted amidophosphoribosyltransferases [General function prediction only]. 31243 COG1041: Predicted DNA modification methylase [DNA replication, recombination, and repair]. 31244 COG1042: Acyl-CoA synthetase (NDP forming) [Energy production and conversion]. 31245 COG1043: Acyl-[acyl carrier protein]. 31246 COG1044: UDP-3-O-[3-hydroxymyristoyl]. 31247 COG1045: Serine acetyltransferase [Amino acid transport and metabolism]. 31248 COG1047: FKBP-type peptidyl-prolyl cis-trans isomerases 2 [Posttranslational modification, protein turnover, chaperones]. 31249 COG1048: Aconitase A [Energy production and conversion]. 31250 COG1049: Aconitase B [Energy production and conversion]. 31251 COG1051: ADP-ribose pyrophosphatase [Nucleotide transport and metabolism]. 31252 COG1052: Lactate dehydrogenase and related dehydrogenases [Energy production and conversion / Coenzyme metabolism / General function prediction only]. 31253 COG1053: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit [Energy production and conversion]. 31254 COG1054: Predicted sulfurtransferase [General function prediction only]. 31255 COG1055: Na+/H+ antiporter NhaD and related arsenite permeases [Inorganic ion transport and metabolism]. 31256 COG1056: Nicotinamide mononucleotide adenylyltransferase [Coenzyme metabolism]. 31257 COG1057: Nicotinic acid mononucleotide adenylyltransferase [Coenzyme metabolism]. 31258 COG1058: Predicted nucleotide-utilizing enzyme related to molybdopterin-biosynthesis enzyme MoeA [General function prediction only]. 31259 COG1059: Thermostable 8-oxoguanine DNA glycosylase [DNA replication, recombination, and repair]. 31260 COG1060: Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes [Coenzyme metabolism / General function prediction only]. 31261 COG1061: DNA or RNA helicases of superfamily II [Transcription / DNA replication, recombination, and repair]. 31262 COG1062: Zn-dependent alcohol dehydrogenases, class III [Energy production and conversion]. 31263 COG1063: Threonine dehydrogenase and related Zn-dependent dehydrogenases [Amino acid transport and metabolism / General function prediction only]. 31264 COG1064: Zn-dependent alcohol dehydrogenases [General function prediction only]. 31265 COG1066: Predicted ATP-dependent serine protease [Posttranslational modification, protein turnover, chaperones]. 31266 COG1067: Predicted ATP-dependent protease [Posttranslational modification, protein turnover, chaperones]. 31267 COG1069: Ribulose kinase [Energy production and conversion]. 31268 COG1070: Sugar (pentulose and hexulose) kinases [Carbohydrate transport and metabolism]. 31269 COG1071: Pyruvate/2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, eukaryotic type, alpha subunit [Energy production and conversion]. 31270 COG1072: Panthothenate kinase [Coenzyme metabolism]. 31271 COG1073: Hydrolases of the alpha/beta superfamily [General function prediction only]. 31272 COG1074: ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) [DNA replication, recombination, and repair]. 31273 COG1075: Predicted acetyltransferases and hydrolases with the alpha/beta hydrolase fold [General function prediction only]. 31274 COG1076: DnaJ-domain-containing proteins 1 [Posttranslational modification, protein turnover, chaperones]. 31275 COG1077: Actin-like ATPase involved in cell morphogenesis [Cell division and chromosome partitioning]. 31276 COG1078: HD superfamily phosphohydrolases [General function prediction only]. 31277 COG1079: Uncharacterized ABC-type transport system, permease component [General function prediction only]. 31278 COG1080: Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) [Carbohydrate transport and metabolism]. 31279 COG1082: Sugar phosphate isomerases/epimerases [Carbohydrate transport and metabolism]. 31280 COG1083: CMP-N-acetylneuraminic acid synthetase [Cell envelope biogenesis, outer membrane]. 31281 COG1084: Predicted GTPase [General function prediction only]. 31282 COG1085: Galactose-1-phosphate uridylyltransferase [Energy production and conversion]. 31283 COG1086: Predicted nucleoside-diphosphate sugar epimerases [Cell envelope biogenesis, outer membrane / Carbohydrate transport and metabolism]. 31284 COG1087: UDP-glucose 4-epimerase [Cell envelope biogenesis, outer membrane]. 31285 COG1088: dTDP-D-glucose 4,6-dehydratase [Cell envelope biogenesis, outer membrane]. 31286 COG1089: GDP-D-mannose dehydratase [Cell envelope biogenesis, outer membrane]. 31287 COG1090: Predicted nucleoside-diphosphate sugar epimerase [General function prediction only]. 31288 COG1091: dTDP-4-dehydrorhamnose reductase [Cell envelope biogenesis, outer membrane]. 31289 COG1092: Predicted SAM-dependent methyltransferases [General function prediction only]. 31290 COG1093: Translation initiation factor 2, alpha subunit (eIF-2alpha) [Translation, ribosomal structure and biogenesis]. 31291 COG1094: Predicted RNA-binding protein (contains KH domains) [General function prediction only]. 31292 COG1095: DNA-directed RNA polymerase, subunit E' [Transcription]. 31293 COG1096: Predicted RNA-binding protein (consists of S1 domain and a Zn-ribbon domain) [Translation, ribosomal structure and biogenesis]. 31294 COG1097: RNA-binding protein Rrp4 and related proteins (contain S1 domain and KH domain) [Translation, ribosomal structure and biogenesis]. 31295 COG1098: Predicted RNA binding protein (contains ribosomal protein S1 domain) [Translation, ribosomal structure and biogenesis]. 31296 COG1099: Predicted metal-dependent hydrolases with the TIM-barrel fold [General function prediction only]. 31297 COG1100: GTPase SAR1 and related small G proteins [General function prediction only]. 31298 COG1101: ABC-type uncharacterized transport system, ATPase component [General function prediction only]. 31299 COG1102: Cytidylate kinase [Nucleotide transport and metabolism]. 31300 COG1103: Archaea-specific pyridoxal phosphate-dependent enzymes [General function prediction only]. 31301 COG1104: Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes [Amino acid transport and metabolism]. 31302 COG1105: Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) [Carbohydrate transport and metabolism]. 31303 COG1106: Predicted ATPases [General function prediction only]. 31304 COG1107: Archaea-specific RecJ-like exonuclease, contains DnaJ-type Zn finger domain [DNA replication, recombination, and repair]. 31305 COG1108: ABC-type Mn2+/Zn2+ transport systems, permease components [Inorganic ion transport and metabolism]. 31306 COG1109: Phosphomannomutase [Carbohydrate transport and metabolism]. 31307 COG1110: Reverse gyrase [DNA replication, recombination, and repair]. 31308 COG1111: ERCC4-like helicases [DNA replication, recombination, and repair]. 31309 COG1112: Superfamily I DNA and RNA helicases and helicase subunits [DNA replication, recombination, and repair]. 31310 COG1113: Gamma-aminobutyrate permease and related permeases [Amino acid transport and metabolism]. 31311 COG1114: Branched-chain amino acid permeases [Amino acid transport and metabolism]. 31312 COG1115: Na+/alanine symporter [Amino acid transport and metabolism]. 31313 COG1116: ABC-type nitrate/sulfonate/bicarbonate transport system, ATPase component [Inorganic ion transport and metabolism]. 31314 COG1117: ABC-type phosphate transport system, ATPase component [Inorganic ion transport and metabolism]. 31315 COG1118: ABC-type sulfate/molybdate transport systems, ATPase component [Inorganic ion transport and metabolism]. 31316 COG1119: ABC-type molybdenum transport system, ATPase component/photorepair protein PhrA [Inorganic ion transport and metabolism]. 31317 COG1120: ABC-type cobalamin/Fe3+-siderophores transport systems, ATPase components [Inorganic ion transport and metabolism / Coenzyme metabolism]. 31318 COG1121: ABC-type Mn/Zn transport systems, ATPase component [Inorganic ion transport and metabolism]. 31319 COG1122: ABC-type cobalt transport system, ATPase component [Inorganic ion transport and metabolism]. 31320 COG1123: ATPase components of various ABC-type transport systems, contain duplicated ATPase [General function prediction only]. 31321 COG1124: ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component [Amino acid transport and metabolism / Inorganic ion transport and metabolism]. 31322 COG1125: ABC-type proline/glycine betaine transport systems, ATPase components [Amino acid transport and metabolism]. 31323 COG1126: ABC-type polar amino acid transport system, ATPase component [Amino acid transport and metabolism]. 31324 COG1127: ABC-type transport system involved in resistance to organic solvents, ATPase component [Secondary metabolites biosynthesis, transport, and catabolism]. 31325 COG1129: ABC-type sugar transport system, ATPase component [Carbohydrate transport and metabolism]. 31326 COG1131: ABC-type multidrug transport system, ATPase component [Defense mechanisms]. 31327 COG1132: ABC-type multidrug transport system, ATPase and permease components [Defense mechanisms]. 31328 COG1133: ABC-type long-chain fatty acid transport system, fused permease and ATPase components [Lipid metabolism]. 31329 COG1134: ABC-type polysaccharide/polyol phosphate transport system, ATPase component [Carbohydrate transport and metabolism / Cell envelope biogenesis, outer membrane]. 31330 COG1135: ABC-type metal ion transport system, ATPase component [Inorganic ion transport and metabolism]. 31331 COG1136: ABC-type antimicrobial peptide transport system, ATPase component [Defense mechanisms]. 31332 COG1137: ABC-type (unclassified) transport system, ATPase component [General function prediction only]. 31333 COG1138: Cytochrome c biogenesis factor [Posttranslational modification, protein turnover, chaperones]. 31334 COG1139: Uncharacterized conserved protein containing a ferredoxin-like domain [Energy production and conversion]. 31335 COG1140: Nitrate reductase beta subunit [Energy production and conversion]. 31336 COG1141: Ferredoxin [Energy production and conversion]. 31337 COG1142: Fe-S-cluster-containing hydrogenase components 2 [Energy production and conversion]. 31338 COG1143: Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) [Energy production and conversion]. 31339 COG1144: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, delta subunit [Energy production and conversion]. 31340 COG1145: Ferredoxin [Energy production and conversion]. 31341 COG1146: Ferredoxin [Energy production and conversion]. 31342 COG1148: Heterodisulfide reductase, subunit A and related polyferredoxins [Energy production and conversion]. 31343 COG1149: MinD superfamily P-loop ATPase containing an inserted ferredoxin domain [Energy production and conversion]. 31344 COG1150: Heterodisulfide reductase, subunit C [Energy production and conversion]. 31345 COG1151: 6Fe-6S prismane cluster-containing protein [Energy production and conversion]. 31346 COG1152: CO dehydrogenase/acetyl-CoA synthase alpha subunit [Energy production and conversion]. 31347 COG1153: Formylmethanofuran dehydrogenase subunit D [Energy production and conversion]. 31348 COG1154: Deoxyxylulose-5-phosphate synthase [Coenzyme metabolism / Lipid metabolism]. 31349 COG1155: Archaeal/vacuolar-type H+-ATPase subunit A [Energy production and conversion]. 31350 COG1156: Archaeal/vacuolar-type H+-ATPase subunit B [Energy production and conversion]. 31351 COG1157: Flagellar biosynthesis/type III secretory pathway ATPase [Cell motility and secretion / Intracellular trafficking and secretion]. 31352 COG1158: Transcription termination factor [Transcription]. 31353 COG1159: GTPase [General function prediction only]. 31354 COG1160: Predicted GTPases [General function prediction only]. 31355 COG1161: Predicted GTPases [General function prediction only]. 31356 COG1162: Predicted GTPases [General function prediction only]. 31357 COG1163: Predicted GTPase [General function prediction only]. 31358 COG1164: Oligoendopeptidase F [Amino acid transport and metabolism]. 31359 COG1165: 2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate synthase [Coenzyme metabolism]. 31360 COG1166: Arginine decarboxylase (spermidine biosynthesis) [Amino acid transport and metabolism]. 31361 COG1167: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs [Transcription / Amino acid transport and metabolism]. 31362 COG1168: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities [Amino acid transport and metabolism]. 31363 COG1169: Isochorismate synthase [Coenzyme metabolism / Secondary metabolites biosynthesis, transport, and catabolism]. 31364 COG1171: Threonine dehydratase [Amino acid transport and metabolism]. 31365 COG1172: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components [Carbohydrate transport and metabolism]. 31366 COG1173: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components [Amino acid transport and metabolism / Inorganic ion transport and metabolism]. 31367 COG1174: ABC-type proline/glycine betaine transport systems, permease component [Amino acid transport and metabolism]. 31368 COG1175: ABC-type sugar transport systems, permease components [Carbohydrate transport and metabolism]. 31369 COG1176: ABC-type spermidine/putrescine transport system, permease component I [Amino acid transport and metabolism]. 31370 COG1177: ABC-type spermidine/putrescine transport system, permease component II [Amino acid transport and metabolism]. 31371 COG1178: ABC-type Fe3+ transport system, permease component [Inorganic ion transport and metabolism]. 31372 COG1179: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 [Coenzyme metabolism]. 31373 COG1180: Pyruvate-formate lyase-activating enzyme [Posttranslational modification, protein turnover, chaperones]. 31374 COG1181: D-alanine-D-alanine ligase and related ATP-grasp enzymes [Cell envelope biogenesis, outer membrane]. 31375 COG1182: Acyl carrier protein phosphodiesterase [Lipid metabolism]. 31376 COG1183: Phosphatidylserine synthase [Lipid metabolism]. 31377 COG1184: Translation initiation factor 2B subunit, eIF-2B alpha/beta/delta family [Translation, ribosomal structure and biogenesis]. 31378 COG1185: Polyribonucleotide nucleotidyltransferase (polynucleotide phosphorylase) [Translation, ribosomal structure and biogenesis]. 31379 COG1186: Protein chain release factor B [Translation, ribosomal structure and biogenesis]. 31380 COG1187: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases [Translation, ribosomal structure and biogenesis]. 31381 COG1188: Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) [Translation, ribosomal structure and biogenesis]. 31382 COG1189: Predicted rRNA methylase [Translation, ribosomal structure and biogenesis]. 31383 COG1190: Lysyl-tRNA synthetase (class II) [Translation, ribosomal structure and biogenesis]. 31384 COG1191: DNA-directed RNA polymerase specialized sigma subunit [Transcription]. 31385 COG1192: ATPases involved in chromosome partitioning [Cell division and chromosome partitioning]. 31386 COG1193: Mismatch repair ATPase (MutS family) [DNA replication, recombination, and repair]. 31387 COG1194: A/G-specific DNA glycosylase [DNA replication, recombination, and repair]. 31388 COG1195: Recombinational DNA repair ATPase (RecF pathway) [DNA replication, recombination, and repair]. 31389 COG1196: Chromosome segregation ATPases [Cell division and chromosome partitioning]. 31390 COG1197: Transcription-repair coupling factor (superfamily II helicase) [DNA replication, recombination, and repair / Transcription]. 31391 COG1198: Primosomal protein N' (replication factor Y) - superfamily II helicase [DNA replication, recombination, and repair]. 31392 COG1199: Rad3-related DNA helicases [Transcription / DNA replication, recombination, and repair]. 31393 COG1200: RecG-like helicase [DNA replication, recombination, and repair / Transcription]. 31394 COG1201: Lhr-like helicases [General function prediction only]. 31395 COG1202: Superfamily II helicase, archaea-specific [General function prediction only]. 31396 COG1203: Predicted helicases [General function prediction only]. 31397 COG1204: Superfamily II helicase [General function prediction only]. 31398 COG1205: Distinct helicase family with a unique C-terminal domain including a metal-binding cysteine cluster [General function prediction only]. 31399 COG1206: NAD(FAD)-utilizing enzyme possibly involved in translation [Translation, ribosomal structure and biogenesis]. 31400 COG1207: N-acetylglucosamine-1-phosphate uridyltransferase (contains nucleotidyltransferase and I-patch acetyltransferase domains) [Cell envelope biogenesis, outer membrane]. 31401 COG1208: Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) [Cell envelope biogenesis, outer membrane / Translation, ribosomal structure and biogenesis]. 31402 COG1209: dTDP-glucose pyrophosphorylase [Cell envelope biogenesis, outer membrane]. 31403 COG1210: UDP-glucose pyrophosphorylase [Cell envelope biogenesis, outer membrane]. 31404 COG1211: 4-diphosphocytidyl-2-methyl-D-erithritol synthase [Lipid metabolism]. 31405 COG1212: CMP-2-keto-3-deoxyoctulosonic acid synthetase [Cell envelope biogenesis, outer membrane]. 31406 COG1213: Predicted sugar nucleotidyltransferases [Cell envelope biogenesis, outer membrane]. 31407 COG1214: Inactive homolog of metal-dependent proteases, putative molecular chaperone [Posttranslational modification, protein turnover, chaperones]. 31408 COG1215: Glycosyltransferases, probably involved in cell wall biogenesis [Cell envelope biogenesis, outer membrane]. 31409 COG1216: Predicted glycosyltransferases [General function prediction only]. 31410 COG1217: Predicted membrane GTPase involved in stress response [Signal transduction mechanisms]. 31411 COG1218: 3'-Phosphoadenosine 5'-phosphosulfate (PAPS) 3 '-phosphatase [Inorganic ion transport and metabolism]. 31412 COG1219: ATP-dependent protease Clp, ATPase subunit [Posttranslational modification, protein turnover, chaperones]. 31413 COG1220: ATP-dependent protease HslVU (ClpYQ), ATPase subunit [Posttranslational modification, protein turnover, chaperones]. 31414 COG1221: Transcriptional regulators containing an AAA-type ATPase domain and a DNA-binding domain [Transcription / Signal transduction mechanisms]. 31415 COG1222: ATP-dependent 26S proteasome regulatory subunit [Posttranslational modification, protein turnover, chaperones]. 31416 COG1223: Predicted ATPase (AAA+ superfamily) [General function prediction only]. 31417 COG1224: DNA helicase TIP49, TBP-interacting protein [Transcription]. 31418 COG1225: Peroxiredoxin [Posttranslational modification, protein turnover, chaperones]. 31419 COG1226: Kef-type K+ transport systems, predicted NAD-binding component [Inorganic ion transport and metabolism]. 31420 COG1227: Inorganic pyrophosphatase/exopolyphosphatase [Energy production and conversion]. 31421 COG1228: Imidazolonepropionase and related amidohydrolases [Secondary metabolites biosynthesis, transport, and catabolism]. 31422 COG1229: Formylmethanofuran dehydrogenase subunit A [Energy production and conversion]. 31423 COG1230: Co/Zn/Cd efflux system component [Inorganic ion transport and metabolism]. 31424 COG1231: Monoamine oxidase [Amino acid transport and metabolism]. 31425 COG1232: Protoporphyrinogen oxidase [Coenzyme metabolism]. 31426 COG1233: Phytoene dehydrogenase and related proteins [Secondary metabolites biosynthesis, transport, and catabolism]. 31427 COG1234: Metal-dependent hydrolases of the beta-lactamase superfamily III [General function prediction only]. 31428 COG1235: Metal-dependent hydrolases of the beta-lactamase superfamily I [General function prediction only]. 31429 COG1236: Predicted exonuclease of the beta-lactamase fold involved in RNA processing [Translation, ribosomal structure and biogenesis]. 31430 COG1237: Metal-dependent hydrolases of the beta-lactamase superfamily II [General function prediction only]. 31431 COG1238: Predicted membrane protein [Function unknown]. 31432 COG1239: Mg-chelatase subunit ChlI [Coenzyme metabolism]. 31433 COG1240: Mg-chelatase subunit ChlD [Coenzyme metabolism]. 31434 COG1241: Predicted ATPase involved in replication control, Cdc46/Mcm family [DNA replication, recombination, and repair]. 31435 COG1242: Predicted Fe-S oxidoreductase [General function prediction only]. 31436 COG1243: Histone acetyltransferase [Transcription / Chromatin structure and dynamics]. 31437 COG1244: Predicted Fe-S oxidoreductase [General function prediction only]. 31438 COG1245: Predicted ATPase, RNase L inhibitor (RLI) homolog [General function prediction only]. 31439 COG1246: N-acetylglutamate synthase and related acetyltransferases [Amino acid transport and metabolism]. 31440 COG1247: Sortase and related acyltransferases [Cell envelope biogenesis, outer membrane]. 31441 COG1249: Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide dehydrogenase (E3) component, and related enzymes [Energy production and conversion]. 31442 COG1250: 3-hydroxyacyl-CoA dehydrogenase [Lipid metabolism]. 31443 COG1251: NAD(P)H-nitrite reductase [Energy production and conversion]. 31444 COG1252: NADH dehydrogenase, FAD-containing subunit [Energy production and conversion]. 31445 COG1253: Hemolysins and related proteins containing CBS domains [General function prediction only]. 31446 COG1254: Acylphosphatases [Energy production and conversion]. 31447 COG1255: Uncharacterized protein conserved in archaea [Function unknown]. 31448 COG1256: Flagellar hook-associated protein [Cell motility and secretion]. 31449 COG1257: Hydroxymethylglutaryl-CoA reductase [Lipid metabolism]. 31450 COG1258: Predicted pseudouridylate synthase [Translation, ribosomal structure and biogenesis]. 31451 COG1259: Uncharacterized conserved protein [Function unknown]. 31452 COG1260: Myo-inositol-1-phosphate synthase [Lipid metabolism]. 31453 COG1261: Flagellar basal body P-ring biosynthesis protein [Cell motility and secretion / Posttranslational modification, protein turnover, chaperones]. 31454 COG1262: Uncharacterized conserved protein [Function unknown]. 31455 COG1263: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific [Carbohydrate transport and metabolism]. 31456 COG1264: Phosphotransferase system IIB components [Carbohydrate transport and metabolism]. 31457 COG1266: Predicted metal-dependent membrane protease [General function prediction only]. 31458 COG1267: Phosphatidylglycerophosphatase A and related proteins [Lipid metabolism]. 31459 COG1268: Uncharacterized conserved protein [General function prediction only]. 31460 COG1269: Archaeal/vacuolar-type H+-ATPase subunit I [Energy production and conversion]. 31461 COG1270: Cobalamin biosynthesis protein CobD/CbiB [Coenzyme metabolism]. 31462 COG1271: Cytochrome bd-type quinol oxidase, subunit 1 [Energy production and conversion]. 31463 COG1272: Predicted membrane protein, hemolysin III homolog [General function prediction only]. 31464 COG1273: Uncharacterized conserved protein [Function unknown]. 31465 COG1274: Phosphoenolpyruvate carboxykinase (GTP) [Energy production and conversion]. 31466 COG1275: Tellurite resistance protein and related permeases [Inorganic ion transport and metabolism]. 31467 COG1276: Putative copper export protein [Inorganic ion transport and metabolism]. 31468 COG1277: ABC-type transport system involved in multi-copper enzyme maturation, permease component [General function prediction only]. 31469 COG1278: Cold shock proteins [Transcription]. 31470 COG1279: Lysine efflux permease [General function prediction only]. 31471 COG1280: Putative threonine efflux protein [Amino acid transport and metabolism]. 31472 COG1281: Disulfide bond chaperones of the HSP33 family [Posttranslational modification, protein turnover, chaperones]. 31473 COG1282: NAD/NADP transhydrogenase beta subunit [Energy production and conversion]. 31474 COG1283: Na+/phosphate symporter [Inorganic ion transport and metabolism]. 31475 COG1284: Uncharacterized conserved protein [Function unknown]. 31476 COG1285: Uncharacterized membrane protein [Function unknown]. 31477 COG1286: Uncharacterized membrane protein, required for colicin V production [General function prediction only]. 31478 COG1287: Uncharacterized membrane protein, required for N-linked glycosylation [General function prediction only]. 31479 COG1288: Predicted membrane protein [Function unknown]. 31480 COG1289: Predicted membrane protein [Function unknown]. 31481 COG1290: Cytochrome b subunit of the bc complex [Energy production and conversion]. 31482 COG1291: Flagellar motor component [Cell motility and secretion]. 31483 COG1292: Choline-glycine betaine transporter [Cell envelope biogenesis, outer membrane]. 31484 COG1293: Predicted RNA-binding protein homologous to eukaryotic snRNP [Transcription]. 31485 COG1294: Cytochrome bd-type quinol oxidase, subunit 2 [Energy production and conversion]. 31486 COG1295: Predicted membrane protein [Function unknown]. 31487 COG1296: Predicted branched-chain amino acid permease (azaleucine resistance) [Amino acid transport and metabolism]. 31488 COG1297: Predicted membrane protein [Function unknown]. 31489 COG1298: Flagellar biosynthesis pathway, component FlhA [Cell motility and secretion / Intracellular trafficking and secretion]. 31490 COG1299: Phosphotransferase system, fructose-specific IIC component [Carbohydrate transport and metabolism]. 31491 COG1300: Uncharacterized membrane protein [Function unknown]. 31492 COG1301: Na+/H+-dicarboxylate symporters [Energy production and conversion]. 31493 COG1302: Uncharacterized protein conserved in bacteria [Function unknown]. 31494 COG1303: Uncharacterized protein conserved in archaea [Function unknown]. 31495 COG1304: L-lactate dehydrogenase (FMN-dependent) and related alpha-hydroxy acid dehydrogenases [Energy production and conversion]. 31496 COG1305: Transglutaminase-like enzymes, putative cysteine proteases [Amino acid transport and metabolism]. 31497 COG1306: Uncharacterized conserved protein [Function unknown]. 31498 COG1307: Uncharacterized protein conserved in bacteria [Function unknown]. 31499 COG1308: Transcription factor homologous to NACalpha-BTF3 [Transcription]. 31500 COG1309: Transcriptional regulator [Transcription]. 31501 COG1310: Predicted metal-dependent protease of the PAD1/JAB1 superfamily [General function prediction only]. 31502 COG1311: Archaeal DNA polymerase II, small subunit/DNA polymerase delta, subunit B [DNA replication, recombination, and repair]. 31503 COG1312: D-mannonate dehydratase [Carbohydrate transport and metabolism]. 31504 COG1313: Uncharacterized Fe-S protein PflX, homolog of pyruvate formate lyase activating proteins [General function prediction only]. 31505 COG1314: Preprotein translocase subunit SecG [Intracellular trafficking and secretion]. 31506 COG1315: Predicted polymerase, most proteins contain PALM domain, HD hydrolase domain and Zn-ribbon domain [DNA replication, recombination, and repair]. 31507 COG1316: Transcriptional regulator [Transcription]. 31508 COG1317: Flagellar biosynthesis/type III secretory pathway protein [Cell motility and secretion / Intracellular trafficking and secretion]. 31509 COG1318: Predicted transcriptional regulators [Transcription]. 31510 COG1319: Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs [Energy production and conversion]. 31511 COG1320: Multisubunit Na+/H+ antiporter, MnhG subunit [Inorganic ion transport and metabolism]. 31512 COG1321: Mn-dependent transcriptional regulator [Transcription]. 31513 COG1322: Uncharacterized protein conserved in bacteria [Function unknown]. 31514 COG1323: Predicted nucleotidyltransferase [General function prediction only]. 31515 COG1324: Uncharacterized protein involved in tolerance to divalent cations [Inorganic ion transport and metabolism]. 31516 COG1325: Predicted exosome subunit [Translation, ribosomal structure and biogenesis]. 31517 COG1326: Uncharacterized archaeal Zn-finger protein [General function prediction only]. 31518 COG1327: Predicted transcriptional regulator, consists of a Zn-ribbon and ATP-cone domains [Transcription]. 31519 COG1328: Oxygen-sensitive ribonucleoside-triphosphate reductase [Nucleotide transport and metabolism]. 31520 COG1329: Transcriptional regulators, similar to M. xanthus CarD [Transcription]. 31521 COG1330: Exonuclease V gamma subunit [DNA replication, recombination, and repair]. 31522 COG1331: Highly conserved protein containing a thioredoxin domain [Posttranslational modification, protein turnover, chaperones]. 31523 COG1332: Uncharacterized protein predicted to be involved in DNA repair (RAMP superfamily) [DNA replication, recombination, and repair]. 31524 COG1333: ResB protein required for cytochrome c biosynthesis [Posttranslational modification, protein turnover, chaperones]. 31525 COG1334: Uncharacterized flagellar protein FlaG [Cell motility and secretion]. 31526 COG1335: Amidases related to nicotinamidase [Secondary metabolites biosynthesis, transport, and catabolism]. 31527 COG1336: Uncharacterized protein predicted to be involved in DNA repair (RAMP superfamily) [DNA replication, recombination, and repair]. 31528 COG1337: Uncharacterized protein predicted to be involved in DNA repair (RAMP superfamily) [DNA replication, recombination, and repair]. 31529 COG1338: Flagellar biosynthesis pathway, component FliP [Cell motility and secretion / Intracellular trafficking and secretion]. 31530 COG1339: Transcriptional regulator of a riboflavin/FAD biosynthetic operon [Transcription / Coenzyme metabolism]. 31531 COG1340: Uncharacterized archaeal coiled-coil protein [Function unknown]. 31532 COG1341: Predicted GTPase or GTP-binding protein [General function prediction only]. 31533 COG1342: Predicted DNA-binding proteins [General function prediction only]. 31534 COG1343: Uncharacterized protein predicted to be involved in DNA repair [DNA replication, recombination, and repair]. 31535 COG1344: Flagellin and related hook-associated proteins [Cell motility and secretion]. 31536 COG1345: Flagellar capping protein [Cell motility and secretion]. 31537 COG1346: Putative effector of murein hydrolase [Cell envelope biogenesis, outer membrane]. 31538 COG1347: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrD [Energy production and conversion]. 31539 COG1348: Nitrogenase subunit NifH (ATPase) [Inorganic ion transport and metabolism]. 31540 COG1349: Transcriptional regulators of sugar metabolism [Transcription / Carbohydrate transport and metabolism]. 31541 COG1350: Predicted alternative tryptophan synthase beta-subunit (paralog of TrpB) [General function prediction only]. 31542 COG1351: Predicted alternative thymidylate synthase [Nucleotide transport and metabolism]. 31543 COG1352: Methylase of chemotaxis methyl-accepting proteins [Cell motility and secretion / Signal transduction mechanisms]. 31544 COG1353: Predicted hydrolase of the HD superfamily (permuted catalytic motifs) [General function prediction only]. 31545 COG1354: Uncharacterized conserved protein [Function unknown]. 31546 COG1355: Predicted dioxygenase [General function prediction only]. 31547 COG1356: Uncharacterized protein conserved in archaea [Function unknown]. 31548 COG1357: Uncharacterized low-complexity proteins [Function unknown]. 31549 COG1358: Ribosomal protein HS6-type (S12/L30/L7a) [Translation, ribosomal structure and biogenesis]. 31550 COG1359: Uncharacterized conserved protein [Function unknown]. 31551 COG1360: Flagellar motor protein [Cell motility and secretion]. 31552 COG1361: S-layer domain [Cell envelope biogenesis, outer membrane]. 31553 COG1362: Aspartyl aminopeptidase [Amino acid transport and metabolism]. 31554 COG1363: Cellulase M and related proteins [Carbohydrate transport and metabolism]. 31555 COG1364: N-acetylglutamate synthase (N-acetylornithine aminotransferase) [Amino acid transport and metabolism]. 31556 COG1365: Predicted ATPase (PP-loop superfamily) [General function prediction only]. 31557 COG1366: Anti-anti-sigma regulatory factor (antagonist of anti-sigma factor) [Signal transduction mechanisms]. 31558 COG1367: Uncharacterized protein predicted to be involved in DNA repair (RAMP superfamily) [DNA replication, recombination, and repair]. 31559 COG1368: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily [Cell envelope biogenesis, outer membrane]. 31560 COG1369: RNase P/RNase MRP subunit POP5 [Translation, ribosomal structure and biogenesis]. 31561 COG1370: Prefoldin, molecular chaperone implicated in de novo protein folding, alpha subunit [Posttranslational modification, protein turnover, chaperones]. 31562 COG1371: Uncharacterized conserved protein [Function unknown]. 31563 COG1372: Intein/homing endonuclease [DNA replication, recombination, and repair]. 31564 COG1373: Predicted ATPase (AAA+ superfamily) [General function prediction only]. 31565 COG1374: Protein involved in ribosomal biogenesis, contains PUA domain [Translation, ribosomal structure and biogenesis]. 31566 COG1376: Uncharacterized protein conserved in bacteria [Function unknown]. 31567 COG1377: Flagellar biosynthesis pathway, component FlhB [Cell motility and secretion / Intracellular trafficking and secretion]. 31568 COG1378: Predicted transcriptional regulators [Transcription]. 31569 COG1379: Uncharacterized conserved protein [Function unknown]. 31570 COG1380: Putative effector of murein hydrolase LrgA [General function prediction only]. 31571 COG1381: Recombinational DNA repair protein (RecF pathway) [DNA replication, recombination, and repair]. 31572 COG1382: Prefoldin, chaperonin cofactor [Posttranslational modification, protein turnover, chaperones]. 31573 COG1383: Ribosomal protein S17E [Translation, ribosomal structure and biogenesis]. 31574 COG1384: Lysyl-tRNA synthetase (class I) [Translation, ribosomal structure and biogenesis]. 31575 COG1385: Uncharacterized protein conserved in bacteria [Function unknown]. 31576 COG1386: Predicted transcriptional regulator containing the HTH domain [Transcription]. 31577 COG1387: Histidinol phosphatase and related hydrolases of the PHP family [Amino acid transport and metabolism / General function prediction only]. 31578 COG1388: FOG: LysM repeat [Cell envelope biogenesis, outer membrane]. 31579 COG1389: DNA topoisomerase VI, subunit B [DNA replication, recombination, and repair]. 31580 COG1390: Archaeal/vacuolar-type H+-ATPase subunit E [Energy production and conversion]. 31581 COG1391: Glutamine synthetase adenylyltransferase [Posttranslational modification, protein turnover, chaperones / Signal transduction mechanisms]. 31582 COG1392: Phosphate transport regulator (distant homolog of PhoU) [Inorganic ion transport and metabolism]. 31583 COG1393: Arsenate reductase and related proteins, glutaredoxin family [Inorganic ion transport and metabolism]. 31584 COG1394: Archaeal/vacuolar-type H+-ATPase subunit D [Energy production and conversion]. 31585 COG1395: Predicted transcriptional regulator [Transcription]. 31586 COG1396: Predicted transcriptional regulators [Transcription]. 31587 COG1397: ADP-ribosylglycohydrolase [Posttranslational modification, protein turnover, chaperones]. 31588 COG1398: Fatty-acid desaturase [Lipid metabolism]. 31589 COG1399: Predicted metal-binding, possibly nucleic acid-binding protein [General function prediction only]. 31590 COG1400: Signal recognition particle 19 kDa protein [Intracellular trafficking and secretion]. 31591 COG1401: GTPase subunit of restriction endonuclease [Defense mechanisms]. 31592 COG1402: Uncharacterized protein, putative amidase [General function prediction only]. 31593 COG1403: Restriction endonuclease [Defense mechanisms]. 31594 COG1404: Subtilisin-like serine proteases [Posttranslational modification, protein turnover, chaperones]. 31595 COG1405: Transcription initiation factor TFIIIB, Brf1 subunit/Transcription initiation factor TFIIB [Transcription]. 31596 COG1406: Predicted inhibitor of MCP methylation, homolog of CheC [Cell motility and secretion]. 31597 COG1407: Predicted ICC-like phosphoesterases [General function prediction only]. 31598 COG1408: Predicted phosphohydrolases [General function prediction only]. 31599 COG1409: Predicted phosphohydrolases [General function prediction only]. 31600 COG1410: Methionine synthase I, cobalamin-binding domain [Amino acid transport and metabolism]. 31601 COG1411: Uncharacterized protein related to proFAR isomerase (HisA) [General function prediction only]. 31602 COG1412: Uncharacterized proteins of PilT N-term./Vapc superfamily [General function prediction only]. 31603 COG1413: FOG: HEAT repeat [Energy production and conversion]. 31604 COG1414: Transcriptional regulator [Transcription]. 31605 COG1415: Uncharacterized conserved protein [Function unknown]. 31606 COG1416: Uncharacterized conserved protein [Function unknown]. 31607 COG1417: Uncharacterized conserved protein [Function unknown]. 31608 COG1418: Predicted HD superfamily hydrolase [General function prediction only]. 31609 COG1419: Flagellar GTP-binding protein [Cell motility and secretion]. 31610 COG1420: Transcriptional regulator of heat shock gene [Transcription]. 31611 COG1421: Uncharacterized protein predicted to be involved in DNA repair [DNA replication, recombination, and repair]. 31612 COG1422: Predicted membrane protein [Function unknown]. 31613 COG1423: ATP-dependent DNA ligase, homolog of eukaryotic ligase III [DNA replication, recombination, and repair]. 31614 COG1424: Pimeloyl-CoA synthetase [Coenzyme metabolism]. 31615 COG1426: Uncharacterized protein conserved in bacteria [Function unknown]. 31616 COG1427: Predicted periplasmic solute-binding protein [General function prediction only]. 31617 COG1428: Deoxynucleoside kinases [Nucleotide transport and metabolism]. 31618 COG1429: Cobalamin biosynthesis protein CobN and related Mg-chelatases [Coenzyme metabolism]. 31619 COG1430: Uncharacterized conserved protein [Function unknown]. 31620 COG1431: Uncharacterized protein containing piwi/argonaute domain [Translation, ribosomal structure and biogenesis]. 31621 COG1432: Uncharacterized conserved protein [Function unknown]. 31622 COG1433: Uncharacterized conserved protein [Function unknown]. 31623 COG1434: Uncharacterized conserved protein [Function unknown]. 31624 COG1435: Thymidine kinase [Nucleotide transport and metabolism]. 31625 COG1436: Archaeal/vacuolar-type H+-ATPase subunit F [Energy production and conversion]. 31626 COG1437: Adenylate cyclase, class 2 (thermophilic) [Nucleotide transport and metabolism]. 31627 COG1438: Arginine repressor [Transcription]. 31628 COG1439: Predicted nucleic acid-binding protein, consists of a PIN domain and a Zn-ribbon module [General function prediction only]. 31629 COG1440: Phosphotransferase system cellobiose-specific component IIB [Carbohydrate transport and metabolism]. 31630 COG1441: O-succinylbenzoate synthase [Coenzyme metabolism]. 31631 COG1442: Lipopolysaccharide biosynthesis proteins, LPS:glycosyltransferases [Cell envelope biogenesis, outer membrane]. 31632 COG1443: Isopentenyldiphosphate isomerase [Lipid metabolism]. 31633 COG1444: Predicted P-loop ATPase fused to an acetyltransferase [General function prediction only]. 31634 COG1445: Phosphotransferase system fructose-specific component IIB [Carbohydrate transport and metabolism]. 31635 COG1446: Asparaginase [Amino acid transport and metabolism]. 31636 COG1447: Phosphotransferase system cellobiose-specific component IIA [Carbohydrate transport and metabolism]. 31637 COG1448: Aspartate/tyrosine/aromatic aminotransferase [Amino acid transport and metabolism]. 31638 COG1449: Alpha-amylase/alpha-mannosidase [Carbohydrate transport and metabolism]. 31639 COG1450: Type II secretory pathway, component PulD [Cell motility and secretion / Intracellular trafficking and secretion]. 31640 COG1451: Predicted metal-dependent hydrolase [General function prediction only]. 31641 COG1452: Organic solvent tolerance protein OstA [Cell envelope biogenesis, outer membrane]. 31642 COG1453: Predicted oxidoreductases of the aldo/keto reductase family [General function prediction only]. 31643 COG1454: Alcohol dehydrogenase, class IV [Energy production and conversion]. 31644 COG1455: Phosphotransferase system cellobiose-specific component IIC [Carbohydrate transport and metabolism]. 31645 COG1456: CO dehydrogenase/acetyl-CoA synthase gamma subunit (corrinoid Fe-S protein) [Energy production and conversion]. 31646 COG1457: Purine-cytosine permease and related proteins [Nucleotide transport and metabolism]. 31647 COG1458: Predicted DNA-binding protein containing PIN domain [General function prediction only]. 31648 COG1459: Type II secretory pathway, component PulF [Cell motility and secretion / Intracellular trafficking and secretion]. 31649 COG1460: Uncharacterized protein conserved in archaea [Function unknown]. 31650 COG1461: Predicted kinase related to dihydroxyacetone kinase [General function prediction only]. 31651 COG1462: Uncharacterized protein involved in formation of curli polymers [Cell envelope biogenesis, outer membrane]. 31652 COG1463: ABC-type transport system involved in resistance to organic solvents, periplasmic component [Secondary metabolites biosynthesis, transport, and catabolism]. 31653 COG1464: ABC-type metal ion transport system, periplasmic component/surface antigen [Inorganic ion transport and metabolism]. 31654 COG1465: Predicted alternative 3-dehydroquinate synthase [Amino acid transport and metabolism]. 31655 COG1466: DNA polymerase III, delta subunit [DNA replication, recombination, and repair]. 31656 COG1467: Eukaryotic-type DNA primase, catalytic (small) subunit [DNA replication, recombination, and repair]. 31657 COG1468: RecB family exonuclease [DNA replication, recombination, and repair]. 31658 COG1469: Uncharacterized conserved protein [Function unknown]. 31659 COG1470: Predicted membrane protein [Function unknown]. 31660 COG1471: Ribosomal protein S4E [Translation, ribosomal structure and biogenesis]. 31661 COG1472: Beta-glucosidase-related glycosidases [Carbohydrate transport and metabolism]. 31662 COG1473: Metal-dependent amidase/aminoacylase/carboxypeptidase [General function prediction only]. 31663 COG1474: Cdc6-related protein, AAA superfamily ATPase [DNA replication, recombination, and repair / Posttranslational modification, protein turnover, chaperones]. 31664 COG1475: Predicted transcriptional regulators [Transcription]. 31665 COG1476: Predicted transcriptional regulators [Transcription]. 31666 COG1477: Membrane-associated lipoprotein involved in thiamine biosynthesis [Coenzyme metabolism]. 31667 COG1478: Uncharacterized conserved protein [Function unknown]. 31668 COG1479: Uncharacterized conserved protein [Function unknown]. 31669 COG1480: Predicted membrane-associated HD superfamily hydrolase [General function prediction only]. 31670 COG1481: Uncharacterized protein conserved in bacteria [Function unknown]. 31671 COG1482: Phosphomannose isomerase [Carbohydrate transport and metabolism]. 31672 COG1483: Predicted ATPase (AAA+ superfamily) [General function prediction only]. 31673 COG1484: DNA replication protein [DNA replication, recombination, and repair]. 31674 COG1485: Predicted ATPase [General function prediction only]. 31675 COG1486: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases [Carbohydrate transport and metabolism]. 31676 COG1487: Predicted nucleic acid-binding protein, contains PIN domain [General function prediction only]. 31677 COG1488: Nicotinic acid phosphoribosyltransferase [Coenzyme metabolism]. 31678 COG1489: DNA-binding protein, stimulates sugar fermentation [General function prediction only]. 31679 COG1490: D-Tyr-tRNAtyr deacylase [Translation, ribosomal structure and biogenesis]. 31680 COG1491: Predicted RNA-binding protein [Translation, ribosomal structure and biogenesis]. 31681 COG1492: Cobyric acid synthase [Coenzyme metabolism]. 31682 COG1493: Serine kinase of the HPr protein, regulates carbohydrate metabolism [Signal transduction mechanisms]. 31683 COG1494: Fructose-1,6-bisphosphatase/sedoheptulose 1,7-bisphosphatase and related proteins [Carbohydrate transport and metabolism]. 31684 COG1495: Disulfide bond formation protein DsbB [Posttranslational modification, protein turnover, chaperones]. 31685 COG1496: Uncharacterized conserved protein [Function unknown]. 31686 COG1497: Predicted transcriptional regulator [Transcription]. 31687 COG1498: Protein implicated in ribosomal biogenesis, Nop56p homolog [Translation, ribosomal structure and biogenesis]. 31688 COG1499: NMD protein affecting ribosome stability and mRNA decay [Translation, ribosomal structure and biogenesis]. 31689 COG1500: Predicted exosome subunit [Translation, ribosomal structure and biogenesis]. 31690 COG1501: Alpha-glucosidases, family 31 of glycosyl hydrolases [Carbohydrate transport and metabolism]. 31691 COG1502: Phosphatidylserine/phosphatidylglycerophosphate/cardiolipin synthases and related enzymes [Lipid metabolism]. 31692 COG1503: Peptide chain release factor 1 (eRF1) [Translation, ribosomal structure and biogenesis]. 31693 COG1504: Uncharacterized conserved protein [Function unknown]. 31694 COG1505: Serine proteases of the peptidase family S9A [Amino acid transport and metabolism]. 31695 COG1506: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases [Amino acid transport and metabolism]. 31696 COG1507: Uncharacterized conserved protein [Function unknown]. 31697 COG1508: DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog [Transcription]. 31698 COG1509: Lysine 2,3-aminomutase [Amino acid transport and metabolism]. 31699 COG1510: Predicted transcriptional regulators [Transcription]. 31700 COG1511: Predicted membrane protein [Function unknown]. 31701 COG1512: Beta-propeller domains of methanol dehydrogenase type [General function prediction only]. 31702 COG1513: Cyanate lyase [Inorganic ion transport and metabolism]. 31703 COG1514: 2'-5' RNA ligase [Translation, ribosomal structure and biogenesis]. 31704 COG1515: Deoxyinosine 3'endonuclease (endonuclease V) [DNA replication, recombination, and repair]. 31705 COG1516: Flagellin-specific chaperone FliS [Cell motility and secretion / Intracellular trafficking and secretion / Posttranslational modification, protein turnover, chaperones]. 31706 COG1517: Uncharacterized protein predicted to be involved in DNA repair [DNA replication, recombination, and repair]. 31707 COG1518: Uncharacterized protein predicted to be involved in DNA repair [DNA replication, recombination, and repair]. 31708 COG1519: 3-deoxy-D-manno-octulosonic-acid transferase [Cell envelope biogenesis, outer membrane]. 31709 COG1520: FOG: WD40-like repeat [Function unknown]. 31710 COG1521: Putative transcriptional regulator, homolog of Bvg accessory factor [Transcription]. 31711 COG1522: Transcriptional regulators [Transcription]. 31712 COG1523: Type II secretory pathway, pullulanase PulA and related glycosidases [Carbohydrate transport and metabolism]. 31713 COG1524: Uncharacterized proteins of the AP superfamily [General function prediction only]. 31714 COG1525: Micrococcal nuclease (thermonuclease) homologs [DNA replication, recombination, and repair]. 31715 COG1526: Uncharacterized protein required for formate dehydrogenase activity [Energy production and conversion]. 31716 COG1527: Archaeal/vacuolar-type H+-ATPase subunit C [Energy production and conversion]. 31717 COG1528: Ferritin-like protein [Inorganic ion transport and metabolism]. 31718 COG1529: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs [Energy production and conversion]. 31719 COG1530: Ribonucleases G and E [Translation, ribosomal structure and biogenesis]. 31720 COG1531: Uncharacterized protein conserved in archaea [Function unknown]. 31721 COG1532: Predicted RNA-binding protein [General function prediction only]. 31722 COG1533: DNA repair photolyase [DNA replication, recombination, and repair]. 31723 COG1534: Predicted RNA-binding protein containing KH domain, possibly ribosomal protein [Translation, ribosomal structure and biogenesis]. 31724 COG1535: Isochorismate hydrolase [Secondary metabolites biosynthesis, transport, and catabolism]. 31725 COG1536: Flagellar motor switch protein [Cell motility and secretion]. 31726 COG1537: Predicted RNA-binding proteins [General function prediction only]. 31727 COG1538: Outer membrane protein [Cell envelope biogenesis, outer membrane / Intracellular trafficking and secretion]. 31728 COG1539: Dihydroneopterin aldolase [Coenzyme metabolism]. 31729 COG1540: Uncharacterized proteins, homologs of lactam utilization protein B [General function prediction only]. 31730 COG1541: Coenzyme F390 synthetase [Coenzyme metabolism]. 31731 COG1542: Uncharacterized conserved protein [Function unknown]. 31732 COG1543: Uncharacterized conserved protein [Function unknown]. 31733 COG1544: Ribosome-associated protein Y (PSrp-1) [Translation, ribosomal structure and biogenesis]. 31734 COG1545: Predicted nucleic-acid-binding protein containing a Zn-ribbon [General function prediction only]. 31735 COG1546: Uncharacterized protein (competence- and mitomycin-induced) [General function prediction only]. 31736 COG1547: Uncharacterized conserved protein [Function unknown]. 31737 COG1548: Predicted transcriptional regulator/sugar kinase [Transcription / Carbohydrate transport and metabolism]. 31738 COG1549: Queuine tRNA-ribosyltransferases, contain PUA domain [Translation, ribosomal structure and biogenesis]. 31739 COG1550: Uncharacterized protein conserved in bacteria [Function unknown]. 31740 COG1551: Carbon storage regulator (could also regulate swarming and quorum sensing) [Signal transduction mechanisms]. 31741 COG1552: Ribosomal protein L40E [Translation, ribosomal structure and biogenesis]. 31742 COG1553: Uncharacterized conserved protein involved in intracellular sulfur reduction [Inorganic ion transport and metabolism]. 31743 COG1554: Trehalose and maltose hydrolases (possible phosphorylases) [Carbohydrate transport and metabolism]. 31744 COG1555: DNA uptake protein and related DNA-binding proteins [DNA replication, recombination, and repair]. 31745 COG1556: Uncharacterized conserved protein [Function unknown]. 31746 COG1558: Flagellar basal body rod protein [Cell motility and secretion]. 31747 COG1559: Predicted periplasmic solute-binding protein [General function prediction only]. 31748 COG1560: Lauroyl/myristoyl acyltransferase [Cell envelope biogenesis, outer membrane]. 31749 COG1561: Uncharacterized stress-induced protein [Function unknown]. 31750 COG1562: Phytoene/squalene synthetase [Lipid metabolism]. 31751 COG1563: Predicted subunit of the Multisubunit Na+/H+ antiporter [Inorganic ion transport and metabolism]. 31752 COG1564: Thiamine pyrophosphokinase [Coenzyme metabolism]. 31753 COG1565: Uncharacterized conserved protein [Function unknown]. 31754 COG1566: Multidrug resistance efflux pump [Defense mechanisms]. 31755 COG1567: Uncharacterized protein predicted to be involved in DNA repair (RAMP superfamily) [DNA replication, recombination, and repair]. 31756 COG1568: Predicted methyltransferases [General function prediction only]. 31757 COG1569: Predicted nucleic acid-binding protein, contains PIN domain [General function prediction only]. 31758 COG1570: Exonuclease VII, large subunit [DNA replication, recombination, and repair]. 31759 COG1571: Predicted DNA-binding protein containing a Zn-ribbon domain [General function prediction only]. 31760 COG1572: Uncharacterized conserved protein [Function unknown]. 31761 COG1573: Uracil-DNA glycosylase [DNA replication, recombination, and repair]. 31762 COG1574: Predicted metal-dependent hydrolase with the TIM-barrel fold [General function prediction only]. 31763 COG1575: 1,4-dihydroxy-2-naphthoate octaprenyltransferase [Coenzyme metabolism]. 31764 COG1576: Uncharacterized conserved protein [Function unknown]. 31765 COG1577: Mevalonate kinase [Lipid metabolism]. 31766 COG1578: Uncharacterized conserved protein [Function unknown]. 31767 COG1579: Zn-ribbon protein, possibly nucleic acid-binding [General function prediction only]. 31768 COG1580: Flagellar basal body-associated protein [Cell motility and secretion]. 31769 COG1581: Archaeal DNA-binding protein [Transcription]. 31770 COG1582: Uncharacterized protein, possibly involved in motility [Cell motility and secretion]. 31771 COG1583: Uncharacterized protein predicted to be involved in DNA repair (RAMP superfamily) [DNA replication, recombination, and repair]. 31772 COG1584: Predicted membrane protein [Function unknown]. 31773 COG1585: Membrane protein implicated in regulation of membrane protease activity [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]. 31774 COG1586: S-adenosylmethionine decarboxylase [Amino acid transport and metabolism]. 31775 COG1587: Uroporphyrinogen-III synthase [Coenzyme metabolism]. 31776 COG1588: RNase P/RNase MRP subunit p29 [Translation, ribosomal structure and biogenesis]. 31777 COG1589: Cell division septal protein [Cell envelope biogenesis, outer membrane]. 31778 COG1590: Uncharacterized conserved protein [Function unknown]. 31779 COG1591: Holliday junction resolvase - archaeal type [DNA replication, recombination, and repair]. 31780 COG1592: Rubrerythrin [Energy production and conversion]. 31781 COG1593: TRAP-type C4-dicarboxylate transport system, large permease component [Carbohydrate transport and metabolism]. 31782 COG1594: DNA-directed RNA polymerase, subunit M/Transcription elongation factor TFIIS [Transcription]. 31783 COG1595: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog [Transcription]. 31784 COG1596: Periplasmic protein involved in polysaccharide export [Cell envelope biogenesis, outer membrane]. 31785 COG1597: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase [Lipid metabolism / General function prediction only]. 31786 COG1598: Uncharacterized conserved protein [Function unknown]. 31787 COG1599: Single-stranded DNA-binding replication protein A (RPA), large (70 kD) subunit and related ssDNA-binding proteins [DNA replication, recombination, and repair]. 31788 COG1600: Uncharacterized Fe-S protein [Energy production and conversion]. 31789 COG1601: Translation initiation factor 2, beta subunit (eIF-2beta)/eIF-5 N-terminal domain [Translation, ribosomal structure and biogenesis]. 31790 COG1602: Uncharacterized conserved protein [Function unknown]. 31791 COG1603: RNase P/RNase MRP subunit p30 [Translation, ribosomal structure and biogenesis]. 31792 COG1604: Uncharacterized protein predicted to be involved in DNA repair (RAMP superfamily) [DNA replication, recombination, and repair]. 31793 COG1605: Chorismate mutase [Amino acid transport and metabolism]. 31794 COG1606: ATP-utilizing enzymes of the PP-loop superfamily [General function prediction only]. 31795 COG1607: Acyl-CoA hydrolase [Lipid metabolism]. 31796 COG1608: Predicted archaeal kinase [General function prediction only]. 31797 COG1609: Transcriptional regulators [Transcription]. 31798 COG1610: Uncharacterized conserved protein [Function unknown]. 31799 COG1611: Predicted Rossmann fold nucleotide-binding protein [General function prediction only]. 31800 COG1612: Uncharacterized protein required for cytochrome oxidase assembly [Posttranslational modification, protein turnover, chaperones]. 31801 COG1613: ABC-type sulfate transport system, periplasmic component [Inorganic ion transport and metabolism]. 31802 COG1614: CO dehydrogenase/acetyl-CoA synthase beta subunit [Energy production and conversion]. 31803 COG1615: Uncharacterized conserved protein [Function unknown]. 31804 COG1617: Uncharacterized conserved protein [Function unknown]. 31805 COG1618: Predicted nucleotide kinase [Nucleotide transport and metabolism]. 31806 COG1619: Uncharacterized proteins, homologs of microcin C7 resistance protein MccF [Defense mechanisms]. 31807 COG1620: L-lactate permease [Energy production and conversion]. 31808 COG1621: Beta-fructosidases (levanase/invertase) [Carbohydrate transport and metabolism]. 31809 COG1622: Heme/copper-type cytochrome/quinol oxidases, subunit 2 [Energy production and conversion]. 31810 COG1623: Predicted nucleic-acid-binding protein (contains the HHH domain) [General function prediction only]. 31811 COG1624: Uncharacterized conserved protein [Function unknown]. 31812 COG1625: Fe-S oxidoreductase, related to NifB/MoaA family [Energy production and conversion]. 31813 COG1626: Neutral trehalase [Carbohydrate transport and metabolism]. 31814 COG1627: Uncharacterized protein conserved in archaea [Function unknown]. 31815 COG1628: Uncharacterized conserved protein [Function unknown]. 31816 COG1629: Outer membrane receptor proteins, mostly Fe transport [Inorganic ion transport and metabolism]. 31817 COG1630: Uncharacterized protein conserved in archaea [Function unknown]. 31818 COG1631: Ribosomal protein L44E [Translation, ribosomal structure and biogenesis]. 31819 COG1632: Ribosomal protein L15E [Translation, ribosomal structure and biogenesis]. 31820 COG1633: Uncharacterized conserved protein [Function unknown]. 31821 COG1634: Uncharacterized Rossmann fold enzyme [General function prediction only]. 31822 COG1635: Flavoprotein involved in thiazole biosynthesis [Coenzyme metabolism]. 31823 COG1636: Uncharacterized protein conserved in bacteria [Function unknown]. 31824 COG1637: Predicted nuclease of the RecB family [DNA replication, recombination, and repair]. 31825 COG1638: TRAP-type C4-dicarboxylate transport system, periplasmic component [Carbohydrate transport and metabolism]. 31826 COG1639: Predicted signal transduction protein [Signal transduction mechanisms]. 31827 COG1640: 4-alpha-glucanotransferase [Carbohydrate transport and metabolism]. 31828 COG1641: Uncharacterized conserved protein [Function unknown]. 31829 COG1643: HrpA-like helicases [DNA replication, recombination, and repair]. 31830 COG1644: DNA-directed RNA polymerase, subunit N (RpoN/RPB10) [Transcription]. 31831 COG1645: Uncharacterized Zn-finger containing protein [General function prediction only]. 31832 COG1646: Predicted phosphate-binding enzymes, TIM-barrel fold [General function prediction only]. 31833 COG1647: Esterase/lipase [General function prediction only]. 31834 COG1648: Siroheme synthase (precorrin-2 oxidase/ferrochelatase domain) [Coenzyme metabolism]. 31835 COG1649: Uncharacterized protein conserved in bacteria [Function unknown]. 31836 COG1650: Uncharacterized protein conserved in archaea [Function unknown]. 31837 COG1651: Protein-disulfide isomerase [Posttranslational modification, protein turnover, chaperones]. 31838 COG1652: Uncharacterized protein containing LysM domain [Function unknown]. 31839 COG1653: ABC-type sugar transport system, periplasmic component [Carbohydrate transport and metabolism]. 31840 COG1654: Biotin operon repressor [Transcription]. 31841 COG1655: Uncharacterized protein conserved in bacteria [Function unknown]. 31842 COG1656: Uncharacterized conserved protein [Function unknown]. 31843 COG1657: Squalene cyclase [Lipid metabolism]. 31844 COG1658: Small primase-like proteins (Toprim domain) [DNA replication, recombination, and repair]. 31845 COG1659: Uncharacterized protein, linocin/CFP29 homolog [Function unknown]. 31846 COG1660: Predicted P-loop-containing kinase [General function prediction only]. 31847 COG1661: Predicted DNA-binding protein with PD1-like DNA-binding motif [General function prediction only]. 31848 COG1662: Transposase and inactivated derivatives, IS1 family [DNA replication, recombination, and repair]. 31849 COG1663: Tetraacyldisaccharide-1-P 4'-kinase [Cell envelope biogenesis, outer membrane]. 31850 COG1664: Integral membrane protein CcmA involved in cell shape determination [Cell envelope biogenesis, outer membrane]. 31851 COG1665: Uncharacterized protein conserved in archaea [Function unknown]. 31852 COG1666: Uncharacterized protein conserved in bacteria [Function unknown]. 31853 COG1667: Uncharacterized protein conserved in archaea [Function unknown]. 31854 COG1668: ABC-type Na+ efflux pump, permease component [Energy production and conversion / Inorganic ion transport and metabolism]. 31855 COG1669: Predicted nucleotidyltransferases [General function prediction only]. 31856 COG1670: Acetyltransferases, including N-acetylases of ribosomal proteins [Translation, ribosomal structure and biogenesis]. 31857 COG1671: Uncharacterized protein conserved in bacteria [Function unknown]. 31858 COG1672: Predicted ATPase (AAA+ superfamily) [General function prediction only]. 31859 COG1673: Uncharacterized protein conserved in archaea [Function unknown]. 31860 COG1674: DNA segregation ATPase FtsK/SpoIIIE and related proteins [Cell division and chromosome partitioning]. 31861 COG1675: Transcription initiation factor IIE, alpha subunit [Transcription]. 31862 COG1676: tRNA splicing endonuclease [Translation, ribosomal structure and biogenesis]. 31863 COG1677: Flagellar hook-basal body protein [Cell motility and secretion / Intracellular trafficking and secretion]. 31864 COG1678: Putative transcriptional regulator [Transcription]. 31865 COG1679: Uncharacterized conserved protein [Function unknown]. 31866 COG1680: Beta-lactamase class C and other penicillin binding proteins [Defense mechanisms]. 31867 COG1681: Archaeal flagellins [Cell motility and secretion]. 31868 COG1682: ABC-type polysaccharide/polyol phosphate export systems, permease component [Carbohydrate transport and metabolism / Cell envelope biogenesis, outer membrane]. 31869 COG1683: Uncharacterized conserved protein [Function unknown]. 31870 COG1684: Flagellar biosynthesis pathway, component FliR [Cell motility and secretion / Intracellular trafficking and secretion]. 31871 COG1685: Archaeal shikimate kinase [Amino acid transport and metabolism / Coenzyme metabolism]. 31872 COG1686: D-alanyl-D-alanine carboxypeptidase [Cell envelope biogenesis, outer membrane]. 31873 COG1687: Predicted branched-chain amino acid permeases (azaleucine resistance) [Amino acid transport and metabolism]. 31874 COG1688: Uncharacterized protein predicted to be involved in DNA repair (RAMP superfamily) [DNA replication, recombination, and repair]. 31875 COG1689: Uncharacterized protein conserved in archaea [Function unknown]. 31876 COG1690: Uncharacterized conserved protein [Function unknown]. 31877 COG1691: NCAIR mutase (PurE)-related proteins [General function prediction only]. 31878 COG1692: Uncharacterized protein conserved in bacteria [Function unknown]. 31879 COG1693: Uncharacterized protein conserved in archaea [Function unknown]. 31880 COG1694: Predicted pyrophosphatase [General function prediction only]. 31881 COG1695: Predicted transcriptional regulators [Transcription]. 31882 COG1696: Predicted membrane protein involved in D-alanine export [Cell envelope biogenesis, outer membrane]. 31883 COG1697: DNA topoisomerase VI, subunit A [DNA replication, recombination, and repair]. 31884 COG1698: Uncharacterized protein conserved in archaea [Function unknown]. 31885 COG1699: Uncharacterized protein conserved in bacteria [Function unknown]. 31886 COG1700: Uncharacterized conserved protein [Function unknown]. 31887 COG1701: Uncharacterized protein conserved in archaea [Function unknown]. 31888 COG1702: Phosphate starvation-inducible protein PhoH, predicted ATPase [Signal transduction mechanisms]. 31889 COG1703: Putative periplasmic protein kinase ArgK and related GTPases of G3E family [Amino acid transport and metabolism]. 31890 COG1704: Uncharacterized conserved protein [Function unknown]. 31891 COG1705: Muramidase (flagellum-specific) [Cell motility and secretion / Intracellular trafficking and secretion]. 31892 COG1706: Flagellar basal-body P-ring protein [Cell motility and secretion]. 31893 COG1707: ACT domain-containing protein [General function prediction only]. 31894 COG1708: Predicted nucleotidyltransferases [General function prediction only]. 31895 COG1709: Predicted transcriptional regulator [Transcription]. 31896 COG1710: Uncharacterized protein conserved in archaea [Function unknown]. 31897 COG1711: Uncharacterized protein conserved in archaea [Function unknown]. 31898 COG1712: Predicted dinucleotide-utilizing enzyme [General function prediction only]. 31899 COG1713: Predicted HD superfamily hydrolase involved in NAD metabolism [Coenzyme metabolism]. 31900 COG1714: Predicted membrane protein/domain [Function unknown]. 31901 COG1715: Restriction endonuclease [Defense mechanisms]. 31902 COG1716: FOG: FHA domain [Signal transduction mechanisms]. 31903 COG1717: Ribosomal protein L32E [Translation, ribosomal structure and biogenesis]. 31904 COG1718: Serine/threonine protein kinase involved in cell cycle control [Signal transduction mechanisms / Cell division and chromosome partitioning]. 31905 COG1719: Predicted hydrocarbon binding protein (contains V4R domain) [General function prediction only]. 31906 COG1720: Uncharacterized conserved protein [Function unknown]. 31907 COG1721: Uncharacterized conserved protein (some members contain a von Willebrand factor type A (vWA) domain) [General function prediction only]. 31908 COG1722: Exonuclease VII small subunit [DNA replication, recombination, and repair]. 31909 COG1723: Uncharacterized conserved protein [Function unknown]. 31910 COG1724: Predicted periplasmic or secreted lipoprotein [Cell motility and secretion]. 31911 COG1725: Predicted transcriptional regulators [Transcription]. 31912 COG1726: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrA [Energy production and conversion]. 31913 COG1727: Ribosomal protein L18E [Translation, ribosomal structure and biogenesis]. 31914 COG1728: Uncharacterized protein conserved in bacteria [Function unknown]. 31915 COG1729: Uncharacterized protein conserved in bacteria [Function unknown]. 31916 COG1730: Predicted prefoldin, molecular chaperone implicated in de novo protein folding [Posttranslational modification, protein turnover, chaperones]. 31917 COG1731: Archaeal riboflavin synthase [Coenzyme metabolism]. 31918 COG1732: Periplasmic glycine betaine/choline-binding (lipo)protein of an ABC-type transport system (osmoprotectant binding protein) [Cell envelope biogenesis, outer membrane]. 31919 COG1733: Predicted transcriptional regulators [Transcription]. 31920 COG1734: DnaK suppressor protein [Signal transduction mechanisms]. 31921 COG1735: Predicted metal-dependent hydrolase with the TIM-barrel fold [General function prediction only]. 31922 COG1736: Diphthamide synthase subunit DPH2 [Translation, ribosomal structure and biogenesis]. 31923 COG1737: Transcriptional regulators [Transcription]. 31924 COG1738: Uncharacterized conserved protein [Function unknown]. 31925 COG1739: Uncharacterized conserved protein [Function unknown]. 31926 COG1740: Ni,Fe-hydrogenase I small subunit [Energy production and conversion]. 31927 COG1741: Pirin-related protein [General function prediction only]. 31928 COG1742: Uncharacterized conserved protein [Function unknown]. 31929 COG1743: Adenine-specific DNA methylase containing a Zn-ribbon [DNA replication, recombination, and repair]. 31930 COG1744: Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein [General function prediction only]. 31931 COG1745: Predicted metal-binding protein [General function prediction only]. 31932 COG1746: tRNA nucleotidyltransferase (CCA-adding enzyme) [Translation, ribosomal structure and biogenesis]. 31933 COG1747: Uncharacterized N-terminal domain of the transcription elongation factor GreA [Function unknown]. 31934 COG1748: Saccharopine dehydrogenase and related proteins [Amino acid transport and metabolism]. 31935 COG1749: Flagellar hook protein FlgE [Cell motility and secretion]. 31936 COG1750: Archaeal serine proteases [General function prediction only]. 31937 COG1751: Uncharacterized conserved protein [Function unknown]. 31938 COG1752: Predicted esterase of the alpha-beta hydrolase superfamily [General function prediction only]. 31939 COG1753: Uncharacterized conserved protein [Function unknown]. 31940 COG1754: Uncharacterized C-terminal domain of topoisomerase IA [General function prediction only]. 31941 COG1755: Uncharacterized protein conserved in bacteria [Function unknown]. 31942 COG1756: Uncharacterized conserved protein [Function unknown]. 31943 COG1757: Na+/H+ antiporter [Energy production and conversion]. 31944 COG1758: DNA-directed RNA polymerase, subunit K/omega [Transcription]. 31945 COG1759: ATP-utilizing enzymes of ATP-grasp superfamily (probably carboligases) [General function prediction only]. 31946 COG1760: L-serine deaminase [Amino acid transport and metabolism]. 31947 COG1761: DNA-directed RNA polymerase, subunit L [Transcription]. 31948 COG1762: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) [Carbohydrate transport and metabolism / Signal transduction mechanisms]. 31949 COG1763: Molybdopterin-guanine dinucleotide biosynthesis protein [Coenzyme metabolism]. 31950 COG1764: Predicted redox protein, regulator of disulfide bond formation [Posttranslational modification, protein turnover, chaperones]. 31951 COG1765: Predicted redox protein, regulator of disulfide bond formation [Posttranslational modification, protein turnover, chaperones]. 31952 COG1766: Flagellar biosynthesis/type III secretory pathway lipoprotein [Cell motility and secretion / Intracellular trafficking and secretion]. 31953 COG1767: Triphosphoribosyl-dephospho-CoA synthetase [Coenzyme metabolism]. 31954 COG1768: Predicted phosphohydrolase [General function prediction only]. 31955 COG1769: Uncharacterized protein predicted to be involved in DNA repair (RAMP superfamily) [DNA replication, recombination, and repair]. 31956 COG1770: Protease II [Amino acid transport and metabolism]. 31957 COG1771: Uncharacterized protein conserved in archaea [Function unknown]. 31958 COG1772: Uncharacterized protein conserved in archaea [Function unknown]. 31959 COG1773: Rubredoxin [Energy production and conversion]. 31960 COG1774: Uncharacterized homolog of PSP1 [Function unknown]. 31961 COG1775: Benzoyl-CoA reductase/2-hydroxyglutaryl-CoA dehydratase subunit, BcrC/BadD/HgdB [Amino acid transport and metabolism]. 31962 COG1776: Chemotaxis protein CheC, inhibitor of MCP methylation [Cell motility and secretion / Signal transduction mechanisms]. 31963 COG1777: Predicted transcriptional regulators [Transcription]. 31964 COG1778: Low specificity phosphatase (HAD superfamily) [General function prediction only]. 31965 COG1779: C4-type Zn-finger protein [General function prediction only]. 31966 COG1780: Protein involved in ribonucleotide reduction [Nucleotide transport and metabolism]. 31967 COG1781: Aspartate carbamoyltransferase, regulatory subunit [Nucleotide transport and metabolism]. 31968 COG1782: Predicted metal-dependent RNase, consists of a metallo-beta-lactamase domain and an RNA-binding KH domain [General function prediction only]. 31969 COG1783: Phage terminase large subunit [General function prediction only]. 31970 COG1784: Predicted membrane protein [Function unknown]. 31971 COG1785: Alkaline phosphatase [Inorganic ion transport and metabolism]. 31972 COG1786: Uncharacterized conserved protein [Function unknown]. 31973 COG1787: Predicted endonuclease distantly related to archaeal Holliday junction resolvase and Mrr-like restriction enzymes [Defense mechanisms]. 31974 COG1788: Acyl CoA:acetate/3-ketoacid CoA transferase, alpha subunit [Lipid metabolism]. 31975 COG1790: Uncharacterized protein conserved in archaea [Function unknown]. 31976 COG1791: Uncharacterized conserved protein, contains double-stranded beta-helix domain [Function unknown]. 31977 COG1792: Cell shape-determining protein [Cell envelope biogenesis, outer membrane]. 31978 COG1793: ATP-dependent DNA ligase [DNA replication, recombination, and repair]. 31979 COG1794: Aspartate racemase [Cell envelope biogenesis, outer membrane]. 31980 COG1795: Uncharacterized conserved protein [Function unknown]. 31981 COG1796: DNA polymerase IV (family X) [DNA replication, recombination, and repair]. 31982 COG1797: Cobyrinic acid a,c-diamide synthase [Coenzyme metabolism]. 31983 COG1798: Diphthamide biosynthesis methyltransferase [Translation, ribosomal structure and biogenesis]. 31984 COG1799: Uncharacterized protein conserved in bacteria [Function unknown]. 31985 COG1800: Predicted transglutaminase-like proteases [General function prediction only]. 31986 COG1801: Uncharacterized conserved protein [Function unknown]. 31987 COG1802: Transcriptional regulators [Transcription]. 31988 COG1803: Methylglyoxal synthase [Carbohydrate transport and metabolism]. 31989 COG1804: Predicted acyl-CoA transferases/carnitine dehydratase [Energy production and conversion]. 31990 COG1805: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrB [Energy production and conversion]. 31991 COG1806: Uncharacterized protein conserved in bacteria [Function unknown]. 31992 COG1807: 4-amino-4-deoxy-L-arabinose transferase and related glycosyltransferases of PMT family [Cell envelope biogenesis, outer membrane]. 31993 COG1808: Predicted membrane protein [Function unknown]. 31994 COG1809: Uncharacterized conserved protein [Function unknown]. 31995 COG1810: Uncharacterized protein conserved in archaea [Function unknown]. 31996 COG1811: Uncharacterized membrane protein, possible Na+ channel or pump [General function prediction only]. 31997 COG1812: Archaeal S-adenosylmethionine synthetase [Amino acid transport and metabolism]. 31998 COG1813: Predicted transcription factor, homolog of eukaryotic MBF1 [Transcription]. 31999 COG1814: Uncharacterized membrane protein [Function unknown]. 32000 COG1815: Flagellar basal body protein [Cell motility and secretion]. 32001 COG1816: Adenosine deaminase [Nucleotide transport and metabolism]. 32002 COG1817: Uncharacterized protein conserved in archaea [Function unknown]. 32003 COG1818: Predicted RNA-binding protein, contains THUMP domain [General function prediction only]. 32004 COG1819: Glycosyl transferases, related to UDP-glucuronosyltransferase [Carbohydrate transport and metabolism / Signal transduction mechanisms]. 32005 COG1820: N-acetylglucosamine-6-phosphate deacetylase [Carbohydrate transport and metabolism]. 32006 COG1821: Predicted ATP-utilizing enzyme (ATP-grasp superfamily) [General function prediction only]. 32007 COG1822: Predicted archaeal membrane protein [Function unknown]. 32008 COG1823: Predicted Na+/dicarboxylate symporter [General function prediction only]. 32009 COG1824: Permease, similar to cation transporters [Inorganic ion transport and metabolism]. 32010 COG1825: Ribosomal protein L25 (general stress protein Ctc) [Translation, ribosomal structure and biogenesis]. 32011 COG1826: Sec-independent protein secretion pathway components [Intracellular trafficking and secretion]. 32012 COG1827: Predicted small molecule binding protein (contains 3H domain) [General function prediction only]. 32013 COG1828: Phosphoribosylformylglycinamidine (FGAM) synthase, PurS component [Nucleotide transport and metabolism]. 32014 COG1829: Predicted archaeal kinase (sugar kinase superfamily) [General function prediction only]. 32015 COG1830: DhnA-type fructose-1,6-bisphosphate aldolase and related enzymes [Carbohydrate transport and metabolism]. 32016 COG1831: Predicted metal-dependent hydrolase (urease superfamily) [General function prediction only]. 32017 COG1832: Predicted CoA-binding protein [General function prediction only]. 32018 COG1833: Uncharacterized conserved protein [Function unknown]. 32019 COG1834: N-Dimethylarginine dimethylaminohydrolase [Amino acid transport and metabolism]. 32020 COG1835: Predicted acyltransferases [Lipid metabolism]. 32021 COG1836: Predicted membrane protein [Function unknown]. 32022 COG1837: Predicted RNA-binding protein (contains KH domain) [General function prediction only]. 32023 COG1838: Tartrate dehydratase beta subunit/Fumarate hydratase class I, C-terminal domain [Energy production and conversion]. 32024 COG1839: Uncharacterized conserved protein [Function unknown]. 32025 COG1840: ABC-type Fe3+ transport system, periplasmic component [Inorganic ion transport and metabolism]. 32026 COG1841: Ribosomal protein L30/L7E [Translation, ribosomal structure and biogenesis]. 32027 COG1842: Phage shock protein A (IM30), suppresses sigma54-dependent transcription [Transcription / Signal transduction mechanisms]. 32028 COG1843: Flagellar hook capping protein [Cell motility and secretion]. 32029 COG1844: Uncharacterized protein conserved in archaea [Function unknown]. 32030 COG1845: Heme/copper-type cytochrome/quinol oxidase, subunit 3 [Energy production and conversion]. 32031 COG1846: Transcriptional regulators [Transcription]. 32032 COG1847: Predicted RNA-binding protein [General function prediction only]. 32033 COG1848: Predicted nucleic acid-binding protein, contains PIN domain [General function prediction only]. 32034 COG1849: Uncharacterized protein conserved in archaea [Function unknown]. 32035 COG1850: Ribulose 1,5-bisphosphate carboxylase, large subunit [Carbohydrate transport and metabolism]. 32036 COG1851: Uncharacterized conserved protein [Function unknown]. 32037 COG1852: Uncharacterized conserved protein [Function unknown]. 32038 COG1853: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family [General function prediction only]. 32039 COG1854: LuxS protein involved in autoinducer AI2 synthesis [Signal transduction mechanisms]. 32040 COG1855: ATPase (PilT family) [General function prediction only]. 32041 COG1856: Uncharacterized homolog of biotin synthetase [Function unknown]. 32042 COG1857: Uncharacterized protein predicted to be involved in DNA repair [DNA replication, recombination, and repair]. 32043 COG1858: Cytochrome c peroxidase [Inorganic ion transport and metabolism]. 32044 COG1859: RNA:NAD 2'-phosphotransferase [Translation, ribosomal structure and biogenesis]. 32045 COG1860: Uncharacterized protein conserved in archaea [Function unknown]. 32046 COG1861: Spore coat polysaccharide biosynthesis protein F, CMP-KDO synthetase homolog [Cell envelope biogenesis, outer membrane]. 32047 COG1862: Preprotein translocase subunit YajC [Intracellular trafficking and secretion]. 32048 COG1863: Multisubunit Na+/H+ antiporter, MnhE subunit [Inorganic ion transport and metabolism]. 32049 COG1864: DNA/RNA endonuclease G, NUC1 [Nucleotide transport and metabolism]. 32050 COG1865: Uncharacterized conserved protein [Function unknown]. 32051 COG1866: Phosphoenolpyruvate carboxykinase (ATP) [Energy production and conversion]. 32052 COG1867: N2,N2-dimethylguanosine tRNA methyltransferase [Translation, ribosomal structure and biogenesis]. 32053 COG1868: Flagellar motor switch protein [Cell motility and secretion]. 32054 COG1869: ABC-type ribose transport system, auxiliary component [Carbohydrate transport and metabolism]. 32055 COG1871: Chemotaxis protein; stimulates methylation of MCP proteins [Cell motility and secretion / Signal transduction mechanisms]. 32056 COG1872: Uncharacterized conserved protein [Function unknown]. 32057 COG1873: Uncharacterized conserved protein [Function unknown]. 32058 COG1874: Beta-galactosidase [Carbohydrate transport and metabolism]. 32059 COG1875: Predicted ATPase related to phosphate starvation-inducible protein PhoH [Signal transduction mechanisms]. 32060 COG1876: D-alanyl-D-alanine carboxypeptidase [Cell envelope biogenesis, outer membrane]. 32061 COG1877: Trehalose-6-phosphatase [Carbohydrate transport and metabolism]. 32062 COG1878: Predicted metal-dependent hydrolase [General function prediction only]. 32063 COG1879: ABC-type sugar transport system, periplasmic component [Carbohydrate transport and metabolism]. 32064 COG1880: CO dehydrogenase/acetyl-CoA synthase epsilon subunit [Energy production and conversion]. 32065 COG1881: Phospholipid-binding protein [General function prediction only]. 32066 COG1882: Pyruvate-formate lyase [Energy production and conversion]. 32067 COG1883: Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit [Energy production and conversion]. 32068 COG1884: Methylmalonyl-CoA mutase, N-terminal domain/subunit [Lipid metabolism]. 32069 COG1885: Uncharacterized protein conserved in archaea [Function unknown]. 32070 COG1886: Flagellar motor switch/type III secretory pathway protein [Cell motility and secretion / Intracellular trafficking and secretion]. 32071 COG1887: Putative glycosyl/glycerophosphate transferases involved in teichoic acid biosynthesis TagF/TagB/EpsJ/RodC [Cell envelope biogenesis, outer membrane]. 32072 COG1888: Uncharacterized protein conserved in archaea [Function unknown]. 32073 COG1889: Fibrillarin-like rRNA methylase [Translation, ribosomal structure and biogenesis]. 32074 COG1890: Ribosomal protein S3AE [Translation, ribosomal structure and biogenesis]. 32075 COG1891: Uncharacterized protein conserved in archaea [Function unknown]. 32076 COG1892: Uncharacterized protein conserved in archaea [Function unknown]. 32077 COG1893: Ketopantoate reductase [Coenzyme metabolism]. 32078 COG1894: NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit [Energy production and conversion]. 32079 COG1895: Uncharacterized conserved protein related to C-terminal domain of eukaryotic chaperone, SACSIN [Function unknown]. 32080 COG1896: Predicted hydrolases of HD superfamily [General function prediction only]. 32081 COG1897: Homoserine trans-succinylase [Amino acid transport and metabolism]. 32082 COG1898: dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes [Cell envelope biogenesis, outer membrane]. 32083 COG1899: Deoxyhypusine synthase [Posttranslational modification, protein turnover, chaperones]. 32084 COG1900: Uncharacterized conserved protein [Function unknown]. 32085 COG1901: Uncharacterized conserved protein [Function unknown]. 32086 COG1902: NADH:flavin oxidoreductases, Old Yellow Enzyme family [Energy production and conversion]. 32087 COG1903: Cobalamin biosynthesis protein CbiD [Coenzyme metabolism]. 32088 COG1904: Glucuronate isomerase [Carbohydrate transport and metabolism]. 32089 COG1905: NADH:ubiquinone oxidoreductase 24 kD subunit [Energy production and conversion]. 32090 COG1906: Uncharacterized conserved protein [Function unknown]. 32091 COG1907: Predicted archaeal sugar kinases [General function prediction only]. 32092 COG1908: Coenzyme F420-reducing hydrogenase, delta subunit [Energy production and conversion]. 32093 COG1909: Uncharacterized protein conserved in archaea [Function unknown]. 32094 COG1910: Periplasmic molybdate-binding protein/domain [Inorganic ion transport and metabolism]. 32095 COG1911: Ribosomal protein L30E [Translation, ribosomal structure and biogenesis]. 32096 COG1912: Uncharacterized conserved protein [Function unknown]. 32097 COG1913: Predicted Zn-dependent proteases [General function prediction only]. 32098 COG1914: Mn2+ and Fe2+ transporters of the NRAMP family [Inorganic ion transport and metabolism]. 32099 COG1915: Uncharacterized conserved protein [Function unknown]. 32100 COG1916: Uncharacterized homolog of PrgY (pheromone shutdown protein) [Function unknown]. 32101 COG1917: Uncharacterized conserved protein, contains double-stranded beta-helix domain [Function unknown]. 32102 COG1918: Fe2+ transport system protein A [Inorganic ion transport and metabolism]. 32103 COG1920: Uncharacterized conserved protein [Function unknown]. 32104 COG1921: Selenocysteine synthase [seryl-tRNASer selenium transferase] [Amino acid transport and metabolism]. 32105 COG1922: Teichoic acid biosynthesis proteins [Cell envelope biogenesis, outer membrane]. 32106 COG1923: Uncharacterized host factor I protein [General function prediction only]. 32107 COG1924: Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) [Lipid metabolism]. 32108 COG1925: Phosphotransferase system, HPr-related proteins [Carbohydrate transport and metabolism]. 32109 COG1926: Predicted phosphoribosyltransferases [General function prediction only]. 32110 COG1927: Coenzyme F420-dependent N(5),N(10)-methenyltetrahydromethanopterin dehydrogenase [Energy production and conversion]. 32111 COG1928: Dolichyl-phosphate-mannose--protein O-mannosyl transferase [Posttranslational modification, protein turnover, chaperones]. 32112 COG1929: Glycerate kinase [Carbohydrate transport and metabolism]. 32113 COG1930: ABC-type cobalt transport system, periplasmic component [Inorganic ion transport and metabolism]. 32114 COG1931: Uncharacterized protein conserved in archaea [Function unknown]. 32115 COG1932: Phosphoserine aminotransferase [Coenzyme metabolism / Amino acid transport and metabolism]. 32116 COG1933: Archaeal DNA polymerase II, large subunit [DNA replication, recombination, and repair]. 32117 COG1934: Uncharacterized protein conserved in bacteria [Function unknown]. 32118 COG1935: Uncharacterized conserved protein [Function unknown]. 32119 COG1936: Predicted nucleotide kinase (related to CMP and AMP kinases) [Nucleotide transport and metabolism]. 32120 COG1937: Uncharacterized protein conserved in bacteria [Function unknown]. 32121 COG1938: Archaeal enzymes of ATP-grasp superfamily [General function prediction only]. 32122 COG1939: Uncharacterized protein conserved in bacteria [Function unknown]. 32123 COG1940: Transcriptional regulator/sugar kinase [Transcription / Carbohydrate transport and metabolism]. 32124 COG1941: Coenzyme F420-reducing hydrogenase, gamma subunit [Energy production and conversion]. 32125 COG1942: Uncharacterized protein, 4-oxalocrotonate tautomerase homolog [General function prediction only]. 32126 COG1943: Transposase and inactivated derivatives [DNA replication, recombination, and repair]. 32127 COG1944: Uncharacterized conserved protein [Function unknown]. 32128 COG1945: Uncharacterized conserved protein [Function unknown]. 32129 COG1946: Acyl-CoA thioesterase [Lipid metabolism]. 32130 COG1947: 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase [Lipid metabolism]. 32131 COG1948: ERCC4-type nuclease [DNA replication, recombination, and repair]. 32132 COG1949: Oligoribonuclease (3'->5' exoribonuclease) [RNA processing and modification]. 32133 COG1950: Predicted membrane protein [Function unknown]. 32134 COG1951: Tartrate dehydratase alpha subunit/Fumarate hydratase class I, N-terminal domain [Energy production and conversion]. 32135 COG1952: Preprotein translocase subunit SecB [Intracellular trafficking and secretion]. 32136 COG1953: Cytosine/uracil/thiamine/allantoin permeases [Nucleotide transport and metabolism / Coenzyme metabolism]. 32137 COG1954: Glycerol-3-phosphate responsive antiterminator (mRNA-binding) [Transcription]. 32138 COG1955: Archaeal flagella assembly protein J [Cell motility and secretion / Intracellular trafficking and secretion]. 32139 COG1956: GAF domain-containing protein [Signal transduction mechanisms]. 32140 COG1957: Inosine-uridine nucleoside N-ribohydrolase [Nucleotide transport and metabolism]. 32141 COG1958: Small nuclear ribonucleoprotein (snRNP) homolog [Transcription]. 32142 COG1959: Predicted transcriptional regulator [Transcription]. 32143 COG1960: Acyl-CoA dehydrogenases [Lipid metabolism]. 32144 COG1961: Site-specific recombinases, DNA invertase Pin homologs [DNA replication, recombination, and repair]. 32145 COG1962: Tetrahydromethanopterin S-methyltransferase, subunit H [Coenzyme metabolism]. 32146 COG1963: Uncharacterized protein conserved in bacteria [Function unknown]. 32148 COG1965: Protein implicated in iron transport, frataxin homolog [Inorganic ion transport and metabolism]. 32149 COG1966: Carbon starvation protein, predicted membrane protein [Signal transduction mechanisms]. 32150 COG1967: Predicted membrane protein [Function unknown]. 32151 COG1968: Uncharacterized bacitracin resistance protein [Defense mechanisms]. 32152 COG1969: Ni,Fe-hydrogenase I cytochrome b subunit [Energy production and conversion]. 32153 COG1970: Large-conductance mechanosensitive channel [Cell envelope biogenesis, outer membrane]. 32154 COG1971: Predicted membrane protein [Function unknown]. 32155 COG1972: Nucleoside permease [Nucleotide transport and metabolism]. 32156 COG1973: Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]. 32157 COG1974: SOS-response transcriptional repressors (RecA-mediated autopeptidases) [Transcription / Signal transduction mechanisms]. 32158 COG1975: Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family [Posttranslational modification, protein turnover, chaperones]. 32159 COG1976: Translation initiation factor 6 (eIF-6) [Translation, ribosomal structure and biogenesis]. 32160 COG1977: Molybdopterin converting factor, small subunit [Coenzyme metabolism]. 32161 COG1978: Uncharacterized protein conserved in bacteria [Function unknown]. 32162 COG1979: Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family [Energy production and conversion]. 32163 COG1980: Archaeal fructose 1,6-bisphosphatase [Carbohydrate transport and metabolism]. 32164 COG1981: Predicted membrane protein [Function unknown]. 32165 COG1982: Arginine/lysine/ornithine decarboxylases [Amino acid transport and metabolism]. 32166 COG1983: Putative stress-responsive transcriptional regulator [Transcription / Signal transduction mechanisms]. 32167 COG1984: Allophanate hydrolase subunit 2 [Amino acid transport and metabolism]. 32168 COG1985: Pyrimidine reductase, riboflavin biosynthesis [Coenzyme metabolism]. 32169 COG1986: Uncharacterized conserved protein [Function unknown]. 32170 COG1987: Flagellar biosynthesis pathway, component FliQ [Cell motility and secretion / Intracellular trafficking and secretion]. 32171 COG1988: Predicted membrane-bound metal-dependent hydrolases [General function prediction only]. 32172 COG1989: Type II secretory pathway, prepilin signal peptidase PulO and related peptidases [Cell motility and secretion / Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]. 32173 COG1990: Uncharacterized conserved protein [Function unknown]. 32174 COG1991: Uncharacterized conserved protein [Function unknown]. 32175 COG1992: Uncharacterized conserved protein [Function unknown]. 32176 COG1993: Uncharacterized conserved protein [Function unknown]. 32177 COG1994: Zn-dependent proteases [General function prediction only]. 32178 COG1995: Pyridoxal phosphate biosynthesis protein [Coenzyme metabolism]. 32179 COG1996: DNA-directed RNA polymerase, subunit RPC10 (contains C4-type Zn-finger) [Transcription]. 32180 COG1997: Ribosomal protein L37AE/L43A [Translation, ribosomal structure and biogenesis]. 32181 COG1998: Ribosomal protein S27AE [Translation, ribosomal structure and biogenesis]. 32182 COG1999: Uncharacterized protein SCO1/SenC/PrrC, involved in biogenesis of respiratory and photosynthetic systems [General function prediction only]. 32183 COG2000: Predicted Fe-S protein [General function prediction only]. 32184 COG2001: Uncharacterized protein conserved in bacteria [Function unknown]. 32185 COG2002: Regulators of stationary/sporulation gene expression [Transcription]. 32186 COG2003: DNA repair proteins [DNA replication, recombination, and repair]. 32187 COG2004: Ribosomal protein S24E [Translation, ribosomal structure and biogenesis]. 32188 COG2005: N-terminal domain of molybdenum-binding protein [General function prediction only]. 32189 COG2006: Uncharacterized conserved protein [Function unknown]. 32190 COG2007: Ribosomal protein S8E [Translation, ribosomal structure and biogenesis]. 32191 COG2008: Threonine aldolase [Amino acid transport and metabolism]. 32192 COG2009: Succinate dehydrogenase/fumarate reductase, cytochrome b subunit [Energy production and conversion]. 32193 COG2010: Cytochrome c, mono- and diheme variants [Energy production and conversion]. 32194 COG2011: ABC-type metal ion transport system, permease component [Inorganic ion transport and metabolism]. 32195 COG2012: DNA-directed RNA polymerase, subunit H, RpoH/RPB5 [Transcription]. 32196 COG2013: Uncharacterized conserved protein [Function unknown]. 32197 COG2014: Uncharacterized conserved protein [Function unknown]. 32198 COG2015: Alkyl sulfatase and related hydrolases [Secondary metabolites biosynthesis, transport, and catabolism]. 32199 COG2016: Predicted RNA-binding protein (contains PUA domain) [Translation, ribosomal structure and biogenesis]. 32200 COG2017: Galactose mutarotase and related enzymes [Carbohydrate transport and metabolism]. 32201 COG2018: Uncharacterized distant relative of homeotic protein bithoraxoid [General function prediction only]. 32202 COG2019: Archaeal adenylate kinase [Nucleotide transport and metabolism]. 32203 COG2020: Putative protein-S-isoprenylcysteine methyltransferase [Posttranslational modification, protein turnover, chaperones]. 32204 COG2021: Homoserine acetyltransferase [Amino acid transport and metabolism]. 32205 COG2022: Uncharacterized enzyme of thiazole biosynthesis [Nucleotide transport and metabolism]. 32206 COG2023: RNase P subunit RPR2 [Translation, ribosomal structure and biogenesis]. 32207 COG2024: Phenylalanyl-tRNA synthetase alpha subunit (archaeal type) [Translation, ribosomal structure and biogenesis]. 32208 COG2025: Electron transfer flavoprotein, alpha subunit [Energy production and conversion]. 32209 COG2026: Cytotoxic translational repressor of toxin-antitoxin stability system [Translation, ribosomal structure and biogenesis / Cell division and chromosome partitioning]. 32210 COG2027: D-alanyl-D-alanine carboxypeptidase (penicillin-binding protein 4) [Cell envelope biogenesis, outer membrane]. 32211 COG2028: Uncharacterized conserved protein [Function unknown]. 32212 COG2029: Uncharacterized conserved protein [Function unknown]. 32213 COG2030: Acyl dehydratase [Lipid metabolism]. 32214 COG2031: Short chain fatty acids transporter [Lipid metabolism]. 32215 COG2032: Cu/Zn superoxide dismutase [Inorganic ion transport and metabolism]. 32216 COG2033: Desulfoferrodoxin [Energy production and conversion]. 32217 COG2034: Predicted membrane protein [Function unknown]. 32218 COG2035: Predicted membrane protein [Function unknown]. 32219 COG2036: Histones H3 and H4 [Chromatin structure and dynamics]. 32220 COG2037: Formylmethanofuran:tetrahydromethanopterin formyltransferase [Energy production and conversion]. 32221 COG2038: NaMN:DMB phosphoribosyltransferase [Coenzyme metabolism]. 32222 COG2039: Pyrrolidone-carboxylate peptidase (N-terminal pyroglutamyl peptidase) [Posttranslational modification, protein turnover, chaperones]. 32223 COG2040: Homocysteine/selenocysteine methylase (S-methylmethionine-dependent) [Amino acid transport and metabolism]. 32224 COG2041: Sulfite oxidase and related enzymes [General function prediction only]. 32225 COG2042: Uncharacterized conserved protein [Function unknown]. 32226 COG2043: Uncharacterized protein conserved in archaea [Function unknown]. 32227 COG2044: Predicted peroxiredoxins [General function prediction only]. 32228 COG2045: Phosphosulfolactate phosphohydrolase and related enzymes [Coenzyme metabolism / General function prediction only]. 32229 COG2046: ATP sulfurylase (sulfate adenylyltransferase) [Inorganic ion transport and metabolism]. 32230 COG2047: Uncharacterized protein (ATP-grasp superfamily) [General function prediction only]. 32231 COG2048: Heterodisulfide reductase, subunit B [Energy production and conversion]. 32232 COG2049: Allophanate hydrolase subunit 1 [Amino acid transport and metabolism]. 32233 COG2050: Uncharacterized protein, possibly involved in aromatic compounds catabolism [Secondary metabolites biosynthesis, transport, and catabolism]. 32234 COG2051: Ribosomal protein S27E [Translation, ribosomal structure and biogenesis]. 32235 COG2052: Uncharacterized protein conserved in bacteria [Function unknown]. 32236 COG2053: Ribosomal protein S28E/S33 [Translation, ribosomal structure and biogenesis]. 32237 COG2054: Uncharacterized archaeal kinase related to aspartokinases, uridylate kinases [General function prediction only]. 32238 COG2055: Malate/L-lactate dehydrogenases [Energy production and conversion]. 32239 COG2056: Predicted permease [General function prediction only]. 32240 COG2057: Acyl CoA:acetate/3-ketoacid CoA transferase, beta subunit [Lipid metabolism]. 32241 COG2058: Ribosomal protein L12E/L44/L45/RPP1/RPP2 [Translation, ribosomal structure and biogenesis]. 32242 COG2059: Chromate transport protein ChrA [Inorganic ion transport and metabolism]. 32243 COG2060: K+-transporting ATPase, A chain [Inorganic ion transport and metabolism]. 32244 COG2061: ACT-domain-containing protein, predicted allosteric regulator of homoserine dehydrogenase [Amino acid transport and metabolism]. 32245 COG2062: Phosphohistidine phosphatase SixA [Signal transduction mechanisms]. 32246 COG2063: Flagellar basal body L-ring protein [Cell motility and secretion]. 32247 COG2064: Flp pilus assembly protein TadC [Cell motility and secretion / Intracellular trafficking and secretion]. 32248 COG2065: Pyrimidine operon attenuation protein/uracil phosphoribosyltransferase [Nucleotide transport and metabolism]. 32249 COG2066: Glutaminase [Amino acid transport and metabolism]. 32250 COG2067: Long-chain fatty acid transport protein [Lipid metabolism]. 32251 COG2068: Uncharacterized MobA-related protein [General function prediction only]. 32252 COG2069: CO dehydrogenase/acetyl-CoA synthase delta subunit (corrinoid Fe-S protein) [Energy production and conversion]. 32253 COG2070: Dioxygenases related to 2-nitropropane dioxygenase [General function prediction only]. 32254 COG2071: Predicted glutamine amidotransferases [General function prediction only]. 32255 COG2072: Predicted flavoprotein involved in K+ transport [Inorganic ion transport and metabolism]. 32256 COG2073: Cobalamin biosynthesis protein CbiG [Coenzyme metabolism]. 32257 COG2074: 2-phosphoglycerate kinase [Carbohydrate transport and metabolism]. 32258 COG2075: Ribosomal protein L24E [Translation, ribosomal structure and biogenesis]. 32259 COG2076: Membrane transporters of cations and cationic drugs [Inorganic ion transport and metabolism]. 32260 COG2077: Peroxiredoxin [Posttranslational modification, protein turnover, chaperones]. 32261 COG2078: Uncharacterized conserved protein [Function unknown]. 32262 COG2079: Uncharacterized protein involved in propionate catabolism [General function prediction only]. 32263 COG2080: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs [Energy production and conversion]. 32264 COG2081: Predicted flavoproteins [General function prediction only]. 32265 COG2082: Precorrin isomerase [Coenzyme metabolism]. 32266 COG2083: Uncharacterized protein conserved in archaea [Function unknown]. 32267 COG2084: 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases [Lipid metabolism]. 32268 COG2085: Predicted dinucleotide-binding enzymes [General function prediction only]. 32269 COG2086: Electron transfer flavoprotein, beta subunit [Energy production and conversion]. 32270 COG2087: Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase [Coenzyme metabolism]. 32271 COG2088: Uncharacterized protein, involved in the regulation of septum location [Cell envelope biogenesis, outer membrane]. 32272 COG2089: Sialic acid synthase [Cell envelope biogenesis, outer membrane]. 32273 COG2090: Uncharacterized protein conserved in archaea [Function unknown]. 32274 COG2091: Phosphopantetheinyl transferase [Coenzyme metabolism]. 32275 COG2092: Translation elongation factor EF-1beta [Translation, ribosomal structure and biogenesis]. 32276 COG2093: DNA-directed RNA polymerase, subunit E'' [Transcription]. 32277 COG2094: 3-methyladenine DNA glycosylase [DNA replication, recombination, and repair]. 32278 COG2095: Multiple antibiotic transporter [Intracellular trafficking and secretion]. 32279 COG2096: Uncharacterized conserved protein [Function unknown]. 32280 COG2097: Ribosomal protein L31E [Translation, ribosomal structure and biogenesis]. 32281 COG2098: Uncharacterized protein conserved in archaea [Function unknown]. 32282 COG2099: Precorrin-6x reductase [Coenzyme metabolism]. 32283 COG2100: Predicted Fe-S oxidoreductase [General function prediction only]. 32284 COG2101: TATA-box binding protein (TBP), component of TFIID and TFIIIB [Transcription]. 32285 COG2102: Predicted ATPases of PP-loop superfamily [General function prediction only]. 32286 COG2103: Predicted sugar phosphate isomerase [General function prediction only]. 32287 COG2104: Sulfur transfer protein involved in thiamine biosynthesis [Coenzyme metabolism]. 32288 COG2105: Uncharacterized conserved protein [Function unknown]. 32289 COG2106: Uncharacterized conserved protein [Function unknown]. 32290 COG2107: Predicted periplasmic solute-binding protein [General function prediction only]. 32291 COG2108: Uncharacterized conserved protein related to pyruvate formate-lyase activating enzyme [General function prediction only]. 32292 COG2109: ATP:corrinoid adenosyltransferase [Coenzyme metabolism]. 32293 COG2110: Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 [General function prediction only]. 32294 COG2111: Multisubunit Na+/H+ antiporter, MnhB subunit [Inorganic ion transport and metabolism]. 32295 COG2112: Predicted Ser/Thr protein kinase [Signal transduction mechanisms]. 32296 COG2113: ABC-type proline/glycine betaine transport systems, periplasmic components [Amino acid transport and metabolism]. 32297 COG2114: Adenylate cyclase, family 3 (some proteins contain HAMP domain) [Signal transduction mechanisms]. 32298 COG2115: Xylose isomerase [Carbohydrate transport and metabolism]. 32299 COG2116: Formate/nitrite family of transporters [Inorganic ion transport and metabolism]. 32300 COG2117: Predicted subunit of tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain [Translation, ribosomal structure and biogenesis]. 32301 COG2118: DNA-binding protein [General function prediction only]. 32302 COG2119: Predicted membrane protein [Function unknown]. 32303 COG2120: Uncharacterized proteins, LmbE homologs [Function unknown]. 32304 COG2121: Uncharacterized protein conserved in bacteria [Function unknown]. 32305 COG2122: Uncharacterized conserved protein [Function unknown]. 32306 COG2123: RNase PH-related exoribonuclease [Translation, ribosomal structure and biogenesis]. 32307 COG2124: Cytochrome P450 [Secondary metabolites biosynthesis, transport, and catabolism]. 32308 COG2125: Ribosomal protein S6E (S10) [Translation, ribosomal structure and biogenesis]. 32309 COG2126: Ribosomal protein L37E [Translation, ribosomal structure and biogenesis]. 32310 COG2127: Uncharacterized conserved protein [Function unknown]. 32311 COG2128: Uncharacterized conserved protein [Function unknown]. 32312 COG2129: Predicted phosphoesterases, related to the Icc protein [General function prediction only]. 32313 COG2130: Putative NADP-dependent oxidoreductases [General function prediction only]. 32314 COG2131: Deoxycytidylate deaminase [Nucleotide transport and metabolism]. 32315 COG2132: Putative multicopper oxidases [Secondary metabolites biosynthesis, transport, and catabolism]. 32316 COG2133: Glucose/sorbosone dehydrogenases [Carbohydrate transport and metabolism]. 32317 COG2134: CDP-diacylglycerol pyrophosphatase [Lipid metabolism]. 32318 COG2135: Uncharacterized conserved protein [Function unknown]. 32319 COG2136: Predicted exosome subunit/U3 small nucleolar ribonucleoprotein (snoRNP) component, contains IMP4 domain [Translation, ribosomal structure and biogenesis / RNA processing and modification]. 32320 COG2137: Uncharacterized protein conserved in bacteria [General function prediction only]. 32321 COG2138: Uncharacterized conserved protein [Function unknown]. 32322 COG2139: Ribosomal protein L21E [Translation, ribosomal structure and biogenesis]. 32323 COG2140: Thermophilic glucose-6-phosphate isomerase and related metalloenzymes [Carbohydrate transport and metabolism / General function prediction only]. 32324 COG2141: Coenzyme F420-dependent N5,N10-methylene tetrahydromethanopterin reductase and related flavin-dependent oxidoreductases [Energy production and conversion]. 32325 COG2142: Succinate dehydrogenase, hydrophobic anchor subunit [Energy production and conversion]. 32326 COG2143: Thioredoxin-related protein [Posttranslational modification, protein turnover, chaperones]. 32327 COG2144: Selenophosphate synthetase-related proteins [General function prediction only]. 32328 COG2145: Hydroxyethylthiazole kinase, sugar kinase family [Coenzyme metabolism]. 32329 COG2146: Ferredoxin subunits of nitrite reductase and ring-hydroxylating dioxygenases [Inorganic ion transport and metabolism / General function prediction only]. 32330 COG2147: Ribosomal protein L19E [Translation, ribosomal structure and biogenesis]. 32331 COG2148: Sugar transferases involved in lipopolysaccharide synthesis [Cell envelope biogenesis, outer membrane]. 32332 COG2149: Predicted membrane protein [Function unknown]. 32333 COG2150: Predicted regulator of amino acid metabolism, contains ACT domain [General function prediction only]. 32334 COG2151: Predicted metal-sulfur cluster biosynthetic enzyme [General function prediction only]. 32335 COG2152: Predicted glycosylase [Carbohydrate transport and metabolism]. 32336 COG2153: Predicted acyltransferase [General function prediction only]. 32337 COG2154: Pterin-4a-carbinolamine dehydratase [Coenzyme metabolism]. 32338 COG2155: Uncharacterized conserved protein [Function unknown]. 32339 COG2156: K+-transporting ATPase, c chain [Inorganic ion transport and metabolism]. 32340 COG2157: Ribosomal protein L20A (L18A) [Translation, ribosomal structure and biogenesis]. 32341 COG2158: Uncharacterized protein containing a Zn-finger-like domain [General function prediction only]. 32342 COG2159: Predicted metal-dependent hydrolase of the TIM-barrel fold [General function prediction only]. 32343 COG2160: L-arabinose isomerase [Carbohydrate transport and metabolism]. 32344 COG2161: Antitoxin of toxin-antitoxin stability system [Cell division and chromosome partitioning]. 32345 COG2162: Arylamine N-acetyltransferase [Secondary metabolites biosynthesis, transport, and catabolism]. 32346 COG2163: Ribosomal protein L14E/L6E/L27E [Translation, ribosomal structure and biogenesis]. 32347 COG2164: Uncharacterized conserved protein [Function unknown]. 32348 COG2165: Type II secretory pathway, pseudopilin PulG [Cell motility and secretion / Intracellular trafficking and secretion]. 32349 COG2166: SufE protein probably involved in Fe-S center assembly [General function prediction only]. 32350 COG2167: Ribosomal protein L39E [Translation, ribosomal structure and biogenesis]. 32351 COG2168: Uncharacterized conserved protein involved in oxidation of intracellular sulfur [Inorganic ion transport and metabolism]. 32352 COG2169: Adenosine deaminase [Nucleotide transport and metabolism]. 32353 COG2170: Uncharacterized conserved protein [Function unknown]. 32354 COG2171: Tetrahydrodipicolinate N-succinyltransferase [Amino acid transport and metabolism]. 32355 COG2172: Anti-sigma regulatory factor (Ser/Thr protein kinase) [Signal transduction mechanisms]. 32356 COG2173: D-alanyl-D-alanine dipeptidase [Cell envelope biogenesis, outer membrane]. 32357 COG2174: Ribosomal protein L34E [Translation, ribosomal structure and biogenesis]. 32358 COG2175: Probable taurine catabolism dioxygenase [Secondary metabolites biosynthesis, transport, and catabolism]. 32359 COG2176: DNA polymerase III, alpha subunit (gram-positive type) [DNA replication, recombination, and repair]. 32360 COG2177: Cell division protein [Cell division and chromosome partitioning]. 32361 COG2178: Predicted RNA-binding protein of the translin family [Translation, ribosomal structure and biogenesis]. 32362 COG2179: Predicted hydrolase of the HAD superfamily [General function prediction only]. 32363 COG2180: Nitrate reductase delta subunit [Energy production and conversion]. 32364 COG2181: Nitrate reductase gamma subunit [Energy production and conversion]. 32365 COG2182: Maltose-binding periplasmic proteins/domains [Carbohydrate transport and metabolism]. 32366 COG2183: Transcriptional accessory protein [Transcription]. 32367 COG2184: Protein involved in cell division [Cell division and chromosome partitioning]. 32368 COG2185: Methylmalonyl-CoA mutase, C-terminal domain/subunit (cobalamin-binding) [Lipid metabolism]. 32369 COG2186: Transcriptional regulators [Transcription]. 32370 COG2187: Uncharacterized protein conserved in bacteria [Function unknown]. 32371 COG2188: Transcriptional regulators [Transcription]. 32372 COG2189: Adenine specific DNA methylase Mod [DNA replication, recombination, and repair]. 32373 COG2190: Phosphotransferase system IIA components [Carbohydrate transport and metabolism]. 32374 COG2191: Formylmethanofuran dehydrogenase subunit E [Energy production and conversion]. 32375 COG2192: Predicted carbamoyl transferase, NodU family [Posttranslational modification, protein turnover, chaperones]. 32376 COG2193: Bacterioferritin (cytochrome b1) [Inorganic ion transport and metabolism]. 32377 COG2194: Predicted membrane-associated, metal-dependent hydrolase [General function prediction only]. 32378 COG2195: Di- and tripeptidases [Amino acid transport and metabolism]. 32379 COG2197: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain [Signal transduction mechanisms / Transcription]. 32380 COG2198: FOG: HPt domain [Signal transduction mechanisms]. 32381 COG2199: FOG: GGDEF domain [Signal transduction mechanisms]. 32382 COG2200: FOG: EAL domain [Signal transduction mechanisms]. 32383 COG2201: Chemotaxis response regulator containing a CheY-like receiver domain and a methylesterase domain [Cell motility and secretion / Signal transduction mechanisms]. 32384 COG2202: FOG: PAS/PAC domain [Signal transduction mechanisms]. 32385 COG2203: FOG: GAF domain [Signal transduction mechanisms]. 32386 COG2204: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains [Signal transduction mechanisms]. 32387 COG2205: Osmosensitive K+ channel histidine kinase [Signal transduction mechanisms]. 32388 COG2206: HD-GYP domain [Signal transduction mechanisms]. 32389 COG2207: AraC-type DNA-binding domain-containing proteins [Transcription]. 32390 COG2208: Serine phosphatase RsbU, regulator of sigma subunit [Signal transduction mechanisms / Transcription]. 32391 COG2209: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrE [Energy production and conversion]. 32392 COG2210: Uncharacterized conserved protein [Function unknown]. 32393 COG2211: Na+/melibiose symporter and related transporters [Carbohydrate transport and metabolism]. 32394 COG2212: Multisubunit Na+/H+ antiporter, MnhF subunit [Inorganic ion transport and metabolism]. 32395 COG2213: Phosphotransferase system, mannitol-specific IIBC component [Carbohydrate transport and metabolism]. 32396 COG2214: DnaJ-class molecular chaperone [Posttranslational modification, protein turnover, chaperones]. 32397 COG2215: ABC-type uncharacterized transport system, permease component [General function prediction only]. 32398 COG2216: High-affinity K+ transport system, ATPase chain B [Inorganic ion transport and metabolism]. 32399 COG2217: Cation transport ATPase [Inorganic ion transport and metabolism]. 32400 COG2218: Formylmethanofuran dehydrogenase subunit C [Energy production and conversion]. 32401 COG2219: Eukaryotic-type DNA primase, large subunit [DNA replication, recombination, and repair]. 32402 COG2220: Predicted Zn-dependent hydrolases of the beta-lactamase fold [General function prediction only]. 32403 COG2221: Dissimilatory sulfite reductase (desulfoviridin), alpha and beta subunits [Energy production and conversion]. 32404 COG2222: Predicted phosphosugar isomerases [Cell envelope biogenesis, outer membrane]. 32405 COG2223: Nitrate/nitrite transporter [Inorganic ion transport and metabolism]. 32406 COG2224: Isocitrate lyase [Energy production and conversion]. 32407 COG2225: Malate synthase [Energy production and conversion]. 32408 COG2226: Methylase involved in ubiquinone/menaquinone biosynthesis [Coenzyme metabolism]. 32409 COG2227: 2-polyprenyl-3-methyl-5-hydroxy-6-metoxy-1,4-benzoquinol methylase [Coenzyme metabolism]. 32410 COG2229: Predicted GTPase [General function prediction only]. 32411 COG2230: Cyclopropane fatty acid synthase and related methyltransferases [Cell envelope biogenesis, outer membrane]. 32412 COG2231: Uncharacterized protein related to Endonuclease III [DNA replication, recombination, and repair]. 32413 COG2232: Predicted ATP-dependent carboligase related to biotin carboxylase [General function prediction only]. 32414 COG2233: Xanthine/uracil permeases [Nucleotide transport and metabolism]. 32415 COG2234: Predicted aminopeptidases [General function prediction only]. 32416 COG2235: Arginine deiminase [Amino acid transport and metabolism]. 32417 COG2236: Predicted phosphoribosyltransferases [General function prediction only]. 32418 COG2237: Predicted membrane protein [Function unknown]. 32419 COG2238: Ribosomal protein S19E (S16A) [Translation, ribosomal structure and biogenesis]. 32420 COG2239: Mg/Co/Ni transporter MgtE (contains CBS domain) [Inorganic ion transport and metabolism]. 32421 COG2240: Pyridoxal/pyridoxine/pyridoxamine kinase [Coenzyme metabolism]. 32422 COG2241: Precorrin-6B methylase 1 [Coenzyme metabolism]. 32423 COG2242: Precorrin-6B methylase 2 [Coenzyme metabolism]. 32424 COG2243: Precorrin-2 methylase [Coenzyme metabolism]. 32425 COG2244: Membrane protein involved in the export of O-antigen and teichoic acid [General function prediction only]. 32426 COG2245: Predicted membrane protein [Function unknown]. 32427 COG2246: Predicted membrane protein [Function unknown]. 32428 COG2247: Putative cell wall-binding domain [Cell envelope biogenesis, outer membrane]. 32429 COG2248: Predicted hydrolase (metallo-beta-lactamase superfamily) [General function prediction only]. 32430 COG2249: Putative NADPH-quinone reductase (modulator of drug activity B) [General function prediction only]. 32431 COG2250: Uncharacterized conserved protein related to C-terminal domain of eukaryotic chaperone, SACSIN [Function unknown]. 32432 COG2251: Predicted nuclease (RecB family) [General function prediction only]. 32433 COG2252: Permeases [General function prediction only]. 32434 COG2253: Uncharacterized conserved protein [Function unknown]. 32435 COG2254: Predicted HD superfamily hydrolase, possibly a nuclease [DNA replication, recombination, and repair]. 32436 COG2255: Holliday junction resolvasome, helicase subunit [DNA replication, recombination, and repair]. 32437 COG2256: ATPase related to the helicase subunit of the Holliday junction resolvase [DNA replication, recombination, and repair]. 32438 COG2257: Uncharacterized homolog of the cytoplasmic domain of flagellar protein FhlB [Function unknown]. 32439 COG2258: Uncharacterized protein conserved in bacteria [Function unknown]. 32440 COG2259: Predicted membrane protein [Function unknown]. 32441 COG2260: Predicted Zn-ribbon RNA-binding protein [Translation, ribosomal structure and biogenesis]. 32442 COG2261: Predicted membrane protein [Function unknown]. 32443 COG2262: GTPases [General function prediction only]. 32444 COG2263: Predicted RNA methylase [Translation, ribosomal structure and biogenesis]. 32445 COG2264: Ribosomal protein L11 methylase [Translation, ribosomal structure and biogenesis]. 32446 COG2265: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase [Translation, ribosomal structure and biogenesis]. 32447 COG2266: GTP:adenosylcobinamide-phosphate guanylyltransferase [Coenzyme metabolism]. 32448 COG2267: Lysophospholipase [Lipid metabolism]. 32449 COG2268: Uncharacterized protein conserved in bacteria [Function unknown]. 32450 COG2269: Truncated, possibly inactive, lysyl-tRNA synthetase (class II) [Translation, ribosomal structure and biogenesis]. 32451 COG2270: Permeases of the major facilitator superfamily [General function prediction only]. 32452 COG2271: Sugar phosphate permease [Carbohydrate transport and metabolism]. 32453 COG2272: Carboxylesterase type B [Lipid metabolism]. 32454 COG2273: Beta-glucanase/Beta-glucan synthetase [Carbohydrate transport and metabolism]. 32455 COG2274: ABC-type bacteriocin/lantibiotic exporters, contain an N-terminal double-glycine peptidase domain [Defense mechanisms]. 32456 COG2301: Citrate lyase beta subunit [Carbohydrate transport and metabolism]. 32457 COG2302: Uncharacterized conserved protein, contains S4-like domain [Function unknown]. 32458 COG2303: Choline dehydrogenase and related flavoproteins [Amino acid transport and metabolism]. 32459 COG2304: Uncharacterized protein containing a von Willebrand factor type A (vWA) domain [General function prediction only]. 32460 COG2306: Uncharacterized conserved protein [Function unknown]. 32461 COG2307: Uncharacterized protein conserved in bacteria [Function unknown]. 32462 COG2308: Uncharacterized conserved protein [Function unknown]. 32463 COG2309: Leucyl aminopeptidase (aminopeptidase T) [Amino acid transport and metabolism]. 32464 COG2310: Uncharacterized proteins involved in stress response, homologs of TerZ and putative cAMP-binding protein CABP1 [Signal transduction mechanisms]. 32465 COG2311: Predicted membrane protein [Function unknown]. 32466 COG2312: Erythromycin esterase homolog [General function prediction only]. 32467 COG2313: Uncharacterized enzyme involved in pigment biosynthesis [Secondary metabolites biosynthesis, transport, and catabolism]. 32468 COG2314: Predicted membrane protein [Function unknown]. 32469 COG2315: Uncharacterized protein conserved in bacteria [Function unknown]. 32470 COG2316: Predicted hydrolase (HD superfamily) [General function prediction only]. 32471 COG2317: Zn-dependent carboxypeptidase [Amino acid transport and metabolism]. 32472 COG2318: Uncharacterized protein conserved in bacteria [Function unknown]. 32473 COG2319: FOG: WD40 repeat [General function prediction only]. 32474 COG2320: Uncharacterized conserved protein [Function unknown]. 32475 COG2321: Predicted metalloprotease [General function prediction only]. 32476 COG2322: Predicted membrane protein [Function unknown]. 32477 COG2323: Predicted membrane protein [Function unknown]. 32478 COG2324: Predicted membrane protein [Function unknown]. 32479 COG2326: Uncharacterized conserved protein [Function unknown]. 32480 COG2327: Uncharacterized conserved protein [Function unknown]. 32481 COG2329: Uncharacterized enzyme involved in biosynthesis of extracellular polysaccharides [General function prediction only]. 32482 COG2331: Uncharacterized protein conserved in bacteria [Function unknown]. 32483 COG2332: Cytochrome c-type biogenesis protein CcmE [Posttranslational modification, protein turnover, chaperones]. 32484 COG2333: Predicted hydrolase (metallo-beta-lactamase superfamily) [General function prediction only]. 32485 COG2334: Putative homoserine kinase type II (protein kinase fold) [General function prediction only]. 32486 COG2335: Secreted and surface protein containing fasciclin-like repeats [Cell envelope biogenesis, outer membrane]. 32487 COG2336: Growth regulator [Signal transduction mechanisms]. 32488 COG2337: Growth inhibitor [Signal transduction mechanisms]. 32489 COG2339: Predicted membrane protein [Function unknown]. 32490 COG2340: Uncharacterized protein with SCP/PR1 domains [Function unknown]. 32491 COG2342: Predicted extracellular endo alpha-1,4 polygalactosaminidase or related polysaccharide hydrolase [Carbohydrate transport and metabolism]. 32492 COG2343: Uncharacterized protein conserved in bacteria [Function unknown]. 32493 COG2344: AT-rich DNA-binding protein [General function prediction only]. 32494 COG2345: Predicted transcriptional regulator [Transcription]. 32495 COG2346: Truncated hemoglobins [General function prediction only]. 32496 COG2348: Uncharacterized protein involved in methicillin resistance [Defense mechanisms]. 32497 COG2350: Uncharacterized protein conserved in bacteria [Function unknown]. 32498 COG2351: Transthyretin-like protein [General function prediction only]. 32499 COG2352: Phosphoenolpyruvate carboxylase [Energy production and conversion]. 32500 COG2353: Uncharacterized conserved protein [Function unknown]. 32501 COG2354: Uncharacterized protein conserved in bacteria [Function unknown]. 32502 COG2355: Zn-dependent dipeptidase, microsomal dipeptidase homolog [Amino acid transport and metabolism]. 32503 COG2356: Endonuclease I [DNA replication, recombination, and repair]. 32504 COG2357: Uncharacterized protein conserved in bacteria [Function unknown]. 32505 COG2358: TRAP-type uncharacterized transport system, periplasmic component [General function prediction only]. 32506 COG2359: Uncharacterized protein conserved in bacteria [Function unknown]. 32507 COG2360: Leu/Phe-tRNA-protein transferase [Posttranslational modification, protein turnover, chaperones]. 32508 COG2361: Uncharacterized conserved protein [Function unknown]. 32509 COG2362: D-aminopeptidase [Amino acid transport and metabolism]. 32510 COG2363: Uncharacterized small membrane protein [Function unknown]. 32511 COG2364: Predicted membrane protein [Function unknown]. 32512 COG2365: Protein tyrosine/serine phosphatase [Signal transduction mechanisms]. 32513 COG2366: Protein related to penicillin acylase [General function prediction only]. 32514 COG2367: Beta-lactamase class A [Defense mechanisms]. 32515 COG2368: Aromatic ring hydroxylase [Secondary metabolites biosynthesis, transport, and catabolism]. 32516 COG2369: Uncharacterized protein, homolog of phage Mu protein gp30 [Function unknown]. 32517 COG2370: Hydrogenase/urease accessory protein [Posttranslational modification, protein turnover, chaperones]. 32518 COG2371: Urease accessory protein UreE [Posttranslational modification, protein turnover, chaperones]. 32519 COG2372: Uncharacterized protein, homolog of Cu resistance protein CopC [General function prediction only]. 32520 COG2373: Large extracellular alpha-helical protein [General function prediction only]. 32521 COG2374: Predicted extracellular nuclease [General function prediction only]. 32522 COG2375: Siderophore-interacting protein [Inorganic ion transport and metabolism]. 32523 COG2376: Dihydroxyacetone kinase [Carbohydrate transport and metabolism]. 32524 COG2377: Predicted molecular chaperone distantly related to HSP70-fold metalloproteases [Posttranslational modification, protein turnover, chaperones]. 32525 COG2378: Predicted transcriptional regulator [Transcription]. 32526 COG2379: Putative glycerate kinase [Carbohydrate transport and metabolism]. 32527 COG2380: Uncharacterized protein conserved in bacteria [Function unknown]. 32528 COG2382: Enterochelin esterase and related enzymes [Inorganic ion transport and metabolism]. 32529 COG2383: Uncharacterized conserved protein [Function unknown]. 32530 COG2384: Predicted SAM-dependent methyltransferase [General function prediction only]. 32531 COG2385: Sporulation protein and related proteins [Cell division and chromosome partitioning]. 32532 COG2386: ABC-type transport system involved in cytochrome c biogenesis, permease component [Posttranslational modification, protein turnover, chaperones]. 32533 COG2388: Predicted acetyltransferase [General function prediction only]. 32534 COG2389: Uncharacterized metal-binding protein [General function prediction only]. 32535 COG2390: Transcriptional regulator, contains sigma factor-related N-terminal domain [Transcription]. 32536 COG2391: Predicted transporter component [General function prediction only]. 32537 COG2401: ABC-type ATPase fused to a predicted acetyltransferase domain [General function prediction only]. 32538 COG2402: Predicted nucleic acid-binding protein, contains PIN domain [General function prediction only]. 32539 COG2403: Predicted GTPase [General function prediction only]. 32540 COG2404: Predicted phosphohydrolase (DHH superfamily) [General function prediction only]. 32541 COG2405: Predicted nucleic acid-binding protein, contains PIN domain [General function prediction only]. 32542 COG2406: Protein distantly related to bacterial ferritins [General function prediction only]. 32543 COG2407: L-fucose isomerase and related proteins [Carbohydrate transport and metabolism]. 32544 COG2409: Predicted drug exporters of the RND superfamily [General function prediction only]. 32545 COG2410: Uncharacterized conserved protein [Function unknown]. 32546 COG2411: Uncharacterized conserved protein [Function unknown]. 32547 COG2412: Uncharacterized conserved protein [Function unknown]. 32548 COG2413: Predicted nucleotidyltransferase [General function prediction only]. 32549 COG2414: Aldehyde:ferredoxin oxidoreductase [Energy production and conversion]. 32550 COG2419: Uncharacterized conserved protein [Function unknown]. 32551 COG2421: Predicted acetamidase/formamidase [Energy production and conversion]. 32552 COG2423: Predicted ornithine cyclodeaminase, mu-crystallin homolog [Amino acid transport and metabolism]. 32553 COG2425: Uncharacterized protein containing a von Willebrand factor type A (vWA) domain [General function prediction only]. 32554 COG2426: Predicted membrane protein [Function unknown]. 32555 COG2427: Uncharacterized conserved protein [Function unknown]. 32556 COG2428: Uncharacterized conserved protein [Function unknown]. 32557 COG2429: Uncharacterized conserved protein [Function unknown]. 32558 COG2430: Uncharacterized conserved protein [Function unknown]. 32559 COG2431: Predicted membrane protein [Function unknown]. 32560 COG2433: Uncharacterized conserved protein [Function unknown]. 32561 COG2440: Ferredoxin-like protein [Energy production and conversion]. 32562 COG2441: Predicted butyrate kinase [Energy production and conversion]. 32563 COG2442: Uncharacterized conserved protein [Function unknown]. 32564 COG2443: Preprotein translocase subunit Sss1 [Intracellular trafficking and secretion]. 32565 COG2445: Uncharacterized conserved protein [Function unknown]. 32566 COG2450: Uncharacterized conserved protein [Function unknown]. 32567 COG2451: Ribosomal protein L35AE/L33A [Translation, ribosomal structure and biogenesis]. 32568 COG2452: Predicted site-specific integrase-resolvase [DNA replication, recombination, and repair]. 32569 COG2453: Predicted protein-tyrosine phosphatase [Signal transduction mechanisms]. 32570 COG2454: Uncharacterized conserved protein [Function unknown]. 32571 COG2456: Uncharacterized conserved protein [Function unknown]. 32572 COG2457: Uncharacterized conserved protein [Function unknown]. 32573 COG2461: Uncharacterized conserved protein [Function unknown]. 32574 COG2469: Uncharacterized conserved protein [Function unknown]. 32575 COG2501: Uncharacterized conserved protein [Function unknown]. 32576 COG2502: Asparagine synthetase A [Amino acid transport and metabolism]. 32577 COG2503: Predicted secreted acid phosphatase [General function prediction only]. 32578 COG2508: Regulator of polyketide synthase expression [Signal transduction mechanisms / Secondary metabolites biosynthesis, transport, and catabolism]. 32579 COG2509: Uncharacterized FAD-dependent dehydrogenases [General function prediction only]. 32580 COG2510: Predicted membrane protein [Function unknown]. 32581 COG2511: Archaeal Glu-tRNAGln amidotransferase subunit E (contains GAD domain) [Translation, ribosomal structure and biogenesis]. 32582 COG2512: Uncharacterized membrane-associated protein/domain [Function unknown]. 32583 COG2513: PEP phosphonomutase and related enzymes [Carbohydrate transport and metabolism]. 32584 COG2514: Predicted ring-cleavage extradiol dioxygenase [General function prediction only]. 32585 COG2515: 1-aminocyclopropane-1-carboxylate deaminase [Amino acid transport and metabolism]. 32586 COG2516: Biotin synthase-related enzyme [General function prediction only]. 32587 COG2517: Predicted RNA-binding protein containing a C-terminal EMAP domain [General function prediction only]. 32588 COG2518: Protein-L-isoaspartate carboxylmethyltransferase [Posttranslational modification, protein turnover, chaperones]. 32589 COG2519: tRNA(1-methyladenosine) methyltransferase and related methyltransferases [Translation, ribosomal structure and biogenesis]. 32590 COG2520: Predicted methyltransferase [General function prediction only]. 32591 COG2521: Predicted archaeal methyltransferase [General function prediction only]. 32592 COG2522: Predicted transcriptional regulator [General function prediction only]. 32593 COG2524: Predicted transcriptional regulator, contains C-terminal CBS domains [Transcription]. 32594 COG2602: Beta-lactamase class D [Defense mechanisms]. 32595 COG2603: Predicted ATPase [General function prediction only]. 32596 COG2604: Uncharacterized protein conserved in bacteria [Function unknown]. 32597 COG2605: Predicted kinase related to galactokinase and mevalonate kinase [General function prediction only]. 32598 COG2606: Uncharacterized conserved protein [Function unknown]. 32599 COG2607: Predicted ATPase (AAA+ superfamily) [General function prediction only]. 32600 COG2608: Copper chaperone [Inorganic ion transport and metabolism]. 32601 COG2609: Pyruvate dehydrogenase complex, dehydrogenase (E1) component [Energy production and conversion]. 32602 COG2610: H+/gluconate symporter and related permeases [Carbohydrate transport and metabolism / Amino acid transport and metabolism]. 32603 COG2703: Hemerythrin [Inorganic ion transport and metabolism]. 32604 COG2704: Anaerobic C4-dicarboxylate transporter [General function prediction only]. 32605 COG2706: 3-carboxymuconate cyclase [Carbohydrate transport and metabolism]. 32606 COG2707: Predicted membrane protein [Function unknown]. 32607 COG2710: Nitrogenase molybdenum-iron protein, alpha and beta chains [Energy production and conversion]. 32608 COG2715: Uncharacterized membrane protein, required for spore maturation in B.subtilis. [General function prediction only]. 32609 COG2716: Glycine cleavage system regulatory protein [Amino acid transport and metabolism]. 32610 COG2717: Predicted membrane protein [Function unknown]. 32611 COG2718: Uncharacterized conserved protein [Function unknown]. 32612 COG2719: Uncharacterized conserved protein [Function unknown]. 32613 COG2720: Uncharacterized vancomycin resistance protein [Defense mechanisms]. 32614 COG2721: Altronate dehydratase [Carbohydrate transport and metabolism]. 32615 COG2723: Beta-glucosidase/6-phospho-beta-glucosidase/beta-galactosidase [Carbohydrate transport and metabolism]. 32616 COG2730: Endoglucanase [Carbohydrate transport and metabolism]. 32617 COG2731: Beta-galactosidase, beta subunit [Carbohydrate transport and metabolism]. 32618 COG2732: Barstar, RNAse (barnase) inhibitor [Transcription]. 32619 COG2733: Predicted membrane protein [Function unknown]. 32620 COG2738: Predicted Zn-dependent protease [General function prediction only]. 32621 COG2739: Uncharacterized protein conserved in bacteria [Function unknown]. 32622 COG2740: Predicted nucleic-acid-binding protein implicated in transcription termination [Transcription]. 32623 COG2746: Aminoglycoside N3'-acetyltransferase [Defense mechanisms]. 32624 COG2747: Negative regulator of flagellin synthesis (anti-sigma28 factor) [Transcription / Cell motility and secretion / Intracellular trafficking and secretion]. 32625 COG2755: Lysophospholipase L1 and related esterases [Amino acid transport and metabolism]. 32626 COG2759: Formyltetrahydrofolate synthetase [Nucleotide transport and metabolism]. 32627 COG2761: Predicted dithiol-disulfide isomerase involved in polyketide biosynthesis [Secondary metabolites biosynthesis, transport, and catabolism]. 32628 COG2764: Uncharacterized protein conserved in bacteria [Function unknown]. 32629 COG2766: Putative Ser protein kinase [Signal transduction mechanisms]. 32630 COG2768: Uncharacterized Fe-S center protein [General function prediction only]. 32631 COG2770: FOG: HAMP domain [Signal transduction mechanisms]. 32632 COG2771: DNA-binding HTH domain-containing proteins [Transcription]. 32633 COG2801: Transposase and inactivated derivatives [DNA replication, recombination, and repair]. 32634 COG2802: Uncharacterized protein, similar to the N-terminal domain of Lon protease [General function prediction only]. 32635 COG2804: Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB [Cell motility and secretion / Intracellular trafficking and secretion]. 32636 COG2805: Tfp pilus assembly protein, pilus retraction ATPase PilT [Cell motility and secretion / Intracellular trafficking and secretion]. 32637 COG2807: Cyanate permease [Inorganic ion transport and metabolism]. 32638 COG2808: Transcriptional regulator [Transcription]. 32639 COG2810: Predicted type IV restriction endonuclease [Defense mechanisms]. 32640 COG2811: Archaeal/vacuolar-type H+-ATPase subunit H [Energy production and conversion]. 32641 COG2812: DNA polymerase III, gamma/tau subunits [DNA replication, recombination, and repair]. 32642 COG2813: 16S RNA G1207 methylase RsmC [Translation, ribosomal structure and biogenesis]. 32643 COG2814: Arabinose efflux permease [Carbohydrate transport and metabolism]. 32644 COG2815: Uncharacterized protein conserved in bacteria [Function unknown]. 32645 COG2816: NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding [DNA replication, recombination, and repair]. 32646 COG2818: 3-methyladenine DNA glycosylase [DNA replication, recombination, and repair]. 32647 COG2819: Predicted hydrolase of the alpha/beta superfamily [General function prediction only]. 32648 COG2820: Uridine phosphorylase [Nucleotide transport and metabolism]. 32649 COG2821: Membrane-bound lytic murein transglycosylase [Cell envelope biogenesis, outer membrane]. 32650 COG2822: Predicted periplasmic lipoprotein involved in iron transport [Inorganic ion transport and metabolism]. 32651 COG2823: Predicted periplasmic or secreted lipoprotein [General function prediction only]. 32652 COG2824: Uncharacterized Zn-ribbon-containing protein involved in phosphonate metabolism [Inorganic ion transport and metabolism]. 32653 COG2825: Outer membrane protein [Cell envelope biogenesis, outer membrane]. 32654 COG2826: Transposase and inactivated derivatives, IS30 family [DNA replication, recombination, and repair]. 32655 COG2827: Predicted endonuclease containing a URI domain [DNA replication, recombination, and repair]. 32656 COG2828: Uncharacterized protein conserved in bacteria [Function unknown]. 32657 COG2829: Outer membrane phospholipase A [Cell envelope biogenesis, outer membrane]. 32658 COG2830: Uncharacterized protein conserved in bacteria [Function unknown]. 32659 COG2831: Hemolysin activation/secretion protein [Intracellular trafficking and secretion]. 32660 COG2832: Uncharacterized protein conserved in bacteria [Function unknown]. 32661 COG2833: Uncharacterized protein conserved in bacteria [Function unknown]. 32662 COG2834: Outer membrane lipoprotein-sorting protein [Cell envelope biogenesis, outer membrane]. 32663 COG2835: Uncharacterized conserved protein [Function unknown]. 32664 COG2836: Uncharacterized conserved protein [Function unknown]. 32665 COG2837: Predicted iron-dependent peroxidase [Inorganic ion transport and metabolism]. 32666 COG2838: Monomeric isocitrate dehydrogenase [Energy production and conversion]. 32667 COG2839: Uncharacterized protein conserved in bacteria [Function unknown]. 32668 COG2840: Uncharacterized protein conserved in bacteria [Function unknown]. 32669 COG2841: Uncharacterized protein conserved in bacteria [Function unknown]. 32670 COG2842: Uncharacterized ATPase, putative transposase [General function prediction only]. 32671 COG2843: Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) [Cell envelope biogenesis, outer membrane]. 32672 COG2844: UTP:GlnB (protein PII) uridylyltransferase [Posttranslational modification, protein turnover, chaperones]. 32673 COG2845: Uncharacterized protein conserved in bacteria [Function unknown]. 32674 COG2846: Regulator of cell morphogenesis and NO signaling [Cell division and chromosome partitioning]. 32675 COG2847: Uncharacterized protein conserved in bacteria [Function unknown]. 32676 COG2848: Uncharacterized conserved protein [Function unknown]. 32677 COG2849: Uncharacterized protein conserved in bacteria [Function unknown]. 32678 COG2850: Uncharacterized conserved protein [Function unknown]. 32679 COG2851: H+/citrate symporter [Energy production and conversion]. 32680 COG2852: Uncharacterized protein conserved in bacteria [Function unknown]. 32681 COG2853: Surface lipoprotein [Cell envelope biogenesis, outer membrane]. 32682 COG2854: ABC-type transport system involved in resistance to organic solvents, auxiliary component [Secondary metabolites biosynthesis, transport, and catabolism]. 32683 COG2855: Predicted membrane protein [Function unknown]. 32684 COG2856: Predicted Zn peptidase [Amino acid transport and metabolism]. 32685 COG2857: Cytochrome c1 [Energy production and conversion]. 32686 COG2859: Uncharacterized protein conserved in bacteria [Function unknown]. 32687 COG2860: Predicted membrane protein [Function unknown]. 32688 COG2861: Uncharacterized protein conserved in bacteria [Function unknown]. 32689 COG2862: Predicted membrane protein [Function unknown]. 32690 COG2863: Cytochrome c553 [Energy production and conversion]. 32691 COG2864: Cytochrome b subunit of formate dehydrogenase [Energy production and conversion]. 32692 COG2865: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen [Transcription]. 32693 COG2866: Predicted carboxypeptidase [Amino acid transport and metabolism]. 32694 COG2867: Oligoketide cyclase/lipid transport protein [Lipid metabolism]. 32695 COG2868: Predicted ribosomal protein [Translation, ribosomal structure and biogenesis]. 32696 COG2869: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrC [Energy production and conversion]. 32697 COG2870: ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase [Cell envelope biogenesis, outer membrane]. 32698 COG2871: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrF [Energy production and conversion]. 32699 COG2872: Predicted metal-dependent hydrolases related to alanyl-tRNA synthetase HxxxH domain [General function prediction only]. 32700 COG2873: O-acetylhomoserine sulfhydrylase [Amino acid transport and metabolism]. 32701 COG2874: Predicted ATPases involved in biogenesis of archaeal flagella [Cell motility and secretion / Intracellular trafficking and secretion]. 32702 COG2875: Precorrin-4 methylase [Coenzyme metabolism]. 32703 COG2876: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase [Amino acid transport and metabolism]. 32704 COG2877: 3-deoxy-D-manno-octulosonic acid (KDO) 8-phosphate synthase [Cell envelope biogenesis, outer membrane]. 32705 COG2878: Predicted NADH:ubiquinone oxidoreductase, subunit RnfB [Energy production and conversion]. 32706 COG2879: Uncharacterized small protein [Function unknown]. 32707 COG2880: Uncharacterized protein conserved in archaea [Function unknown]. 32708 COG2881: Uncharacterized protein conserved in archaea [Function unknown]. 32709 COG2882: Flagellar biosynthesis chaperone [Cell motility and secretion / Intracellular trafficking and secretion / Posttranslational modification, protein turnover, chaperones]. 32710 COG2884: Predicted ATPase involved in cell division [Cell division and chromosome partitioning]. 32711 COG2885: Outer membrane protein and related peptidoglycan-associated (lipo)proteins [Cell envelope biogenesis, outer membrane]. 32712 COG2886: Uncharacterized small protein [Function unknown]. 32713 COG2887: RecB family exonuclease [DNA replication, recombination, and repair]. 32714 COG2888: Predicted Zn-ribbon RNA-binding protein with a function in translation [Translation, ribosomal structure and biogenesis]. 32715 COG2890: Methylase of polypeptide chain release factors [Translation, ribosomal structure and biogenesis]. 32716 COG2891: Cell shape-determining protein [Cell envelope biogenesis, outer membrane]. 32717 COG2892: Uncharacterized protein conserved in archaea [Function unknown]. 32718 COG2893: Phosphotransferase system, mannose/fructose-specific component IIA [Carbohydrate transport and metabolism]. 32719 COG2894: Septum formation inhibitor-activating ATPase [Cell division and chromosome partitioning]. 32720 COG2895: GTPases - Sulfate adenylate transferase subunit 1 [Inorganic ion transport and metabolism]. 32721 COG2896: Molybdenum cofactor biosynthesis enzyme [Coenzyme metabolism]. 32722 COG2897: Rhodanese-related sulfurtransferase [Inorganic ion transport and metabolism]. 32723 COG2898: Uncharacterized conserved protein [Function unknown]. 32724 COG2899: Uncharacterized protein conserved in bacteria [Function unknown]. 32725 COG2900: Uncharacterized protein conserved in bacteria [Function unknown]. 32726 COG2901: Factor for inversion stimulation Fis, transcriptional activator [Transcription / DNA replication, recombination, and repair]. 32727 COG2902: NAD-specific glutamate dehydrogenase [Amino acid transport and metabolism]. 32728 COG2904: Uncharacterized protein conserved in bacteria [Function unknown]. 32729 COG2905: Predicted signal-transduction protein containing cAMP-binding and CBS domains [Signal transduction mechanisms]. 32730 COG2906: Bacterioferritin-associated ferredoxin [Inorganic ion transport and metabolism]. 32731 COG2907: Predicted NAD/FAD-binding protein [General function prediction only]. 32732 COG2908: Uncharacterized protein conserved in bacteria [Function unknown]. 32733 COG2909: ATP-dependent transcriptional regulator [Transcription]. 32734 COG2910: Putative NADH-flavin reductase [General function prediction only]. 32735 COG2911: Uncharacterized protein conserved in bacteria [Function unknown]. 32736 COG2912: Uncharacterized conserved protein [Function unknown]. 32737 COG2913: Small protein A (tmRNA-binding) [Translation, ribosomal structure and biogenesis]. 32738 COG2914: Uncharacterized protein conserved in bacteria [Function unknown]. 32739 COG2915: Uncharacterized protein involved in purine metabolism [General function prediction only]. 32740 COG2916: DNA-binding protein H-NS [General function prediction only]. 32741 COG2917: Intracellular septation protein A [Cell division and chromosome partitioning]. 32742 COG2918: Gamma-glutamylcysteine synthetase [Coenzyme metabolism]. 32743 COG2919: Septum formation initiator [Cell division and chromosome partitioning]. 32744 COG2920: Dissimilatory sulfite reductase (desulfoviridin), gamma subunit [Inorganic ion transport and metabolism]. 32745 COG2921: Uncharacterized conserved protein [Function unknown]. 32746 COG2922: Uncharacterized protein conserved in bacteria [Function unknown]. 32747 COG2923: Uncharacterized protein involved in the oxidation of intracellular sulfur [Inorganic ion transport and metabolism]. 32748 COG2924: Uncharacterized protein conserved in bacteria [Function unknown]. 32749 COG2925: Exonuclease I [DNA replication, recombination, and repair]. 32750 COG2926: Uncharacterized protein conserved in bacteria [Function unknown]. 32751 COG2927: DNA polymerase III, chi subunit [DNA replication, recombination, and repair]. 32752 COG2928: Uncharacterized conserved protein [Function unknown]. 32753 COG2929: Uncharacterized protein conserved in bacteria [Function unknown]. 32754 COG2930: Uncharacterized conserved protein [Function unknown]. 32755 COG2931: RTX toxins and related Ca2+-binding proteins [Secondary metabolites biosynthesis, transport, and catabolism]. 32756 COG2932: Predicted transcriptional regulator [Transcription]. 32757 COG2933: Predicted SAM-dependent methyltransferase [General function prediction only]. 32758 COG2935: Putative arginyl-tRNA:protein arginylyltransferase [Posttranslational modification, protein turnover, chaperones]. 32759 COG2936: Predicted acyl esterases [General function prediction only]. 32760 COG2937: Glycerol-3-phosphate O-acyltransferase [Lipid metabolism]. 32761 COG2938: Uncharacterized conserved protein [Function unknown]. 32762 COG2939: Carboxypeptidase C (cathepsin A) [Amino acid transport and metabolism]. 32763 COG2940: Proteins containing SET domain [General function prediction only]. 32764 COG2941: Ubiquinone biosynthesis protein COQ7 [Coenzyme metabolism]. 32765 COG2942: N-acyl-D-glucosamine 2-epimerase [Carbohydrate transport and metabolism]. 32766 COG2943: Membrane glycosyltransferase [Cell envelope biogenesis, outer membrane]. 32767 COG2944: Predicted transcriptional regulator [Transcription]. 32768 COG2945: Predicted hydrolase of the alpha/beta superfamily [General function prediction only]. 32769 COG2946: Putative phage replication protein RstA [DNA replication, recombination, and repair]. 32770 COG2947: Uncharacterized conserved protein [Function unknown]. 32771 COG2948: Type IV secretory pathway, VirB10 components [Intracellular trafficking and secretion]. 32772 COG2949: Uncharacterized membrane protein [Function unknown]. 32773 COG2951: Membrane-bound lytic murein transglycosylase B [Cell envelope biogenesis, outer membrane]. 32774 COG2952: Uncharacterized protein conserved in bacteria [Function unknown]. 32775 COG2954: Uncharacterized protein conserved in bacteria [Function unknown]. 32776 COG2956: Predicted N-acetylglucosaminyl transferase [Carbohydrate transport and metabolism]. 32777 COG2957: Peptidylarginine deiminase and related enzymes [Amino acid transport and metabolism]. 32778 COG2958: Uncharacterized protein conserved in bacteria [Function unknown]. 32779 COG2959: Uncharacterized enzyme of heme biosynthesis [Coenzyme metabolism]. 32780 COG2960: Uncharacterized protein conserved in bacteria [Function unknown]. 32781 COG2961: Protein involved in catabolism of external DNA [General function prediction only]. 32782 COG2962: Predicted permeases [General function prediction only]. 32783 COG2963: Transposase and inactivated derivatives [DNA replication, recombination, and repair]. 32784 COG2964: Uncharacterized protein conserved in bacteria [Function unknown]. 32785 COG2965: Primosomal replication protein N [DNA replication, recombination, and repair]. 32786 COG2966: Uncharacterized conserved protein [Function unknown]. 32787 COG2967: Uncharacterized protein affecting Mg2+/Co2+ transport [Inorganic ion transport and metabolism]. 32788 COG2968: Uncharacterized conserved protein [Function unknown]. 32789 COG2969: Stringent starvation protein B [General function prediction only]. 32790 COG2971: Predicted N-acetylglucosamine kinase [Carbohydrate transport and metabolism]. 32791 COG2972: Predicted signal transduction protein with a C-terminal ATPase domain [Signal transduction mechanisms]. 32792 COG2973: Trp operon repressor [Transcription]. 32793 COG2974: DNA recombination-dependent growth factor C [DNA replication, recombination, and repair]. 32794 COG2975: Uncharacterized protein conserved in bacteria [Function unknown]. 32795 COG2976: Uncharacterized protein conserved in bacteria [Function unknown]. 32796 COG2977: Phosphopantetheinyl transferase component of siderophore synthetase [Secondary metabolites biosynthesis, transport, and catabolism]. 32797 COG2978: Putative p-aminobenzoyl-glutamate transporter [Coenzyme metabolism]. 32798 COG2979: Uncharacterized protein conserved in bacteria [Function unknown]. 32799 COG2980: Rare lipoprotein B [Cell envelope biogenesis, outer membrane]. 32800 COG2981: Uncharacterized protein involved in cysteine biosynthesis [Amino acid transport and metabolism]. 32801 COG2982: Uncharacterized protein involved in outer membrane biogenesis [Cell envelope biogenesis, outer membrane]. 32802 COG2983: Uncharacterized conserved protein [Function unknown]. 32803 COG2984: ABC-type uncharacterized transport system, periplasmic component [General function prediction only]. 32804 COG2985: Predicted permease [General function prediction only]. 32805 COG2986: Histidine ammonia-lyase [Amino acid transport and metabolism]. 32806 COG2987: Urocanate hydratase [Amino acid transport and metabolism]. 32807 COG2988: Succinylglutamate desuccinylase [Amino acid transport and metabolism]. 32808 COG2989: Uncharacterized protein conserved in bacteria [Function unknown]. 32809 COG2990: Uncharacterized protein conserved in bacteria [Function unknown]. 32810 COG2991: Uncharacterized protein conserved in bacteria [Function unknown]. 32811 COG2992: Uncharacterized FlgJ-related protein [General function prediction only]. 32812 COG2993: Cbb3-type cytochrome oxidase, cytochrome c subunit [Energy production and conversion]. 32813 COG2994: ACP:hemolysin acyltransferase (hemolysin-activating protein) [Posttranslational modification, protein turnover, chaperones]. 32814 COG2995: Uncharacterized paraquat-inducible protein A [Function unknown]. 32815 COG2996: Uncharacterized protein conserved in bacteria [Function unknown]. 32816 COG2998: ABC-type tungstate transport system, permease component [Coenzyme metabolism]. 32817 COG2999: Glutaredoxin 2 [Posttranslational modification, protein turnover, chaperones]. 32818 COG3000: Sterol desaturase [Lipid metabolism]. 32819 COG3001: Uncharacterized protein conserved in bacteria [Function unknown]. 32820 COG3002: Uncharacterized protein conserved in bacteria [Function unknown]. 32821 COG3004: Na+/H+ antiporter [Inorganic ion transport and metabolism]. 32822 COG3005: Nitrate/TMAO reductases, membrane-bound tetraheme cytochrome c subunit [Energy production and conversion]. 32823 COG3006: Uncharacterized protein involved in chromosome partitioning [Cell division and chromosome partitioning]. 32824 COG3007: Uncharacterized paraquat-inducible protein B [Function unknown]. 32825 COG3008: Paraquat-inducible protein B [General function prediction only]. 32826 COG3009: Uncharacterized protein conserved in bacteria [Function unknown]. 32827 COG3010: Putative N-acetylmannosamine-6-phosphate epimerase [Carbohydrate transport and metabolism]. 32828 COG3011: Uncharacterized protein conserved in bacteria [Function unknown]. 32829 COG3012: Uncharacterized protein conserved in bacteria [Function unknown]. 32830 COG3013: Uncharacterized conserved protein [Function unknown]. 32831 COG3014: Uncharacterized protein conserved in bacteria [Function unknown]. 32832 COG3015: Uncharacterized lipoprotein NlpE involved in copper resistance [Cell envelope biogenesis, outer membrane / Inorganic ion transport and metabolism]. 32833 COG3016: Uncharacterized iron-regulated protein [Function unknown]. 32834 COG3017: Outer membrane lipoprotein involved in outer membrane biogenesis [Cell envelope biogenesis, outer membrane]. 32835 COG3018: Uncharacterized protein conserved in bacteria [Function unknown]. 32836 COG3019: Predicted metal-binding protein [General function prediction only]. 32837 COG3021: Uncharacterized protein conserved in bacteria [Function unknown]. 32838 COG3022: Uncharacterized protein conserved in bacteria [Function unknown]. 32839 COG3023: Negative regulator of beta-lactamase expression [Defense mechanisms]. 32840 COG3024: Uncharacterized protein conserved in bacteria [Function unknown]. 32841 COG3025: Uncharacterized conserved protein [Function unknown]. 32842 COG3026: Negative regulator of sigma E activity [Signal transduction mechanisms]. 32843 COG3027: Uncharacterized protein conserved in bacteria [Function unknown]. 32844 COG3028: Uncharacterized protein conserved in bacteria [Function unknown]. 32845 COG3029: Fumarate reductase subunit C [Energy production and conversion]. 32846 COG3030: Protein affecting phage T7 exclusion by the F plasmid [General function prediction only]. 32847 COG3031: Type II secretory pathway, component PulC [Intracellular trafficking and secretion]. 32848 COG3033: Tryptophanase [Amino acid transport and metabolism]. 32849 COG3034: Uncharacterized protein conserved in bacteria [Function unknown]. 32850 COG3036: Uncharacterized protein conserved in bacteria [Function unknown]. 32851 COG3037: Uncharacterized protein conserved in bacteria [Function unknown]. 32852 COG3038: Cytochrome B561 [Energy production and conversion]. 32853 COG3039: Transposase and inactivated derivatives, IS5 family [DNA replication, recombination, and repair]. 32854 COG3040: Bacterial lipocalin [Cell envelope biogenesis, outer membrane]. 32855 COG3041: Uncharacterized protein conserved in bacteria [Function unknown]. 32856 COG3042: Putative hemolysin [General function prediction only]. 32857 COG3043: Nitrate reductase cytochrome c-type subunit [Energy production and conversion]. 32858 COG3044: Predicted ATPase of the ABC class [General function prediction only]. 32859 COG3045: Uncharacterized protein conserved in bacteria [Function unknown]. 32860 COG3046: Uncharacterized protein related to deoxyribodipyrimidine photolyase [General function prediction only]. 32861 COG3047: Outer membrane protein W [Cell envelope biogenesis, outer membrane]. 32862 COG3048: D-serine dehydratase [Amino acid transport and metabolism]. 32863 COG3049: Penicillin V acylase and related amidases [Cell envelope biogenesis, outer membrane]. 32864 COG3050: DNA polymerase III, psi subunit [DNA replication, recombination, and repair]. 32865 COG3051: Citrate lyase, alpha subunit [Energy production and conversion]. 32866 COG3052: Citrate lyase, gamma subunit [Energy production and conversion]. 32867 COG3053: Citrate lyase synthetase [Energy production and conversion]. 32868 COG3054: Predicted transcriptional regulator [General function prediction only]. 32869 COG3055: Uncharacterized protein conserved in bacteria [Function unknown]. 32870 COG3056: Uncharacterized lipoprotein [Cell envelope biogenesis, outer membrane]. 32871 COG3057: Negative regulator of replication initiationR [DNA replication, recombination, and repair]. 32872 COG3058: Uncharacterized protein involved in formate dehydrogenase formation [Posttranslational modification, protein turnover, chaperones]. 32873 COG3059: Predicted membrane protein [Function unknown]. 32874 COG3060: Transcriptional regulator of met regulon [Transcription / Amino acid transport and metabolism]. 32875 COG3061: Cell envelope opacity-associated protein A [Cell envelope biogenesis, outer membrane]. 32876 COG3062: Uncharacterized protein involved in formation of periplasmic nitrate reductase [Inorganic ion transport and metabolism]. 32877 COG3063: Tfp pilus assembly protein PilF [Cell motility and secretion / Intracellular trafficking and secretion]. 32878 COG3064: Membrane protein involved in colicin uptake [Cell envelope biogenesis, outer membrane]. 32879 COG3065: Starvation-inducible outer membrane lipoprotein [Cell envelope biogenesis, outer membrane]. 32880 COG3066: DNA mismatch repair protein [DNA replication, recombination, and repair]. 32881 COG3067: Na+/H+ antiporter [Inorganic ion transport and metabolism]. 32882 COG3068: Uncharacterized protein conserved in bacteria [Function unknown]. 32883 COG3069: C4-dicarboxylate transporter [Energy production and conversion]. 32884 COG3070: Regulator of competence-specific genes [Transcription]. 32885 COG3071: Uncharacterized enzyme of heme biosynthesis [Coenzyme metabolism]. 32886 COG3072: Adenylate cyclase [Nucleotide transport and metabolism]. 32887 COG3073: Negative regulator of sigma E activity [Signal transduction mechanisms]. 32888 COG3074: Uncharacterized protein conserved in bacteria [Function unknown]. 32889 COG3075: Anaerobic glycerol-3-phosphate dehydrogenase [Amino acid transport and metabolism]. 32890 COG3076: Uncharacterized protein conserved in bacteria [Function unknown]. 32891 COG3077: DNA-damage-inducible protein J [DNA replication, recombination, and repair]. 32892 COG3078: Uncharacterized protein conserved in bacteria [Function unknown]. 32893 COG3079: Uncharacterized protein conserved in bacteria [Function unknown]. 32894 COG3080: Fumarate reductase subunit D [Energy production and conversion]. 32895 COG3081: Nucleoid-associated protein [General function prediction only]. 32896 COG3082: Uncharacterized protein conserved in bacteria [Function unknown]. 32897 COG3083: Predicted hydrolase of alkaline phosphatase superfamily [General function prediction only]. 32898 COG3084: Uncharacterized protein conserved in bacteria [Function unknown]. 32899 COG3085: Uncharacterized protein conserved in bacteria [Function unknown]. 32900 COG3086: Positive regulator of sigma E activity [Signal transduction mechanisms]. 32901 COG3087: Cell division protein [Cell division and chromosome partitioning]. 32902 COG3088: Uncharacterized protein involved in biosynthesis of c-type cytochromes [Posttranslational modification, protein turnover, chaperones]. 32903 COG3089: Uncharacterized protein conserved in bacteria [Function unknown]. 32904 COG3090: TRAP-type C4-dicarboxylate transport system, small permease component [Carbohydrate transport and metabolism]. 32905 COG3091: Uncharacterized protein conserved in bacteria [Function unknown]. 32906 COG3092: Uncharacterized protein conserved in bacteria [Function unknown]. 32907 COG3093: Plasmid maintenance system antidote protein [General function prediction only]. 32908 COG3094: Uncharacterized protein conserved in bacteria [Function unknown]. 32909 COG3095: Uncharacterized protein involved in chromosome partitioning [Cell division and chromosome partitioning]. 32910 COG3096: Uncharacterized protein involved in chromosome partitioning [Cell division and chromosome partitioning]. 32911 COG3097: Uncharacterized protein conserved in bacteria [Function unknown]. 32912 COG3098: Uncharacterized protein conserved in bacteria [Function unknown]. 32913 COG3099: Uncharacterized protein conserved in bacteria [Function unknown]. 32914 COG3100: Uncharacterized protein conserved in bacteria [Function unknown]. 32915 COG3101: Uncharacterized protein conserved in bacteria [Function unknown]. 32916 COG3102: Uncharacterized protein conserved in bacteria [Function unknown]. 32917 COG3103: SH3 domain protein [Signal transduction mechanisms]. 32918 COG3104: Dipeptide/tripeptide permease [Amino acid transport and metabolism]. 32919 COG3105: Uncharacterized protein conserved in bacteria [Function unknown]. 32920 COG3106: Predicted ATPase [General function prediction only]. 32921 COG3107: Putative lipoprotein [General function prediction only]. 32922 COG3108: Uncharacterized protein conserved in bacteria [Function unknown]. 32923 COG3109: Activator of osmoprotectant transporter ProP [Signal transduction mechanisms]. 32924 COG3110: Uncharacterized protein conserved in bacteria [Function unknown]. 32925 COG3111: Uncharacterized conserved protein [Function unknown]. 32926 COG3112: Uncharacterized protein conserved in bacteria [Function unknown]. 32927 COG3113: Predicted NTP binding protein (contains STAS domain) [General function prediction only]. 32928 COG3114: Heme exporter protein D [Intracellular trafficking and secretion]. 32929 COG3115: Cell division protein [Cell division and chromosome partitioning]. 32930 COG3116: Cell division protein [Cell division and chromosome partitioning]. 32931 COG3117: Uncharacterized protein conserved in bacteria [Function unknown]. 32932 COG3118: Thioredoxin domain-containing protein [Posttranslational modification, protein turnover, chaperones]. 32933 COG3119: Arylsulfatase A and related enzymes [Inorganic ion transport and metabolism]. 32934 COG3120: Uncharacterized protein conserved in bacteria [Function unknown]. 32935 COG3121: P pilus assembly protein, chaperone PapD [Cell motility and secretion / Intracellular trafficking and secretion]. 32936 COG3122: Uncharacterized protein conserved in bacteria [Function unknown]. 32937 COG3123: Uncharacterized protein conserved in bacteria [Function unknown]. 32938 COG3124: Uncharacterized protein conserved in bacteria [Function unknown]. 32939 COG3125: Heme/copper-type cytochrome/quinol oxidase, subunit 4 [Energy production and conversion]. 32940 COG3126: Uncharacterized protein conserved in bacteria [Function unknown]. 32941 COG3127: Predicted ABC-type transport system involved in lysophospholipase L1 biosynthesis, permease component [Secondary metabolites biosynthesis, transport, and catabolism]. 32942 COG3128: Uncharacterized iron-regulated protein [Function unknown]. 32943 COG3129: Predicted SAM-dependent methyltransferase [General function prediction only]. 32944 COG3130: Ribosome modulation factor [Translation, ribosomal structure and biogenesis]. 32945 COG3131: Periplasmic glucans biosynthesis protein [Inorganic ion transport and metabolism]. 32946 COG3132: Uncharacterized protein conserved in bacteria [Function unknown]. 32947 COG3133: Outer membrane lipoprotein [Cell envelope biogenesis, outer membrane]. 32948 COG3134: Predicted outer membrane lipoprotein [Function unknown]. 32949 COG3135: Uncharacterized protein involved in benzoate metabolism [Secondary metabolites biosynthesis, transport, and catabolism]. 32950 COG3136: Uncharacterized membrane protein required for alginate biosynthesis [General function prediction only]. 32951 COG3137: Putative salt-induced outer membrane protein [Cell envelope biogenesis, outer membrane]. 32952 COG3138: Arginine/ornithine N-succinyltransferase beta subunit [Amino acid transport and metabolism]. 32953 COG3139: Uncharacterized protein conserved in bacteria [Function unknown]. 32954 COG3140: Uncharacterized protein conserved in bacteria [Function unknown]. 32955 COG3141: Uncharacterized protein conserved in bacteria [Function unknown]. 32956 COG3142: Uncharacterized protein involved in copper resistance [Inorganic ion transport and metabolism]. 32957 COG3143: Chemotaxis protein [Cell motility and secretion / Signal transduction mechanisms]. 32958 COG3144: Flagellar hook-length control protein [Cell motility and secretion]. 32959 COG3145: Alkylated DNA repair protein [DNA replication, recombination, and repair]. 32960 COG3146: Uncharacterized protein conserved in bacteria [Function unknown]. 32961 COG3147: Uncharacterized protein conserved in bacteria [Function unknown]. 32962 COG3148: Uncharacterized conserved protein [Function unknown]. 32963 COG3149: Type II secretory pathway, component PulM [Intracellular trafficking and secretion]. 32964 COG3150: Predicted esterase [General function prediction only]. 32965 COG3151: Uncharacterized protein conserved in bacteria [Function unknown]. 32966 COG3152: Predicted membrane protein [Function unknown]. 32967 COG3153: Predicted acetyltransferase [General function prediction only]. 32968 COG3154: Putative lipid carrier protein [Lipid metabolism]. 32969 COG3155: Uncharacterized protein involved in an early stage of isoprenoid biosynthesis [Secondary metabolites biosynthesis, transport, and catabolism]. 32970 COG3156: Type II secretory pathway, component PulK [Intracellular trafficking and secretion]. 32971 COG3157: Hemolysin-coregulated protein (uncharacterized) [Function unknown]. 32972 COG3158: K+ transporter [Inorganic ion transport and metabolism]. 32973 COG3159: Uncharacterized protein conserved in bacteria [Function unknown]. 32974 COG3160: Regulator of sigma D [Transcription]. 32975 COG3161: 4-hydroxybenzoate synthetase (chorismate lyase) [Coenzyme metabolism]. 32976 COG3162: Predicted membrane protein [Function unknown]. 32977 COG3164: Predicted membrane protein [Function unknown]. 32978 COG3165: Uncharacterized protein conserved in bacteria [Function unknown]. 32979 COG3166: Tfp pilus assembly protein PilN [Cell motility and secretion / Intracellular trafficking and secretion]. 32980 COG3167: Tfp pilus assembly protein PilO [Cell motility and secretion / Intracellular trafficking and secretion]. 32981 COG3168: Tfp pilus assembly protein PilP [Cell motility and secretion / Intracellular trafficking and secretion]. 32982 COG3169: Uncharacterized protein conserved in bacteria [Function unknown]. 32983 COG3170: Tfp pilus assembly protein FimV [Cell motility and secretion / Intracellular trafficking and secretion]. 32984 COG3171: Uncharacterized protein conserved in bacteria [Function unknown]. 32985 COG3172: Predicted ATPase/kinase involved in NAD metabolism [Coenzyme metabolism]. 32986 COG3173: Predicted aminoglycoside phosphotransferase [General function prediction only]. 32987 COG3174: Predicted membrane protein [Function unknown]. 32988 COG3175: Cytochrome oxidase assembly factor [Posttranslational modification, protein turnover, chaperones]. 32989 COG3176: Putative hemolysin [General function prediction only]. 32990 COG3177: Uncharacterized conserved protein [Function unknown]. 32991 COG3178: Predicted phosphotransferase related to Ser/Thr protein kinases [General function prediction only]. 32992 COG3179: Predicted chitinase [General function prediction only]. 32993 COG3180: Putative ammonia monooxygenase [General function prediction only]. 32994 COG3181: Uncharacterized protein conserved in bacteria [Function unknown]. 32995 COG3182: Uncharacterized iron-regulated membrane protein [Function unknown]. 32996 COG3183: Predicted restriction endonuclease [Defense mechanisms]. 32997 COG3184: Uncharacterized protein conserved in bacteria [Function unknown]. 32998 COG3185: 4-hydroxyphenylpyruvate dioxygenase and related hemolysins [Amino acid transport and metabolism / General function prediction only]. 32999 COG3186: Phenylalanine-4-hydroxylase [Amino acid transport and metabolism]. 33000 COG3187: Heat shock protein [Posttranslational modification, protein turnover, chaperones]. 33001 COG3188: P pilus assembly protein, porin PapC [Cell motility and secretion / Intracellular trafficking and secretion]. 33002 COG3189: Uncharacterized conserved protein [Function unknown]. 33003 COG3190: Flagellar biogenesis protein [Cell motility and secretion]. 33004 COG3191: L-aminopeptidase/D-esterase [Amino acid transport and metabolism / Secondary metabolites biosynthesis, transport, and catabolism]. 33005 COG3192: Ethanolamine utilization protein [Amino acid transport and metabolism]. 33006 COG3193: Uncharacterized protein, possibly involved in utilization of glycolate and propanediol [General function prediction only]. 33007 COG3194: Ureidoglycolate hydrolase [Nucleotide transport and metabolism]. 33008 COG3195: Uncharacterized protein conserved in bacteria [Function unknown]. 33009 COG3196: Uncharacterized protein conserved in bacteria [Function unknown]. 33010 COG3197: Uncharacterized protein, possibly involved in nitrogen fixation [Inorganic ion transport and metabolism]. 33011 COG3198: Uncharacterized protein conserved in bacteria [Function unknown]. 33012 COG3199: Uncharacterized conserved protein [Function unknown]. 33013 COG3200: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase [Amino acid transport and metabolism]. 33014 COG3201: Nicotinamide mononucleotide transporter [Coenzyme metabolism]. 33015 COG3202: ATP/ADP translocase [Energy production and conversion]. 33016 COG3203: Outer membrane protein (porin) [Cell envelope biogenesis, outer membrane]. 33017 COG3204: Uncharacterized protein conserved in bacteria [Function unknown]. 33018 COG3205: Predicted membrane protein [Function unknown]. 33019 COG3206: Uncharacterized protein involved in exopolysaccharide biosynthesis [Cell envelope biogenesis, outer membrane]. 33020 COG3207: Pyoverdine/dityrosine biosynthesis protein [Secondary metabolites biosynthesis, transport, and catabolism]. 33021 COG3208: Predicted thioesterase involved in non-ribosomal peptide biosynthesis [Secondary metabolites biosynthesis, transport, and catabolism]. 33022 COG3209: Rhs family protein [Cell envelope biogenesis, outer membrane]. 33023 COG3210: Large exoproteins involved in heme utilization or adhesion [Intracellular trafficking and secretion]. 33024 COG3211: Predicted phosphatase [General function prediction only]. 33025 COG3212: Predicted membrane protein [Function unknown]. 33026 COG3213: Uncharacterized protein involved in response to NO [Inorganic ion transport and metabolism]. 33027 COG3214: Uncharacterized protein conserved in bacteria [Function unknown]. 33028 COG3215: Tfp pilus assembly protein PilZ [Cell motility and secretion / Intracellular trafficking and secretion]. 33029 COG3216: Uncharacterized protein conserved in bacteria [Function unknown]. 33030 COG3217: Uncharacterized Fe-S protein [General function prediction only]. 33031 COG3218: ABC-type uncharacterized transport system, auxiliary component [General function prediction only]. 33032 COG3219: Uncharacterized protein conserved in bacteria [Function unknown]. 33033 COG3220: Uncharacterized protein conserved in bacteria [Function unknown]. 33034 COG3221: ABC-type phosphate/phosphonate transport system, periplasmic component [Inorganic ion transport and metabolism]. 33035 COG3222: Uncharacterized protein conserved in bacteria [Function unknown]. 33036 COG3223: Predicted membrane protein [Function unknown]. 33037 COG3224: Uncharacterized protein conserved in bacteria [Function unknown]. 33038 COG3225: ABC-type uncharacterized transport system involved in gliding motility, auxiliary component [Cell motility and secretion]. 33039 COG3226: Uncharacterized protein conserved in bacteria [Function unknown]. 33040 COG3227: Zinc metalloprotease (elastase) [Amino acid transport and metabolism]. 33041 COG3228: Uncharacterized protein conserved in bacteria [Function unknown]. 33042 COG3230: Heme oxygenase [Inorganic ion transport and metabolism]. 33043 COG3231: Aminoglycoside phosphotransferase [Translation, ribosomal structure and biogenesis]. 33044 COG3232: 5-carboxymethyl-2-hydroxymuconate isomerase [Amino acid transport and metabolism]. 33045 COG3233: Predicted deacetylase [General function prediction only]. 33046 COG3234: Uncharacterized protein conserved in bacteria [Function unknown]. 33047 COG3235: Predicted membrane protein [Function unknown]. 33048 COG3236: Uncharacterized protein conserved in bacteria [Function unknown]. 33049 COG3237: Uncharacterized protein conserved in bacteria [Function unknown]. 33050 COG3238: Uncharacterized protein conserved in bacteria [Function unknown]. 33051 COG3239: Fatty acid desaturase [Lipid metabolism]. 33052 COG3240: Phospholipase/lecithinase/hemolysin [Lipid metabolism / General function prediction only]. 33053 COG3241: Azurin [Energy production and conversion]. 33054 COG3242: Uncharacterized protein conserved in bacteria [Function unknown]. 33055 COG3243: Poly(3-hydroxyalkanoate) synthetase [Lipid metabolism]. 33056 COG3245: Cytochrome c5 [Energy production and conversion]. 33057 COG3246: Uncharacterized conserved protein [Function unknown]. 33058 COG3247: Uncharacterized conserved protein [Function unknown]. 33059 COG3248: Nucleoside-binding outer membrane protein [Cell envelope biogenesis, outer membrane]. 33060 COG3249: Uncharacterized protein conserved in bacteria [Function unknown]. 33061 COG3250: Beta-galactosidase/beta-glucuronidase [Carbohydrate transport and metabolism]. 33062 COG3251: Uncharacterized protein conserved in bacteria [Function unknown]. 33063 COG3252: Methenyltetrahydromethanopterin cyclohydrolase [Coenzyme metabolism]. 33064 COG3253: Uncharacterized conserved protein [Function unknown]. 33065 COG3254: Uncharacterized conserved protein [Function unknown]. 33066 COG3255: Putative sterol carrier protein [Lipid metabolism]. 33067 COG3256: Nitric oxide reductase large subunit [Inorganic ion transport and metabolism]. 33068 COG3257: Uncharacterized protein, possibly involved in glyoxylate utilization [General function prediction only]. 33069 COG3258: Cytochrome c [Energy production and conversion]. 33070 COG3259: Coenzyme F420-reducing hydrogenase, alpha subunit [Energy production and conversion]. 33071 COG3260: Ni,Fe-hydrogenase III small subunit [Energy production and conversion]. 33072 COG3261: Ni,Fe-hydrogenase III large subunit [Energy production and conversion]. 33073 COG3262: Ni,Fe-hydrogenase III component G [Energy production and conversion]. 33074 COG3263: NhaP-type Na+/H+ and K+/H+ antiporters with a unique C-terminal domain [Inorganic ion transport and metabolism]. 33075 COG3264: Small-conductance mechanosensitive channel [Cell envelope biogenesis, outer membrane]. 33076 COG3265: Gluconate kinase [Carbohydrate transport and metabolism]. 33077 COG3266: Uncharacterized protein conserved in bacteria [Function unknown]. 33078 COG3267: Type II secretory pathway, component ExeA (predicted ATPase) [Intracellular trafficking and secretion]. 33079 COG3268: Uncharacterized conserved protein [Function unknown]. 33080 COG3269: Predicted RNA-binding protein, contains TRAM domain [General function prediction only]. 33081 COG3270: Uncharacterized conserved protein [Function unknown]. 33082 COG3271: Predicted double-glycine peptidase [General function prediction only]. 33083 COG3272: Uncharacterized conserved protein [Function unknown]. 33084 COG3273: Uncharacterized conserved protein [Function unknown]. 33085 COG3274: Uncharacterized protein conserved in bacteria [Function unknown]. 33086 COG3275: Putative regulator of cell autolysis [Signal transduction mechanisms]. 33087 COG3276: Selenocysteine-specific translation elongation factor [Translation, ribosomal structure and biogenesis]. 33088 COG3277: RNA-binding protein involved in rRNA processing [Translation, ribosomal structure and biogenesis]. 33089 COG3278: Cbb3-type cytochrome oxidase, subunit 1 [Posttranslational modification, protein turnover, chaperones]. 33090 COG3279: Response regulator of the LytR/AlgR family [Transcription / Signal transduction mechanisms]. 33091 COG3280: Maltooligosyl trehalose synthase [Carbohydrate transport and metabolism]. 33092 COG3281: Uncharacterized protein, probably involved in trehalose biosynthesis [Carbohydrate transport and metabolism]. 33093 COG3283: Transcriptional regulator of aromatic amino acids metabolism [Transcription / Amino acid transport and metabolism]. 33094 COG3284: Transcriptional activator of acetoin/glycerol metabolism [Secondary metabolites biosynthesis, transport, and catabolism / Transcription]. 33095 COG3285: Predicted eukaryotic-type DNA primase [DNA replication, recombination, and repair]. 33096 COG3286: Uncharacterized protein conserved in archaea [Function unknown]. 33097 COG3287: Uncharacterized conserved protein [Function unknown]. 33098 COG3288: NAD/NADP transhydrogenase alpha subunit [Energy production and conversion]. 33099 COG3290: Signal transduction histidine kinase regulating citrate/malate metabolism [Signal transduction mechanisms]. 33100 COG3291: FOG: PKD repeat [General function prediction only]. 33101 COG3292: Predicted periplasmic ligand-binding sensor domain [Signal transduction mechanisms]. 33102 COG3293: Transposase and inactivated derivatives [DNA replication, recombination, and repair]. 33103 COG3294: Uncharacterized conserved protein [Function unknown]. 33104 COG3295: Uncharacterized protein conserved in bacteria [Function unknown]. 33105 COG3296: Uncharacterized protein conserved in bacteria [Function unknown]. 33106 COG3297: Type II secretory pathway, component PulL [Intracellular trafficking and secretion]. 33108 COG3299: Uncharacterized homolog of phage Mu protein gp47 [Function unknown]. 33109 COG3300: MHYT domain (predicted integral membrane sensor domain) [Signal transduction mechanisms]. 33110 COG3301: Formate-dependent nitrite reductase, membrane component [Inorganic ion transport and metabolism]. 33111 COG3302: DMSO reductase anchor subunit [General function prediction only]. 33112 COG3303: Formate-dependent nitrite reductase, periplasmic cytochrome c552 subunit [Inorganic ion transport and metabolism]. 33113 COG3304: Predicted membrane protein [Function unknown]. 33114 COG3305: Predicted membrane protein [Function unknown]. 33115 COG3306: Glycosyltransferase involved in LPS biosynthesis [Cell envelope biogenesis, outer membrane]. 33116 COG3307: Lipid A core - O-antigen ligase and related enzymes [Cell envelope biogenesis, outer membrane]. 33117 COG3308: Predicted membrane protein [Function unknown]. 33118 COG3309: Uncharacterized virulence-associated protein D [Function unknown]. 33119 COG3310: Uncharacterized protein conserved in bacteria [Function unknown]. 33120 COG3311: Predicted transcriptional regulator [Transcription]. 33121 COG3312: F0F1-type ATP synthase, subunit I [Energy production and conversion]. 33122 COG3313: Predicted Fe-S protein [General function prediction only]. 33123 COG3314: Uncharacterized protein conserved in bacteria [Function unknown]. 33124 COG3315: O-Methyltransferase involved in polyketide biosynthesis [Secondary metabolites biosynthesis, transport, and catabolism]. 33125 COG3316: Transposase and inactivated derivatives [DNA replication, recombination, and repair]. 33126 COG3317: Uncharacterized lipoprotein [Cell envelope biogenesis, outer membrane]. 33127 COG3318: Predicted metal-binding protein related to the C-terminal domain of SecA [General function prediction only]. 33128 COG3319: Thioesterase domains of type I polyketide synthases or non-ribosomal peptide synthetases [Secondary metabolites biosynthesis, transport, and catabolism]. 33129 COG3320: Putative dehydrogenase domain of multifunctional non-ribosomal peptide synthetases and related enzymes [Secondary metabolites biosynthesis, transport, and catabolism]. 33130 COG3321: Polyketide synthase modules and related proteins [Secondary metabolites biosynthesis, transport, and catabolism]. 33131 COG3322: Predicted periplasmic ligand-binding sensor domain [Signal transduction mechanisms]. 33132 COG3323: Uncharacterized protein conserved in bacteria [Function unknown]. 33133 COG3324: Predicted enzyme related to lactoylglutathione lyase [General function prediction only]. 33134 COG3325: Chitinase [Carbohydrate transport and metabolism]. 33135 COG3326: Predicted membrane protein [Function unknown]. 33136 COG3327: Phenylacetic acid-responsive transcriptional repressor [Transcription]. 33137 COG3328: Transposase and inactivated derivatives [DNA replication, recombination, and repair]. 33138 COG3329: Predicted permease [General function prediction only]. 33139 COG3330: Uncharacterized protein conserved in bacteria [Function unknown]. 33140 COG3331: Penicillin-binding protein-related factor A, putative recombinase [General function prediction only]. 33141 COG3332: Uncharacterized conserved protein [Function unknown]. 33142 COG3333: Uncharacterized protein conserved in bacteria [Function unknown]. 33143 COG3334: Uncharacterized conserved protein [Function unknown]. 33144 COG3335: Transposase and inactivated derivatives [DNA replication, recombination, and repair]. 33145 COG3336: Predicted membrane protein [Function unknown]. 33146 COG3337: Uncharacterized protein predicted to be involved in DNA repair [DNA replication, recombination, and repair]. 33147 COG3338: Carbonic anhydrase [Inorganic ion transport and metabolism]. 33148 COG3339: Uncharacterized conserved protein [Function unknown]. 33149 COG3340: Peptidase E [Amino acid transport and metabolism]. 33150 COG3341: Predicted double-stranded RNA/RNA-DNA hybrid binding protein [General function prediction only]. 33151 COG3342: Uncharacterized conserved protein [Function unknown]. 33152 COG3343: DNA-directed RNA polymerase, delta subunit [Transcription]. 33153 COG3344: Retron-type reverse transcriptase [DNA replication, recombination, and repair]. 33154 COG3345: Alpha-galactosidase [Carbohydrate transport and metabolism]. 33155 COG3346: Uncharacterized conserved protein [Function unknown]. 33156 COG3347: Uncharacterized conserved protein [Function unknown]. 33157 COG3349: Uncharacterized conserved protein [Function unknown]. 33158 COG3350: Uncharacterized conserved protein [Function unknown]. 33159 COG3351: Putative archaeal flagellar protein D/E [Cell motility and secretion]. 33160 COG3352: Putative archaeal flagellar protein C [Cell motility and secretion]. 33161 COG3353: Putative archaeal flagellar protein F [Cell motility and secretion]. 33162 COG3354: Putative archaeal flagellar protein G [Cell motility and secretion]. 33163 COG3355: Predicted transcriptional regulator [Transcription]. 33164 COG3356: Predicted membrane protein [Function unknown]. 33165 COG3357: Predicted transcriptional regulator containing an HTH domain fused to a Zn-ribbon [Transcription]. 33166 COG3358: Uncharacterized conserved protein [Function unknown]. 33167 COG3359: Predicted exonuclease [DNA replication, recombination, and repair]. 33168 COG3360: Uncharacterized conserved protein [Function unknown]. 33169 COG3361: Uncharacterized conserved protein [Function unknown]. 33170 COG3363: Archaeal IMP cyclohydrolase [Nucleotide transport and metabolism]. 33171 COG3364: Zn-ribbon containing protein [General function prediction only]. 33172 COG3365: Uncharacterized protein conserved in archaea [Function unknown]. 33173 COG3366: Uncharacterized protein conserved in archaea [Function unknown]. 33174 COG3367: Uncharacterized conserved protein [Function unknown]. 33175 COG3368: Predicted permease [General function prediction only]. 33176 COG3369: Uncharacterized conserved protein [Function unknown]. 33177 COG3370: Uncharacterized protein conserved in archaea [Function unknown]. 33178 COG3371: Predicted membrane protein [Function unknown]. 33179 COG3372: Uncharacterized conserved protein [Function unknown]. 33180 COG3373: Uncharacterized protein conserved in archaea [Function unknown]. 33181 COG3374: Predicted membrane protein [Function unknown]. 33182 COG3375: Uncharacterized conserved protein [Function unknown]. 33183 COG3376: High-affinity nickel permease [Inorganic ion transport and metabolism]. 33184 COG3377: Uncharacterized conserved protein [Function unknown]. 33185 COG3378: Predicted ATPase [General function prediction only]. 33186 COG3379: Uncharacterized conserved protein [Function unknown]. 33187 COG3380: Predicted NAD/FAD-dependent oxidoreductase [General function prediction only]. 33188 COG3381: Uncharacterized component of anaerobic dehydrogenases [General function prediction only]. 33189 COG3382: Uncharacterized conserved protein [Function unknown]. 33190 COG3383: Uncharacterized anaerobic dehydrogenase [General function prediction only]. 33191 COG3384: Uncharacterized conserved protein [Function unknown]. 33192 COG3385: FOG: Transposase and inactivated derivatives [DNA replication, recombination, and repair]. 33193 COG3386: Gluconolactonase [Carbohydrate transport and metabolism]. 33194 COG3387: Glucoamylase and related glycosyl hydrolases [Carbohydrate transport and metabolism]. 33195 COG3388: Uncharacterized protein conserved in archaea [Function unknown]. 33196 COG3389: Uncharacterized protein conserved in archaea [Function unknown]. 33197 COG3390: Uncharacterized protein conserved in archaea [Function unknown]. 33198 COG3391: Uncharacterized conserved protein [Function unknown]. 33199 COG3392: Adenine-specific DNA methylase [DNA replication, recombination, and repair]. 33200 COG3393: Predicted acetyltransferase [General function prediction only]. 33201 COG3394: Uncharacterized protein conserved in bacteria [Function unknown]. 33202 COG3395: Uncharacterized protein conserved in bacteria [Function unknown]. 33203 COG3396: Uncharacterized conserved protein [Function unknown]. 33204 COG3397: Uncharacterized protein conserved in bacteria [Function unknown]. 33205 COG3398: Uncharacterized protein conserved in archaea [Function unknown]. 33206 COG3399: Uncharacterized protein conserved in bacteria [Function unknown]. 33207 COG3400: Uncharacterized protein conserved in bacteria [Function unknown]. 33208 COG3401: Fibronectin type 3 domain-containing protein [General function prediction only]. 33209 COG3402: Uncharacterized conserved protein [Function unknown]. 33210 COG3403: Uncharacterized conserved protein [Function unknown]. 33211 COG3404: Methenyl tetrahydrofolate cyclohydrolase [Amino acid transport and metabolism]. 33212 COG3405: Endoglucanase Y [Carbohydrate transport and metabolism]. 33213 COG3407: Mevalonate pyrophosphate decarboxylase [Lipid metabolism]. 33214 COG3408: Glycogen debranching enzyme [Carbohydrate transport and metabolism]. 33215 COG3409: Putative peptidoglycan-binding domain-containing protein [Cell envelope biogenesis, outer membrane]. 33216 COG3410: Uncharacterized conserved protein [Function unknown]. 33217 COG3411: Ferredoxin [Energy production and conversion]. 33218 COG3412: Uncharacterized protein conserved in bacteria [Function unknown]. 33219 COG3413: Predicted DNA binding protein [General function prediction only]. 33220 COG3414: Phosphotransferase system, galactitol-specific IIB component [Carbohydrate transport and metabolism]. 33221 COG3415: Transposase and inactivated derivatives [DNA replication, recombination, and repair]. 33222 COG3416: Uncharacterized protein conserved in bacteria [Function unknown]. 33223 COG3417: Collagen-binding surface adhesin SpaP (antigen I/II family) [General function prediction only]. 33224 COG3418: Flagellar biosynthesis/type III secretory pathway chaperone [Cell motility and secretion / Intracellular trafficking and secretion / Posttranslational modification, protein turnover, chaperones]. 33225 COG3419: Tfp pilus assembly protein, tip-associated adhesin PilY1 [Cell motility and secretion / Intracellular trafficking and secretion]. 33226 COG3420: Nitrous oxidase accessory protein [Inorganic ion transport and metabolism]. 33227 COG3421: Uncharacterized protein conserved in bacteria [Function unknown]. 33228 COG3422: Uncharacterized conserved protein [Function unknown]. 33229 COG3423: Predicted transcriptional regulator [Transcription]. 33230 COG3424: Predicted naringenin-chalcone synthase [Secondary metabolites biosynthesis, transport, and catabolism]. 33231 COG3425: 3-hydroxy-3-methylglutaryl CoA synthase [Lipid metabolism]. 33232 COG3426: Butyrate kinase [Energy production and conversion]. 33233 COG3427: Uncharacterized conserved protein [Function unknown]. 33234 COG3428: Predicted membrane protein [Function unknown]. 33235 COG3429: Glucose-6-P dehydrogenase subunit [Carbohydrate transport and metabolism]. 33236 COG3430: Uncharacterized protein conserved in archaea [Function unknown]. 33237 COG3431: Predicted membrane protein [Function unknown]. 33238 COG3432: Predicted transcriptional regulator [Transcription]. 33239 COG3433: Aryl carrier domain [Secondary metabolites biosynthesis, transport, and catabolism]. 33240 COG3434: Predicted signal transduction protein containing EAL and modified HD-GYP domains [Signal transduction mechanisms]. 33241 COG3435: Gentisate 1,2-dioxygenase [Secondary metabolites biosynthesis, transport, and catabolism]. 33242 COG3436: Transposase and inactivated derivatives [DNA replication, recombination, and repair]. 33243 COG3437: Response regulator containing a CheY-like receiver domain and an HD-GYP domain [Transcription / Signal transduction mechanisms]. 33244 COG3439: Uncharacterized conserved protein [Function unknown]. 33245 COG3440: Predicted restriction endonuclease [Defense mechanisms]. 33246 COG3442: Predicted glutamine amidotransferase [General function prediction only]. 33247 COG3443: Predicted periplasmic or secreted protein [General function prediction only]. 33248 COG3444: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB [Carbohydrate transport and metabolism]. 33249 COG3445: Acid-induced glycyl radical enzyme [General function prediction only]. 33250 COG3447: Predicted integral membrane sensor domain [Signal transduction mechanisms]. 33251 COG3448: CBS-domain-containing membrane protein [Signal transduction mechanisms]. 33252 COG3449: DNA gyrase inhibitor [DNA replication, recombination, and repair]. 33253 COG3450: Predicted enzyme of the cupin superfamily [General function prediction only]. 33254 COG3451: Type IV secretory pathway, VirB4 components [Intracellular trafficking and secretion]. 33255 COG3452: Predicted periplasmic ligand-binding sensor domain [Signal transduction mechanisms]. 33256 COG3453: Uncharacterized protein conserved in bacteria [Function unknown]. 33257 COG3454: Metal-dependent hydrolase involved in phosphonate metabolism [Inorganic ion transport and metabolism]. 33258 COG3455: Uncharacterized protein conserved in bacteria [Function unknown]. 33259 COG3456: Uncharacterized conserved protein, contains FHA domain [Signal transduction mechanisms]. 33260 COG3457: Predicted amino acid racemase [Amino acid transport and metabolism]. 33261 COG3458: Acetyl esterase (deacetylase) [Secondary metabolites biosynthesis, transport, and catabolism]. 33262 COG3459: Cellobiose phosphorylase [Carbohydrate transport and metabolism]. 33263 COG3460: Uncharacterized enzyme of phenylacetate metabolism [Secondary metabolites biosynthesis, transport, and catabolism]. 33264 COG3461: Uncharacterized conserved protein [Function unknown]. 33265 COG3462: Predicted membrane protein [Function unknown]. 33266 COG3463: Predicted membrane protein [Function unknown]. 33267 COG3464: Transposase and inactivated derivatives [DNA replication, recombination, and repair]. 33268 COG3465: Uncharacterized conserved protein [Function unknown]. 33269 COG3466: Putative transposon-encoded protein [Function unknown]. 33270 COG3467: Predicted flavin-nucleotide-binding protein [General function prediction only]. 33272 COG3469: Chitinase [Carbohydrate transport and metabolism]. 33273 COG3470: Uncharacterized protein probably involved in high-affinity Fe2+ transport [Inorganic ion transport and metabolism]. 33274 COG3471: Predicted periplasmic/secreted protein [Function unknown]. 33275 COG3472: Uncharacterized conserved protein [Function unknown]. 33276 COG3473: Maleate cis-trans isomerase [Secondary metabolites biosynthesis, transport, and catabolism]. 33277 COG3474: Cytochrome c2 [Energy production and conversion]. 33278 COG3475: LPS biosynthesis protein [Cell envelope biogenesis, outer membrane]. 33279 COG3476: Tryptophan-rich sensory protein (mitochondrial benzodiazepine receptor homolog) [Signal transduction mechanisms]. 33280 COG3477: Predicted periplasmic/secreted protein [Function unknown]. 33281 COG3478: Predicted nucleic-acid-binding protein containing a Zn-ribbon domain [General function prediction only]. 33282 COG3479: Phenolic acid decarboxylase [Secondary metabolites biosynthesis, transport, and catabolism]. 33283 COG3480: Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]. 33284 COG3481: Predicted HD-superfamily hydrolase [General function prediction only]. 33285 COG3482: Uncharacterized conserved protein [Function unknown]. 33286 COG3483: Tryptophan 2,3-dioxygenase (vermilion) [Amino acid transport and metabolism]. 33287 COG3484: Predicted proteasome-type protease [Posttranslational modification, protein turnover, chaperones]. 33288 COG3485: Protocatechuate 3,4-dioxygenase beta subunit [Secondary metabolites biosynthesis, transport, and catabolism]. 33289 COG3486: Lysine/ornithine N-monooxygenase [Secondary metabolites biosynthesis, transport, and catabolism]. 33290 COG3487: Uncharacterized iron-regulated protein [Inorganic ion transport and metabolism]. 33291 COG3488: Predicted thiol oxidoreductase [Energy production and conversion]. 33292 COG3489: Predicted periplasmic lipoprotein [General function prediction only]. 33293 COG3490: Uncharacterized protein conserved in bacteria [Function unknown]. 33294 COG3491: Isopenicillin N synthase and related dioxygenases [General function prediction only]. 33295 COG3492: Uncharacterized protein conserved in bacteria [Function unknown]. 33296 COG3493: Na+/citrate symporter [Energy production and conversion]. 33297 COG3494: Uncharacterized protein conserved in bacteria [Function unknown]. 33298 COG3495: Uncharacterized protein conserved in bacteria [Function unknown]. 33299 COG3496: Uncharacterized conserved protein [Function unknown]. 33300 COG3497: Phage tail sheath protein FI [General function prediction only]. 33301 COG3498: Phage tail tube protein FII [General function prediction only]. 33302 COG3499: Phage protein U [General function prediction only]. 33303 COG3500: Phage protein D [General function prediction only]. 33304 COG3501: Uncharacterized protein conserved in bacteria [Function unknown]. 33305 COG3502: Uncharacterized protein conserved in bacteria [Function unknown]. 33306 COG3503: Predicted membrane protein [Function unknown]. 33307 COG3504: Type IV secretory pathway, VirB9 components [Intracellular trafficking and secretion]. 33308 COG3505: Type IV secretory pathway, VirD4 components [Intracellular trafficking and secretion]. 33309 COG3506: Uncharacterized conserved protein [Function unknown]. 33310 COG3507: Beta-xylosidase [Carbohydrate transport and metabolism]. 33311 COG3508: Homogentisate 1,2-dioxygenase [Secondary metabolites biosynthesis, transport, and catabolism]. 33312 COG3509: Poly(3-hydroxybutyrate) depolymerase [Secondary metabolites biosynthesis, transport, and catabolism]. 33313 COG3510: Cephalosporin hydroxylase [Defense mechanisms]. 33314 COG3511: Phospholipase C [Cell envelope biogenesis, outer membrane]. 33315 COG3512: Uncharacterized protein conserved in bacteria [Function unknown]. 33316 COG3513: Uncharacterized protein conserved in bacteria [Function unknown]. 33317 COG3514: Uncharacterized protein conserved in bacteria [Function unknown]. 33318 COG3515: Uncharacterized protein conserved in bacteria [Function unknown]. 33319 COG3516: Uncharacterized protein conserved in bacteria [Function unknown]. 33320 COG3517: Uncharacterized protein conserved in bacteria [Function unknown]. 33321 COG3518: Uncharacterized protein conserved in bacteria [Function unknown]. 33322 COG3519: Uncharacterized protein conserved in bacteria [Function unknown]. 33323 COG3520: Uncharacterized protein conserved in bacteria [Function unknown]. 33324 COG3521: Uncharacterized protein conserved in bacteria [Function unknown]. 33325 COG3522: Uncharacterized protein conserved in bacteria [Function unknown]. 33326 COG3523: Uncharacterized protein conserved in bacteria [Function unknown]. 33327 COG3524: Capsule polysaccharide export protein [Cell envelope biogenesis, outer membrane]. 33328 COG3525: N-acetyl-beta-hexosaminidase [Carbohydrate transport and metabolism]. 33329 COG3526: Uncharacterized protein conserved in bacteria [Posttranslational modification, protein turnover, chaperones]. 33330 COG3527: Alpha-acetolactate decarboxylase [Secondary metabolites biosynthesis, transport, and catabolism]. 33331 COG3528: Uncharacterized protein conserved in bacteria [Function unknown]. 33332 COG3529: Predicted nucleic-acid-binding protein containing a Zn-ribbon domain [General function prediction only]. 33333 COG3530: Uncharacterized protein conserved in bacteria [Function unknown]. 33334 COG3531: Predicted protein-disulfide isomerase [Posttranslational modification, protein turnover, chaperones]. 33335 COG3533: Uncharacterized protein conserved in bacteria [Function unknown]. 33336 COG3534: Alpha-L-arabinofuranosidase [Carbohydrate transport and metabolism]. 33337 COG3535: Uncharacterized conserved protein [Function unknown]. 33338 COG3536: Uncharacterized protein conserved in bacteria [Function unknown]. 33339 COG3537: Putative alpha-1,2-mannosidase [Carbohydrate transport and metabolism]. 33340 COG3538: Uncharacterized conserved protein [Function unknown]. 33341 COG3539: P pilus assembly protein, pilin FimA [Cell motility and secretion / Intracellular trafficking and secretion]. 33342 COG3540: Phosphodiesterase/alkaline phosphatase D [Inorganic ion transport and metabolism]. 33343 COG3541: Predicted nucleotidyltransferase [General function prediction only]. 33344 COG3542: Uncharacterized conserved protein [Function unknown]. 33345 COG3543: Uncharacterized conserved protein [Function unknown]. 33346 COG3544: Uncharacterized protein conserved in bacteria [Function unknown]. 33347 COG3545: Predicted esterase of the alpha/beta hydrolase fold [General function prediction only]. 33348 COG3546: Mn-containing catalase [Inorganic ion transport and metabolism]. 33349 COG3547: Transposase and inactivated derivatives [DNA replication, recombination, and repair]. 33350 COG3548: Predicted integral membrane protein [Function unknown]. 33351 COG3549: Plasmid maintenance system killer protein [General function prediction only]. 33352 COG3550: Uncharacterized protein related to capsule biosynthesis enzymes [General function prediction only]. 33353 COG3551: Uncharacterized protein conserved in bacteria [Function unknown]. 33354 COG3552: Protein containing von Willebrand factor type A (vWA) domain [General function prediction only]. 33355 COG3553: Uncharacterized protein conserved in bacteria [Function unknown]. 33356 COG3554: Uncharacterized protein conserved in bacteria [Function unknown]. 33357 COG3555: Aspartyl/asparaginyl beta-hydroxylase and related dioxygenases [Posttranslational modification, protein turnover, chaperones]. 33358 COG3556: Predicted membrane protein [Function unknown]. 33359 COG3557: Uncharacterized domain/protein associated with RNAses G and E [Translation, ribosomal structure and biogenesis]. 33360 COG3558: Uncharacterized protein conserved in bacteria [Function unknown]. 33361 COG3559: Putative exporter of polyketide antibiotics [Cell envelope biogenesis, outer membrane]. 33362 COG3560: Predicted oxidoreductase related to nitroreductase [General function prediction only]. 33363 COG3561: Phage anti-repressor protein [Transcription]. 33364 COG3562: Capsule polysaccharide export protein [Cell envelope biogenesis, outer membrane]. 33365 COG3563: Capsule polysaccharide export protein [Cell envelope biogenesis, outer membrane]. 33366 COG3564: Uncharacterized protein conserved in bacteria [Function unknown]. 33367 COG3565: Predicted dioxygenase of extradiol dioxygenase family [General function prediction only]. 33368 COG3566: Uncharacterized protein conserved in bacteria [Function unknown]. 33369 COG3567: Uncharacterized protein conserved in bacteria [Function unknown]. 33370 COG3568: Metal-dependent hydrolase [General function prediction only]. 33371 COG3569: Topoisomerase IB [DNA replication, recombination, and repair]. 33372 COG3570: Streptomycin 6-kinase [Defense mechanisms]. 33373 COG3571: Predicted hydrolase of the alpha/beta-hydrolase fold [General function prediction only]. 33374 COG3572: Gamma-glutamylcysteine synthetase [Coenzyme metabolism]. 33375 COG3573: Predicted oxidoreductase [General function prediction only]. 33376 COG3575: Uncharacterized protein conserved in bacteria [Function unknown]. 33377 COG3576: Predicted flavin-nucleotide-binding protein structurally related to pyridoxine 5'-phosphate oxidase [General function prediction only]. 33378 COG3577: Predicted aspartyl protease [General function prediction only]. 33379 COG3579: Aminopeptidase C [Amino acid transport and metabolism]. 33380 COG3580: Uncharacterized protein conserved in bacteria [Function unknown]. 33381 COG3581: Uncharacterized protein conserved in bacteria [Function unknown]. 33382 COG3582: Predicted nucleic acid binding protein containing the AN1-type Zn-finger [General function prediction only]. 33383 COG3583: Uncharacterized protein conserved in bacteria [Function unknown]. 33384 COG3584: Uncharacterized protein conserved in bacteria [Function unknown]. 33385 COG3585: Molybdopterin-binding protein [Coenzyme metabolism]. 33386 COG3586: Uncharacterized conserved protein [Function unknown]. 33387 COG3587: Restriction endonuclease [Defense mechanisms]. 33388 COG3588: Fructose-1,6-bisphosphate aldolase [Carbohydrate transport and metabolism]. 33389 COG3589: Uncharacterized conserved protein [Function unknown]. 33390 COG3590: Predicted metalloendopeptidase [Posttranslational modification, protein turnover, chaperones]. 33391 COG3591: V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]. 33392 COG3592: Uncharacterized conserved protein [Function unknown]. 33393 COG3593: Predicted ATP-dependent endonuclease of the OLD family [DNA replication, recombination, and repair]. 33394 COG3594: Fucose 4-O-acetylase and related acetyltransferases [Carbohydrate transport and metabolism]. 33395 COG3595: Uncharacterized conserved protein [Function unknown]. 33396 COG3596: Predicted GTPase [General function prediction only]. 33397 COG3597: Uncharacterized protein/domain associated with GTPases [Function unknown]. 33398 COG3598: RecA-family ATPase [DNA replication, recombination, and repair]. 33399 COG3599: Cell division initiation protein [Cell division and chromosome partitioning]. 33400 COG3600: Uncharacterized phage-associated protein [Function unknown]. 33401 COG3601: Predicted membrane protein [Function unknown]. 33402 COG3602: Uncharacterized protein conserved in bacteria [Function unknown]. 33403 COG3603: Uncharacterized conserved protein [Function unknown]. 33404 COG3604: Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains [Transcription / Signal transduction mechanisms]. 33405 COG3605: Signal transduction protein containing GAF and PtsI domains [Signal transduction mechanisms]. 33406 COG3607: Predicted lactoylglutathione lyase [General function prediction only]. 33407 COG3608: Predicted deacylase [General function prediction only]. 33408 COG3609: Predicted transcriptional regulators containing the CopG/Arc/MetJ DNA-binding domain [Transcription]. 33409 COG3610: Uncharacterized conserved protein [Function unknown]. 33410 COG3611: Replication initiation/membrane attachment protein [DNA replication, recombination, and repair]. 33411 COG3612: Uncharacterized protein conserved in archaea [Function unknown]. 33412 COG3613: Nucleoside 2-deoxyribosyltransferase [Nucleotide transport and metabolism]. 33413 COG3614: Predicted periplasmic ligand-binding sensor domain [Signal transduction mechanisms]. 33414 COG3615: Uncharacterized protein/domain, possibly involved in tellurite resistance [Inorganic ion transport and metabolism]. 33415 COG3616: Predicted amino acid aldolase or racemase [Amino acid transport and metabolism]. 33416 COG3617: Prophage antirepressor [Transcription]. 33417 COG3618: Predicted metal-dependent hydrolase of the TIM-barrel fold [General function prediction only]. 33418 COG3619: Predicted membrane protein [Function unknown]. 33419 COG3620: Predicted transcriptional regulator with C-terminal CBS domains [Transcription]. 33420 COG3621: Patatin [General function prediction only]. 33421 COG3622: Hydroxypyruvate isomerase [Carbohydrate transport and metabolism]. 33422 COG3623: Putative L-xylulose-5-phosphate 3-epimerase [Carbohydrate transport and metabolism]. 33423 COG3624: Uncharacterized enzyme of phosphonate metabolism [Inorganic ion transport and metabolism]. 33424 COG3625: Uncharacterized enzyme of phosphonate metabolism [Inorganic ion transport and metabolism]. 33425 COG3626: Uncharacterized enzyme of phosphonate metabolism [Inorganic ion transport and metabolism]. 33426 COG3627: Uncharacterized enzyme of phosphonate metabolism [Inorganic ion transport and metabolism]. 33427 COG3628: Phage baseplate assembly protein W [General function prediction only]. 33428 COG3629: DNA-binding transcriptional activator of the SARP family [Signal transduction mechanisms]. 33429 COG3630: Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, gamma subunit [Energy production and conversion]. 33430 COG3631: Ketosteroid isomerase-related protein [General function prediction only]. 33431 COG3633: Na+/serine symporter [Amino acid transport and metabolism]. 33432 COG3634: Alkyl hydroperoxide reductase, large subunit [Posttranslational modification, protein turnover, chaperones]. 33433 COG3635: Predicted phosphoglycerate mutase, AP superfamily [Carbohydrate transport and metabolism]. 33434 COG3636: Predicted transcriptional regulator [Transcription]. 33435 COG3637: Opacity protein and related surface antigens [Cell envelope biogenesis, outer membrane]. 33436 COG3638: ABC-type phosphate/phosphonate transport system, ATPase component [Inorganic ion transport and metabolism]. 33437 COG3639: ABC-type phosphate/phosphonate transport system, permease component [Inorganic ion transport and metabolism]. 33438 COG3640: CO dehydrogenase maturation factor [Cell division and chromosome partitioning]. 33439 COG3641: Predicted membrane protein, putative toxin regulator [General function prediction only]. 33440 COG3642: Mn2+-dependent serine/threonine protein kinase [Signal transduction mechanisms]. 33441 COG3643: Glutamate formiminotransferase [Amino acid transport and metabolism]. 33442 COG3644: Uncharacterized protein conserved in bacteria [Function unknown]. 33443 COG3645: Uncharacterized phage-encoded protein [Function unknown]. 33444 COG3646: Uncharacterized phage-encoded protein [Function unknown]. 33445 COG3647: Predicted membrane protein [Function unknown]. 33446 COG3648: Uricase (urate oxidase) [Secondary metabolites biosynthesis, transport, and catabolism]. 33447 COG3649: Uncharacterized protein predicted to be involved in DNA repair [DNA replication, recombination, and repair]. 33448 COG3650: Predicted membrane protein [Function unknown]. 33449 COG3651: Uncharacterized protein conserved in bacteria [Function unknown]. 33450 COG3652: Predicted outer membrane protein [Function unknown]. 33451 COG3653: N-acyl-D-aspartate/D-glutamate deacylase [Secondary metabolites biosynthesis, transport, and catabolism]. 33452 COG3654: Prophage maintenance system killer protein [General function prediction only]. 33453 COG3655: Predicted transcriptional regulator [Transcription]. 33454 COG3656: Predicted periplasmic protein [Function unknown]. 33455 COG3657: Uncharacterized protein conserved in bacteria [Function unknown]. 33456 COG3658: Cytochrome b [Energy production and conversion]. 33457 COG3659: Carbohydrate-selective porin [Cell envelope biogenesis, outer membrane]. 33458 COG3660: Predicted nucleoside-diphosphate-sugar epimerase [Cell envelope biogenesis, outer membrane]. 33459 COG3661: Alpha-glucuronidase [Carbohydrate transport and metabolism]. 33460 COG3662: Uncharacterized protein conserved in bacteria [Function unknown]. 33461 COG3663: G:T/U mismatch-specific DNA glycosylase [DNA replication, recombination, and repair]. 33462 COG3664: Beta-xylosidase [Carbohydrate transport and metabolism]. 33463 COG3665: Uncharacterized conserved protein [Function unknown]. 33464 COG3666: Transposase and inactivated derivatives [DNA replication, recombination, and repair]. 33465 COG3667: Uncharacterized protein involved in copper resistance [Inorganic ion transport and metabolism]. 33466 COG3668: Plasmid stabilization system protein [General function prediction only]. 33467 COG3669: Alpha-L-fucosidase [Carbohydrate transport and metabolism]. 33468 COG3670: Lignostilbene-alpha,beta-dioxygenase and related enzymes [Secondary metabolites biosynthesis, transport, and catabolism]. 33469 COG3671: Predicted membrane protein [Function unknown]. 33470 COG3672: Predicted periplasmic protein [Function unknown]. 33471 COG3673: Uncharacterized conserved protein [Function unknown]. 33472 COG3675: Predicted lipase [Lipid metabolism]. 33473 COG3676: Transposase and inactivated derivatives [DNA replication, recombination, and repair]. 33474 COG3677: Transposase and inactivated derivatives [DNA replication, recombination, and repair]. 33475 COG3678: P pilus assembly/Cpx signaling pathway, periplasmic inhibitor/zinc-resistance associated protein [Intracellular trafficking and secretion / Cell motility and secretio / Signal transduction mechanisms / Inorganic ion transport and metabolism]. 33476 COG3679: Uncharacterized conserved protein [Function unknown]. 33477 COG3680: Uncharacterized protein conserved in bacteria [Function unknown]. 33478 COG3681: Uncharacterized conserved protein [Function unknown]. 33479 COG3682: Predicted transcriptional regulator [Transcription]. 33480 COG3683: ABC-type uncharacterized transport system, periplasmic component [General function prediction only]. 33481 COG3684: Tagatose-1,6-bisphosphate aldolase [Carbohydrate transport and metabolism]. 33482 COG3685: Uncharacterized protein conserved in bacteria [Function unknown]. 33483 COG3686: Predicted membrane protein [Function unknown]. 33484 COG3687: Predicted metal-dependent hydrolase [General function prediction only]. 33485 COG3688: Predicted RNA-binding protein containing a PIN domain [General function prediction only]. 33486 COG3689: Predicted membrane protein [Function unknown]. 33487 COG3691: Uncharacterized protein conserved in bacteria [Function unknown]. 33488 COG3692: Uncharacterized protein conserved in bacteria [Function unknown]. 33489 COG3693: Beta-1,4-xylanase [Carbohydrate transport and metabolism]. 33490 COG3694: ABC-type uncharacterized transport system, permease component [General function prediction only]. 33491 COG3695: Predicted methylated DNA-protein cysteine methyltransferase [DNA replication, recombination, and repair]. 33492 COG3696: Putative silver efflux pump [Inorganic ion transport and metabolism]. 33493 COG3697: Phosphoribosyl-dephospho-CoA transferase (holo-ACP synthetase) [Coenzyme metabolism / Lipid metabolism]. 33494 COG3698: Predicted periplasmic protein [Function unknown]. 33495 COG3700: Acid phosphatase (class B) [General function prediction only]. 33496 COG3701: Type IV secretory pathway, TrbF components [Intracellular trafficking and secretion]. 33497 COG3702: Type IV secretory pathway, VirB3 components [Intracellular trafficking and secretion]. 33498 COG3703: Uncharacterized protein involved in cation transport [Inorganic ion transport and metabolism]. 33499 COG3704: Type IV secretory pathway, VirB6 components [Intracellular trafficking and secretion]. 33500 COG3705: ATP phosphoribosyltransferase involved in histidine biosynthesis [Amino acid transport and metabolism]. 33501 COG3706: Response regulator containing a CheY-like receiver domain and a GGDEF domain [Signal transduction mechanisms]. 33502 COG3707: Response regulator with putative antiterminator output domain [Signal transduction mechanisms]. 33503 COG3708: Uncharacterized protein conserved in bacteria [Function unknown]. 33504 COG3709: Uncharacterized component of phosphonate metabolism [Inorganic ion transport and metabolism]. 33505 COG3710: DNA-binding winged-HTH domains [Transcription]. 33506 COG3711: Transcriptional antiterminator [Transcription]. 33507 COG3712: Fe2+-dicitrate sensor, membrane component [Inorganic ion transport and metabolism / Signal transduction mechanisms]. 33508 COG3713: Outer membrane protein V [Cell envelope biogenesis, outer membrane]. 33509 COG3714: Predicted membrane protein [Function unknown]. 33510 COG3715: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC [Carbohydrate transport and metabolism]. 33511 COG3716: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID [Carbohydrate transport and metabolism]. 33512 COG3717: 5-keto 4-deoxyuronate isomerase [Carbohydrate transport and metabolism]. 33513 COG3718: Uncharacterized enzyme involved in inositol metabolism [Carbohydrate transport and metabolism]. 33514 COG3719: Ribonuclease I [Translation, ribosomal structure and biogenesis]. 33515 COG3720: Putative heme degradation protein [Inorganic ion transport and metabolism]. 33516 COG3721: Putative heme iron utilization protein [Inorganic ion transport and metabolism]. 33517 COG3722: Transcriptional regulator [Transcription]. 33518 COG3723: Recombinational DNA repair protein (RecE pathway) [DNA replication, recombination, and repair]. 33519 COG3724: Succinylarginine dihydrolase [Amino acid transport and metabolism]. 33520 COG3725: Membrane protein required for beta-lactamase induction [Defense mechanisms]. 33521 COG3726: Uncharacterized membrane protein affecting hemolysin expression [General function prediction only]. 33522 COG3727: DNA G:T-mismatch repair endonuclease [DNA replication, recombination, and repair]. 33523 COG3728: Phage terminase, small subunit [DNA replication, recombination, and repair]. 33524 COG3729: General stress protein [General function prediction only]. 33525 COG3730: Phosphotransferase system sorbitol-specific component IIC [Carbohydrate transport and metabolism]. 33526 COG3731: Phosphotransferase system sorbitol-specific component IIA [Carbohydrate transport and metabolism]. 33527 COG3732: Phosphotransferase system sorbitol-specific component IIBC [Carbohydrate transport and metabolism]. 33528 COG3733: Cu2+-containing amine oxidase [Secondary metabolites biosynthesis, transport, and catabolism]. 33529 COG3734: 2-keto-3-deoxy-galactonokinase [Carbohydrate transport and metabolism]. 33530 COG3735: Uncharacterized protein conserved in bacteria [Function unknown]. 33531 COG3736: Type IV secretory pathway, component VirB8 [Intracellular trafficking and secretion]. 33532 COG3737: Uncharacterized conserved protein [Function unknown]. 33533 COG3738: Uncharacterized protein conserved in bacteria [Function unknown]. 33534 COG3739: Uncharacterized integral membrane protein [Function unknown]. 33535 COG3740: Phage head maturation protease [General function prediction only]. 33536 COG3741: N-formylglutamate amidohydrolase [Amino acid transport and metabolism]. 33537 COG3742: Uncharacterized protein conserved in bacteria [Function unknown]. 33538 COG3743: Uncharacterized conserved protein [Function unknown]. 33539 COG3744: Uncharacterized protein conserved in bacteria [Function unknown]. 33540 COG3745: Flp pilus assembly protein CpaB [Intracellular trafficking and secretion]. 33541 COG3746: Phosphate-selective porin [Inorganic ion transport and metabolism]. 33542 COG3747: Phage terminase, small subunit [DNA replication, recombination, and repair]. 33543 COG3748: Predicted membrane protein [Function unknown]. 33544 COG3749: Uncharacterized protein conserved in bacteria [Function unknown]. 33545 COG3750: Uncharacterized protein conserved in bacteria [Function unknown]. 33546 COG3751: Predicted proline hydroxylase [Posttranslational modification, protein turnover, chaperones]. 33547 COG3752: Predicted membrane protein [Function unknown]. 33548 COG3753: Uncharacterized protein conserved in bacteria [Function unknown]. 33549 COG3754: Lipopolysaccharide biosynthesis protein [Cell envelope biogenesis, outer membrane]. 33550 COG3755: Uncharacterized protein conserved in bacteria [Function unknown]. 33551 COG3756: Uncharacterized protein conserved in bacteria [Function unknown]. 33552 COG3757: Lyzozyme M1 (1,4-beta-N-acetylmuramidase) [Cell envelope biogenesis, outer membrane]. 33553 COG3758: Uncharacterized protein conserved in bacteria [Function unknown]. 33554 COG3759: Predicted membrane protein [Function unknown]. 33555 COG3760: Uncharacterized conserved protein [Function unknown]. 33556 COG3761: NADH:ubiquinone oxidoreductase 17.2 kD subunit [Energy production and conversion]. 33557 COG3762: Predicted membrane protein [Function unknown]. 33558 COG3763: Uncharacterized protein conserved in bacteria [Function unknown]. 33559 COG3764: Sortase (surface protein transpeptidase) [Cell envelope biogenesis, outer membrane]. 33560 COG3765: Chain length determinant protein [Cell envelope biogenesis, outer membrane]. 33561 COG3766: Predicted membrane protein [Function unknown]. 33562 COG3767: Uncharacterized low-complexity protein [Function unknown]. 33563 COG3768: Predicted membrane protein [Function unknown]. 33564 COG3769: Predicted hydrolase (HAD superfamily) [General function prediction only]. 33565 COG3770: Murein endopeptidase [Cell envelope biogenesis, outer membrane]. 33566 COG3771: Predicted membrane protein [Function unknown]. 33567 COG3772: Phage-related lysozyme (muraminidase) [General function prediction only]. 33568 COG3773: Cell wall hydrolyses involved in spore germination [Cell envelope biogenesis, outer membrane]. 33569 COG3774: Mannosyltransferase OCH1 and related enzymes [Cell envelope biogenesis, outer membrane]. 33570 COG3775: Phosphotransferase system, galactitol-specific IIC component [Carbohydrate transport and metabolism]. 33571 COG3776: Predicted membrane protein [Function unknown]. 33572 COG3777: Uncharacterized conserved protein [Function unknown]. 33573 COG3778: Uncharacterized protein conserved in bacteria [Function unknown]. 33574 COG3779: Uncharacterized protein conserved in bacteria [Function unknown]. 33575 COG3780: DNA endonuclease related to intein-encoded endonucleases [DNA replication, recombination, and repair]. 33576 COG3781: Predicted membrane protein [Function unknown]. 33577 COG3782: Uncharacterized protein conserved in bacteria [Function unknown]. 33578 COG3783: Soluble cytochrome b562 [Energy production and conversion]. 33579 COG3784: Uncharacterized protein conserved in bacteria [Function unknown]. 33580 COG3785: Uncharacterized conserved protein [Function unknown]. 33581 COG3786: Uncharacterized protein conserved in bacteria [Function unknown]. 33582 COG3787: Uncharacterized protein conserved in bacteria [Function unknown]. 33583 COG3788: Uncharacterized relative of glutathione S-transferase, MAPEG superfamily [General function prediction only]. 33584 COG3789: Uncharacterized protein conserved in bacteria [Function unknown]. 33585 COG3790: Predicted membrane protein [Function unknown]. 33586 COG3791: Uncharacterized conserved protein [Function unknown]. 33587 COG3792: Uncharacterized protein conserved in bacteria [Function unknown]. 33588 COG3793: Tellurite resistance protein [Inorganic ion transport and metabolism]. 33589 COG3794: Plastocyanin [Energy production and conversion]. 33590 COG3795: Uncharacterized protein conserved in bacteria [Function unknown]. 33591 COG3797: Uncharacterized protein conserved in bacteria [Function unknown]. 33592 COG3798: Uncharacterized protein conserved in bacteria [Function unknown]. 33593 COG3799: Methylaspartate ammonia-lyase [Amino acid transport and metabolism]. 33594 COG3800: Predicted transcriptional regulator [General function prediction only]. 33595 COG3801: Uncharacterized protein conserved in bacteria [Function unknown]. 33596 COG3802: Uncharacterized protein conserved in bacteria [Function unknown]. 33597 COG3803: Uncharacterized protein conserved in bacteria [Function unknown]. 33598 COG3804: Uncharacterized conserved protein related to dihydrodipicolinate reductase [Function unknown]. 33599 COG3805: Aromatic ring-cleaving dioxygenase [Secondary metabolites biosynthesis, transport, and catabolism]. 33600 COG3806: Transcriptional activator [Transcription]. 33601 COG3807: Uncharacterized protein conserved in bacteria [Function unknown]. 33602 COG3808: Inorganic pyrophosphatase [Energy production and conversion]. 33603 COG3809: Uncharacterized protein conserved in bacteria [Function unknown]. 33604 COG3811: Uncharacterized protein conserved in bacteria [Function unknown]. 33605 COG3812: Uncharacterized protein conserved in bacteria [Function unknown]. 33606 COG3813: Uncharacterized protein conserved in bacteria [Function unknown]. 33607 COG3814: Uncharacterized protein conserved in bacteria [Function unknown]. 33608 COG3815: Predicted membrane protein [Function unknown]. 33609 COG3816: Uncharacterized protein conserved in bacteria [Function unknown]. 33610 COG3817: Predicted membrane protein [Function unknown]. 33611 COG3818: Predicted acetyltransferase, GNAT superfamily [General function prediction only]. 33612 COG3819: Predicted membrane protein [Function unknown]. 33613 COG3820: Uncharacterized protein conserved in bacteria [Function unknown]. 33614 COG3821: Predicted membrane protein [Function unknown]. 33615 COG3822: ABC-type sugar transport system, auxiliary component [General function prediction only]. 33616 COG3823: Glutamine cyclotransferase [Posttranslational modification, protein turnover, chaperones]. 33617 COG3824: Uncharacterized protein conserved in bacteria [Function unknown]. 33618 COG3825: Uncharacterized protein conserved in bacteria [Function unknown]. 33619 COG3826: Uncharacterized protein conserved in bacteria [Function unknown]. 33620 COG3827: Uncharacterized protein conserved in bacteria [Function unknown]. 33621 COG3828: Uncharacterized protein conserved in bacteria [Function unknown]. 33622 COG3829: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains [Transcription / Signal transduction mechanisms]. 33623 COG3830: ACT domain-containing protein [Signal transduction mechanisms]. 33624 COG3831: Uncharacterized conserved protein [Function unknown]. 33625 COG3832: Uncharacterized conserved protein [Function unknown]. 33626 COG3833: ABC-type maltose transport systems, permease component [Carbohydrate transport and metabolism]. 33627 COG3835: Sugar diacid utilization regulator [Transcription / Signal transduction mechanisms]. 33628 COG3836: 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase [Carbohydrate transport and metabolism]. 33629 COG3837: Uncharacterized conserved protein, contains double-stranded beta-helix domain [Function unknown]. 33630 COG3838: Type IV secretory pathway, VirB2 components (pilins) [Intracellular trafficking and secretion]. 33631 COG3839: ABC-type sugar transport systems, ATPase components [Carbohydrate transport and metabolism]. 33632 COG3840: ABC-type thiamine transport system, ATPase component [Coenzyme metabolism]. 33633 COG3842: ABC-type spermidine/putrescine transport systems, ATPase components [Amino acid transport and metabolism]. 33634 COG3843: Type IV secretory pathway, VirD2 components (relaxase) [Intracellular trafficking and secretion]. 33635 COG3844: Kynureninase [Amino acid transport and metabolism]. 33636 COG3845: ABC-type uncharacterized transport systems, ATPase components [General function prediction only]. 33637 COG3846: Type IV secretory pathway, TrbL components [Intracellular trafficking and secretion]. 33638 COG3847: Flp pilus assembly protein, pilin Flp [Intracellular trafficking and secretion]. 33639 COG3848: Phosphohistidine swiveling domain [Signal transduction mechanisms]. 33640 COG3850: Signal transduction histidine kinase, nitrate/nitrite-specific [Signal transduction mechanisms]. 33641 COG3851: Signal transduction histidine kinase, glucose-6-phosphate specific [Signal transduction mechanisms]. 33642 COG3852: Signal transduction histidine kinase, nitrogen specific [Signal transduction mechanisms]. 33643 COG3853: Uncharacterized protein involved in tellurite resistance [Inorganic ion transport and metabolism]. 33644 COG3854: ncharacterized protein conserved in bacteria [Function unknown]. 33645 COG3855: Uncharacterized protein conserved in bacteria [Carbohydrate transport and metabolism]. 33646 COG3856: Uncharacterized conserved protein (small basic protein) [Function unknown]. 33647 COG3857: ATP-dependent nuclease, subunit B [DNA replication, recombination, and repair]. 33648 COG3858: Predicted glycosyl hydrolase [General function prediction only]. 33649 COG3859: Predicted membrane protein [Function unknown]. 33650 COG3860: Uncharacterized protein conserved in bacteria [Function unknown]. 33651 COG3861: Uncharacterized protein conserved in bacteria [Function unknown]. 33652 COG3862: Uncharacterized protein with conserved CXXC pairs [Function unknown]. 33653 COG3863: Uncharacterized distant relative of cell wall-associated hydrolases [Function unknown]. 33654 COG3864: Uncharacterized protein conserved in bacteria [Function unknown]. 33655 COG3865: Uncharacterized protein conserved in bacteria [Function unknown]. 33656 COG3866: Pectate lyase [Carbohydrate transport and metabolism]. 33657 COG3867: Arabinogalactan endo-1,4-beta-galactosidase [Carbohydrate transport and metabolism]. 33658 COG3868: Uncharacterized conserved protein [Function unknown]. 33659 COG3869: Arginine kinase [Amino acid transport and metabolism]. 33660 COG3870: Uncharacterized protein conserved in bacteria [Function unknown]. 33661 COG3871: Uncharacterized stress protein (general stress protein 26) [General function prediction only]. 33662 COG3872: Predicted metal-dependent enzyme [General function prediction only]. 33663 COG3874: Uncharacterized conserved protein [Function unknown]. 33664 COG3875: Uncharacterized conserved protein [Function unknown]. 33665 COG3876: Uncharacterized protein conserved in bacteria [Function unknown]. 33666 COG3877: Uncharacterized protein conserved in bacteria [Function unknown]. 33667 COG3878: Uncharacterized protein conserved in bacteria [Function unknown]. 33668 COG3879: Uncharacterized protein conserved in bacteria [Function unknown]. 33669 COG3880: Uncharacterized protein with conserved CXXC pairs [Function unknown]. 33670 COG3881: Uncharacterized protein conserved in bacteria [Function unknown]. 33671 COG3882: Predicted enzyme involved in methoxymalonyl-ACP biosynthesis [Secondary metabolites biosynthesis, transport, and catabolism]. 33672 COG3883: Uncharacterized protein conserved in bacteria [Function unknown]. 33673 COG3884: Acyl-ACP thioesterase [Lipid metabolism]. 33674 COG3885: Uncharacterized conserved protein [Function unknown]. 33675 COG3886: Predicted HKD family nuclease [DNA replication, recombination, and repair]. 33676 COG3887: Predicted signaling protein consisting of a modified GGDEF domain and a DHH domain [Signal transduction mechanisms]. 33677 COG3888: Predicted transcriptional regulator [Transcription]. 33678 COG3889: Predicted solute binding protein [General function prediction only]. 33679 COG3890: Phosphomevalonate kinase [Lipid metabolism]. 33680 COG3892: Uncharacterized protein conserved in bacteria [Function unknown]. 33681 COG3893: Inactivated superfamily I helicase [DNA replication, recombination, and repair]. 33682 COG3894: Uncharacterized metal-binding protein [General function prediction only]. 33683 COG3895: Predicted periplasmic protein [General function prediction only]. 33684 COG3896: Chloramphenicol 3-O-phosphotransferase [Defense mechanisms]. 33685 COG3897: Predicted methyltransferase [General function prediction only]. 33686 COG3898: Uncharacterized membrane-bound protein [Function unknown]. 33687 COG3899: Predicted ATPase [General function prediction only]. 33688 COG3900: Predicted periplasmic protein [Function unknown]. 33689 COG3901: Regulator of nitric oxide reductase transcription [Transcription]. 33690 COG3903: Predicted ATPase [General function prediction only]. 33691 COG3904: Predicted periplasmic protein [Function unknown]. 33692 COG3905: Predicted transcriptional regulator [Transcription]. 33693 COG3906: Uncharacterized protein conserved in bacteria [Function unknown]. 33694 COG3907: PAP2 (acid phosphatase) superfamily protein [General function prediction only]. 33695 COG3908: Uncharacterized protein conserved in bacteria [Function unknown]. 33696 COG3909: Cytochrome c556 [Energy production and conversion]. 33697 COG3910: Predicted ATPase [General function prediction only]. 33698 COG3911: Predicted ATPase [General function prediction only]. 33699 COG3913: Uncharacterized protein conserved in bacteria [Function unknown]. 33700 COG3914: Predicted O-linked N-acetylglucosamine transferase, SPINDLY family [Posttranslational modification, protein turnover, chaperones]. 33701 COG3915: Uncharacterized protein conserved in bacteria [Function unknown]. 33702 COG3916: N-acyl-L-homoserine lactone synthetase [Signal transduction mechanisms / Secondary metabolites biosynthesis, transport, and catabolism]. 33703 COG3917: 2-hydroxychromene-2-carboxylate isomerase [Secondary metabolites biosynthesis, transport, and catabolism]. 33704 COG3918: Predicted membrane protein [Function unknown]. 33705 COG3919: Predicted ATP-grasp enzyme [General function prediction only]. 33706 COG3920: Signal transduction histidine kinase [Signal transduction mechanisms]. 33707 COG3921: Uncharacterized protein conserved in bacteria [Function unknown]. 33708 COG3923: Primosomal replication protein N'' [DNA replication, recombination, and repair]. 33709 COG3924: Predicted membrane protein [Function unknown]. 33710 COG3925: N-terminal domain of the phosphotransferase system fructose-specific component IIB [Carbohydrate transport and metabolism]. 33711 COG3926: Putative secretion activating protein [General function prediction only]. 33712 COG3930: Uncharacterized protein conserved in bacteria [Function unknown]. 33713 COG3931: Predicted N-formylglutamate amidohydrolase [Amino acid transport and metabolism]. 33714 COG3932: Uncharacterized ABC-type transport system, permease components [General function prediction only]. 33715 COG3933: Transcriptional antiterminator [Transcription]. 33716 COG3934: Endo-beta-mannanase [Carbohydrate transport and metabolism]. 33717 COG3935: Putative primosome component and related proteins [DNA replication, recombination, and repair]. 33718 COG3936: Protein involved in polysaccharide intercellular adhesin (PIA) synthesis/biofilm formation [Carbohydrate transport and metabolism]. 33719 COG3937: Uncharacterized conserved protein [Function unknown]. 33720 COG3938: Proline racemase [Amino acid transport and metabolism]. 33721 COG3940: Predicted beta-xylosidase [General function prediction only]. 33722 COG3941: Mu-like prophage protein [General function prediction only]. 33723 COG3942: Surface antigen [General function prediction only]. 33724 COG3943: Virulence protein [General function prediction only]. 33725 COG3944: Capsular polysaccharide biosynthesis protein [Cell envelope biogenesis, outer membrane]. 33726 COG3945: Uncharacterized conserved protein [Function unknown]. 33727 COG3946: Type IV secretory pathway, VirJ component [Intracellular trafficking and secretion]. 33728 COG3947: Response regulator containing CheY-like receiver and SARP domains [Signal transduction mechanisms]. 33729 COG3948: Phage-related baseplate assembly protein [General function prediction only]. 33730 COG3949: Uncharacterized membrane protein [Function unknown]. 33731 COG3950: Predicted ATP-binding protein involved in virulence [General function prediction only]. 33732 COG3951: Rod binding protein [Cell envelope biogenesis, outer membrane / Cell motility and secretion / Posttranslational modification, protein turnover, chaperones]. 33733 COG3952: Predicted membrane protein [Function unknown]. 33734 COG3953: SLT domain proteins [General function prediction only]. 33735 COG3954: Phosphoribulokinase [Energy production and conversion]. 33736 COG3955: Exopolysaccharide biosynthesis protein [Cell envelope biogenesis, outer membrane]. 33737 COG3956: Protein containing tetrapyrrole methyltransferase domain and MazG-like (predicted pyrophosphatase) domain [General function prediction only]. 33738 COG3957: Phosphoketolase [Carbohydrate transport and metabolism]. 33739 COG3958: Transketolase, C-terminal subunit [Carbohydrate transport and metabolism]. 33740 COG3959: Transketolase, N-terminal subunit [Carbohydrate transport and metabolism]. 33741 COG3960: Glyoxylate carboligase [General function prediction only]. 33742 COG3961: Pyruvate decarboxylase and related thiamine pyrophosphate-requiring enzymes [Carbohydrate transport and metabolism / Coenzyme metabolism / General function prediction only]. 33743 COG3962: Acetolactate synthase [Amino acid transport and metabolism]. 33744 COG3963: Phospholipid N-methyltransferase [Lipid metabolism]. 33745 COG3964: Predicted amidohydrolase [General function prediction only]. 33746 COG3965: Predicted Co/Zn/Cd cation transporters [Inorganic ion transport and metabolism]. 33747 COG3966: Protein involved in D-alanine esterification of lipoteichoic acid and wall teichoic acid (D-alanine transfer protein) [Cell envelope biogenesis, outer membrane]. 33748 COG3967: Short-chain dehydrogenase involved in D-alanine esterification of lipoteichoic acid and wall teichoic acid (D-alanine transfer protein) [Cell envelope biogenesis, outer membrane]. 33749 COG3968: Uncharacterized protein related to glutamine synthetase [General function prediction only]. 33750 COG3969: Predicted phosphoadenosine phosphosulfate sulfotransferase [General function prediction only]. 33751 COG3970: Fumarylacetoacetate (FAA) hydrolase family protein [General function prediction only]. 33752 COG3971: 2-keto-4-pentenoate hydratase [Secondary metabolites biosynthesis, transport, and catabolism]. 33753 COG3972: Superfamily I DNA and RNA helicases [General function prediction only]. 33754 COG3973: Superfamily I DNA and RNA helicases [General function prediction only]. 33755 COG3975: Predicted protease with the C-terminal PDZ domain [General function prediction only]. 33756 COG3976: Uncharacterized protein conserved in bacteria [Function unknown]. 33757 COG3977: Alanine-alpha-ketoisovalerate (or valine-pyruvate) aminotransferase [Amino acid transport and metabolism]. 33758 COG3978: Acetolactate synthase (isozyme II), small (regulatory) subunit [Function unknown]. 33759 COG3979: Uncharacterized protein contain chitin-binding domain type 3 [General function prediction only]. 33760 COG3980: Spore coat polysaccharide biosynthesis protein, predicted glycosyltransferase [Cell envelope biogenesis, outer membrane]. 33761 COG3981: Predicted acetyltransferase [General function prediction only]. 33762 COG4001: Predicted metal-binding protein [General function prediction only]. 33763 COG4002: Predicted phosphotransacetylase [General function prediction only]. 33764 COG4003: Uncharacterized protein conserved in archaea [Function unknown]. 33765 COG4004: Uncharacterized protein conserved in archaea [Function unknown]. 33766 COG4006: Uncharacterized protein conserved in archaea [Function unknown]. 33767 COG4007: Predicted dehydrogenase related to H2-forming N5,N10-methylenetetrahydromethanopterin dehydrogenase [General function prediction only]. 33768 COG4008: Predicted metal-binding transcription factor [Transcription]. 33769 COG4009: Uncharacterized protein conserved in archaea [Function unknown]. 33770 COG4010: Uncharacterized protein conserved in archaea [Function unknown]. 33771 COG4012: Uncharacterized protein conserved in archaea [Function unknown]. 33772 COG4013: Uncharacterized protein conserved in archaea [Function unknown]. 33773 COG4014: Uncharacterized protein conserved in archaea [Function unknown]. 33774 COG4015: Predicted dinucleotide-utilizing enzyme of the ThiF/HesA family [General function prediction only]. 33775 COG4016: Uncharacterized protein conserved in archaea [Function unknown]. 33776 COG4017: Uncharacterized protein conserved in archaea [Function unknown]. 33777 COG4018: Uncharacterized protein conserved in archaea [Function unknown]. 33778 COG4019: Uncharacterized protein conserved in archaea [Function unknown]. 33779 COG4020: Uncharacterized protein conserved in archaea [Function unknown]. 33780 COG4021: Uncharacterized conserved protein [Function unknown]. 33781 COG4022: Uncharacterized protein conserved in archaea [Function unknown]. 33782 COG4023: Preprotein translocase subunit Sec61beta [Intracellular trafficking and secretion]. 33783 COG4024: Uncharacterized protein conserved in archaea [Function unknown]. 33784 COG4025: Predicted membrane protein [Function unknown]. 33785 COG4026: Uncharacterized protein containing TOPRIM domain, potential nuclease [General function prediction only]. 33786 COG4027: Uncharacterized protein conserved in archaea [Function unknown]. 33787 COG4028: Predicted P-loop ATPase/GTPase [General function prediction only]. 33788 COG4029: Uncharacterized protein conserved in archaea [Function unknown]. 33789 COG4030: Uncharacterized protein conserved in archaea [Function unknown]. 33790 COG4031: Predicted metal-binding protein [General function prediction only]. 33791 COG4032: Predicted thiamine-pyrophosphate-binding protein [General function prediction only]. 33792 COG4033: Uncharacterized protein conserved in archaea [Function unknown]. 33793 COG4034: Uncharacterized protein conserved in archaea [Function unknown]. 33794 COG4035: Predicted membrane protein [Function unknown]. 33795 COG4036: Predicted membrane protein [Function unknown]. 33796 COG4037: Predicted membrane protein [Function unknown]. 33797 COG4038: Predicted membrane protein [Function unknown]. 33798 COG4039: Predicted membrane protein [Function unknown]. 33799 COG4040: Predicted membrane protein [Function unknown]. 33800 COG4041: Predicted membrane protein [Function unknown]. 33801 COG4042: Predicted membrane protein [Function unknown]. 33802 COG4043: Uncharacterized conserved protein [Function unknown]. 33803 COG4044: Uncharacterized protein conserved in archaea [Function unknown]. 33804 COG4046: Uncharacterized protein conserved in archaea [Function unknown]. 33805 COG4047: Uncharacterized protein conserved in archaea [Function unknown]. 33806 COG4048: Uncharacterized protein conserved in archaea [Function unknown]. 33807 COG4049: Uncharacterized protein containing archaeal-type C2H2 Zn-finger [General function prediction only]. 33808 COG4050: Uncharacterized protein conserved in archaea [Function unknown]. 33809 COG4051: Uncharacterized protein conserved in archaea [Function unknown]. 33810 COG4052: Uncharacterized protein related to methyl coenzyme M reductase subunit C [General function prediction only]. 33811 COG4053: Uncharacterized protein conserved in archaea [Function unknown]. 33812 COG4054: Methyl coenzyme M reductase, beta subunit [Coenzyme metabolism]. 33813 COG4055: Methyl coenzyme M reductase, subunit D [Coenzyme metabolism]. 33814 COG4056: Methyl coenzyme M reductase, subunit C [Coenzyme metabolism]. 33815 COG4057: Methyl coenzyme M reductase, gamma subunit [Coenzyme metabolism]. 33816 COG4058: Methyl coenzyme M reductase, alpha subunit [Coenzyme metabolism]. 33817 COG4059: Tetrahydromethanopterin S-methyltransferase, subunit E [Coenzyme metabolism]. 33818 COG4060: Tetrahydromethanopterin S-methyltransferase, subunit D [Coenzyme metabolism]. 33819 COG4061: Tetrahydromethanopterin S-methyltransferase, subunit C [Coenzyme metabolism]. 33820 COG4062: Tetrahydromethanopterin S-methyltransferase, subunit B [Coenzyme metabolism]. 33821 COG4063: Tetrahydromethanopterin S-methyltransferase, subunit A [Coenzyme metabolism]. 33822 COG4064: Tetrahydromethanopterin S-methyltransferase, subunit G [Coenzyme metabolism]. 33823 COG4065: Uncharacterized protein conserved in archaea [Function unknown]. 33824 COG4066: Uncharacterized protein conserved in archaea [Function unknown]. 33825 COG4067: Uncharacterized protein conserved in archaea [Posttranslational modification, protein turnover, chaperones]. 33826 COG4068: Uncharacterized protein containing a Zn-ribbon [Function unknown]. 33827 COG4069: Uncharacterized protein conserved in archaea [Function unknown]. 33828 COG4070: Predicted peptidyl-prolyl cis-trans isomerase (rotamase), cyclophilin family [Posttranslational modification, protein turnover, chaperones]. 33829 COG4071: Uncharacterized protein conserved in archaea [Function unknown]. 33830 COG4072: Uncharacterized protein conserved in archaea [Function unknown]. 33831 COG4073: Uncharacterized protein conserved in archaea [Function unknown]. 33832 COG4074: H2-forming N5,N10-methylenetetrahydromethanopterin dehydrogenase [Energy production and conversion]. 33833 COG4075: Uncharacterized conserved protein, homolog of nitrogen regulatory protein PII [Function unknown]. 33834 COG4076: Predicted RNA methylase [General function prediction only]. 33835 COG4077: Uncharacterized protein conserved in archaea [Function unknown]. 33836 COG4078: Predicted membrane protein [Function unknown]. 33837 COG4079: Uncharacterized protein conserved in archaea [Function unknown]. 33838 COG4080: RecB-family nuclease [DNA replication, recombination, and repair]. 33839 COG4081: Uncharacterized protein conserved in archaea [Function unknown]. 33840 COG4083: Predicted membrane protein [Function unknown]. 33841 COG4084: Uncharacterized protein conserved in archaea [Function unknown]. 33842 COG4085: Predicted RNA-binding protein, contains TRAM domain [General function prediction only]. 33843 COG4086: Predicted secreted protein [Function unknown]. 33844 COG4087: Soluble P-type ATPase [General function prediction only]. 33845 COG4088: Predicted nucleotide kinase [Nucleotide transport and metabolism]. 33846 COG4089: Predicted membrane protein [Function unknown]. 33847 COG4090: Uncharacterized protein conserved in archaea [Function unknown]. 33848 COG4091: Predicted homoserine dehydrogenase [Amino acid transport and metabolism]. 33849 COG4092: Predicted glycosyltransferase involved in capsule biosynthesis [Cell envelope biogenesis, outer membrane]. 33850 COG4093: Uncharacterized protein conserved in bacteria [Function unknown]. 33851 COG4094: Predicted membrane protein [Function unknown]. 33852 COG4095: Uncharacterized conserved protein [Function unknown]. 33853 COG4096: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases [Defense mechanisms]. 33854 COG4097: Predicted ferric reductase [Inorganic ion transport and metabolism]. 33855 COG4098: Superfamily II DNA/RNA helicase required for DNA uptake (late competence protein) [DNA replication, recombination, and repair]. 33856 COG4099: Predicted peptidase [General function prediction only]. 33857 COG4100: Cystathionine beta-lyase family protein involved in aluminum resistance [Inorganic ion transport and metabolism]. 33858 COG4101: Predicted mannose-6-phosphate isomerase [Carbohydrate transport and metabolism]. 33859 COG4102: Uncharacterized protein conserved in bacteria [Function unknown]. 33860 COG4103: Uncharacterized protein conserved in bacteria [Function unknown]. 33861 COG4104: Uncharacterized conserved protein [Function unknown]. 33862 COG4105: DNA uptake lipoprotein [General function prediction only]. 33863 COG4106: Trans-aconitate methyltransferase [General function prediction only]. 33864 COG4107: ABC-type phosphonate transport system, ATPase component [Inorganic ion transport and metabolism]. 33865 COG4108: Peptide chain release factor RF-3 [Translation, ribosomal structure and biogenesis]. 33866 COG4109: Predicted transcriptional regulator containing CBS domains [Transcription]. 33867 COG4110: Uncharacterized protein involved in stress response [General function prediction only]. 33868 COG4111: Uncharacterized conserved protein [General function prediction only]. 33869 COG4112: Predicted phosphoesterase (MutT family) [General function prediction only]. 33870 COG4113: Predicted nucleic acid-binding protein, contains PIN domain [General function prediction only]. 33871 COG4114: Uncharacterized Fe-S protein [General function prediction only]. 33872 COG4115: Uncharacterized protein conserved in bacteria [Function unknown]. 33873 COG4116: Uncharacterized protein conserved in bacteria [Function unknown]. 33874 COG4117: Thiosulfate reductase cytochrome B subunit (membrane anchoring protein) [Energy production and conversion]. 33875 COG4118: Antitoxin of toxin-antitoxin stability system [Cell division and chromosome partitioning]. 33876 COG4119: Predicted NTP pyrophosphohydrolase [DNA replication, recombination, and repair / General function prediction only]. 33877 COG4120: ABC-type uncharacterized transport system, permease component [General function prediction only]. 33878 COG4121: Uncharacterized conserved protein [Function unknown]. 33879 COG4122: Predicted O-methyltransferase [General function prediction only]. 33880 COG4123: Predicted O-methyltransferase [General function prediction only]. 33881 COG4124: Beta-mannanase [Carbohydrate transport and metabolism]. 33882 COG4125: Predicted membrane protein [Function unknown]. 33883 COG4126: Hydantoin racemase [Amino acid transport and metabolism]. 33884 COG4127: Uncharacterized conserved protein [Function unknown]. 33885 COG4128: Zonula occludens toxin [General function prediction only]. 33886 COG4129: Predicted membrane protein [Function unknown]. 33887 COG4130: Predicted sugar epimerase [Carbohydrate transport and metabolism]. 33888 COG4132: ABC-type uncharacterized transport system, permease component [General function prediction only]. 33889 COG4133: ABC-type transport system involved in cytochrome c biogenesis, ATPase component [Posttranslational modification, protein turnover, chaperones]. 33890 COG4134: ABC-type uncharacterized transport system, periplasmic component [General function prediction only]. 33891 COG4135: ABC-type uncharacterized transport system, permease component [General function prediction only]. 33892 COG4136: ABC-type uncharacterized transport system, ATPase component [General function prediction only]. 33893 COG4137: ABC-type uncharacterized transport system, permease component [General function prediction only]. 33894 COG4138: ABC-type cobalamin transport system, ATPase component [Coenzyme metabolism]. 33895 COG4139: ABC-type cobalamin transport system, permease component [Coenzyme metabolism]. 33896 COG4143: ABC-type thiamine transport system, periplasmic component [Coenzyme metabolism]. 33897 COG4145: Na+/panthothenate symporter [Coenzyme metabolism]. 33898 COG4146: Predicted symporter [General function prediction only]. 33899 COG4147: Predicted symporter [General function prediction only]. 33900 COG4148: ABC-type molybdate transport system, ATPase component [Inorganic ion transport and metabolism]. 33901 COG4149: ABC-type molybdate transport system, permease component [Inorganic ion transport and metabolism]. 33902 COG4150: ABC-type sulfate transport system, periplasmic component [Inorganic ion transport and metabolism]. 33903 COG4152: ABC-type uncharacterized transport system, ATPase component [General function prediction only]. 33904 COG4154: Fucose dissimilation pathway protein FucU [Carbohydrate transport and metabolism]. 33905 COG4158: Predicted ABC-type sugar transport system, permease component [General function prediction only]. 33906 COG4160: ABC-type arginine/histidine transport system, permease component [Amino acid transport and metabolism]. 33907 COG4161: ABC-type arginine transport system, ATPase component [Amino acid transport and metabolism]. 33908 COG4166: ABC-type oligopeptide transport system, periplasmic component [Amino acid transport and metabolism]. 33909 COG4167: ABC-type antimicrobial peptide transport system, ATPase component [Defense mechanisms]. 33910 COG4168: ABC-type antimicrobial peptide transport system, permease component [Defense mechanisms]. 33911 COG4170: ABC-type antimicrobial peptide transport system, ATPase component [Defense mechanisms]. 33912 COG4171: ABC-type antimicrobial peptide transport system, permease component [Defense mechanisms]. 33913 COG4172: ABC-type uncharacterized transport system, duplicated ATPase component [General function prediction only]. 33914 COG4174: ABC-type uncharacterized transport system, permease component [General function prediction only]. 33915 COG4175: ABC-type proline/glycine betaine transport system, ATPase component [Amino acid transport and metabolism]. 33916 COG4176: ABC-type proline/glycine betaine transport system, permease component [Amino acid transport and metabolism]. 33917 COG4177: ABC-type branched-chain amino acid transport system, permease component [Amino acid transport and metabolism]. 33918 COG4178: ABC-type uncharacterized transport system, permease and ATPase components [General function prediction only]. 33919 COG4181: Predicted ABC-type transport system involved in lysophospholipase L1 biosynthesis, ATPase component [Secondary metabolites biosynthesis, transport, and catabolism]. 33920 COG4185: Uncharacterized protein conserved in bacteria [Function unknown]. 33921 COG4186: Predicted phosphoesterase or phosphohydrolase [General function prediction only]. 33922 COG4187: Arginine degradation protein (predicted deacylase) [Amino acid transport and metabolism]. 33923 COG4188: Predicted dienelactone hydrolase [General function prediction only]. 33924 COG4189: Predicted transcriptional regulator [Transcription]. 33925 COG4190: Predicted transcriptional regulator [Transcription]. 33926 COG4191: Signal transduction histidine kinase regulating C4-dicarboxylate transport system [Signal transduction mechanisms]. 33927 COG4192: Signal transduction histidine kinase regulating phosphoglycerate transport system [Signal transduction mechanisms]. 33928 COG4193: Beta- N-acetylglucosaminidase [Carbohydrate transport and metabolism]. 33929 COG4194: Predicted membrane protein [General function prediction only]. 33930 COG4195: Phage-related replication protein [General function prediction only]. 33931 COG4196: Uncharacterized protein conserved in bacteria [Function unknown]. 33932 COG4197: Uncharacterized protein conserved in bacteria, prophage-related [Function unknown]. 33933 COG4198: Uncharacterized conserved protein [Function unknown]. 33934 COG4199: Uncharacterized protein conserved in bacteria [Function unknown]. 33935 COG4200: Uncharacterized protein conserved in bacteria [Function unknown]. 33936 COG4206: Outer membrane cobalamin receptor protein [Coenzyme metabolism]. 33937 COG4208: ABC-type sulfate transport system, permease component [Inorganic ion transport and metabolism]. 33938 COG4209: ABC-type polysaccharide transport system, permease component [Carbohydrate transport and metabolism]. 33939 COG4211: ABC-type glucose/galactose transport system, permease component [Carbohydrate transport and metabolism]. 33940 COG4213: ABC-type xylose transport system, periplasmic component [Carbohydrate transport and metabolism]. 33941 COG4214: ABC-type xylose transport system, permease component [Carbohydrate transport and metabolism]. 33942 COG4215: ABC-type arginine transport system, permease component [Amino acid transport and metabolism]. 33943 COG4218: Tetrahydromethanopterin S-methyltransferase, subunit F [Coenzyme metabolism]. 33944 COG4219: Antirepressor regulating drug resistance, predicted signal transduction N-terminal membrane component [Transcription / Signal transduction mechanisms]. 33945 COG4220: Phage DNA packaging protein, Nu1 subunit of terminase [DNA replication, recombination, and repair]. 33946 COG4221: Short-chain alcohol dehydrogenase of unknown specificity [General function prediction only]. 33947 COG4222: Uncharacterized protein conserved in bacteria [Function unknown]. 33948 COG4223: Uncharacterized protein conserved in bacteria [Function unknown]. 33949 COG4224: Uncharacterized protein conserved in bacteria [Function unknown]. 33950 COG4225: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins [General function prediction only]. 33951 COG4226: Uncharacterized protein encoded in hypervariable junctions of pilus gene clusters [Function unknown]. 33952 COG4227: Antirestriction protein [DNA replication, recombination, and repair]. 33953 COG4228: Mu-like prophage DNA circulation protein [General function prediction only]. 33954 COG4229: Predicted enolase-phosphatase [Energy production and conversion]. 33955 COG4230: Delta 1-pyrroline-5-carboxylate dehydrogenase [Energy production and conversion]. 33956 COG4231: Indolepyruvate ferredoxin oxidoreductase, alpha and beta subunits [Energy production and conversion]. 33957 COG4232: Thiol:disulfide interchange protein [Posttranslational modification, protein turnover, chaperones / Energy production and conversion]. 33958 COG4233: Uncharacterized protein predicted to be involved in C-type cytochrome biogenesis [Posttranslational modification, protein turnover, chaperones / Energy production and conversion]. 33959 COG4235: Cytochrome c biogenesis factor [Posttranslational modification, protein turnover, chaperones]. 33960 COG4237: Hydrogenase 4 membrane component (E) [Energy production and conversion]. 33961 COG4238: Murein lipoprotein [Cell envelope biogenesis, outer membrane]. 33962 COG4239: ABC-type uncharacterized transport system, permease component [General function prediction only]. 33963 COG4240: Predicted kinase [General function prediction only]. 33964 COG4241: Predicted membrane protein [Function unknown]. 33965 COG4242: Cyanophycinase and related exopeptidases [Secondary metabolites biosynthesis, transport, and catabolism / Inorganic ion transport and metabolism]. 33966 COG4243: Predicted membrane protein [Function unknown]. 33967 COG4244: Predicted membrane protein [Function unknown]. 33968 COG4245: Uncharacterized protein encoded in toxicity protection region of plasmid R478, contains von Willebrand factor (vWF) domain [General function prediction only]. 33969 COG4246: Uncharacterized protein conserved in bacteria [Function unknown]. 33970 COG4247: 3-phytase (myo-inositol-hexaphosphate 3-phosphohydrolase) [Lipid metabolism]. 33971 COG4248: Uncharacterized protein with protein kinase and helix-hairpin-helix DNA-binding domains [General function prediction only]. 33972 COG4249: Uncharacterized protein containing caspase domain [General function prediction only]. 33973 COG4250: Predicted sensor protein/domain [Signal transduction mechanisms]. 33974 COG4251: Bacteriophytochrome (light-regulated signal transduction histidine kinase) [Signal transduction mechanisms]. 33975 COG4252: Predicted transmembrane sensor domain [Signal transduction mechanisms]. 33976 COG4253: Uncharacterized protein conserved in bacteria [Function unknown]. 33977 COG4254: Uncharacterized protein conserved in bacteria [Function unknown]. 33978 COG4255: Uncharacterized protein conserved in bacteria [Function unknown]. 33979 COG4256: Hemin uptake protein [Inorganic ion transport and metabolism]. 33980 COG4257: Streptogramin lyase [Defense mechanisms]. 33981 COG4258: Predicted exporter [General function prediction only]. 33982 COG4259: Uncharacterized protein conserved in bacteria [Function unknown]. 33983 COG4260: Putative virion core protein (lumpy skin disease virus) [Function unknown]. 33984 COG4261: Predicted acyltransferase [General function prediction only]. 33985 COG4262: Predicted spermidine synthase with an N-terminal membrane domain [General function prediction only]. 33986 COG4263: Nitrous oxide reductase [Energy production and conversion]. 33987 COG4264: Siderophore synthetase component [Secondary metabolites biosynthesis, transport, and catabolism]. 33988 COG4266: Allantoicase [Nucleotide transport and metabolism]. 33989 COG4267: Predicted membrane protein [Function unknown]. 33990 COG4268: McrBC 5-methylcytosine restriction system component [Defense mechanisms]. 33991 COG4269: Predicted membrane protein [Function unknown]. 33992 COG4270: Predicted membrane protein [Function unknown]. 33993 COG4271: Predicted nucleotide-binding protein containing TIR -like domain [Transcription]. 33994 COG4272: Predicted membrane protein [Function unknown]. 33995 COG4273: Uncharacterized conserved protein [Function unknown]. 33996 COG4274: Uncharacterized conserved protein [Function unknown]. 33997 COG4275: Uncharacterized conserved protein [Function unknown]. 33998 COG4276: Uncharacterized conserved protein [Function unknown]. 33999 COG4277: Predicted DNA-binding protein with the Helix-hairpin-helix motif [General function prediction only]. 34000 COG4278: Uncharacterized conserved protein [Function unknown]. 34001 COG4279: Uncharacterized conserved protein [Function unknown]. 34002 COG4280: Predicted membrane protein [Function unknown]. 34003 COG4281: Acyl-CoA-binding protein [Lipid metabolism]. 34004 COG4282: Protein involved in beta-1,3-glucan synthesis [Carbohydrate transport and metabolism]. 34005 COG4283: Uncharacterized conserved protein [Function unknown]. 34006 COG4284: UDP-glucose pyrophosphorylase [Carbohydrate transport and metabolism]. 34007 COG4285: Uncharacterized conserved protein [Function unknown]. 34008 COG4286: Uncharacterized conserved protein related to MYG1 family [Function unknown]. 34009 COG4287: PhoPQ-activated pathogenicity-related protein [General function prediction only]. 34010 COG4288: Uncharacterized protein conserved in bacteria [Function unknown]. 34011 COG4289: Uncharacterized protein conserved in bacteria [Function unknown]. 34012 COG4290: Guanyl-specific ribonuclease Sa [Nucleotide transport and metabolism]. 34013 COG4291: Predicted membrane protein [Function unknown]. 34014 COG4292: Predicted membrane protein [Function unknown]. 34015 COG4293: Uncharacterized protein conserved in bacteria [Function unknown]. 34016 COG4294: UV damage repair endonuclease [DNA replication, recombination, and repair]. 34017 COG4295: Uncharacterized protein conserved in bacteria [Function unknown]. 34018 COG4296: Uncharacterized protein conserved in bacteria [Function unknown]. 34019 COG4297: Uncharacterized protein containing double-stranded beta helix domain [Function unknown]. 34020 COG4298: Uncharacterized protein conserved in bacteria [Function unknown]. 34021 COG4299: Uncharacterized protein conserved in bacteria [Function unknown]. 34022 COG4300: Predicted permease, cadmium resistance protein [Inorganic ion transport and metabolism]. 34023 COG4301: Uncharacterized conserved protein [Function unknown]. 34024 COG4302: Ethanolamine ammonia-lyase, small subunit [Amino acid transport and metabolism]. 34025 COG4303: Ethanolamine ammonia-lyase, large subunit [Amino acid transport and metabolism]. 34026 COG4304: Uncharacterized protein conserved in bacteria [Function unknown]. 34027 COG4305: Endoglucanase C-terminal domain/subunit and related proteins [Carbohydrate transport and metabolism]. 34028 COG4306: Uncharacterized protein conserved in bacteria [Function unknown]. 34029 COG4307: Uncharacterized protein conserved in bacteria [Function unknown]. 34030 COG4308: Limonene-1,2-epoxide hydrolase [Secondary metabolites biosynthesis, transport, and catabolism]. 34031 COG4309: Uncharacterized conserved protein [Function unknown]. 34032 COG4310: Uncharacterized protein conserved in bacteria with an aminopeptidase-like domain [General function prediction only]. 34033 COG4311: Sarcosine oxidase delta subunit [Amino acid transport and metabolism]. 34034 COG4312: Uncharacterized protein conserved in bacteria [Function unknown]. 34035 COG4313: Protein involved in meta-pathway of phenol degradation [Energy production and conversion]. 34036 COG4314: Predicted lipoprotein involved in nitrous oxide reduction [Energy production and conversion]. 34037 COG4315: Uncharacterized protein conserved in bacteria [Function unknown]. 34038 COG4316: Uncharacterized protein conserved in bacteria [Function unknown]. 34039 COG4317: Uncharacterized protein conserved in bacteria [Function unknown]. 34040 COG4318: Uncharacterized protein conserved in bacteria [Function unknown]. 34041 COG4319: Ketosteroid isomerase homolog [Function unknown]. 34042 COG4320: Uncharacterized protein conserved in bacteria [Function unknown]. 34043 COG4321: Uncharacterized protein related to arylsulfate sulfotransferase involved in siderophore biosynthesis [General function prediction only]. 34044 COG4322: Uncharacterized protein conserved in bacteria [Function unknown]. 34045 COG4323: Predicted membrane protein [Function unknown]. 34046 COG4324: Predicted aminopeptidase [General function prediction only]. 34047 COG4325: Predicted membrane protein [Function unknown]. 34048 COG4326: Sporulation control protein [General function prediction only]. 34049 COG4327: Predicted membrane protein [Function unknown]. 34050 COG4328: Uncharacterized protein conserved in bacteria [Function unknown]. 34051 COG4329: Predicted membrane protein [Function unknown]. 34052 COG4330: Predicted membrane protein [Function unknown]. 34053 COG4331: Predicted membrane protein [Function unknown]. 34054 COG4332: Uncharacterized protein conserved in bacteria [Function unknown]. 34055 COG4333: Uncharacterized protein conserved in bacteria [Function unknown]. 34056 COG4334: Uncharacterized protein conserved in bacteria [Function unknown]. 34057 COG4335: DNA alkylation repair enzyme [DNA replication, recombination, and repair]. 34058 COG4336: Uncharacterized conserved protein [Function unknown]. 34059 COG4337: Uncharacterized protein conserved in bacteria [Function unknown]. 34060 COG4338: Uncharacterized protein conserved in bacteria [Function unknown]. 34061 COG4339: Uncharacterized protein conserved in bacteria [Function unknown]. 34062 COG4340: Uncharacterized protein conserved in bacteria [Function unknown]. 34063 COG4341: Predicted HD phosphohydrolase [General function prediction only]. 34064 COG4342: Uncharacterized protein conserved in archaea [Function unknown]. 34065 COG4343: Uncharacterized protein conserved in archaea [Function unknown]. 34066 COG4344: Uncharacterized protein conserved in archaea [Function unknown]. 34067 COG4345: Uncharacterized protein conserved in archaea [Function unknown]. 34068 COG4346: Predicted membrane-bound dolichyl-phosphate-mannose-protein mannosyltransferase [Posttranslational modification, protein turnover, chaperones]. 34069 COG4347: Predicted membrane protein [Function unknown]. 34070 COG4352: Ribosomal protein L13E [Translation, ribosomal structure and biogenesis]. 34071 COG4353: Uncharacterized conserved protein [Function unknown]. 34072 COG4354: Predicted bile acid beta-glucosidase [Carbohydrate transport and metabolism]. 34073 COG4357: Uncharacterized conserved protein [Function unknown]. 34074 COG4359: Uncharacterized conserved protein [Function unknown]. 34075 COG4360: ATP adenylyltransferase (5',5'' '-P-1,P-4-tetraphosphate phosphorylase II) [Nucleotide transport and metabolism]. 34076 COG4362: Nitric oxide synthase, oxygenase domain [Inorganic ion transport and metabolism / Amino acid transport and metabolism]. 34077 COG4365: Uncharacterized protein conserved in bacteria [Function unknown]. 34078 COG4367: Uncharacterized protein conserved in bacteria [Function unknown]. 34079 COG4370: Uncharacterized protein conserved in bacteria [Function unknown]. 34080 COG4371: Predicted membrane protein [Function unknown]. 34081 COG4372: Uncharacterized protein conserved in bacteria with the myosin-like domain [Function unknown]. 34082 COG4373: Mu-like prophage FluMu protein gp28 [General function prediction only]. 34083 COG4374: Uncharacterized protein conserved in bacteria [Function unknown]. 34084 COG4377: Predicted membrane protein [Function unknown]. 34085 COG4378: Uncharacterized protein conserved in bacteria [Function unknown]. 34086 COG4379: Mu-like prophage tail protein gpP [General function prediction only]. 34087 COG4380: Uncharacterized protein conserved in bacteria [Function unknown]. 34088 COG4381: Mu-like prophage protein gp46 [Function unknown]. 34089 COG4382: Mu-like prophage protein gp16 [Function unknown]. 34090 COG4383: Mu-like prophage protein gp29 [Function unknown]. 34091 COG4384: Mu-like prophage protein gp45 [Function unknown]. 34092 COG4385: Bacteriophage P2-related tail formation protein [General function prediction only]. 34093 COG4386: Mu-like prophage tail sheath protein gpL [General function prediction only]. 34094 COG4387: Mu-like prophage protein gp36 [Function unknown]. 34095 COG4388: Mu-like prophage I protein [General function prediction only]. 34096 COG4389: Site-specific recombinase [DNA replication, recombination, and repair]. 34097 COG4390: Uncharacterized protein conserved in bacteria [Function unknown]. 34098 COG4391: Uncharacterized protein conserved in bacteria [Function unknown]. 34099 COG4392: Predicted membrane protein [Function unknown]. 34100 COG4393: Predicted membrane protein [Function unknown]. 34101 COG4394: Uncharacterized protein conserved in bacteria [Function unknown]. 34102 COG4395: Uncharacterized protein conserved in bacteria [Function unknown]. 34103 COG4396: Mu-like prophage host-nuclease inhibitor protein Gam [General function prediction only]. 34104 COG4397: Mu-like prophage major head subunit gpT [General function prediction only]. 34105 COG4398: Uncharacterized protein conserved in bacteria [Function unknown]. 34106 COG4399: Uncharacterized protein conserved in bacteria [Function unknown]. 34107 COG4401: Chorismate mutase [Amino acid transport and metabolism]. 34108 COG4402: Uncharacterized protein conserved in bacteria [Function unknown]. 34109 COG4403: Lantibiotic modifying enzyme [Defense mechanisms]. 34110 COG4405: Uncharacterized protein conserved in bacteria [Function unknown]. 34111 COG4408: Uncharacterized protein conserved in bacteria [Function unknown]. 34112 COG4409: Neuraminidase (sialidase) [Carbohydrate transport and metabolism]. 34113 COG4412: Uncharacterized protein conserved in bacteria [Function unknown]. 34114 COG4413: Urea transporter [Amino acid transport and metabolism]. 34115 COG4416: Mu-like prophage protein Com [General function prediction only]. 34116 COG4420: Predicted membrane protein [Function unknown]. 34117 COG4421: Capsular polysaccharide biosynthesis protein [Carbohydrate transport and metabolism]. 34118 COG4422: Bacteriophage protein gp37 [Function unknown]. 34119 COG4423: Uncharacterized protein conserved in bacteria [Function unknown]. 34120 COG4424: Uncharacterized protein conserved in bacteria [Function unknown]. 34121 COG4425: Predicted membrane protein [Function unknown]. 34122 COG4427: Uncharacterized protein conserved in bacteria [Function unknown]. 34123 COG4430: Uncharacterized protein conserved in bacteria [Function unknown]. 34124 COG4443: Uncharacterized protein conserved in bacteria [Function unknown]. 34125 COG4445: Hydroxylase for synthesis of 2-methylthio-cis-ribozeatin in tRNA [Nucleotide transport and metabolism / Translation, ribosomal structure and biogenesis]. 34126 COG4446: Uncharacterized protein conserved in bacteria [Function unknown]. 34127 COG4447: Uncharacterized protein related to plant photosystem II stability/assembly factor [General function prediction only]. 34128 COG4448: L-asparaginase II [Amino acid transport and metabolism]. 34129 COG4449: Predicted protease of the Abi (CAAX) family [General function prediction only]. 34130 COG4451: Ribulose bisphosphate carboxylase small subunit [Energy production and conversion]. 34131 COG4452: Inner membrane protein involved in colicin E2 resistance [Defense mechanisms]. 34132 COG4453: Uncharacterized protein conserved in bacteria [Function unknown]. 34133 COG4454: Uncharacterized copper-binding protein [Inorganic ion transport and metabolism]. 34134 COG4455: Protein of avirulence locus involved in temperature-dependent protein secretion [General function prediction only]. 34135 COG4456: Virulence-associated protein and related proteins [Function unknown]. 34136 COG4457: Uncharacterized protein conserved in bacteria, putative virulence factor [Function unknown]. 34137 COG4458: Uncharacterized protein conserved in bacteria, putative virulence factor [Function unknown]. 34138 COG4459: Periplasmic nitrate reductase system, NapE component [Energy production and conversion]. 34139 COG4460: Uncharacterized protein conserved in bacteria [Function unknown]. 34140 COG4461: Uncharacterized protein conserved in bacteria, putative lipoprotein [Function unknown]. 34141 COG4463: Transcriptional repressor of class III stress genes [Transcription]. 34142 COG4464: Capsular polysaccharide biosynthesis protein [Carbohydrate transport and metabolism / Cell envelope biogenesis, outer membrane]. 34143 COG4465: Pleiotropic transcriptional repressor [Transcription]. 34144 COG4466: Uncharacterized protein conserved in bacteria [Function unknown]. 34145 COG4467: Uncharacterized protein conserved in bacteria [Function unknown]. 34146 COG4468: Galactose-1-phosphate uridyltransferase [Carbohydrate transport and metabolism]. 34147 COG4469: Competence protein [General function prediction only]. 34148 COG4470: Uncharacterized protein conserved in bacteria [Function unknown]. 34149 COG4471: Uncharacterized protein conserved in bacteria [Function unknown]. 34150 COG4472: Uncharacterized protein conserved in bacteria [Function unknown]. 34151 COG4473: Predicted ABC-type exoprotein transport system, permease component [Intracellular trafficking and secretion]. 34152 COG4474: Uncharacterized protein conserved in bacteria [Function unknown]. 34153 COG4475: Uncharacterized protein conserved in bacteria [Function unknown]. 34154 COG4476: Uncharacterized protein conserved in bacteria [Function unknown]. 34155 COG4477: Negative regulator of septation ring formation [Cell division and chromosome partitioning]. 34156 COG4478: Predicted membrane protein [Function unknown]. 34157 COG4479: Uncharacterized protein conserved in bacteria [Function unknown]. 34158 COG4481: Uncharacterized protein conserved in bacteria [Function unknown]. 34159 COG4483: Uncharacterized protein conserved in bacteria [Function unknown]. 34160 COG4485: Predicted membrane protein [Function unknown]. 34161 COG4487: Uncharacterized protein conserved in bacteria [Function unknown]. 34162 COG4492: ACT domain-containing protein [General function prediction only]. 34163 COG4493: Uncharacterized protein conserved in bacteria [Function unknown]. 34164 COG4495: Uncharacterized protein conserved in bacteria [Function unknown]. 34165 COG4496: Uncharacterized protein conserved in bacteria [Function unknown]. 34166 COG4499: Predicted membrane protein [Function unknown]. 34167 COG4502: Uncharacterized protein conserved in bacteria [Function unknown]. 34168 COG4506: Uncharacterized protein conserved in bacteria [Function unknown]. 34169 COG4508: Uncharacterized protein conserved in bacteria [Function unknown]. 34170 COG4509: Uncharacterized protein conserved in bacteria [Function unknown]. 34171 COG4512: Membrane protein putatively involved in post-translational modification of the autoinducing quorum-sensing peptide [Posttranslational modification, protein turnover, chaperones / Signal transduction mechanisms / Transcription]. 34172 COG4517: Uncharacterized protein conserved in bacteria [Function unknown]. 34173 COG4518: Mu-like prophage FluMu protein gp41 [Function unknown]. 34174 COG4519: Uncharacterized protein conserved in bacteria [Function unknown]. 34175 COG4520: Surface antigen [Cell envelope biogenesis, outer membrane]. 34176 COG4521: ABC-type taurine transport system, periplasmic component [Inorganic ion transport and metabolism]. 34177 COG4525: ABC-type taurine transport system, ATPase component [Inorganic ion transport and metabolism]. 34178 COG4529: Uncharacterized protein conserved in bacteria [Function unknown]. 34179 COG4530: Uncharacterized protein conserved in bacteria [Function unknown]. 34180 COG4531: ABC-type Zn2+ transport system, periplasmic component/surface adhesin [Inorganic ion transport and metabolism]. 34181 COG4533: ABC-type uncharacterized transport system, periplasmic component [General function prediction only]. 34182 COG4535: Putative Mg2+ and Co2+ transporter CorC [Inorganic ion transport and metabolism]. 34183 COG4536: Putative Mg2+ and Co2+ transporter CorB [Inorganic ion transport and metabolism]. 34184 COG4537: Competence protein ComGC [Intracellular trafficking and secretion]. 34185 COG4538: Uncharacterized conserved protein [Function unknown]. 34186 COG4539: Predicted membrane protein [Function unknown]. 34187 COG4540: Phage P2 baseplate assembly protein gpV [General function prediction only]. 34188 COG4541: Predicted membrane protein [Function unknown]. 34189 COG4542: Protein involved in propanediol utilization, and related proteins (includes coumermycin biosynthetic protein), possible kinase [Secondary metabolites biosynthesis, transport, and catabolism]. 34190 COG4544: Uncharacterized conserved protein [Function unknown]. 34191 COG4545: Glutaredoxin-related protein [Posttranslational modification, protein turnover, chaperones]. 34192 COG4547: Cobalamin biosynthesis protein CobT (nicotinate-mononucleotide:5, 6-dimethylbenzimidazole phosphoribosyltransferase) [Coenzyme metabolism]. 34193 COG4548: Nitric oxide reductase activation protein [Inorganic ion transport and metabolism]. 34194 COG4549: Uncharacterized protein conserved in bacteria [Function unknown]. 34195 COG4550: Predicted membrane protein [Function unknown]. 34196 COG4551: Predicted protein tyrosine phosphatase [General function prediction only]. 34197 COG4552: Predicted acetyltransferase involved in intracellular survival and related acetyltransferases [General function prediction only]. 34198 COG4553: Poly-beta-hydroxyalkanoate depolymerase [Lipid metabolism]. 34199 COG4555: ABC-type Na+ transport system, ATPase component [Energy production and conversion / Inorganic ion transport and metabolism]. 34200 COG4558: ABC-type hemin transport system, periplasmic component [Inorganic ion transport and metabolism]. 34201 COG4559: ABC-type hemin transport system, ATPase component [Inorganic ion transport and metabolism]. 34202 COG4564: Signal transduction histidine kinase [Signal transduction mechanisms]. 34203 COG4565: Response regulator of citrate/malate metabolism [Transcription / Signal transduction mechanisms]. 34204 COG4566: Response regulator [Signal transduction mechanisms]. 34205 COG4567: Response regulator consisting of a CheY-like receiver domain and a Fis-type HTH domain [Signal transduction mechanisms / Transcription]. 34206 COG4568: Transcriptional antiterminator [Transcription]. 34207 COG4569: Acetaldehyde dehydrogenase (acetylating) [Secondary metabolites biosynthesis, transport, and catabolism]. 34208 COG4570: Holliday junction resolvase [DNA replication, recombination, and repair]. 34209 COG4571: Outer membrane protease [Cell envelope biogenesis, outer membrane]. 34210 COG4572: Putative cation transport regulator [General function prediction only]. 34211 COG4573: Predicted tagatose 6-phosphate kinase [Carbohydrate transport and metabolism]. 34212 COG4574: Serine protease inhibitor ecotin [General function prediction only]. 34213 COG4575: Uncharacterized conserved protein [Function unknown]. 34214 COG4576: Carbon dioxide concentrating mechanism/carboxysome shell protein [Secondary metabolites biosynthesis, transport, and catabolism / Energy production and conversion]. 34215 COG4577: Carbon dioxide concentrating mechanism/carboxysome shell protein [Secondary metabolites biosynthesis, transport, and catabolism / Energy production and conversion]. 34216 COG4578: Glucitol operon activator [Transcription]. 34217 COG4579: Isocitrate dehydrogenase kinase/phosphatase [Signal transduction mechanisms]. 34218 COG4580: Maltoporin (phage lambda and maltose receptor) [Carbohydrate transport and metabolism]. 34219 COG4581: Superfamily II RNA helicase [DNA replication, recombination, and repair]. 34220 COG4582: Uncharacterized protein conserved in bacteria [Function unknown]. 34221 COG4583: Sarcosine oxidase gamma subunit [Amino acid transport and metabolism]. 34222 COG4584: Transposase and inactivated derivatives [DNA replication, recombination, and repair]. 34223 COG4585: Signal transduction histidine kinase [Signal transduction mechanisms]. 34224 COG4586: ABC-type uncharacterized transport system, ATPase component [General function prediction only]. 34225 COG4587: ABC-type uncharacterized transport system, permease component [General function prediction only]. 34226 COG4588: Accessory colonization factor AcfC, contains ABC-type periplasmic domain [General function prediction only]. 34227 COG4589: Predicted CDP-diglyceride synthetase/phosphatidate cytidylyltransferase [General function prediction only]. 34228 COG4590: ABC-type uncharacterized transport system, permease component [General function prediction only]. 34229 COG4591: ABC-type transport system, involved in lipoprotein release, permease component [Cell envelope biogenesis, outer membrane]. 34230 COG4592: ABC-type Fe2+-enterobactin transport system, periplasmic component [Inorganic ion transport and metabolism]. 34231 COG4594: ABC-type Fe3+-citrate transport system, periplasmic component [Inorganic ion transport and metabolism]. 34232 COG4597: ABC-type amino acid transport system, permease component [Amino acid transport and metabolism]. 34233 COG4598: ABC-type histidine transport system, ATPase component [Amino acid transport and metabolism]. 34234 COG4603: ABC-type uncharacterized transport system, permease component [General function prediction only]. 34235 COG4604: ABC-type enterochelin transport system, ATPase component [Inorganic ion transport and metabolism]. 34236 COG4605: ABC-type enterochelin transport system, permease component [Inorganic ion transport and metabolism]. 34237 COG4606: ABC-type enterochelin transport system, permease component [Inorganic ion transport and metabolism]. 34238 COG4607: ABC-type enterochelin transport system, periplasmic component [Inorganic ion transport and metabolism]. 34240 COG4615: ABC-type siderophore export system, fused ATPase and permease components [Secondary metabolites biosynthesis, transport, and catabolism / Inorganic ion transport and metabolism]. 34241 COG4618: ABC-type protease/lipase transport system, ATPase and permease components [General function prediction only]. 34242 COG4619: ABC-type uncharacterized transport system, ATPase component [General function prediction only]. 34243 COG4623: Predicted soluble lytic transglycosylase fused to an ABC-type amino acid-binding protein [Cell envelope biogenesis, outer membrane]. 34244 COG4624: Iron only hydrogenase large subunit, C-terminal domain [General function prediction only]. 34245 COG4625: Uncharacterized protein with a C-terminal OMP (outer membrane protein) domain [Function unknown]. 34246 COG4626: Phage terminase-like protein, large subunit [General function prediction only]. 34247 COG4627: Uncharacterized protein conserved in bacteria [Function unknown]. 34248 COG4628: Uncharacterized conserved protein [Function unknown]. 34249 COG4630: Xanthine dehydrogenase, iron-sulfur cluster and FAD-binding subunit A [Nucleotide transport and metabolism]. 34250 COG4631: Xanthine dehydrogenase, molybdopterin-binding subunit B [Nucleotide transport and metabolism]. 34251 COG4632: Exopolysaccharide biosynthesis protein related to N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase [Carbohydrate transport and metabolism]. 34252 COG4633: Uncharacterized protein conserved in bacteria [Function unknown]. 34253 COG4634: Uncharacterized protein conserved in bacteria [Function unknown]. 34254 COG4635: Flavodoxin [Energy production and conversion / Coenzyme metabolism]. 34255 COG4636: Uncharacterized protein conserved in cyanobacteria [Function unknown]. 34256 COG4637: Predicted ATPase [General function prediction only]. 34257 COG4638: Phenylpropionate dioxygenase and related ring-hydroxylating dioxygenases, large terminal subunit [Inorganic ion transport and metabolism / General function prediction only]. 34258 COG4639: Predicted kinase [General function prediction only]. 34259 COG4640: Predicted membrane protein [Function unknown]. 34260 COG4641: Uncharacterized protein conserved in bacteria [Function unknown]. 34261 COG4642: Uncharacterized protein conserved in bacteria [Function unknown]. 34262 COG4643: Uncharacterized protein conserved in bacteria [Function unknown]. 34263 COG4644: Transposase and inactivated derivatives, TnpA family [DNA replication, recombination, and repair]. 34264 COG4645: Uncharacterized protein conserved in bacteria [Function unknown]. 34265 COG4646: DNA methylase [Transcription / DNA replication, recombination, and repair]. 34266 COG4647: Acetone carboxylase, gamma subunit [Secondary metabolites biosynthesis, transport, and catabolism]. 34267 COG4648: Predicted membrane protein [Function unknown]. 34268 COG4649: Uncharacterized protein conserved in bacteria [Function unknown]. 34269 COG4650: Sigma54-dependent transcription regulator containing an AAA-type ATPase domain and a DNA-binding domain [Transcription / Signal transduction mechanisms]. 34270 COG4651: Kef-type K+ transport system, predicted NAD-binding component [Inorganic ion transport and metabolism]. 34271 COG4652: Uncharacterized protein conserved in bacteria [Function unknown]. 34272 COG4653: Predicted phage phi-C31 gp36 major capsid-like protein [General function prediction only]. 34273 COG4654: Cytochrome c551/c552 [Energy production and conversion]. 34274 COG4655: Predicted membrane protein [Function unknown]. 34275 COG4656: Predicted NADH:ubiquinone oxidoreductase, subunit RnfC [Energy production and conversion]. 34276 COG4657: Predicted NADH:ubiquinone oxidoreductase, subunit RnfA [Energy production and conversion]. 34277 COG4658: Predicted NADH:ubiquinone oxidoreductase, subunit RnfD [Energy production and conversion]. 34278 COG4659: Predicted NADH:ubiquinone oxidoreductase, subunit RnfG [Energy production and conversion]. 34279 COG4660: Predicted NADH:ubiquinone oxidoreductase, subunit RnfE [Energy production and conversion]. 34280 COG4662: ABC-type tungstate transport system, periplasmic component [Coenzyme metabolism]. 34281 COG4663: TRAP-type mannitol/chloroaromatic compound transport system, periplasmic component [Secondary metabolites biosynthesis, transport, and catabolism]. 34282 COG4664: TRAP-type mannitol/chloroaromatic compound transport system, large permease component [Secondary metabolites biosynthesis, transport, and catabolism]. 34283 COG4665: TRAP-type mannitol/chloroaromatic compound transport system, small permease component [Secondary metabolites biosynthesis, transport, and catabolism]. 34284 COG4666: TRAP-type uncharacterized transport system, fused permease components [General function prediction only]. 34285 COG4667: Predicted esterase of the alpha-beta hydrolase superfamily [General function prediction only]. 34286 COG4668: Mannitol/fructose-specific phosphotransferase system, IIA domain [Carbohydrate transport and metabolism]. 34287 COG4669: Type III secretory pathway, lipoprotein EscJ [Intracellular trafficking and secretion]. 34288 COG4670: Acyl CoA:acetate/3-ketoacid CoA transferase [Lipid metabolism]. 34289 COG4671: Predicted glycosyl transferase [General function prediction only]. 34290 COG4672: Phage-related protein [Function unknown]. 34291 COG4674: Uncharacterized ABC-type transport system, ATPase component [General function prediction only]. 34292 COG4675: Microcystin-dependent protein [Function unknown]. 34293 COG4676: Uncharacterized protein conserved in bacteria [Function unknown]. 34294 COG4677: Pectin methylesterase [Carbohydrate transport and metabolism]. 34295 COG4678: Muramidase (phage lambda lysozyme) [Carbohydrate transport and metabolism]. 34296 COG4679: Phage-related protein [Function unknown]. 34297 COG4680: Uncharacterized protein conserved in bacteria [Function unknown]. 34298 COG4681: Uncharacterized protein conserved in bacteria [Function unknown]. 34299 COG4682: Predicted membrane protein [Function unknown]. 34300 COG4683: Uncharacterized protein conserved in bacteria [Function unknown]. 34301 COG4684: Predicted membrane protein [Function unknown]. 34302 COG4685: Uncharacterized protein conserved in bacteria [Function unknown]. 34303 COG4687: Uncharacterized protein conserved in bacteria [Function unknown]. 34304 COG4688: Uncharacterized protein conserved in bacteria [Function unknown]. 34305 COG4689: Acetoacetate decarboxylase [Secondary metabolites biosynthesis, transport, and catabolism]. 34306 COG4690: Dipeptidase [Amino acid transport and metabolism]. 34307 COG4691: Plasmid stability protein [General function prediction only]. 34308 COG4692: Predicted neuraminidase (sialidase) [Carbohydrate transport and metabolism]. 34309 COG4693: Oxidoreductase (NAD-binding), involved in siderophore biosynthesis [Secondary metabolites biosynthesis, transport, and catabolism]. 34310 COG4694: Uncharacterized protein conserved in bacteria [Function unknown]. 34311 COG4695: Phage-related protein [Function unknown]. 34312 COG4696: Uncharacterized protein conserved in bacteria [Function unknown]. 34313 COG4697: Uncharacterized protein conserved in archaea [Function unknown]. 34314 COG4698: Uncharacterized protein conserved in bacteria [Function unknown]. 34315 COG4699: Uncharacterized protein conserved in bacteria [Function unknown]. 34316 COG4700: Uncharacterized protein conserved in bacteria containing a divergent form of TPR repeats [Function unknown]. 34317 COG4701: Uncharacterized protein conserved in bacteria [Function unknown]. 34318 COG4702: Uncharacterized conserved protein [Function unknown]. 34319 COG4703: Uncharacterized protein conserved in bacteria [Function unknown]. 34320 COG4704: Uncharacterized protein conserved in bacteria [Function unknown]. 34321 COG4705: Uncharacterized membrane-anchored protein conserved in bacteria [Function unknown]. 34322 COG4706: Predicted 3-hydroxylacyl-(acyl carrier protein) dehydratase [Lipid metabolism]. 34323 COG4707: Uncharacterized protein conserved in bacteria [Function unknown]. 34324 COG4708: Predicted membrane protein [Function unknown]. 34325 COG4709: Predicted membrane protein [Function unknown]. 34326 COG4710: Predicted DNA-binding protein with an HTH domain [General function prediction only]. 34327 COG4711: Predicted membrane protein [Function unknown]. 34328 COG4712: Uncharacterized protein conserved in bacteria [Function unknown]. 34329 COG4713: Predicted membrane protein [Function unknown]. 34330 COG4714: Uncharacterized membrane-anchored protein conserved in bacteria [Function unknown]. 34331 COG4715: Uncharacterized conserved protein [Function unknown]. 34332 COG4716: Myosin-crossreactive antigen [Function unknown]. 34333 COG4717: Uncharacterized conserved protein [Function unknown]. 34334 COG4718: Phage-related protein [Function unknown]. 34335 COG4719: Uncharacterized protein conserved in bacteria [Function unknown]. 34336 COG4720: Predicted membrane protein [Function unknown]. 34337 COG4721: Predicted membrane protein [Function unknown]. 34338 COG4722: Phage-related protein [Function unknown]. 34339 COG4723: Phage-related protein, tail component [Function unknown]. 34340 COG4724: Endo-beta-N-acetylglucosaminidase D [Carbohydrate transport and metabolism]. 34341 COG4725: Transcriptional activator, adenine-specific DNA methyltransferase [Signal transduction mechanisms / Transcription]. 34342 COG4726: Tfp pilus assembly protein PilX [Cell motility and secretion / Intracellular trafficking and secretion]. 34343 COG4727: Uncharacterized protein conserved in bacteria [Function unknown]. 34344 COG4728: Uncharacterized protein conserved in bacteria [Function unknown]. 34345 COG4729: Uncharacterized conserved protein [Function unknown]. 34346 COG4731: Uncharacterized protein conserved in bacteria [Function unknown]. 34347 COG4732: Predicted membrane protein [Function unknown]. 34348 COG4733: Phage-related protein, tail component [Function unknown]. 34349 COG4734: Antirestriction protein [General function prediction only]. 34350 COG4735: Uncharacterized protein conserved in bacteria [Function unknown]. 34351 COG4736: Cbb3-type cytochrome oxidase, subunit 3 [Posttranslational modification, protein turnover, chaperones]. 34352 COG4737: Uncharacterized protein conserved in bacteria [Function unknown]. 34353 COG4738: Predicted transcriptional regulator [Transcription]. 34354 COG4739: Uncharacterized protein containing a ferredoxin domain [Function unknown]. 34355 COG4740: Predicted metalloprotease [General function prediction only]. 34356 COG4741: Predicted secreted endonuclease distantly related to archaeal Holliday junction resolvase [Nucleotide transport and metabolism]. 34357 COG4742: Predicted transcriptional regulator [Transcription]. 34358 COG4743: Predicted membrane protein [Function unknown]. 34359 COG4744: Uncharacterized conserved protein [Function unknown]. 34360 COG4745: Predicted membrane-bound mannosyltransferase [Posttranslational modification, protein turnover, chaperones]. 34361 COG4746: Uncharacterized protein conserved in archaea [Function unknown]. 34362 COG4747: ACT domain-containing protein [General function prediction only]. 34363 COG4748: Uncharacterized conserved protein [Function unknown]. 34364 COG4749: Uncharacterized protein conserved in archaea [Function unknown]. 34365 COG4750: CTP:phosphocholine cytidylyltransferase involved in choline phosphorylation for cell surface LPS epitopes [Cell envelope biogenesis, outer membrane]. 34366 COG4752: Uncharacterized protein conserved in bacteria [Function unknown]. 34367 COG4753: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain [Signal transduction mechanisms]. 34368 COG4754: Uncharacterized conserved protein [Function unknown]. 34369 COG4755: Uncharacterized protein conserved in archaea [Function unknown]. 34370 COG4756: Predicted cation transporter [General function prediction only]. 34371 COG4757: Predicted alpha/beta hydrolase [General function prediction only]. 34372 COG4758: Predicted membrane protein [Function unknown]. 34373 COG4759: Uncharacterized protein conserved in bacteria containing thioredoxin-like domain [Posttranslational modification, protein turnover, chaperones]. 34374 COG4760: Predicted membrane protein [Function unknown]. 34375 COG4762: Uncharacterized protein conserved in bacteria [Function unknown]. 34376 COG4763: Predicted membrane protein [Function unknown]. 34377 COG4764: Uncharacterized protein conserved in bacteria [Function unknown]. 34378 COG4765: Uncharacterized protein conserved in bacteria [Function unknown]. 34379 COG4766: Ethanolamine utilization protein [Amino acid transport and metabolism]. 34380 COG4767: Glycopeptide antibiotics resistance protein [Defense mechanisms]. 34381 COG4768: Uncharacterized protein containing a divergent version of the methyl-accepting chemotaxis-like domain [General function prediction only]. 34382 COG4769: Predicted membrane protein [Function unknown]. 34383 COG4770: Acetyl/propionyl-CoA carboxylase, alpha subunit [Lipid metabolism]. 34384 COG4771: Outer membrane receptor for ferrienterochelin and colicins [Inorganic ion transport and metabolism]. 34385 COG4772: Outer membrane receptor for Fe3+-dicitrate [Inorganic ion transport and metabolism]. 34386 COG4773: Outer membrane receptor for ferric coprogen and ferric-rhodotorulic acid [Inorganic ion transport and metabolism]. 34387 COG4774: Outer membrane receptor for monomeric catechols [Inorganic ion transport and metabolism]. 34388 COG4775: Outer membrane protein/protective antigen OMA87 [Cell envelope biogenesis, outer membrane]. 34389 COG4776: Exoribonuclease II [Transcription]. 34390 COG4778: ABC-type phosphonate transport system, ATPase component [Inorganic ion transport and metabolism]. 34391 COG4779: ABC-type enterobactin transport system, permease component [Inorganic ion transport and metabolism]. 34392 COG4781: Membrane domain of membrane-anchored glycerophosphoryl diester phosphodiesterase [Energy production and conversion]. 34393 COG4782: Uncharacterized protein conserved in bacteria [Function unknown]. 34394 COG4783: Putative Zn-dependent protease, contains TPR repeats [General function prediction only]. 34395 COG4784: Putative Zn-dependent protease [General function prediction only]. 34396 COG4785: Lipoprotein NlpI, contains TPR repeats [General function prediction only]. 34397 COG4786: Flagellar basal body rod protein [Cell motility and secretion]. 34398 COG4787: Flagellar basal body rod protein [Cell motility and secretion]. 34399 COG4789: Type III secretory pathway, component EscV [Intracellular trafficking and secretion]. 34400 COG4790: Type III secretory pathway, component EscR [Intracellular trafficking and secretion]. 34401 COG4791: Type III secretory pathway, component EscT [Intracellular trafficking and secretion]. 34402 COG4792: Type III secretory pathway, component EscU [Intracellular trafficking and secretion]. 34403 COG4794: Type III secretory pathway, component EscS [Intracellular trafficking and secretion]. 34404 COG4795: Type II secretory pathway, component PulJ [Intracellular trafficking and secretion]. 34405 COG4796: Type II secretory pathway, component HofQ [Intracellular trafficking and secretion]. 34406 COG4797: Predicted regulatory domain of a methyltransferase [General function prediction only]. 34407 COG4798: Predicted methyltransferase [General function prediction only]. 34408 COG4799: Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) [Lipid metabolism]. 34409 COG4800: Predicted transcriptional regulator with an HTH domain [Transcription]. 34410 COG4801: Predicted acyltransferase [General function prediction only]. 34411 COG4802: Ferredoxin-thioredoxin reductase, catalytic subunit [Energy production and conversion]. 34412 COG4803: Predicted membrane protein [Function unknown]. 34413 COG4804: Uncharacterized conserved protein [Function unknown]. 34414 COG4805: Uncharacterized protein conserved in bacteria [Function unknown]. 34415 COG4806: L-rhamnose isomerase [Carbohydrate transport and metabolism]. 34416 COG4807: Uncharacterized protein conserved in bacteria [Function unknown]. 34417 COG4808: Uncharacterized protein conserved in bacteria [Function unknown]. 34418 COG4809: Archaeal ADP-dependent phosphofructokinase/glucokinase [Carbohydrate transport and metabolism]. 34419 COG4810: Ethanolamine utilization protein [Amino acid transport and metabolism]. 34420 COG4811: Predicted membrane protein [Function unknown]. 34421 COG4812: Ethanolamine utilization cobalamin adenosyltransferase [Amino acid transport and metabolism]. 34422 COG4813: Trehalose utilization protein [Carbohydrate transport and metabolism]. 34423 COG4814: Uncharacterized protein with an alpha/beta hydrolase fold [General function prediction only]. 34424 COG4815: Uncharacterized protein conserved in bacteria [Function unknown]. 34425 COG4816: Ethanolamine utilization protein [Amino acid transport and metabolism]. 34426 COG4817: Uncharacterized protein conserved in bacteria [Function unknown]. 34427 COG4818: Predicted membrane protein [Function unknown]. 34428 COG4819: Ethanolamine utilization protein, possible chaperonin protecting lyase from inhibition [Amino acid transport and metabolism]. 34429 COG4820: Ethanolamine utilization protein, possible chaperonin [Amino acid transport and metabolism]. 34430 COG4821: Uncharacterized protein containing SIS (Sugar ISomerase) phosphosugar binding domain [General function prediction only]. 34431 COG4822: Cobalamin biosynthesis protein CbiK, Co2+ chelatase [Coenzyme metabolism]. 34432 COG4823: Abortive infection bacteriophage resistance protein [Defense mechanisms]. 34433 COG4824: Phage-related holin (Lysis protein) [General function prediction only]. 34434 COG4825: Uncharacterized membrane-anchored protein conserved in bacteria [Function unknown]. 34435 COG4826: Serine protease inhibitor [Posttranslational modification, protein turnover, chaperones]. 34436 COG4827: Predicted transporter [General function prediction only]. 34437 COG4828: Predicted membrane protein [Function unknown]. 34438 COG4829: Muconolactone delta-isomerase [Secondary metabolites biosynthesis, transport, and catabolism]. 34439 COG4830: Ribosomal protein S26 [Translation, ribosomal structure and biogenesis]. 34440 COG4831: Uncharacterized conserved protein [Function unknown]. 34441 COG4832: Uncharacterized conserved protein [Function unknown]. 34442 COG4833: Predicted glycosyl hydrolase [Carbohydrate transport and metabolism]. 34443 COG4834: Uncharacterized protein conserved in bacteria [Function unknown]. 34444 COG4835: Uncharacterized protein conserved in bacteria [Function unknown]. 34445 COG4836: Predicted membrane protein [Function unknown]. 34446 COG4837: Uncharacterized protein conserved in bacteria [Function unknown]. 34447 COG4838: Uncharacterized protein conserved in bacteria [Function unknown]. 34448 COG4839: Protein required for the initiation of cell division [Cell division and chromosome partitioning]. 34449 COG4840: Uncharacterized protein conserved in bacteria [Function unknown]. 34450 COG4841: Uncharacterized protein conserved in bacteria [Function unknown]. 34451 COG4842: Uncharacterized protein conserved in bacteria [Function unknown]. 34452 COG4843: Uncharacterized protein conserved in bacteria [Function unknown]. 34453 COG4844: Uncharacterized protein conserved in bacteria [Function unknown]. 34454 COG4845: Chloramphenicol O-acetyltransferase [Defense mechanisms]. 34455 COG4846: Membrane protein involved in cytochrome C biogenesis [Posttranslational modification, protein turnover, chaperones]. 34456 COG4847: Uncharacterized protein conserved in archaea [Function unknown]. 34457 COG4848: Uncharacterized protein conserved in bacteria [Function unknown]. 34458 COG4849: Uncharacterized protein conserved in bacteria [Function unknown]. 34459 COG4850: Uncharacterized conserved protein [Function unknown]. 34460 COG4851: Protein involved in sex pheromone biosynthesis [General function prediction only]. 34461 COG4852: Predicted membrane protein [Function unknown]. 34462 COG4853: Uncharacterized protein conserved in bacteria [Function unknown]. 34463 COG4854: Predicted membrane protein [Function unknown]. 34464 COG4855: Uncharacterized protein conserved in archaea [Function unknown]. 34465 COG4856: Uncharacterized protein conserved in bacteria [Function unknown]. 34466 COG4857: Predicted kinase [General function prediction only]. 34467 COG4858: Uncharacterized membrane-bound protein conserved in bacteria [Function unknown]. 34468 COG4859: Uncharacterized protein conserved in bacteria [Function unknown]. 34469 COG4860: Uncharacterized protein conserved in archaea [Function unknown]. 34470 COG4861: Uncharacterized protein conserved in bacteria [Function unknown]. 34471 COG4862: Negative regulator of genetic competence, sporulation and motility [Posttranslational modification, protein turnover, chaperones / Signal transduction mechanisms / Cell motility and secretion]. 34472 COG4863: Uncharacterized protein conserved in bacteria [Function unknown]. 34473 COG4864: Uncharacterized protein conserved in bacteria [Function unknown]. 34474 COG4865: Glutamate mutase epsilon subunit [Amino acid transport and metabolism]. 34475 COG4866: Uncharacterized conserved protein [Function unknown]. 34476 COG4867: Uncharacterized protein with a von Willebrand factor type A (vWA) domain [General function prediction only]. 34477 COG4868: Uncharacterized protein conserved in bacteria [Function unknown]. 34478 COG4869: Propanediol utilization protein [Secondary metabolites biosynthesis, transport, and catabolism]. 34479 COG4870: Cysteine protease [Posttranslational modification, protein turnover, chaperones]. 34480 COG4871: Uncharacterized protein conserved in archaea [Function unknown]. 34481 COG4872: Predicted membrane protein [Function unknown]. 34482 COG4873: Uncharacterized protein conserved in bacteria [Function unknown]. 34483 COG4874: Uncharacterized protein conserved in bacteria containing a pentein-type domain [Function unknown]. 34484 COG4875: Uncharacterized protein conserved in bacteria with a cystatin-like fold [Function unknown]. 34485 COG4876: Uncharacterized protein conserved in bacteria [Function unknown]. 34486 COG4877: Uncharacterized protein conserved in bacteria [Function unknown]. 34487 COG4878: Uncharacterized protein conserved in bacteria [Function unknown]. 34488 COG4879: Uncharacterized protein conserved in archaea [Function unknown]. 34489 COG4880: Secreted protein containing C-terminal beta-propeller domain distantly related to WD-40 repeats [General function prediction only]. 34490 COG4881: Predicted membrane protein [Function unknown]. 34491 COG4882: Predicted aminopeptidase, Iap family [General function prediction only]. 34492 COG4883: Uncharacterized protein conserved in archaea [Function unknown]. 34493 COG4884: Uncharacterized protein conserved in bacteria [Function unknown]. 34494 COG4885: Uncharacterized protein conserved in archaea [Function unknown]. 34495 COG4886: Leucine-rich repeat (LRR) protein [Function unknown]. 34496 COG4887: Uncharacterized metal-binding protein conserved in archaea [General function prediction only]. 34497 COG4888: Uncharacterized Zn ribbon-containing protein [General function prediction only]. 34498 COG4889: Predicted helicase [General function prediction only]. 34499 COG4890: Predicted outer membrane lipoprotein [Function unknown]. 34500 COG4891: Uncharacterized conserved protein [Function unknown]. 34501 COG4892: Predicted heme/steroid binding protein [General function prediction only]. 34502 COG4893: Uncharacterized protein conserved in bacteria [Function unknown]. 34503 COG4894: Uncharacterized conserved protein [Function unknown]. 34504 COG4895: Uncharacterized conserved protein [Function unknown]. 34505 COG4896: Uncharacterized protein conserved in bacteria [Function unknown]. 34506 COG4897: Uncharacterized protein conserved in bacteria [Function unknown]. 34507 COG4898: Uncharacterized protein conserved in bacteria [Function unknown]. 34508 COG4899: Uncharacterized protein conserved in bacteria [Function unknown]. 34509 COG4900: Predicted metallopeptidase [General function prediction only]. 34510 COG4901: Ribosomal protein S25 [Translation, ribosomal structure and biogenesis]. 34511 COG4902: Uncharacterized protein conserved in archaea [Function unknown]. 34512 COG4903: Genetic competence transcription factor [Transcription]. 34513 COG4904: Uncharacterized protein conserved in archaea [Function unknown]. 34514 COG4905: Predicted membrane protein [Function unknown]. 34515 COG4906: Predicted membrane protein [Function unknown]. 34516 COG4907: Predicted membrane protein [Function unknown]. 34517 COG4908: Uncharacterized protein containing a NRPS condensation (elongation) domain [General function prediction only]. 34518 COG4909: Propanediol dehydratase, large subunit [Secondary metabolites biosynthesis, transport, and catabolism]. 34519 COG4910: Propanediol dehydratase, small subunit [Secondary metabolites biosynthesis, transport, and catabolism]. 34520 COG4911: Uncharacterized conserved protein [Function unknown]. 34521 COG4912: Predicted DNA alkylation repair enzyme [DNA replication, recombination, and repair]. 34522 COG4913: Uncharacterized protein conserved in bacteria [Function unknown]. 34523 COG4914: Predicted nucleotidyltransferase [General function prediction only]. 34524 COG4915: 5-bromo-4-chloroindolyl phosphate hydrolysis protein [General function prediction only]. 34525 COG4916: Uncharacterized protein containing a TIR (Toll-Interleukin 1-resistance) domain [Function unknown]. 34526 COG4917: Ethanolamine utilization protein [Amino acid transport and metabolism]. 34527 COG4918: Uncharacterized protein conserved in bacteria [Function unknown]. 34528 COG4919: Ribosomal protein S30 [Translation, ribosomal structure and biogenesis]. 34529 COG4920: Predicted membrane protein [Function unknown]. 34530 COG4921: Uncharacterized protein conserved in archaea [Function unknown]. 34531 COG4922: Uncharacterized protein conserved in bacteria [Function unknown]. 34532 COG4923: Uncharacterized conserved protein [Function unknown]. 34533 COG4924: Uncharacterized protein conserved in bacteria [Function unknown]. 34534 COG4925: Uncharacterized conserved protein [Function unknown]. 34535 COG4926: Phage-related protein [Function unknown]. 34536 COG4927: Predicted choloylglycine hydrolase [General function prediction only]. 34537 COG4928: Predicted P-loop ATPase [General function prediction only]. 34538 COG4929: Uncharacterized membrane-anchored protein [Function unknown]. 34539 COG4930: Predicted ATP-dependent Lon-type protease [Posttranslational modification, protein turnover, chaperones]. 34540 COG4932: Predicted outer membrane protein [Cell envelope biogenesis, outer membrane]. 34541 COG4933: Uncharacterized conserved protein [Function unknown]. 34542 COG4934: Predicted protease [Posttranslational modification, protein turnover, chaperones]. 34543 COG4935: Regulatory P domain of the subtilisin-like proprotein convertases and other proteases [Posttranslational modification, protein turnover, chaperones]. 34544 COG4936: Predicted sensor domain [Signal transduction mechanisms / Transcription]. 34545 COG4937: Predicted regulatory domain of prephenate dehydrogenase [Translation, ribosomal structure and biogenesis]. 34546 COG4938: Uncharacterized conserved protein [Function unknown]. 34547 COG4939: Major membrane immunogen, membrane-anchored lipoprotein [Function unknown]. 34548 COG4940: Competence protein ComGF [Intracellular trafficking and secretion]. 34549 COG4941: Predicted RNA polymerase sigma factor containing a TPR repeat domain [Transcription]. 34550 COG4942: Membrane-bound metallopeptidase [Cell division and chromosome partitioning]. 34551 COG4943: Predicted signal transduction protein containing sensor and EAL domains [Signal transduction mechanisms]. 34552 COG4944: Uncharacterized protein conserved in bacteria [Function unknown]. 34553 COG4945: Membrane-anchored protein predicted to be involved in regulation of amylopullulanase [Carbohydrate transport and metabolism]. 34554 COG4946: Uncharacterized protein related to the periplasmic component of the Tol biopolymer transport system [Function unknown]. 34555 COG4947: Uncharacterized protein conserved in bacteria [Function unknown]. 34556 COG4948: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily [Cell envelope biogenesis, outer membrane / General function prediction only]. 34557 COG4949: Uncharacterized membrane-anchored protein conserved in bacteria [Function unknown]. 34558 COG4950: Uncharacterized protein conserved in bacteria [Function unknown]. 34559 COG4951: Uncharacterized protein conserved in bacteria [Function unknown]. 34560 COG4952: Predicted sugar isomerase [Cell envelope biogenesis, outer membrane]. 34561 COG4953: Membrane carboxypeptidase/penicillin-binding protein PbpC [Cell envelope biogenesis, outer membrane]. 34562 COG4954: Uncharacterized protein conserved in bacteria [Function unknown]. 34563 COG4955: Uncharacterized protein conserved in bacteria [Function unknown]. 34564 COG4956: Integral membrane protein (PIN domain superfamily) [General function prediction only]. 34565 COG4957: Predicted transcriptional regulator [Transcription]. 34566 COG4959: Type IV secretory pathway, protease TraF [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]. 34567 COG4960: Flp pilus assembly protein, protease CpaA [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]. 34568 COG4961: Flp pilus assembly protein TadG [Intracellular trafficking and secretion]. 34569 COG4962: Flp pilus assembly protein, ATPase CpaF [Intracellular trafficking and secretion]. 34570 COG4963: Flp pilus assembly protein, ATPase CpaE [Intracellular trafficking and secretion]. 34571 COG4964: Flp pilus assembly protein, secretin CpaC [Intracellular trafficking and secretion]. 34572 COG4965: Flp pilus assembly protein TadB [Intracellular trafficking and secretion]. 34573 COG4966: Tfp pilus assembly protein PilW [Cell motility and secretion / Intracellular trafficking and secretion]. 34574 COG4967: Tfp pilus assembly protein PilV [Cell motility and secretion / Intracellular trafficking and secretion]. 34575 COG4968: Tfp pilus assembly protein PilE [Cell motility and secretion / Intracellular trafficking and secretion]. 34576 COG4969: Tfp pilus assembly protein, major pilin PilA [Cell motility and secretion / Intracellular trafficking and secretion]. 34577 COG4970: Tfp pilus assembly protein FimT [Cell motility and secretion / Intracellular trafficking and secretion]. 34578 COG4972: Tfp pilus assembly protein, ATPase PilM [Cell motility and secretion / Intracellular trafficking and secretion]. 34579 COG4973: Site-specific recombinase XerC [DNA replication, recombination, and repair]. 34580 COG4974: Site-specific recombinase XerD [DNA replication, recombination, and repair]. 34581 COG4975: Putative glucose uptake permease [Carbohydrate transport and metabolism]. 34582 COG4976: Predicted methyltransferase (contains TPR repeat) [General function prediction only]. 34583 COG4977: Transcriptional regulator containing an amidase domain and an AraC-type DNA-binding HTH domain [Transcription]. 34584 COG4978: Transcriptional regulator, effector-binding domain/component [Transcription / Signal transduction mechanisms]. 34585 COG4980: Gas vesicle protein [General function prediction only]. 34586 COG4981: Enoyl reductase domain of yeast-type FAS1 [Lipid metabolism]. 34587 COG4982: 3-oxoacyl-[acyl-carrier protein]. 34588 COG4983: Uncharacterized conserved protein [Function unknown]. 34589 COG4984: Predicted membrane protein [Function unknown]. 34590 COG4985: ABC-type phosphate transport system, auxiliary component [Inorganic ion transport and metabolism]. 34591 COG4986: ABC-type anion transport system, duplicated permease component [Inorganic ion transport and metabolism]. 34592 COG4987: ABC-type transport system involved in cytochrome bd biosynthesis, fused ATPase and permease components [Energy production and conversion / Posttranslational modification, protein turnover, chaperones]. 34593 COG4988: ABC-type transport system involved in cytochrome bd biosynthesis, ATPase and permease components [Energy production and conversion / Posttranslational modification, protein turnover, chaperones]. 34594 COG4989: Predicted oxidoreductase [General function prediction only]. 34595 COG4990: Uncharacterized protein conserved in bacteria [Function unknown]. 34596 COG4991: Uncharacterized protein with a bacterial SH3 domain homologue [Function unknown]. 34597 COG4992: Ornithine/acetylornithine aminotransferase [Amino acid transport and metabolism]. 34598 COG4993: Glucose dehydrogenase [Carbohydrate transport and metabolism]. 34599 COG4994: Uncharacterized protein conserved in bacteria [Function unknown]. 34600 COG4995: Uncharacterized protein conserved in bacteria [Function unknown]. 34601 COG4996: Predicted phosphatase [General function prediction only]. 34602 COG4997: Uncharacterized conserved protein [Function unknown]. 34603 COG4998: Predicted endonuclease (RecB family) [DNA replication, recombination, and repair]. 34604 COG4999: Uncharacterized domain of BarA-like signal transduction histidine kinases [Signal transduction mechanisms]. 34605 COG5000: Signal transduction histidine kinase involved in nitrogen fixation and metabolism regulation [Signal transduction mechanisms]. 34606 COG5001: Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain [Signal transduction mechanisms]. 34607 COG5002: Signal transduction histidine kinase [Signal transduction mechanisms]. 34608 COG5003: Mu-like prophage protein gp37 [General function prediction only]. 34609 COG5004: P2-like prophage tail protein X [General function prediction only]. 34610 COG5005: Mu-like prophage protein gpG [General function prediction only]. 34611 COG5006: Predicted permease, DMT superfamily [General function prediction only]. 34612 COG5007: Predicted transcriptional regulator, BolA superfamily [Transcription]. 34613 COG5008: Tfp pilus assembly protein, ATPase PilU [Cell motility and secretion / Intracellular trafficking and secretion]. 34614 COG5009: Membrane carboxypeptidase/penicillin-binding protein [Cell envelope biogenesis, outer membrane]. 34615 COG5010: Flp pilus assembly protein TadD, contains TPR repeats [Intracellular trafficking and secretion]. 34616 COG5011: Uncharacterized protein conserved in bacteria [Function unknown]. 34617 COG5012: Predicted cobalamin binding protein [General function prediction only]. 34618 COG5013: Nitrate reductase alpha subunit [Energy production and conversion]. 34619 COG5014: Predicted Fe-S oxidoreductase [General function prediction only]. 34620 COG5015: Uncharacterized conserved protein [Function unknown]. 34621 COG5016: Pyruvate/oxaloacetate carboxyltransferase [Energy production and conversion]. 34622 COG5017: Uncharacterized conserved protein [Function unknown]. 34623 COG5018: Inhibitor of the KinA pathway to sporulation, predicted exonuclease [General function prediction only]. 34624 COG5019: Septin family protein [Cell division and chromosome partitioning / Cytoskeleton]. 34625 COG5020: Mannosyltransferase [Carbohydrate transport and metabolism]. 34626 COG5021: Ubiquitin-protein ligase [Posttranslational modification, protein turnover, chaperones]. 34627 COG5022: Myosin heavy chain [Cytoskeleton]. 34628 COG5023: Tubulin [Cytoskeleton]. 34629 COG5024: Cyclin [Cell division and chromosome partitioning]. 34630 COG5025: Transcription factor of the Forkhead/HNF3 family [Transcription]. 34631 COG5026: Hexokinase [Carbohydrate transport and metabolism]. 34632 COG5027: Histone acetyltransferase (MYST family) [Chromatin structure and dynamics]. 34633 COG5028: Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking and secretion]. 34634 COG5029: Prenyltransferase, beta subunit [Posttranslational modification, protein turnover, chaperones]. 34635 COG5030: Clathrin adaptor complex, small subunit [Intracellular trafficking and secretion]. 34636 COG5031: Uncharacterized protein involved in ubiquinone biosynthesis [Coenzyme metabolism]. 34637 COG5032: Phosphatidylinositol kinase and protein kinases of the PI-3 kinase family [Signal transduction mechanisms / Cell division and chromosome partitioning / Chromatin structure and dynamics / DNA replication, recombination, and repair / Intracellular trafficking and secretion]. 34638 COG5033: Transcription initiation factor IIF, auxiliary subunit [Transcription]. 34639 COG5034: Chromatin remodeling protein, contains PhD zinc finger [Chromatin structure and dynamics]. 34640 COG5035: Cell cycle control protein [Cell division and chromosome partitioning / Transcription / Signal transduction mechanisms]. 34641 COG5036: SPX domain-containing protein involved in vacuolar polyphosphate accumulation [Inorganic ion transport and metabolism]. 34642 COG5037: Gluconate transport-inducing protein [Signal transduction mechanisms / Carbohydrate transport and metabolism]. 34643 COG5038: Ca2+-dependent lipid-binding protein, contains C2 domain [General function prediction only]. 34644 COG5039: Exopolysaccharide biosynthesis protein [Carbohydrate transport and metabolism / Cell envelope biogenesis, outer membrane]. 34645 COG5040: 14-3-3 family protein [Signal transduction mechanisms]. 34646 COG5041: Casein kinase II, beta subunit [Signal transduction mechanisms / Cell division and chromosome partitioning / Transcription]. 34647 COG5042: Purine nucleoside permease [Nucleotide transport and metabolism]. 34648 COG5043: Vacuolar protein sorting-associated protein [Intracellular trafficking and secretion]. 34649 COG5044: RAB proteins geranylgeranyltransferase component A (RAB escort protein) [Posttranslational modification, protein turnover, chaperones]. 34650 COG5045: Ribosomal protein S10E [Translation, ribosomal structure and biogenesis]. 34651 COG5046: Protein involved in Mod5 protein sorting [Posttranslational modification, protein turnover, chaperones]. 34652 COG5047: Vesicle coat complex COPII, subunit SEC23 [Intracellular trafficking and secretion]. 34653 COG5048: FOG: Zn-finger [General function prediction only]. 34654 COG5049: 5'-3' exonuclease [DNA replication, recombination, and repair / Cell division and chromosome partitioning / Translation]. 34655 COG5050: sn-1,2-diacylglycerol ethanolamine- and cholinephosphotranferases [Lipid metabolism]. 34656 COG5051: Ribosomal protein L36E [Translation, ribosomal structure and biogenesis]. 34657 COG5052: Protein involved in membrane traffic [Intracellular trafficking and secretion]. 34658 COG5053: Translation initiation factor 4E (eIF-4E) [Translation, ribosomal structure and biogenesis]. 34659 COG5054: Mitochondrial sulfhydryl oxidase involved in the biogenesis of cytosolic Fe/S proteins [Posttranslational modification, protein turnover, chaperones]. 34660 COG5055: Recombination DNA repair protein (RAD52 pathway) [DNA replication, recombination, and repair]. 34661 COG5056: Acyl-CoA cholesterol acyltransferase [Lipid metabolism]. 34662 COG5057: Phosphotyrosyl phosphatase activator [Cell division and chromosome partitioning / Signal transduction mechanisms]. 34663 COG5058: Protein transporter of the TRAM (translocating chain-associating membrane) superfamily, longevity assurance factor [Intracellular trafficking and secretion]. 34664 COG5059: Kinesin-like protein [Cytoskeleton]. 34665 COG5061: Oxidoreductin, endoplasmic reticulum membrane-associated protein involved in disulfide bond formation [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]. 34666 COG5062: Uncharacterized membrane protein [Function unknown]. 34667 COG5063: CCCH-type Zn-finger protein [General function prediction only]. 34668 COG5064: Karyopherin (importin) alpha [Intracellular trafficking and secretion]. 34669 COG5065: Protein involved in inorganic phosphate transport [Inorganic ion transport and metabolism]. 34670 COG5066: VAMP-associated protein involved in inositol metabolism [Intracellular trafficking and secretion]. 34671 COG5067: Protein kinase essential for the initiation of DNA replication [DNA replication, recombination, and repair / Cell division and chromosome partitioning]. 34672 COG5068: Regulator of arginine metabolism and related MADS box-containing transcription factors [Transcription]. 34673 COG5069: Ca2+-binding actin-bundling protein fimbrin/plastin (EF-Hand superfamily) [Cytoskeleton]. 34674 COG5070: Nucleotide-sugar transporter [Carbohydrate transport and metabolism / Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]. 34675 COG5071: 26S proteasome regulatory complex component [Posttranslational modification, protein turnover, chaperones]. 34676 COG5072: Serine/threonine kinase of the haspin family [Cell division and chromosome partitioning]. 34677 COG5073: Vacuolar import and degradation protein [Intracellular trafficking and secretion]. 34678 COG5074: t-SNARE complex subunit, syntaxin [Intracellular trafficking and secretion]. 34679 COG5075: Uncharacterized conserved protein [Function unknown]. 34680 COG5076: Transcription factor involved in chromatin remodeling, contains bromodomain [Chromatin structure and dynamics / Transcription]. 34681 COG5077: Ubiquitin carboxyl-terminal hydrolase [Posttranslational modification, protein turnover, chaperones]. 34682 COG5078: Ubiquitin-protein ligase [Posttranslational modification, protein turnover, chaperones]. 34683 COG5079: Nuclear protein export factor [Intracellular trafficking and secretion / Cell division and chromosome partitioning]. 34684 COG5080: Rab GTPase interacting factor, Golgi membrane protein [Intracellular trafficking and secretion]. 34685 COG5081: Predicted membrane protein [Function unknown]. 34686 COG5082: Arginine methyltransferase-interacting protein, contains RING Zn-finger [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]. 34687 COG5083: Uncharacterized protein involved in plasmid maintenance [General function prediction only]. 34688 COG5084: Cleavage and polyadenylation specificity factor (CPSF) Clipper subunit and related makorin family Zn-finger proteins [General function prediction only]. 34689 COG5085: Predicted membrane protein [Function unknown]. 34690 COG5086: Uncharacterized conserved protein [Function unknown]. 34691 COG5087: Uncharacterized conserved protein [Function unknown]. 34692 COG5088: Rad5p-binding protein [General function prediction only]. 34693 COG5090: Transcription initiation factor IIF, small subunit (RAP30) [Transcription]. 34694 COG5091: Suppressor of G2 allele of skp1 and related proteins [General function prediction only]. 34695 COG5092: N-myristoyl transferase [Lipid metabolism]. 34696 COG5093: Uncharacterized conserved protein [Function unknown]. 34697 COG5094: Transcription initiation factor TFIID, subunit TAF9 (also component of histone acetyltransferase SAGA) [Transcription]. 34698 COG5095: Transcription initiation factor TFIID, subunit TAF6 (also component of histone acetyltransferase SAGA) [Transcription]. 34699 COG5096: Vesicle coat complex, various subunits [Intracellular trafficking and secretion]. 34700 COG5097: RNA polymerase II transcriptional regulation mediator [Transcription]. 34701 COG5098: Chromosome condensation complex Condensin, subunit D2 [Chromatin structure and dynamics / Cell division and chromosome partitioning]. 34702 COG5099: RNA-binding protein of the Puf family, translational repressor [Translation, ribosomal structure and biogenesis]. 34703 COG5100: Nuclear pore protein [Nuclear structure]. 34704 COG5101: Importin beta-related nuclear transport receptor [Nuclear structure / Intracellular trafficking and secretion]. 34705 COG5102: Membrane protein involved in ER to Golgi transport [Intracellular trafficking and secretion]. 34706 COG5103: Cell division control protein, negative regulator of transcription [Cell division and chromosome partitioning / Transcription]. 34707 COG5104: Splicing factor [RNA processing and modification]. 34708 COG5105: Mitotic inducer, protein phosphatase [Cell division and chromosome partitioning]. 34709 COG5106: Uncharacterized conserved protein [Function unknown]. 34710 COG5107: Pre-mRNA 3'-end processing (cleavage and polyadenylation) factor [RNA processing and modification]. 34711 COG5108: Mitochondrial DNA-directed RNA polymerase [Transcription]. 34712 COG5109: Uncharacterized conserved protein, contains RING Zn-finger [General function prediction only]. 34713 COG5110: 26S proteasome regulatory complex component [Posttranslational modification, protein turnover, chaperones]. 34714 COG5111: DNA-directed RNA polymerase III, subunit C34 [Transcription]. 34715 COG5112: U1-like Zn-finger-containing protein [General function prediction only]. 34716 COG5113: Ubiquitin fusion degradation protein 2 [Posttranslational modification, protein turnover, chaperones]. 34717 COG5114: Histone acetyltransferase complex SAGA/ADA, subunit ADA2 [Chromatin structure and dynamics]. 34718 COG5116: 26S proteasome regulatory complex component [Posttranslational modification, protein turnover, chaperones]. 34719 COG5117: Protein involved in the nuclear export of pre-ribosomes [Translation, ribosomal structure and biogenesis / Intracellular trafficking and secretion]. 34720 COG5118: Transcription initiation factor TFIIIB, Bdp1 subunit [Transcription]. 34721 COG5119: Uncharacterized protein, contains ParB-like nuclease domain [General function prediction only]. 34722 COG5120: Membrane protein involved in Golgi transport [Intracellular trafficking and secretion]. 34723 COG5122: Transport protein particle (TRAPP) complex subunit [Intracellular trafficking and secretion]. 34724 COG5123: Transcription initiation factor IIA, gamma subunit [Transcription]. 34725 COG5124: Protein predicted to be involved in meiotic recombination [Cell division and chromosome partitioning / General function prediction only]. 34726 COG5125: Uncharacterized conserved protein [Function unknown]. 34727 COG5126: Ca2+-binding protein (EF-Hand superfamily) [Signal transduction mechanisms / Cytoskeleton / Cell division and chromosome partitioning / General function prediction only]. 34728 COG5127: Vacuolar H+-ATPase V1 sector, subunit C [Energy production and conversion]. 34729 COG5128: Transport protein particle (TRAPP) complex subunit [Intracellular trafficking and secretion]. 34730 COG5129: Nuclear protein with HMG-like acidic region [General function prediction only]. 34731 COG5130: Prenylated rab acceptor 1 and related proteins [Intracellular trafficking and secretion / Signal transduction mechanisms]. 34732 COG5131: Ubiquitin-like protein [Posttranslational modification, protein turnover, chaperones]. 34733 COG5132: Cell cycle control protein, G10 family [Transcription / Cell division and chromosome partitioning]. 34734 COG5133: Uncharacterized conserved protein [Function unknown]. 34735 COG5134: Uncharacterized conserved protein [Function unknown]. 34736 COG5135: Uncharacterized conserved protein [Function unknown]. 34737 COG5136: U1 snRNP-specific protein C [RNA processing and modification]. 34738 COG5137: Histone chaperone involved in gene silencing [Transcription / Chromatin structure and dynamics]. 34739 COG5138: Uncharacterized conserved protein [Function unknown]. 34740 COG5139: Uncharacterized conserved protein [Function unknown]. 34741 COG5140: Ubiquitin fusion-degradation protein [Posttranslational modification, protein turnover, chaperones]. 34742 COG5141: PHD zinc finger-containing protein [General function prediction only]. 34743 COG5142: Oxidation resistance protein [DNA replication, recombination, and repair]. 34744 COG5143: Synaptobrevin/VAMP-like protein [Intracellular trafficking and secretion]. 34745 COG5144: RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH, subunit TFB2 [Transcription / DNA replication, recombination, and repair]. 34746 COG5145: DNA excision repair protein [DNA replication, recombination, and repair]. 34747 COG5146: Pantothenate kinase, acetyl-CoA regulated [Coenzyme metabolism]. 34748 COG5147: Myb superfamily proteins, including transcription factors and mRNA splicing factors [Transcription / RNA processing and modification / Cell division and chromosome partitioning]. 34749 COG5148: 26S proteasome regulatory complex, subunit RPN10/PSMD4 [Posttranslational modification, protein turnover, chaperones]. 34750 COG5149: Transcription initiation factor IIA, large chain [Transcription]. 34751 COG5150: Class 2 transcription repressor NC2, beta subunit (Dr1) [Transcription]. 34752 COG5151: RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH, subunit SSL1 [Transcription / DNA replication, recombination, and repair]. 34753 COG5152: Uncharacterized conserved protein, contains RING and CCCH-type Zn-fingers [General function prediction only]. 34754 COG5153: Putative lipase essential for disintegration of autophagic bodies inside the vacuole [Intracellular trafficking and secretion / Lipid metabolism]. 34755 COG5154: RNA-binding protein required for 60S ribosomal subunit biogenesis [Translation, ribosomal structure and biogenesis]. 34756 COG5155: Separase, a protease involved in sister chromatid separation [Cell division and chromosome partitioning / Posttranslational modification, protein turnover, chaperones]. 34757 COG5156: Anaphase-promoting complex (APC), subunit 10 [Cell division and chromosome partitioning / Posttranslational modification, protein turnover, chaperones]. 34758 COG5157: RNA polymerase II assessory factor [Transcription]. 34759 COG5158: Proteins involved in synaptic transmission and general secretion, Sec1 family [Intracellular trafficking and secretion]. 34760 COG5159: 26S proteasome regulatory complex component [Posttranslational modification, protein turnover, chaperones]. 34761 COG5160: Protease, Ulp1 family [Posttranslational modification, protein turnover, chaperones]. 34762 COG5161: Pre-mRNA cleavage and polyadenylation specificity factor [RNA processing and modification]. 34763 COG5162: Transcription initiation factor TFIID, subunit TAF10 (also component of histone acetyltransferase SAGA) [Transcription]. 34764 COG5163: Protein required for biogenesis of the 60S ribosomal subunit [Translation, ribosomal structure and biogenesis]. 34765 COG5164: Transcription elongation factor [Transcription]. 34766 COG5165: Nucleosome-binding factor SPN, POB3 subunit [Transcription / DNA replication, recombination, and repair / Chromatin structure and dynamics]. 34767 COG5166: Uncharacterized conserved protein [Function unknown]. 34768 COG5167: Protein involved in vacuole import and degradation [Intracellular trafficking and secretion]. 34769 COG5169: Heat shock transcription factor [Transcription]. 34770 COG5170: Serine/threonine protein phosphatase 2A, regulatory subunit [Signal transduction mechanisms]. 34771 COG5171: Ran GTPase-activating protein (Ran-binding protein) [Intracellular trafficking and secretion]. 34772 COG5173: Exocyst complex subunit SEC6 [Intracellular trafficking and secretion]. 34773 COG5174: Transcription initiation factor IIE, beta subunit [Transcription]. 34774 COG5175: Transcriptional repressor [Transcription]. 34775 COG5176: Splicing factor (branch point binding protein) [RNA processing and modification]. 34776 COG5177: Uncharacterized conserved protein [Function unknown]. 34777 COG5178: U5 snRNP spliceosome subunit [RNA processing and modification]. 34778 COG5179: Transcription initiation factor TFIID, subunit TAF1 [Transcription]. 34779 COG5180: Protein interacting with poly(A)-binding protein [RNA processing and modification]. 34780 COG5181: U2 snRNP spliceosome subunit [RNA processing and modification]. 34781 COG5182: Splicing factor 3b, subunit 2 [RNA processing and modification]. 34782 COG5183: Protein involved in mRNA turnover and stability [RNA processing and modification]. 34783 COG5184: Alpha-tubulin suppressor and related RCC1 domain-containing proteins [Cell division and chromosome partitioning / Cytoskeleton]. 34784 COG5185: Protein involved in chromosome segregation, interacts with SMC proteins [Cell division and chromosome partitioning]. 34785 COG5186: Poly(A) polymerase [RNA processing and modification]. 34786 COG5187: 26S proteasome regulatory complex component, contains PCI domain [Posttranslational modification, protein turnover, chaperones]. 34787 COG5188: Splicing factor 3a, subunit 3 [RNA processing and modification]. 34788 COG5189: Putative transcriptional repressor regulating G2/M transition [Transcription / Cell division and chromosome partitioning]. 34789 COG5190: TFIIF-interacting CTD phosphatases, including NLI-interacting factor [Transcription]. 34790 COG5191: Uncharacterized conserved protein, contains HAT (Half-A-TPR) repeat [General function prediction only]. 34791 COG5192: GTP-binding protein required for 40S ribosome biogenesis [Translation, ribosomal structure and biogenesis]. 34792 COG5193: La protein, small RNA-binding pol III transcript stabilizing protein and related La-motif-containing proteins involved in translation [Posttranslational modification, protein turnover, chaperones / Translation, ribosomal structure and biogenesis]. 34793 COG5194: Component of SCF ubiquitin ligase and anaphase-promoting complex [Posttranslational modification, protein turnover, chaperones / Cell division and chromosome partitioning]. 34794 COG5195: Uncharacterized conserved protein [Function unknown]. 34795 COG5196: ER lumen protein retaining receptor [Intracellular trafficking and secretion]. 34796 COG5197: Predicted membrane protein [Function unknown]. 34797 COG5198: Protein tyrosine phosphatase-like protein (contains Pro instead of catalytic Arg) [General function prediction only]. 34798 COG5199: Calponin [Cytoskeleton]. 34799 COG5200: U1 snRNP component, mediates U1 snRNP association with cap-binding complex [RNA processing and modification]. 34800 COG5201: SCF ubiquitin ligase, SKP1 component [Posttranslational modification, protein turnover, chaperones]. 34801 COG5202: Predicted membrane protein [Function unknown]. 34802 COG5204: Transcription elongation factor SPT4 [Transcription]. 34803 COG5206: Glycosylphosphatidylinositol transamidase (GPIT), subunit GPI8 [Posttranslational modification, protein turnover, chaperones]. 34804 COG5207: Isopeptidase T [Posttranslational modification, protein turnover, chaperones]. 34805 COG5208: CCAAT-binding factor, subunit C [Transcription]. 34806 COG5209: Uncharacterized protein involved in cell differentiation/sexual development [General function prediction only]. 34807 COG5210: GTPase-activating protein [General function prediction only]. 34808 COG5211: RNA polymerase II-interacting protein involved in transcription start site selection [Transcription]. 34809 COG5212: Low-affinity cAMP phosphodiesterase [Signal transduction mechanisms]. 34810 COG5213: Polyadenylation factor I complex, subunit FIP1 [RNA processing and modification]. 34811 COG5214: DNA polymerase alpha-primase complex, polymerase-associated subunit B [DNA replication, recombination, and repair]. 34812 COG5215: Karyopherin (importin) beta [Intracellular trafficking and secretion]. 34813 COG5216: Uncharacterized conserved protein [Function unknown]. 34814 COG5217: Microtubule-binding protein involved in cell cycle control [Cell division and chromosome partitioning / Cytoskeleton]. 34815 COG5218: Chromosome condensation complex Condensin, subunit G [Chromatin structure and dynamics / Cell division and chromosome partitioning]. 34816 COG5219: Uncharacterized conserved protein, contains RING Zn-finger [General function prediction only]. 34817 COG5220: Cdk activating kinase (CAK)/RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH, subunit TFB3 [Cell division and chromosome partitioning / Transcription / DNA replication, recombination, and repair]. 34818 COG5221: Dopey and related predicted leucine zipper transcription factors [Transcription]. 34819 COG5222: Uncharacterized conserved protein, contains RING Zn-finger [General function prediction only]. 34820 COG5223: Uncharacterized conserved protein [Function unknown]. 34821 COG5224: CCAAT-binding factor, subunit B [Transcription]. 34822 COG5225: Uncharacterized protein involved in ribosome biogenesis [Translation, ribosomal structure and biogenesis]. 34823 COG5226: mRNA capping enzyme, guanylyltransferase (alpha) subunit [RNA processing and modification]. 34824 COG5227: Ubiquitin-like protein (sentrin) [Posttranslational modification, protein turnover, chaperones]. 34825 COG5228: mRNA deadenylase subunit [RNA processing and modification]. 34826 COG5229: Chromosome condensation complex Condensin, subunit H [Chromatin structure and dynamics / Cell division and chromosome partitioning]. 34827 COG5230: Uncharacterized conserved protein [Function unknown]. 34828 COG5231: Vacuolar H+-ATPase V1 sector, subunit H [Energy production and conversion]. 34829 COG5232: Preprotein translocase subunit Sec62 [Intracellular trafficking and secretion]. 34830 COG5233: Peripheral Golgi membrane protein [Intracellular trafficking and secretion]. 34831 COG5234: Beta-tubulin folding cofactor D [Posttranslational modification, protein turnover, chaperones / Cytoskeleton]. 34832 COG5235: Single-stranded DNA-binding replication protein A (RPA), medium (30 kD) subunit [DNA replication, recombination, and repair]. 34833 COG5236: Uncharacterized conserved protein, contains RING Zn-finger [General function prediction only]. 34834 COG5237: Predicted membrane protein [Function unknown]. 34835 COG5238: Ran GTPase-activating protein (RanGAP) involved in mRNA processing and transport [Signal transduction mechanisms / RNA processing and modification]. 34836 COG5239: mRNA deadenylase, exonuclease subunit and related nucleases [RNA processing and modification]. 34837 COG5240: Vesicle coat complex COPI, gamma subunit [Intracellular trafficking and secretion]. 34838 COG5241: Nucleotide excision repair endonuclease NEF1, RAD10 subunit [DNA replication, recombination, and repair]. 34839 COG5242: RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH, subunit TFB4 [Transcription / DNA replication, recombination, and repair]. 34840 COG5243: HRD ubiquitin ligase complex, ER membrane component [Posttranslational modification, protein turnover, chaperones]. 34841 COG5244: Dynactin complex subunit involved in mitotic spindle partitioning in anaphase B [Cell division and chromosome partitioning]. 34842 COG5245: Dynein, heavy chain [Cytoskeleton]. 34843 COG5246: Splicing factor 3a, subunit 2 [RNA processing and modification]. 34844 COG5247: Class 2 transcription repressor NC2, alpha subunit (DRAP1 homolog) [Transcription]. 34845 COG5248: Transcription initiation factor TFIID, subunit TAF13 [Transcription]. 34846 COG5249: Golgi protein involved in Golgi-to-ER retrieval [Intracellular trafficking and secretion]. 34847 COG5250: RNA polymerase II, fourth largest subunit [Transcription]. 34848 COG5251: Transcription initiation factor TFIID, subunit TAF11 [Transcription]. 34849 COG5252: Uncharacterized conserved protein, contains CCCH-type Zn-finger protein [General function prediction only]. 34850 COG5253: Phosphatidylinositol-4-phosphate 5-kinase [Signal transduction mechanisms]. 34851 COG5254: Predicted membrane protein [Function unknown]. 34852 COG5255: Uncharacterized protein conserved in bacteria [Function unknown]. 34853 COG5256: Translation elongation factor EF-1alpha (GTPase) [Translation, ribosomal structure and biogenesis]. 34854 COG5257: Translation initiation factor 2, gamma subunit (eIF-2gamma; GTPase) [Translation, ribosomal structure and biogenesis]. 34855 COG5258: GTPase [General function prediction only]. 34856 COG5259: RSC chromatin remodeling complex subunit RSC8 [Chromatin structure and dynamics / Transcription]. 34857 COG5260: DNA polymerase sigma [DNA replication, recombination, and repair]. 34858 COG5261: Protein involved in regulation of cellular morphogenesis/cytokinesis [Cell division and chromosome partitioning / Signal transduction mechanisms]. 34859 COG5262: Histone H2A [Chromatin structure and dynamics]. 34860 COG5263: FOG: Glucan-binding domain (YG repeat) [General function prediction only]. 34861 COG5264: Vacuolar transporter chaperone [Posttranslational modification, protein turnover, chaperones]. 34862 COG5265: ABC-type transport system involved in Fe-S cluster assembly, permease and ATPase components [Posttranslational modification, protein turnover, chaperones]. 34863 COG5266: ABC-type Co2+ transport system, periplasmic component [Inorganic ion transport and metabolism]. 34864 COG5267: Uncharacterized protein conserved in bacteria [Function unknown]. 34865 COG5268: Type IV secretory pathway, TrbD component [Cell motility and secretion / Intracellular trafficking and secretion]. 34866 COG5269: Ribosome-associated chaperone zuotin [Translation, ribosomal structure and biogenesis / Posttranslational modification, protein turnover, chaperones]. 34867 COG5270: PUA domain (predicted RNA-binding domain) [Translation, ribosomal structure and biogenesis]. 34868 COG5271: AAA ATPase containing von Willebrand factor type A (vWA) domain [General function prediction only]. 34869 COG5272: Ubiquitin [Posttranslational modification, protein turnover, chaperones]. 34870 COG5273: Uncharacterized protein containing DHHC-type Zn finger [General function prediction only]. 34871 COG5274: Cytochrome b involved in lipid metabolism [Energy production and conversion / Lipid metabolism]. 34872 COG5275: BRCT domain type II [General function prediction only]. 34873 COG5276: Uncharacterized conserved protein [Function unknown]. 34874 COG5277: Actin and related proteins [Cytoskeleton]. 34875 COG5278: Predicted periplasmic ligand-binding sensor domain [Signal transduction mechanisms]. 34876 COG5279: Uncharacterized protein involved in cytokinesis, contains TGc (transglutaminase/protease-like) domain [Cell division and chromosome partitioning]. 34877 COG5280: Phage-related minor tail protein [Function unknown]. 34878 COG5281: Phage-related minor tail protein [Function unknown]. 34879 COG5282: Uncharacterized conserved protein [Function unknown]. 34880 COG5283: Phage-related tail protein [Function unknown]. 34881 COG5285: Protein involved in biosynthesis of mitomycin antibiotics/polyketide fumonisin [Secondary metabolites biosynthesis, transport, and catabolism]. 34882 COG5290: IkappaB kinase complex, IKAP component [Transcription]. 34883 COG5291: Predicted membrane protein [Function unknown]. 34884 COG5293: Uncharacterized protein conserved in bacteria [Function unknown]. 34885 COG5294: Uncharacterized protein conserved in bacteria [Function unknown]. 34886 COG5295: Autotransporter adhesin [Intracellular trafficking and secretion / Extracellular structures]. 34887 COG5296: Transcription factor involved in TATA site selection and in elongation by RNA polymerase II [Transcription]. 34888 COG5297: Cellobiohydrolase A (1,4-beta-cellobiosidase A) [Carbohydrate transport and metabolism]. 34889 COG5298: Uncharacterized protein conserved in bacteria [Function unknown]. 34890 COG5301: Phage-related tail fibre protein [General function prediction only]. 34891 COG5302: Post-segregation antitoxin (ccd killing mechanism protein) encoded by the F plasmid [General function prediction only]. 34892 COG5304: Uncharacterized protein conserved in bacteria [Function unknown]. 34893 COG5305: Predicted membrane protein [Function unknown]. 34894 COG5306: Uncharacterized conserved protein [Function unknown]. 34895 COG5307: SEC7 domain proteins [General function prediction only]. 34896 COG5308: Nuclear pore complex subunit [Intracellular trafficking and secretion]. 34897 COG5309: Exo-beta-1,3-glucanase [Carbohydrate transport and metabolism]. 34898 COG5310: Homospermidine synthase [Secondary metabolites biosynthesis, transport, and catabolism]. 34899 COG5314: Conjugal transfer/entry exclusion protein [Intracellular trafficking and secretion]. 34900 COG5316: Uncharacterized conserved protein [Function unknown]. 34901 COG5317: Uncharacterized protein conserved in bacteria [Function unknown]. 34902 COG5319: Uncharacterized protein conserved in bacteria [Function unknown]. 34903 COG5321: Uncharacterized protein conserved in bacteria [Function unknown]. 34904 COG5322: Predicted dehydrogenase [General function prediction only]. 34905 COG5323: Uncharacterized conserved protein [Function unknown]. 34906 COG5324: Uncharacterized conserved protein [Function unknown]. 34907 COG5325: t-SNARE complex subunit, syntaxin [Intracellular trafficking and secretion]. 34908 COG5328: Uncharacterized protein conserved in bacteria [Function unknown]. 34909 COG5329: Phosphoinositide polyphosphatase (Sac family) [Signal transduction mechanisms]. 34910 COG5330: Uncharacterized protein conserved in bacteria [Function unknown]. 34911 COG5331: Uncharacterized protein conserved in bacteria [Function unknown]. 34912 COG5333: Cdk activating kinase (CAK)/RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH/TFIIK, cyclin H subunit [Cell division and chromosome partitioning / Transcription / DNA replication, recombination, and repair]. 34913 COG5336: Uncharacterized protein conserved in bacteria [Function unknown]. 34914 COG5337: Spore coat assembly protein [Cell envelope biogenesis, outer membrane]. 34915 COG5338: Uncharacterized protein conserved in bacteria [Function unknown]. 34916 COG5339: Uncharacterized protein conserved in bacteria [Function unknown]. 34917 COG5340: Predicted transcriptional regulator [Transcription]. 34918 COG5341: Uncharacterized protein conserved in bacteria [Function unknown]. 34919 COG5342: Invasion protein B, involved in pathogenesis [General function prediction only]. 34920 COG5343: Uncharacterized protein conserved in bacteria [Function unknown]. 34921 COG5345: Uncharacterized protein conserved in bacteria [Function unknown]. 34922 COG5346: Predicted membrane protein [Function unknown]. 34923 COG5347: GTPase-activating protein that regulates ARFs (ADP-ribosylation factors), involved in ARF-mediated vesicular transport [Intracellular trafficking and secretion]. 34924 COG5349: Uncharacterized protein conserved in bacteria [Function unknown]. 34925 COG5350: Predicted protein tyrosine phosphatase [General function prediction only]. 34926 COG5351: Uncharacterized protein conserved in bacteria [Function unknown]. 34927 COG5352: Uncharacterized protein conserved in bacteria [Function unknown]. 34928 COG5353: Uncharacterized protein conserved in bacteria [Function unknown]. 34929 COG5354: Uncharacterized protein, contains Trp-Asp (WD) repeat [General function prediction only]. 34930 COG5360: Uncharacterized protein conserved in bacteria [Function unknown]. 34931 COG5361: Uncharacterized conserved protein [Function unknown]. 34932 COG5362: Phage-related terminase [General function prediction only]. 34933 COG5366: Protein involved in propagation of M2 dsRNA satellite of L-A virus [General function prediction only]. 34934 COG5368: Uncharacterized protein conserved in bacteria [Function unknown]. 34935 COG5369: Uncharacterized conserved protein [Function unknown]. 34936 COG5371: Golgi nucleoside diphosphatase [Carbohydrate transport and metabolism / Posttranslational modification, protein turnover, chaperones]. 34937 COG5373: Predicted membrane protein [Function unknown]. 34938 COG5374: Uncharacterized conserved protein [Function unknown]. 34939 COG5375: Uncharacterized protein conserved in bacteria [Function unknown]. 34940 COG5377: Phage-related protein, predicted endonuclease [DNA replication, recombination, and repair]. 34941 COG5378: Predicted nucleotide-binding protein [General function prediction only]. 34942 COG5379: S-adenosylmethionine:diacylglycerol 3-amino-3-carboxypropyl transferase [Lipid metabolism]. 34943 COG5380: Lipase chaperone [Posttranslational modification, protein turnover, chaperones]. 34944 COG5381: Uncharacterized protein conserved in bacteria [Function unknown]. 34945 COG5383: Uncharacterized protein conserved in bacteria [Function unknown]. 34946 COG5384: U3 small nucleolar ribonucleoprotein component [Translation, ribosomal structure and biogenesis]. 34947 COG5385: Uncharacterized protein conserved in bacteria [Function unknown]. 34948 COG5386: Cell surface protein [Cell envelope biogenesis, outer membrane]. 34949 COG5387: Chaperone required for the assembly of the mitochondrial F1-ATPase [Posttranslational modification, protein turnover, chaperones]. 34950 COG5388: Uncharacterized protein conserved in bacteria [Function unknown]. 34951 COG5389: Uncharacterized protein conserved in bacteria [Function unknown]. 34952 COG5391: Phox homology (PX) domain protein [Intracellular trafficking and secretion / General function prediction only]. 34953 COG5393: Predicted membrane protein [Function unknown]. 34954 COG5394: Uncharacterized protein conserved in bacteria [Function unknown]. 34955 COG5395: Predicted membrane protein [Function unknown]. 34956 COG5397: Uncharacterized conserved protein [Function unknown]. 34957 COG5398: Heme oxygenase [Inorganic ion transport and metabolism]. 34958 COG5399: Uncharacterized protein conserved in archaea [Function unknown]. 34959 COG5400: Uncharacterized protein conserved in bacteria [Function unknown]. 34960 COG5401: Spore germination protein [General function prediction only]. 34961 COG5402: Uncharacterized conserved protein [Function unknown]. 34962 COG5403: Uncharacterized conserved protein [Function unknown]. 34963 COG5404: SOS-response cell division inhibitor, blocks FtsZ ring formation [Cell division and chromosome partitioning]. 34964 COG5405: ATP-dependent protease HslVU (ClpYQ), peptidase subunit [Posttranslational modification, protein turnover, chaperones]. 34965 COG5406: Nucleosome binding factor SPN, SPT16 subunit [Transcription / DNA replication, recombination, and repair / Chromatin structure and dynamics]. 34966 COG5407: Preprotein translocase subunit Sec63 [Intracellular trafficking and secretion]. 34967 COG5408: SPX domain-containing protein [Signal transduction mechanisms]. 34968 COG5409: EXS domain-containing protein [Signal transduction mechanisms]. 34969 COG5410: Uncharacterized protein conserved in bacteria [Function unknown]. 34970 COG5411: Phosphatidylinositol 5-phosphate phosphatase [Signal transduction mechanisms]. 34971 COG5412: Phage-related protein [Function unknown]. 34972 COG5413: Uncharacterized integral membrane protein [Function unknown]. 34973 COG5414: TATA-binding protein-associated factor [Transcription]. 34974 COG5415: Predicted integral membrane metal-binding protein [General function prediction only]. 34975 COG5416: Uncharacterized integral membrane protein [Function unknown]. 34976 COG5417: Uncharacterized small protein [Function unknown]. 34977 COG5418: Predicted secreted protein [Function unknown]. 34978 COG5419: Uncharacterized conserved protein [Function unknown]. 34979 COG5420: Uncharacterized conserved small protein containing a coiled-coil domain [Function unknown]. 34980 COG5421: Transposase [DNA replication, recombination, and repair]. 34981 COG5422: RhoGEF, Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases [Signal transduction mechanisms]. 34982 COG5423: Predicted metal-binding protein [Function unknown]. 34983 COG5424: Pyrroloquinoline quinone (Coenzyme PQQ) biosynthesis protein C [Coenzyme metabolism]. 34984 COG5425: Usg protein, probable subunit of phosphoribosylanthranilate isomerase [Amino acid transport and metabolism]. 34985 COG5426: Uncharacterized membrane protein [Function unknown]. 34986 COG5427: Uncharacterized membrane protein [Function unknown]. 34987 COG5428: Uncharacterized conserved small protein [Function unknown]. 34988 COG5429: Uncharacterized secreted protein [Function unknown]. 34989 COG5430: Uncharacterized secreted protein [Function unknown]. 34990 COG5431: Uncharacterized metal-binding protein [Function unknown]. 34991 COG5432: RING-finger-containing E3 ubiquitin ligase [Signal transduction mechanisms]. 34992 COG5433: Transposase [DNA replication, recombination, and repair]. 34993 COG5434: Endopygalactorunase [Cell envelope biogenesis, outer membrane]. 34994 COG5435: Uncharacterized conserved protein [Function unknown]. 34995 COG5436: Predicted integral membrane protein [Function unknown]. 34996 COG5437: Predicted secreted protein [Function unknown]. 34997 COG5438: Predicted multitransmembrane protein [Function unknown]. 34998 COG5439: Uncharacterized conserved protein [Function unknown]. 34999 COG5440: Uncharacterized conserved protein [Function unknown]. 35000 COG5441: Uncharacterized conserved protein [Function unknown]. 35001 COG5442: Flagellar biosynthesis regulator FlaF [Cell motility and secretion]. 35002 COG5443: Flagellar biosynthesis regulator FlbT [Cell motility and secretion]. 35003 COG5444: Uncharacterized conserved protein [Function unknown]. 35004 COG5445: Predicted secreted protein [Function unknown]. 35005 COG5446: Predicted integral membrane protein [Function unknown]. 35006 COG5447: Uncharacterized conserved protein [Function unknown]. 35007 COG5448: Uncharacterized conserved protein [Function unknown]. 35008 COG5449: Uncharacterized conserved protein [Function unknown]. 35009 COG5450: Transcription regulator of the Arc/MetJ class [Transcription]. 35010 COG5451: Predicted secreted protein [Function unknown]. 35011 COG5452: Uncharacterized conserved protein [Function unknown]. 35012 COG5453: Uncharacterized conserved protein [Function unknown]. 35013 COG5454: Predicted secreted protein [Function unknown]. 35014 COG5455: Predicted integral membrane protein [Function unknown]. 35015 COG5456: Predicted integral membrane protein linked to a cation pump [Inorganic ion transport and metabolism]. 35016 COG5457: Uncharacterized conserved small protein [Function unknown]. 35017 COG5458: Uncharacterized conserved protein [Function unknown]. 35018 COG5459: Predicted rRNA methylase [Translation, ribosomal structure and biogenesis]. 35019 COG5460: Uncharacterized conserved protein [Function unknown]. 35020 COG5461: Type IV pili component [Cell motility and secretion]. 35021 COG5462: Predicted secreted (periplasmic) protein [Function unknown]. 35022 COG5463: Predicted integral membrane protein [Function unknown]. 35023 COG5464: Uncharacterized conserved protein [Function unknown]. 35024 COG5465: Uncharacterized conserved protein [Function unknown]. 35025 COG5466: Predicted small metal-binding protein [Function unknown]. 35026 COG5467: Uncharacterized conserved protein [Function unknown]. 35027 COG5468: Predicted secreted (periplasmic) protein [Function unknown]. 35028 COG5469: Predicted metal-binding protein [Function unknown]. 35029 COG5470: Uncharacterized conserved protein [Function unknown]. 35030 COG5471: Uncharacterized conserved protein [Function unknown]. 35031 COG5472: Predicted small integral membrane protein [Function unknown]. 35032 COG5473: Predicted integral membrane protein [Function unknown]. 35033 COG5474: Uncharacterized conserved protein [Function unknown]. 35034 COG5475: Uncharacterized small protein [Function unknown]. 35035 COG5476: Uncharacterized conserved protein [Function unknown]. 35036 COG5477: Predicted small integral membrane protein [Function unknown]. 35037 COG5478: Predicted small integral membrane protein [Function unknown]. 35038 COG5479: Uncharacterized protein potentially involved in peptidoglycan biosynthesis [Cell envelope biogenesis, outer membrane]. 35039 COG5480: Predicted integral membrane protein [Function unknown]. 35040 COG5481: Uncharacterized conserved small protein containing a coiled-coil domain [Function unknown]. 35041 COG5482: Uncharacterized conserved protein [Function unknown]. 35042 COG5483: Uncharacterized conserved protein [Function unknown]. 35043 COG5484: Uncharacterized conserved protein [Function unknown]. 35044 COG5485: Predicted ester cyclase [General function prediction only]. 35045 COG5486: Predicted metal-binding integral membrane protein [Function unknown]. 35046 COG5487: Small integral membrane protein [Function unknown]. 35047 COG5488: Integral membrane protein [Function unknown]. 35048 COG5489: Uncharacterized conserved protein [Function unknown]. 35049 COG5490: Uncharacterized conserved protein [Function unknown]. 35050 COG5491: Conserved protein implicated in secretion [Cell motility and secretion]. 35051 COG5492: Bacterial surface proteins containing Ig-like domains [Cell motility and secretion]. 35052 COG5493: Uncharacterized conserved protein containing a coiled-coil domain [Function unknown]. 35053 COG5494: Predicted thioredoxin/glutaredoxin [Posttranslational modification, protein turnover, chaperones]. 35054 COG5495: Uncharacterized conserved protein [Function unknown]. 35055 COG5496: Predicted thioesterase [General function prediction only]. 35056 COG5497: Predicted secreted protein [Function unknown]. 35057 COG5498: Predicted glycosyl hydrolase [Cell envelope biogenesis, outer membrane]. 35058 COG5499: Predicted transcription regulator containing HTH domain [Transcription]. 35059 COG5500: Predicted integral membrane protein [Function unknown]. 35060 COG5501: Predicted secreted protein [Function unknown]. 35061 COG5502: Uncharacterized conserved protein [Function unknown]. 35062 COG5503: Uncharacterized conserved small protein [Function unknown]. 35063 COG5504: Predicted Zn-dependent protease [Posttranslational modification, protein turnover, chaperones]. 35064 COG5505: Predicted integral membrane protein [Function unknown]. 35065 COG5506: Uncharacterized conserved protein [Function unknown]. 35066 COG5507: Uncharacterized conserved protein [Function unknown]. 35067 COG5508: Uncharacterized conserved small protein [Function unknown]. 35068 COG5509: Uncharacterized small protein containing a coiled-coil domain [Function unknown]. 35069 COG5510: Predicted small secreted protein [Function unknown]. 35070 COG5511: Bacteriophage capsid protein [General function prediction only]. 35071 COG5512: Zn-ribbon-containing, possibly RNA-binding protein and truncated derivatives [General function prediction only]. 35072 COG5513: Predicted secreted protein [Function unknown]. 35073 COG5514: Uncharacterized conserved protein [Function unknown]. 35074 COG5515: Uncharacterized conserved small protein [Function unknown]. 35075 COG5516: Conserved protein containing a Zn-ribbon-like motif, possibly RNA-binding [General function prediction only]. 35076 COG5517: Small subunit of phenylpropionate dioxygenase [Secondary metabolites biosynthesis, transport, and catabolism]. 35077 COG5518: Bacteriophage capsid portal protein [General function prediction only]. 35078 COG5519: Superfamily II helicase and inactivated derivatives [DNA replication, recombination, and repair]. 35079 COG5520: O-Glycosyl hydrolase [Cell envelope biogenesis, outer membrane]. 35080 COG5521: Predicted integral membrane protein [Function unknown]. 35081 COG5522: Predicted integral membrane protein [Function unknown]. 35082 COG5523: Predicted integral membrane protein [Function unknown]. 35083 COG5524: Bacteriorhodopsin [General function prediction only]. 35084 COG5525: Bacteriophage tail assembly protein [General function prediction only]. 35085 COG5526: Uncharacterized conserved protein [Function unknown]. 35086 COG5527: Protein involved in initiation of plasmid replication [DNA replication, recombination, and repair]. 35087 COG5528: Predicted integral membrane protein [Function unknown]. 35088 COG5529: Pyocin large subunit [General function prediction only]. 35089 COG5530: Predicted integral membrane protein [Function unknown]. 35090 COG5531: SWIB-domain-containing proteins implicated in chromatin remodeling [Chromatin structure and dynamics]. 35091 COG5532: Uncharacterized conserved protein [Function unknown]. 35092 COG5533: Ubiquitin C-terminal hydrolase [Posttranslational modification, protein turnover, chaperones]. 35093 COG5534: Plasmid replication initiator protein [DNA replication, recombination, and repair]. 35094 COG5535: DNA repair protein RAD4 [DNA replication, recombination, and repair]. 35095 COG5536: Protein prenyltransferase, alpha subunit [Posttranslational modification, protein turnover, chaperones]. 35096 COG5537: Cohesin [Cell division and chromosome partitioning]. 35097 COG5538: Endoplasmic reticulum translocation complex, subunit SEC66 [Cell motility and secretion]. 35098 COG5539: Predicted cysteine protease (OTU family) [Posttranslational modification, protein turnover, chaperones]. 35099 COG5540: RING-finger-containing ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 35100 COG5541: Vesicle coat complex COPI, zeta subunit [Posttranslational modification, protein turnover, chaperones]. 35101 COG5542: Predicted integral membrane protein [Function unknown]. 35102 COG5543: Uncharacterized conserved protein [Function unknown]. 35103 COG5544: Predicted periplasmic lipoprotein [General function prediction only]. 35104 COG5545: Predicted P-loop ATPase and inactivated derivatives [General function prediction only]. 35105 COG5546: Small integral membrane protein [Function unknown]. 35106 COG5547: Small integral membrane protein [Function unknown]. 35107 COG5548: Small integral membrane protein [Function unknown]. 35108 COG5549: Predicted Zn-dependent protease [Posttranslational modification, protein turnover, chaperones]. 35109 COG5550: Predicted aspartyl protease [Posttranslational modification, protein turnover, chaperones]. 35110 COG5551: Uncharacterized conserved protein [Function unknown]. 35111 COG5552: Uncharacterized conserved protein [Function unknown]. 35112 COG5553: Predicted metal-dependent enzyme of the double-stranded beta helix superfamily [General function prediction only]. 35113 COG5554: Nitrogen fixation protein [Secondary metabolites biosynthesis, transport, and catabolism]. 35114 COG5555: Cytolysin, a secreted calcineurin-like phosphatase [Cell motility and secretion]. 35115 COG5556: Uncharacterized conserved protein [Function unknown]. 35116 COG5557: Polysulphide reductase [Energy production and conversion]. 35117 COG5558: Transposase [DNA replication, recombination, and repair]. 35118 COG5559: Uncharacterized conserved small protein [Function unknown]. 35119 COG5560: Ubiquitin C-terminal hydrolase [Posttranslational modification, protein turnover, chaperones]. 35120 COG5561: Predicted metal-binding protein [Function unknown]. 35121 COG5562: Phage envelope protein [General function prediction only]. 35122 COG5563: Predicted integral membrane proteins containing uncharacterized repeats [Function unknown]. 35123 COG5564: Predicted TIM-barrel enzyme, possibly a dioxygenase [General function prediction only]. 35124 COG5565: Bacteriophage terminase large (ATPase) subunit and inactivated derivatives [General function prediction only]. 35125 COG5566: Uncharacterized conserved protein [Function unknown]. 35126 COG5567: Predicted small periplasmic lipoprotein [Cell motility and secretion]. 35127 COG5568: Uncharacterized small protein [Function unknown]. 35128 COG5569: Uncharacterized conserved protein [Function unknown]. 35129 COG5570: Uncharacterized small protein [Function unknown]. 35130 COG5571: Autotransporter protein or domain, integral membrane beta-barrel involved in protein secretion [Cell motility and secretion]. 35131 COG5572: Predicted integral membrane protein [Function unknown]. 35132 COG5573: Predicted nucleic-acid-binding protein, contains PIN domain [General function prediction only]. 35133 COG5574: RING-finger-containing E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 35134 COG5575: Origin recognition complex, subunit 2 [DNA replication, recombination, and repair]. 35135 COG5576: Homeodomain-containing transcription factor [Transcription]. 35136 COG5577: Spore coat protein [Cell envelope biogenesis, outer membrane]. 35137 COG5578: Predicted integral membrane protein [Function unknown]. 35138 COG5579: Uncharacterized conserved protein [Function unknown]. 35139 COG5580: Activator of HSP90 ATPase [Posttranslational modification, protein turnover, chaperones]. 35140 COG5581: Predicted glycosyltransferase [Cell envelope biogenesis, outer membrane]. 35141 COG5582: Uncharacterized conserved protein [Function unknown]. 35142 COG5583: Uncharacterized small protein [Function unknown]. 35143 COG5584: Predicted small secreted protein [Function unknown]. 35144 COG5585: NAD+--asparagine ADP-ribosyltransferase [Signal transduction mechanisms]. 35145 COG5586: Uncharacterized conserved protein [Function unknown]. 35146 COG5587: Uncharacterized conserved protein [Function unknown]. 35147 COG5588: Uncharacterized conserved protein [Function unknown]. 35148 COG5589: Uncharacterized conserved protein [Function unknown]. 35149 COG5590: Uncharacterized conserved protein [Function unknown]. 35150 COG5591: Uncharacterized conserved protein [Function unknown]. 35151 COG5592: Uncharacterized conserved protein [Function unknown]. 35152 COG5593: Nucleic-acid-binding protein possibly involved in ribosomal biogenesis [Translation, ribosomal structure and biogenesis]. 35153 COG5594: Uncharacterized integral membrane protein [Function unknown]. 35154 COG5595: Zn-ribbon-containing, possibly nucleic-acid-binding protein [General function prediction only]. 35155 COG5596: Mitochondrial import inner membrane translocase, subunit TIM22 [Posttranslational modification, protein turnover, chaperones]. 35156 COG5597: Alpha-N-acetylglucosamine transferase [Cell envelope biogenesis, outer membrane]. 35157 COG5598: Trimethylamine:corrinoid methyltransferase [Coenzyme metabolism]. 35158 COG5599: Protein tyrosine phosphatase [Signal transduction mechanisms]. 35159 COG5600: Transcription-associated recombination protein [DNA replication, recombination, and repair]. 35160 COG5601: General negative regulator of transcription subunit [Transcription]. 35161 COG5602: Histone deacetylase complex, SIN3 component [Chromatin structure and dynamics]. 35162 COG5603: Subunit of TRAPP, an ER-Golgi tethering complex [Cell motility and secretion]. 35163 COG5604: Uncharacterized conserved protein [Function unknown]. 35164 COG5605: Predicted small integral membrane protein [Function unknown]. 35165 COG5606: Uncharacterized conserved small protein [Function unknown]. 35166 COG5607: Uncharacterized conserved protein [Function unknown]. 35167 COG5608: Conserved secreted protein [Function unknown]. 35168 COG5609: Uncharacterized conserved protein [Function unknown]. 35169 COG5610: Predicted hydrolase (HAD superfamily) [General function prediction only]. 35170 COG5611: Predicted nucleic-acid-binding protein, contains PIN domain [General function prediction only]. 35171 COG5612: Predicted integral membrane protein [Function unknown]. 35172 COG5613: Uncharacterized conserved protein [Function unknown]. 35173 COG5614: Bacteriophage head-tail adaptor [General function prediction only]. 35174 COG5615: Predicted integral membrane protein [Function unknown]. 35175 COG5616: Predicted integral membrane protein [Function unknown]. 35176 COG5617: Predicted integral membrane protein [Function unknown]. 35177 COG5618: Predicted periplasmic lipoprotein [General function prediction only]. 35178 COG5619: Uncharacterized conserved protein [Function unknown]. 35179 COG5620: Uncharacterized conserved protein [Function unknown]. 35180 COG5621: Predicted secreted hydrolase [General function prediction only]. 35181 COG5622: Protein required for attachment to host cells [Cell motility and secretion]. 35182 COG5623: Predicted GTPase subunit of the pre-mRNA cleavage complex [Translation, ribosomal structure and biogenesis]. 35183 COG5624: Transcription initiation factor TFIID, subunit TAF12 (also component of histone acetyltransferase SAGA) [Transcription]. 35184 COG5625: Predicted transcription regulator containing HTH domain [Transcription]. 35185 COG5626: Uncharacterized small conserved protein [Function unknown]. 35186 COG5627: DNA repair protein MMS21 [DNA replication, recombination, and repair]. 35187 COG5628: Predicted acetyltransferase [General function prediction only]. 35188 COG5629: Predicted metal-binding protein [Function unknown]. 35189 COG5630: Acetylglutamate synthase [Amino acid transport and metabolism]. 35190 COG5631: Predicted transcription regulator, contains HTH domain (MarR family) [Transcription]. 35191 COG5632: N-acetylmuramoyl-L-alanine amidase [Cell envelope biogenesis, outer membrane]. 35192 COG5633: Predicted periplasmic lipoprotein [General function prediction only]. 35193 COG5634: Uncharacterized conserved protein [Function unknown]. 35194 COG5635: Predicted NTPase (NACHT family) [Signal transduction mechanisms]. 35195 COG5636: Uncharacterized conserved protein, contains Zn-ribbon-like motif [Function unknown]. 35196 COG5637: Predicted integral membrane protein [Function unknown]. 35197 COG5638: Uncharacterized conserved protein [Function unknown]. 35198 COG5639: Uncharacterized conserved small protein [Function unknown]. 35199 COG5640: Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]. 35200 COG5641: GATA Zn-finger-containing transcription factor [Transcription]. 35201 COG5642: Uncharacterized conserved protein [Function unknown]. 35202 COG5643: Protein containing a metal-binding domain shared with formylmethanofuran dehydrogenase subunit E [General function prediction only]. 35203 COG5644: Uncharacterized conserved protein [Function unknown]. 35204 COG5645: Predicted periplasmic lipoprotein [General function prediction only]. 35205 COG5646: Uncharacterized conserved protein [Function unknown]. 35206 COG5647: Cullin, a subunit of E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 35207 COG5648: Chromatin-associated proteins containing the HMG domain [Chromatin structure and dynamics]. 35208 COG5649: Uncharacterized conserved protein [Function unknown]. 35209 COG5650: Predicted integral membrane protein [Function unknown]. 35210 COG5651: PPE-repeat proteins [Cell motility and secretion]. 35211 COG5652: Predicted integral membrane protein [Function unknown]. 35212 COG5653: Protein involved in cellulose biosynthesis (CelD) [Cell envelope biogenesis, outer membrane]. 35213 COG5654: Uncharacterized conserved protein [Function unknown]. 35214 COG5655: Plasmid rolling circle replication initiator protein and truncated derivatives [DNA replication, recombination, and repair]. 35215 COG5656: Importin, protein involved in nuclear import [Posttranslational modification, protein turnover, chaperones]. 35216 COG5657: CAS/CSE protein involved in chromosome segregation [Cell division and chromosome partitioning]. 35217 COG5658: Predicted integral membrane protein [Function unknown]. 35218 COG5659: FOG: Transposase [DNA replication, recombination, and repair]. 35219 COG5660: Predicted integral membrane protein [Function unknown]. 35220 COG5661: Predicted secreted Zn-dependent protease [Posttranslational modification, protein turnover, chaperones]. 35221 COG5662: Predicted transmembrane transcriptional regulator (anti-sigma factor) [Transcription]. 35222 COG5663: Uncharacterized conserved protein [Function unknown]. 35223 COG5664: Predicted secreted Zn-dependent protease [Posttranslational modification, protein turnover, chaperones]. 35224 COG5665: CCR4-NOT transcriptional regulation complex, NOT5 subunit [Transcription]. 35225 KOG0001: Ubiquitin and ubiquitin-like proteins [Posttranslational modification, protein turnover, chaperones, General function prediction only]. 35226 KOG0002: 60s ribosomal protein L39 [Translation, ribosomal structure and biogenesis]. 35227 KOG0003: Ubiquitin/60s ribosomal protein L40 fusion [Translation, ribosomal structure and biogenesis]. 35228 KOG0004: Ubiquitin/40S ribosomal protein S27a fusion [Translation, ribosomal structure and biogenesis]. 35229 KOG0005: Ubiquitin-like protein [Cell cycle control, cell division, chromosome partitioning, Posttranslational modification, protein turnover, chaperones]. 35230 KOG0006: E3 ubiquitin-protein ligase (Parkin protein) [Posttranslational modification, protein turnover, chaperones]. 35231 KOG0007: Splicing factor 3a, subunit 1 [RNA processing and modification]. 35232 KOG0008: Transcription initiation factor TFIID, subunit TAF1 [Transcription]. 35233 KOG0009: Ubiquitin-like/40S ribosomal S30 protein fusion [Translation, ribosomal structure and biogenesis, Posttranslational modification, protein turnover, chaperones]. 35234 KOG0010: Ubiquitin-like protein [Posttranslational modification, protein turnover, chaperones, General function prediction only]. 35235 KOG0011: Nucleotide excision repair factor NEF2, RAD23 component [Replication, recombination and repair]. 35236 KOG0012: DNA damage inducible protein [Replication, recombination and repair]. 35237 KOG0013: Uncharacterized conserved protein [Function unknown]. 35238 KOG0014: MADS box transcription factor [Transcription]. 35239 KOG0015: Regulator of arginine metabolism and related MADS box-containing transcription factors [Transcription]. 35240 KOG0016: Enoyl-CoA hydratase/isomerase [Lipid transport and metabolism]. 35241 KOG0018: Structural maintenance of chromosome protein 1 (sister chromatid cohesion complex Cohesin, subunit SMC1) [Cell cycle control, cell division, chromosome partitioning]. 35242 KOG0019: Molecular chaperone (HSP90 family) [Posttranslational modification, protein turnover, chaperones]. 35243 KOG0020: Endoplasmic reticulum glucose-regulated protein (GRP94/endoplasmin), HSP90 family [Posttranslational modification, protein turnover, chaperones]. 35244 KOG0021: Glutathione synthetase [Secondary metabolites biosynthesis, transport and catabolism]. 35245 KOG0022: Alcohol dehydrogenase, class III [Secondary metabolites biosynthesis, transport and catabolism]. 35246 KOG0023: Alcohol dehydrogenase, class V [Secondary metabolites biosynthesis, transport and catabolism]. 35247 KOG0024: Sorbitol dehydrogenase [Secondary metabolites biosynthesis, transport and catabolism]. 35248 KOG0025: Zn2+-binding dehydrogenase (nuclear receptor binding factor-1) [Transcription, Energy production and conversion]. 35249 KOG0026: Anthranilate synthase, beta chain [Amino acid transport and metabolism]. 35250 KOG0027: Calmodulin and related proteins (EF-Hand superfamily) [Signal transduction mechanisms]. 35251 KOG0028: Ca2+-binding protein (centrin/caltractin), EF-Hand superfamily protein [Cytoskeleton, Cell cycle control, cell division, chromosome partitioning]. 35252 KOG0029: Amine oxidase [Secondary metabolites biosynthesis, transport and catabolism]. 35253 KOG0030: Myosin essential light chain, EF-Hand protein superfamily [Cytoskeleton]. 35254 KOG0031: Myosin regulatory light chain, EF-Hand protein superfamily [Cytoskeleton]. 35255 KOG0032: Ca2+/calmodulin-dependent protein kinase, EF-Hand protein superfamily [Signal transduction mechanisms]. 35256 KOG0033: Ca2+/calmodulin-dependent protein kinase, EF-Hand protein superfamily [Signal transduction mechanisms]. 35257 KOG0034: Ca2+/calmodulin-dependent protein phosphatase (calcineurin subunit B), EF-Hand superfamily protein [Signal transduction mechanisms]. 35258 KOG0035: Ca2+-binding actin-bundling protein (actinin), alpha chain (EF-Hand protein superfamily) [Cytoskeleton]. 35259 KOG0036: Predicted mitochondrial carrier protein [Nucleotide transport and metabolism]. 35260 KOG0037: Ca2+-binding protein, EF-Hand protein superfamily [Signal transduction mechanisms]. 35261 KOG0038: Ca2+-binding kinase interacting protein (KIP) (EF-Hand protein superfamily) [General function prediction only]. 35262 KOG0039: Ferric reductase, NADH/NADPH oxidase and related proteins [Inorganic ion transport and metabolism, Secondary metabolites biosynthesis, transport and catabolism]. 35263 KOG0040: Ca2+-binding actin-bundling protein (spectrin), alpha chain (EF-Hand protein superfamily) [Cytoskeleton]. 35264 KOG0041: Predicted Ca2+-binding protein, EF-Hand protein superfamily [General function prediction only]. 35265 KOG0042: Glycerol-3-phosphate dehydrogenase [Energy production and conversion]. 35266 KOG0043: Uncharacterized conserved protein, contains DM10 domain [Function unknown]. 35267 KOG0044: Ca2+ sensor (EF-Hand superfamily) [Signal transduction mechanisms]. 35268 KOG0045: Cytosolic Ca2+-dependent cysteine protease (calpain), large subunit (EF-Hand protein superfamily) [Posttranslational modification, protein turnover, chaperones, Signal transduction mechanisms]. 35269 KOG0046: Ca2+-binding actin-bundling protein (fimbrin/plastin), EF-Hand protein superfamily [Cytoskeleton]. 35270 KOG0047: Catalase [Inorganic ion transport and metabolism]. 35271 KOG0048: Transcription factor, Myb superfamily [Transcription]. 35272 KOG0049: Transcription factor, Myb superfamily [Transcription]. 35273 KOG0050: mRNA splicing protein CDC5 (Myb superfamily) [RNA processing and modification, Cell cycle control, cell division, chromosome partitioning]. 35274 KOG0051: RNA polymerase I termination factor, Myb superfamily [Transcription]. 35275 KOG0052: Translation elongation factor EF-1 alpha/Tu [Translation, ribosomal structure and biogenesis]. 35276 KOG0053: Cystathionine beta-lyases/cystathionine gamma-synthases [Amino acid transport and metabolism]. 35277 KOG0054: Multidrug resistance-associated protein/mitoxantrone resistance protein, ABC superfamily [Secondary metabolites biosynthesis, transport and catabolism]. 35278 KOG0055: Multidrug/pheromone exporter, ABC superfamily [Secondary metabolites biosynthesis, transport and catabolism]. 35279 KOG0056: Heavy metal exporter HMT1, ABC superfamily [Inorganic ion transport and metabolism]. 35280 KOG0057: Mitochondrial Fe/S cluster exporter, ABC superfamily [Intracellular trafficking, secretion, and vesicular transport]. 35281 KOG0058: Peptide exporter, ABC superfamily [Intracellular trafficking, secretion, and vesicular transport]. 35282 KOG0059: Lipid exporter ABCA1 and related proteins, ABC superfamily [Lipid transport and metabolism, General function prediction only]. 35283 KOG0060: Long-chain acyl-CoA transporter, ABC superfamily (involved in peroxisome organization and biogenesis) [Lipid transport and metabolism, General function prediction only]. 35284 KOG0061: Transporter, ABC superfamily (Breast cancer resistance protein) [Secondary metabolites biosynthesis, transport and catabolism]. 35285 KOG0062: ATPase component of ABC transporters with duplicated ATPase domains/Translation elongation factor EF-3b [Amino acid transport and metabolism, Translation, ribosomal structure and biogenesis]. 35286 KOG0063: RNAse L inhibitor, ABC superfamily [RNA processing and modification]. 35287 KOG0064: Peroxisomal long-chain acyl-CoA transporter, ABC superfamily [Lipid transport and metabolism]. 35288 KOG0065: Pleiotropic drug resistance proteins (PDR1-15), ABC superfamily [Secondary metabolites biosynthesis, transport and catabolism]. 35289 KOG0066: eIF2-interacting protein ABC50 (ABC superfamily) [Translation, ribosomal structure and biogenesis]. 35290 KOG0067: Transcription factor CtBP [Transcription]. 35291 KOG0068: D-3-phosphoglycerate dehydrogenase, D-isomer-specific 2-hydroxy acid dehydrogenase superfamily [Amino acid transport and metabolism]. 35292 KOG0069: Glyoxylate/hydroxypyruvate reductase (D-isomer-specific 2-hydroxy acid dehydrogenase superfamily) [Energy production and conversion]. 35293 KOG0070: GTP-binding ADP-ribosylation factor Arf1 [Intracellular trafficking, secretion, and vesicular transport]. 35294 KOG0071: GTP-binding ADP-ribosylation factor Arf6 (dArf3) [Intracellular trafficking, secretion, and vesicular transport]. 35295 KOG0072: GTP-binding ADP-ribosylation factor-like protein ARL1 [Intracellular trafficking, secretion, and vesicular transport]. 35296 KOG0073: GTP-binding ADP-ribosylation factor-like protein ARL2 [Intracellular trafficking, secretion, and vesicular transport, Cytoskeleton]. 35297 KOG0074: GTP-binding ADP-ribosylation factor-like protein ARL3 [General function prediction only]. 35298 KOG0075: GTP-binding ADP-ribosylation factor-like protein [General function prediction only]. 35299 KOG0076: GTP-binding ADP-ribosylation factor-like protein yARL3 [Intracellular trafficking, secretion, and vesicular transport]. 35300 KOG0077: Vesicle coat complex COPII, GTPase subunit SAR1 [Intracellular trafficking, secretion, and vesicular transport]. 35301 KOG0078: GTP-binding protein SEC4, small G protein superfamily, and related Ras family GTP-binding proteins [Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport]. 35302 KOG0079: GTP-binding protein H-ray, small G protein superfamily [General function prediction only]. 35303 KOG0080: GTPase Rab18, small G protein superfamily [General function prediction only]. 35304 KOG0081: GTPase Rab27, small G protein superfamily [Intracellular trafficking, secretion, and vesicular transport]. 35305 KOG0082: G-protein alpha subunit (small G protein superfamily) [Cell cycle control, cell division, chromosome partitioning, Signal transduction mechanisms]. 35306 KOG0083: GTPase Rab26/Rab37, small G protein superfamily [General function prediction only]. 35307 KOG0084: GTPase Rab1/YPT1, small G protein superfamily, and related GTP-binding proteins [Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport]. 35308 KOG0085: G protein subunit Galphaq/Galphay, small G protein superfamily [Signal transduction mechanisms]. 35309 KOG0086: GTPase Rab4, small G protein superfamily [Intracellular trafficking, secretion, and vesicular transport]. 35310 KOG0087: GTPase Rab11/YPT3, small G protein superfamily [Intracellular trafficking, secretion, and vesicular transport]. 35311 KOG0088: GTPase Rab21, small G protein superfamily [General function prediction only]. 35312 KOG0089: Methylenetetrahydrofolate dehydrogenase/methylenetetrahydrofolate cyclohydrolase [Coenzyme transport and metabolism]. 35313 KOG0090: Signal recognition particle receptor, beta subunit (small G protein superfamily) [Intracellular trafficking, secretion, and vesicular transport]. 35314 KOG0091: GTPase Rab39, small G protein superfamily [General function prediction only]. 35315 KOG0092: GTPase Rab5/YPT51 and related small G protein superfamily GTPases [Intracellular trafficking, secretion, and vesicular transport]. 35316 KOG0093: GTPase Rab3, small G protein superfamily [Intracellular trafficking, secretion, and vesicular transport]. 35317 KOG0094: GTPase Rab6/YPT6/Ryh1, small G protein superfamily [Intracellular trafficking, secretion, and vesicular transport]. 35318 KOG0095: GTPase Rab30, small G protein superfamily [Intracellular trafficking, secretion, and vesicular transport]. 35319 KOG0096: GTPase Ran/TC4/GSP1 (nuclear protein transport pathway), small G protein superfamily [Intracellular trafficking, secretion, and vesicular transport]. 35320 KOG0097: GTPase Rab14, small G protein superfamily [Intracellular trafficking, secretion, and vesicular transport]. 35321 KOG0098: GTPase Rab2, small G protein superfamily [Intracellular trafficking, secretion, and vesicular transport]. 35322 KOG0099: G protein subunit Galphas, small G protein superfamily [Signal transduction mechanisms]. 35323 KOG0100: Molecular chaperones GRP78/BiP/KAR2, HSP70 superfamily [Posttranslational modification, protein turnover, chaperones]. 35324 KOG0101: Molecular chaperones HSP70/HSC70, HSP70 superfamily [Posttranslational modification, protein turnover, chaperones]. 35325 KOG0102: Molecular chaperones mortalin/PBP74/GRP75, HSP70 superfamily [Posttranslational modification, protein turnover, chaperones]. 35326 KOG0103: Molecular chaperones HSP105/HSP110/SSE1, HSP70 superfamily [Posttranslational modification, protein turnover, chaperones]. 35327 KOG0104: Molecular chaperones GRP170/SIL1, HSP70 superfamily [Posttranslational modification, protein turnover, chaperones]. 35328 KOG0105: Alternative splicing factor ASF/SF2 (RRM superfamily) [RNA processing and modification]. 35329 KOG0106: Alternative splicing factor SRp55/B52/SRp75 (RRM superfamily) [RNA processing and modification]. 35330 KOG0107: Alternative splicing factor SRp20/9G8 (RRM superfamily) [RNA processing and modification]. 35331 KOG0108: mRNA cleavage and polyadenylation factor I complex, subunit RNA15 [RNA processing and modification]. 35332 KOG0109: RNA-binding protein LARK, contains RRM and retroviral-type Zn-finger domains [RNA processing and modification, General function prediction only]. 35333 KOG0110: RNA-binding protein (RRM superfamily) [General function prediction only]. 35334 KOG0111: Cyclophilin-type peptidyl-prolyl cis-trans isomerase [Posttranslational modification, protein turnover, chaperones]. 35335 KOG0112: Large RNA-binding protein (RRM superfamily) [General function prediction only]. 35336 KOG0113: U1 small nuclear ribonucleoprotein (RRM superfamily) [RNA processing and modification]. 35337 KOG0114: Predicted RNA-binding protein (RRM superfamily) [General function prediction only]. 35338 KOG0115: RNA-binding protein p54nrb (RRM superfamily) [RNA processing and modification]. 35339 KOG0116: RasGAP SH3 binding protein rasputin, contains NTF2 and RRM domains [Signal transduction mechanisms]. 35340 KOG0117: Heterogeneous nuclear ribonucleoprotein R (RRM superfamily) [RNA processing and modification]. 35341 KOG0119: Splicing factor 1/branch point binding protein (RRM superfamily) [RNA processing and modification]. 35342 KOG0120: Splicing factor U2AF, large subunit (RRM superfamily) [RNA processing and modification]. 35343 KOG0121: Nuclear cap-binding protein complex, subunit CBP20 (RRM superfamily) [RNA processing and modification]. 35344 KOG0122: Translation initiation factor 3, subunit g (eIF-3g) [Translation, ribosomal structure and biogenesis]. 35345 KOG0123: Polyadenylate-binding protein (RRM superfamily) [RNA processing and modification, Translation, ribosomal structure and biogenesis]. 35346 KOG0124: Polypyrimidine tract-binding protein PUF60 (RRM superfamily) [RNA processing and modification]. 35347 KOG0125: Ataxin 2-binding protein (RRM superfamily) [General function prediction only]. 35348 KOG0126: Predicted RNA-binding protein (RRM superfamily) [General function prediction only]. 35349 KOG0127: Nucleolar protein fibrillarin NOP77 (RRM superfamily) [RNA processing and modification]. 35350 KOG0128: RNA-binding protein SART3 (RRM superfamily) [RNA processing and modification]. 35351 KOG0129: Predicted RNA-binding protein (RRM superfamily) [Translation, ribosomal structure and biogenesis]. 35352 KOG0130: RNA-binding protein RBM8/Tsunagi (RRM superfamily) [General function prediction only]. 35353 KOG0131: Splicing factor 3b, subunit 4 [RNA processing and modification]. 35354 KOG0132: RNA polymerase II C-terminal domain-binding protein RA4, contains RPR and RRM domains [RNA processing and modification, Transcription]. 35355 KOG0133: Deoxyribodipyrimidine photolyase/cryptochrome [Replication, recombination and repair, Signal transduction mechanisms]. 35356 KOG0134: NADH:flavin oxidoreductase/12-oxophytodienoate reductase [Energy production and conversion, General function prediction only]. 35357 KOG0135: Pristanoyl-CoA/acyl-CoA oxidase [Lipid transport and metabolism, Secondary metabolites biosynthesis, transport and catabolism]. 35358 KOG0136: Acyl-CoA oxidase [Lipid transport and metabolism]. 35359 KOG0137: Very-long-chain acyl-CoA dehydrogenase [Lipid transport and metabolism]. 35360 KOG0138: Glutaryl-CoA dehydrogenase [Amino acid transport and metabolism]. 35361 KOG0139: Short-chain acyl-CoA dehydrogenase [Lipid transport and metabolism]. 35362 KOG0140: Medium-chain acyl-CoA dehydrogenase [Lipid transport and metabolism]. 35363 KOG0141: Isovaleryl-CoA dehydrogenase [Amino acid transport and metabolism, Lipid transport and metabolism]. 35364 KOG0142: Isopentenyl pyrophosphate:dimethylallyl pyrophosphate isomerase [Secondary metabolites biosynthesis, transport and catabolism]. 35365 KOG0143: Iron/ascorbate family oxidoreductases [Secondary metabolites biosynthesis, transport and catabolism, General function prediction only]. 35366 KOG0144: RNA-binding protein CUGBP1/BRUNO (RRM superfamily) [RNA processing and modification]. 35367 KOG0145: RNA-binding protein ELAV/HU (RRM superfamily) [RNA processing and modification]. 35368 KOG0146: RNA-binding protein ETR-3 (RRM superfamily) [RNA processing and modification]. 35369 KOG0147: Transcriptional coactivator CAPER (RRM superfamily) [Transcription]. 35370 KOG0148: Apoptosis-promoting RNA-binding protein TIA-1/TIAR (RRM superfamily) [RNA processing and modification, Translation, ribosomal structure and biogenesis]. 35371 KOG0149: Predicted RNA-binding protein SEB4 (RRM superfamily) [General function prediction only]. 35372 KOG0150: Spliceosomal protein FBP21 [RNA processing and modification]. 35373 KOG0151: Predicted splicing regulator, contains RRM, SWAP and RPR domains [General function prediction only]. 35374 KOG0152: Spliceosomal protein FBP11/Splicing factor PRP40 [RNA processing and modification]. 35375 KOG0153: Predicted RNA-binding protein (RRM superfamily) [General function prediction only]. 35376 KOG0154: RNA-binding protein RBM5 and related proteins, contain G-patch and RRM domains [General function prediction only]. 35377 KOG0155: Transcription factor CA150 [Transcription]. 35378 KOG0156: Cytochrome P450 CYP2 subfamily [Secondary metabolites biosynthesis, transport and catabolism]. 35379 KOG0157: Cytochrome P450 CYP4/CYP19/CYP26 subfamilies [Secondary metabolites biosynthesis, transport and catabolism, Lipid transport and metabolism]. 35380 KOG0158: Cytochrome P450 CYP3/CYP5/CYP6/CYP9 subfamilies [Secondary metabolites biosynthesis, transport and catabolism]. 35381 KOG0159: Cytochrome P450 CYP11/CYP12/CYP24/CYP27 subfamilies [Secondary metabolites biosynthesis, transport and catabolism]. 35382 KOG0160: Myosin class V heavy chain [Cytoskeleton]. 35383 KOG0161: Myosin class II heavy chain [Cytoskeleton]. 35384 KOG0162: Myosin class I heavy chain [Cytoskeleton]. 35385 KOG0163: Myosin class VI heavy chain [Cytoskeleton]. 35386 KOG0164: Myosin class I heavy chain [Cytoskeleton]. 35387 KOG0165: Microtubule-associated protein Asp [Cytoskeleton]. 35388 KOG0166: Karyopherin (importin) alpha [Intracellular trafficking, secretion, and vesicular transport]. 35389 KOG0168: Putative ubiquitin fusion degradation protein [Posttranslational modification, protein turnover, chaperones]. 35390 KOG0169: Phosphoinositide-specific phospholipase C [Signal transduction mechanisms]. 35391 KOG0170: E3 ubiquitin protein ligase [Posttranslational modification, protein turnover, chaperones]. 35392 KOG0171: Mitochondrial inner membrane protease, subunit IMP1 [Posttranslational modification, protein turnover, chaperones]. 35393 KOG0172: Lysine-ketoglutarate reductase/saccharopine dehydrogenase [Amino acid transport and metabolism]. 35394 KOG0173: 20S proteasome, regulatory subunit beta type PSMB7/PSMB10/PUP1 [Posttranslational modification, protein turnover, chaperones]. 35395 KOG0174: 20S proteasome, regulatory subunit beta type PSMB6/PSMB9/PRE3 [Posttranslational modification, protein turnover, chaperones]. 35396 KOG0175: 20S proteasome, regulatory subunit beta type PSMB5/PSMB8/PRE2 [Posttranslational modification, protein turnover, chaperones]. 35397 KOG0176: 20S proteasome, regulatory subunit alpha type PSMA5/PUP2 [Posttranslational modification, protein turnover, chaperones]. 35398 KOG0177: 20S proteasome, regulatory subunit beta type PSMB2/PRE1 [Posttranslational modification, protein turnover, chaperones]. 35399 KOG0178: 20S proteasome, regulatory subunit alpha type PSMA4/PRE9 [Posttranslational modification, protein turnover, chaperones]. 35400 KOG0179: 20S proteasome, regulatory subunit beta type PSMB1/PRE7 [Posttranslational modification, protein turnover, chaperones]. 35401 KOG0180: 20S proteasome, regulatory subunit beta type PSMB3/PUP3 [Posttranslational modification, protein turnover, chaperones]. 35402 KOG0181: 20S proteasome, regulatory subunit alpha type PSMA2/PRE8 [Posttranslational modification, protein turnover, chaperones]. 35403 KOG0182: 20S proteasome, regulatory subunit alpha type PSMA6/SCL1 [Posttranslational modification, protein turnover, chaperones]. 35404 KOG0183: 20S proteasome, regulatory subunit alpha type PSMA7/PRE6 [Posttranslational modification, protein turnover, chaperones]. 35405 KOG0184: 20S proteasome, regulatory subunit alpha type PSMA3/PRE10 [Posttranslational modification, protein turnover, chaperones]. 35406 KOG0185: 20S proteasome, regulatory subunit beta type PSMB4/PRE4 [Posttranslational modification, protein turnover, chaperones]. 35407 KOG0186: Proline oxidase [Amino acid transport and metabolism]. 35408 KOG0187: 40S ribosomal protein S17 [Translation, ribosomal structure and biogenesis]. 35409 KOG0188: Alanyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 35410 KOG0189: Phosphoadenosine phosphosulfate reductase [Amino acid transport and metabolism]. 35411 KOG0190: Protein disulfide isomerase (prolyl 4-hydroxylase beta subunit) [Posttranslational modification, protein turnover, chaperones]. 35412 KOG0191: Thioredoxin/protein disulfide isomerase [Posttranslational modification, protein turnover, chaperones]. 35413 KOG0192: Tyrosine kinase specific for activated (GTP-bound) p21cdc42Hs [Signal transduction mechanisms]. 35414 KOG0193: Serine/threonine protein kinase RAF [Signal transduction mechanisms]. 35415 KOG0194: Protein tyrosine kinase [Signal transduction mechanisms]. 35416 KOG0195: Integrin-linked kinase [Signal transduction mechanisms]. 35417 KOG0196: Tyrosine kinase, EPH (ephrin) receptor family [Signal transduction mechanisms]. 35418 KOG0197: Tyrosine kinases [Signal transduction mechanisms]. 35419 KOG0198: MEKK and related serine/threonine protein kinases [Signal transduction mechanisms]. 35420 KOG0199: ACK and related non-receptor tyrosine kinases [Signal transduction mechanisms]. 35421 KOG0200: Fibroblast/platelet-derived growth factor receptor and related receptor tyrosine kinases [Signal transduction mechanisms]. 35422 KOG0201: Serine/threonine protein kinase [Signal transduction mechanisms]. 35423 KOG0202: Ca2+ transporting ATPase [Inorganic ion transport and metabolism]. 35424 KOG0203: Na+/K+ ATPase, alpha subunit [Inorganic ion transport and metabolism]. 35425 KOG0204: Calcium transporting ATPase [Inorganic ion transport and metabolism]. 35426 KOG0205: Plasma membrane H+-transporting ATPase [Inorganic ion transport and metabolism]. 35427 KOG0206: P-type ATPase [General function prediction only]. 35428 KOG0207: Cation transport ATPase [Inorganic ion transport and metabolism]. 35429 KOG0208: Cation transport ATPase [Inorganic ion transport and metabolism]. 35430 KOG0209: P-type ATPase [Inorganic ion transport and metabolism]. 35431 KOG0210: P-type ATPase [Inorganic ion transport and metabolism]. 35432 KOG0211: Protein phosphatase 2A regulatory subunit A and related proteins [Signal transduction mechanisms]. 35433 KOG0212: Uncharacterized conserved protein [Function unknown]. 35434 KOG0213: Splicing factor 3b, subunit 1 [RNA processing and modification]. 35435 KOG0214: RNA polymerase II, second largest subunit [Transcription]. 35436 KOG0215: RNA polymerase III, second largest subunit [Transcription]. 35437 KOG0216: RNA polymerase I, second largest subunit [Transcription]. 35438 KOG0217: Mismatch repair ATPase MSH6 (MutS family) [Replication, recombination and repair]. 35439 KOG0218: Mismatch repair MSH3 [Replication, recombination and repair]. 35440 KOG0219: Mismatch repair ATPase MSH2 (MutS family) [Replication, recombination and repair]. 35441 KOG0220: Mismatch repair ATPase MSH4 (MutS family) [Replication, recombination and repair]. 35442 KOG0221: Mismatch repair ATPase MSH5 (MutS family) [Replication, recombination and repair]. 35443 KOG0222: Phenylalanine and histidine ammonia-lyase [Secondary metabolites biosynthesis, transport and catabolism]. 35444 KOG0223: Aquaporin (major intrinsic protein family) [Carbohydrate transport and metabolism]. 35445 KOG0224: Aquaporin (major intrinsic protein family) [Carbohydrate transport and metabolism]. 35446 KOG0225: Pyruvate dehydrogenase E1, alpha subunit [Energy production and conversion]. 35447 KOG0226: RNA-binding proteins [General function prediction only]. 35448 KOG0227: Splicing factor 3a, subunit 2 [RNA processing and modification]. 35449 KOG0228: Beta-fructofuranosidase (invertase) [Carbohydrate transport and metabolism]. 35450 KOG0229: Phosphatidylinositol-4-phosphate 5-kinase [Signal transduction mechanisms]. 35451 KOG0230: Phosphatidylinositol-4-phosphate 5-kinase and related FYVE finger-containing proteins [Signal transduction mechanisms]. 35452 KOG0231: Junctional membrane complex protein Junctophilin and related MORN repeat proteins [General function prediction only]. 35453 KOG0232: Vacuolar H+-ATPase V0 sector, subunits c/c' [Energy production and conversion]. 35454 KOG0233: Vacuolar H+-ATPase V0 sector, subunit c'' [Energy production and conversion]. 35455 KOG0234: Fructose-6-phosphate 2-kinase/fructose-2,6-biphosphatase [Carbohydrate transport and metabolism]. 35456 KOG0235: Phosphoglycerate mutase [Carbohydrate transport and metabolism]. 35457 KOG0236: Sulfate/bicarbonate/oxalate exchanger SAT-1 and related transporters (SLC26 family) [Inorganic ion transport and metabolism]. 35458 KOG0237: Glycinamide ribonucleotide synthetase (GARS)/Aminoimidazole ribonucleotide synthetase (AIRS) [Nucleotide transport and metabolism]. 35459 KOG0238: 3-Methylcrotonyl-CoA carboxylase, biotin-containing subunit/Propionyl-CoA carboxylase, alpha chain/Acetyl-CoA carboxylase, biotin carboxylase subunit [Lipid transport and metabolism, Amino acid transport and metabolism]. 35460 KOG0239: Kinesin (KAR3 subfamily) [Cytoskeleton]. 35461 KOG0240: Kinesin (SMY1 subfamily) [Cytoskeleton]. 35462 KOG0241: Kinesin-like protein [Cytoskeleton]. 35463 KOG0242: Kinesin-like protein [Cytoskeleton]. 35464 KOG0243: Kinesin-like protein [Cytoskeleton]. 35465 KOG0244: Kinesin-like protein [Cytoskeleton]. 35466 KOG0245: Kinesin-like protein [Cytoskeleton]. 35467 KOG0246: Kinesin-like protein [Cytoskeleton]. 35468 KOG0247: Kinesin-like protein [Cytoskeleton]. 35469 KOG0248: Cytoplasmic protein Max-1, contains PH, MyTH4 and FERM domains [Cytoskeleton]. 35470 KOG0249: LAR-interacting protein and related proteins [General function prediction only]. 35471 KOG0250: DNA repair protein RAD18 (SMC family protein) [Replication, recombination and repair]. 35472 KOG0251: Clathrin assembly protein AP180 and related proteins, contain ENTH domain [Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport]. 35473 KOG0252: Inorganic phosphate transporter [Inorganic ion transport and metabolism]. 35474 KOG0253: Synaptic vesicle transporter SV2 (major facilitator superfamily) [General function prediction only]. 35475 KOG0254: Predicted transporter (major facilitator superfamily) [General function prediction only]. 35476 KOG0255: Synaptic vesicle transporter SVOP and related transporters (major facilitator superfamily) [General function prediction only]. 35477 KOG0256: 1-aminocyclopropane-1-carboxylate synthase, and related proteins [Signal transduction mechanisms]. 35478 KOG0257: Kynurenine aminotransferase, glutamine transaminase K [Amino acid transport and metabolism]. 35479 KOG0258: Alanine aminotransferase [Amino acid transport and metabolism]. 35480 KOG0259: Tyrosine aminotransferase [Amino acid transport and metabolism]. 35481 KOG0260: RNA polymerase II, large subunit [Transcription]. 35482 KOG0261: RNA polymerase III, large subunit [Transcription]. 35483 KOG0262: RNA polymerase I, large subunit [Transcription]. 35484 KOG0263: Transcription initiation factor TFIID, subunit TAF5 (also component of histone acetyltransferase SAGA) [Transcription]. 35485 KOG0264: Nucleosome remodeling factor, subunit CAF1/NURF55/MSI1 [Chromatin structure and dynamics]. 35486 KOG0265: U5 snRNP-specific protein-like factor and related proteins [RNA processing and modification]. 35487 KOG0266: WD40 repeat-containing protein [General function prediction only]. 35488 KOG0267: Microtubule severing protein katanin p80 subunit B (contains WD40 repeats) [Cell cycle control, cell division, chromosome partitioning]. 35489 KOG0268: Sof1-like rRNA processing protein (contains WD40 repeats) [RNA processing and modification]. 35490 KOG0269: WD40 repeat-containing protein [Function unknown]. 35491 KOG0270: WD40 repeat-containing protein [Function unknown]. 35492 KOG0271: Notchless-like WD40 repeat-containing protein [Function unknown]. 35493 KOG0272: U4/U6 small nuclear ribonucleoprotein Prp4 (contains WD40 repeats) [RNA processing and modification]. 35494 KOG0273: Beta-transducin family (WD-40 repeat) protein [Chromatin structure and dynamics]. 35495 KOG0274: Cdc4 and related F-box and WD-40 proteins [General function prediction only]. 35496 KOG0275: Conserved WD40 repeat-containing protein [General function prediction only]. 35497 KOG0276: Vesicle coat complex COPI, beta' subunit [Intracellular trafficking, secretion, and vesicular transport]. 35498 KOG0277: Peroxisomal targeting signal type 2 receptor [Intracellular trafficking, secretion, and vesicular transport]. 35499 KOG0278: Serine/threonine kinase receptor-associated protein [Lipid transport and metabolism]. 35500 KOG0279: G protein beta subunit-like protein [Signal transduction mechanisms]. 35501 KOG0280: Uncharacterized conserved protein [Amino acid transport and metabolism]. 35502 KOG0281: Beta-TrCP (transducin repeats containing)/Slimb proteins [Function unknown]. 35503 KOG0282: mRNA splicing factor [Function unknown]. 35504 KOG0283: WD40 repeat-containing protein [Function unknown]. 35505 KOG0284: Polyadenylation factor I complex, subunit PFS2 [RNA processing and modification]. 35506 KOG0285: Pleiotropic regulator 1 [RNA processing and modification]. 35507 KOG0286: G-protein beta subunit [General function prediction only]. 35508 KOG0287: Postreplication repair protein RAD18 [Replication, recombination and repair]. 35509 KOG0288: WD40 repeat protein TipD [General function prediction only]. 35510 KOG0289: mRNA splicing factor [General function prediction only]. 35511 KOG0290: Conserved WD40 repeat-containing protein AN11 [Function unknown]. 35512 KOG0291: WD40-repeat-containing subunit of the 18S rRNA processing complex [RNA processing and modification]. 35513 KOG0292: Vesicle coat complex COPI, alpha subunit [Intracellular trafficking, secretion, and vesicular transport]. 35514 KOG0293: WD40 repeat-containing protein [Function unknown]. 35515 KOG0294: WD40 repeat-containing protein [Function unknown]. 35516 KOG0295: WD40 repeat-containing protein [Function unknown]. 35517 KOG0296: Angio-associated migratory cell protein (contains WD40 repeats) [Function unknown]. 35518 KOG0297: TNF receptor-associated factor [Signal transduction mechanisms]. 35519 KOG0298: DEAD box-containing helicase-like transcription factor/DNA repair protein [Replication, recombination and repair]. 35520 KOG0299: U3 snoRNP-associated protein (contains WD40 repeats) [RNA processing and modification]. 35521 KOG0300: WD40 repeat-containing protein [Function unknown]. 35522 KOG0301: Phospholipase A2-activating protein (contains WD40 repeats) [Lipid transport and metabolism]. 35523 KOG0302: Ribosome Assembly protein [General function prediction only]. 35524 KOG0303: Actin-binding protein Coronin, contains WD40 repeats [Cytoskeleton]. 35525 KOG0304: mRNA deadenylase subunit [RNA processing and modification]. 35526 KOG0305: Anaphase promoting complex, Cdc20, Cdh1, and Ama1 subunits [Cell cycle control, cell division, chromosome partitioning, Posttranslational modification, protein turnover, chaperones]. 35527 KOG0306: WD40-repeat-containing subunit of the 18S rRNA processing complex [RNA processing and modification]. 35528 KOG0307: Vesicle coat complex COPII, subunit SEC31 [Intracellular trafficking, secretion, and vesicular transport]. 35529 KOG0308: Conserved WD40 repeat-containing protein [Function unknown]. 35530 KOG0309: Conserved WD40 repeat-containing protein [Function unknown]. 35531 KOG0310: Conserved WD40 repeat-containing protein [Function unknown]. 35532 KOG0311: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 35533 KOG0312: Uncharacterized conserved protein [Function unknown]. 35534 KOG0313: Microtubule binding protein YTM1 (contains WD40 repeats) [Cytoskeleton]. 35535 KOG0314: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 35536 KOG0315: G-protein beta subunit-like protein (contains WD40 repeats) [General function prediction only]. 35537 KOG0316: Conserved WD40 repeat-containing protein [Function unknown]. 35538 KOG0317: Predicted E3 ubiquitin ligase, integral peroxisomal membrane protein [Posttranslational modification, protein turnover, chaperones]. 35539 KOG0318: WD40 repeat stress protein/actin interacting protein [Cytoskeleton]. 35540 KOG0319: WD40-repeat-containing subunit of the 18S rRNA processing complex [RNA processing and modification]. 35541 KOG0320: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 35542 KOG0321: WD40 repeat-containing protein L2DTL [Function unknown]. 35543 KOG0322: G-protein beta subunit-like protein GNB1L, contains WD repeats [General function prediction only]. 35544 KOG0323: TFIIF-interacting CTD phosphatases, including NLI-interacting factor [Transcription]. 35545 KOG0324: Uncharacterized conserved protein [Function unknown]. 35546 KOG0325: Lipoyltransferase [Energy production and conversion, Coenzyme transport and metabolism]. 35547 KOG0326: ATP-dependent RNA helicase [RNA processing and modification]. 35548 KOG0327: Translation initiation factor 4F, helicase subunit (eIF-4A) and related helicases [Translation, ribosomal structure and biogenesis]. 35549 KOG0328: Predicted ATP-dependent RNA helicase FAL1, involved in rRNA maturation, DEAD-box superfamily [Translation, ribosomal structure and biogenesis]. 35550 KOG0329: ATP-dependent RNA helicase [RNA processing and modification]. 35551 KOG0330: ATP-dependent RNA helicase [RNA processing and modification]. 35552 KOG0331: ATP-dependent RNA helicase [RNA processing and modification]. 35553 KOG0332: ATP-dependent RNA helicase [RNA processing and modification]. 35554 KOG0333: U5 snRNP-like RNA helicase subunit [RNA processing and modification]. 35555 KOG0334: RNA helicase [RNA processing and modification]. 35556 KOG0335: ATP-dependent RNA helicase [RNA processing and modification]. 35557 KOG0336: ATP-dependent RNA helicase [RNA processing and modification]. 35558 KOG0337: ATP-dependent RNA helicase [RNA processing and modification]. 35559 KOG0338: ATP-dependent RNA helicase [RNA processing and modification]. 35560 KOG0339: ATP-dependent RNA helicase [RNA processing and modification]. 35561 KOG0340: ATP-dependent RNA helicase [RNA processing and modification]. 35562 KOG0341: DEAD-box protein abstrakt [RNA processing and modification]. 35563 KOG0342: ATP-dependent RNA helicase pitchoune [RNA processing and modification]. 35564 KOG0343: RNA Helicase [RNA processing and modification]. 35565 KOG0344: ATP-dependent RNA helicase [RNA processing and modification]. 35566 KOG0345: ATP-dependent RNA helicase [RNA processing and modification]. 35567 KOG0346: RNA helicase [RNA processing and modification]. 35568 KOG0347: RNA helicase [RNA processing and modification]. 35569 KOG0348: ATP-dependent RNA helicase [RNA processing and modification]. 35570 KOG0349: Putative DEAD-box RNA helicase DDX1 [RNA processing and modification]. 35571 KOG0350: DEAD-box ATP-dependent RNA helicase [RNA processing and modification]. 35572 KOG0351: ATP-dependent DNA helicase [Replication, recombination and repair]. 35573 KOG0352: ATP-dependent DNA helicase [Replication, recombination and repair]. 35574 KOG0353: ATP-dependent DNA helicase [General function prediction only]. 35575 KOG0354: DEAD-box like helicase [General function prediction only]. 35576 KOG0355: DNA topoisomerase type II [Chromatin structure and dynamics]. 35577 KOG0356: Mitochondrial chaperonin, Cpn60/Hsp60p [Posttranslational modification, protein turnover, chaperones]. 35578 KOG0357: Chaperonin complex component, TCP-1 epsilon subunit (CCT5) [Posttranslational modification, protein turnover, chaperones]. 35579 KOG0358: Chaperonin complex component, TCP-1 delta subunit (CCT4) [Posttranslational modification, protein turnover, chaperones]. 35580 KOG0359: Chaperonin complex component, TCP-1 zeta subunit (CCT6) [Posttranslational modification, protein turnover, chaperones]. 35581 KOG0360: Chaperonin complex component, TCP-1 alpha subunit (CCT1) [Posttranslational modification, protein turnover, chaperones]. 35582 KOG0361: Chaperonin complex component, TCP-1 eta subunit (CCT7) [Posttranslational modification, protein turnover, chaperones]. 35583 KOG0362: Chaperonin complex component, TCP-1 theta subunit (CCT8) [Posttranslational modification, protein turnover, chaperones]. 35584 KOG0363: Chaperonin complex component, TCP-1 beta subunit (CCT2) [Posttranslational modification, protein turnover, chaperones]. 35585 KOG0364: Chaperonin complex component, TCP-1 gamma subunit (CCT3) [Posttranslational modification, protein turnover, chaperones]. 35586 KOG0365: Beta subunit of farnesyltransferase [Posttranslational modification, protein turnover, chaperones]. 35587 KOG0366: Protein geranylgeranyltransferase type II, beta subunit [Posttranslational modification, protein turnover, chaperones]. 35588 KOG0367: Protein geranylgeranyltransferase Type I, beta subunit [Posttranslational modification, protein turnover, chaperones]. 35589 KOG0368: Acetyl-CoA carboxylase [Lipid transport and metabolism]. 35590 KOG0369: Pyruvate carboxylase [Energy production and conversion]. 35591 KOG0370: Multifunctional pyrimidine synthesis protein CAD (includes carbamoyl-phophate synthetase, aspartate transcarbamylase, and glutamine amidotransferase) [General function prediction only]. 35592 KOG0371: Serine/threonine protein phosphatase 2A, catalytic subunit [Signal transduction mechanisms]. 35593 KOG0372: Serine/threonine specific protein phosphatase involved in glycogen accumulation, PP2A-related [Carbohydrate transport and metabolism, Signal transduction mechanisms]. 35594 KOG0373: Serine/threonine specific protein phosphatase involved in cell cycle control, PP2A-related [Cell cycle control, cell division, chromosome partitioning, Signal transduction mechanisms]. 35595 KOG0374: Serine/threonine specific protein phosphatase PP1, catalytic subunit [Signal transduction mechanisms, General function prediction only]. 35596 KOG0375: Serine-threonine phosphatase 2B, catalytic subunit [General function prediction only]. 35597 KOG0376: Serine-threonine phosphatase 2A, catalytic subunit [General function prediction only]. 35598 KOG0377: Protein serine/threonine phosphatase RDGC/PPEF, contains STphosphatase and EF-hand domains [Signal transduction mechanisms]. 35599 KOG0378: 40S ribosomal protein S4 [Translation, ribosomal structure and biogenesis]. 35600 KOG0379: Kelch repeat-containing proteins [General function prediction only]. 35601 KOG0380: Sterol O-acyltransferase/Diacylglycerol O-acyltransferase [Lipid transport and metabolism]. 35602 KOG0381: HMG box-containing protein [General function prediction only]. 35603 KOG0382: Carbonic anhydrase [General function prediction only]. 35604 KOG0383: Predicted helicase [General function prediction only]. 35605 KOG0384: Chromodomain-helicase DNA-binding protein [Transcription]. 35606 KOG0385: Chromatin remodeling complex WSTF-ISWI, small subunit [Transcription]. 35607 KOG0386: Chromatin remodeling complex SWI/SNF, component SWI2 and related ATPases (DNA/RNA helicase superfamily) [Chromatin structure and dynamics, Transcription]. 35608 KOG0387: Transcription-coupled repair protein CSB/RAD26 (contains SNF2 family DNA-dependent ATPase domain) [Transcription, Replication, recombination and repair]. 35609 KOG0388: SNF2 family DNA-dependent ATPase [Replication, recombination and repair]. 35610 KOG0389: SNF2 family DNA-dependent ATPase [Chromatin structure and dynamics]. 35611 KOG0390: DNA repair protein, SNF2 family [Replication, recombination and repair]. 35612 KOG0391: SNF2 family DNA-dependent ATPase [General function prediction only]. 35613 KOG0392: SNF2 family DNA-dependent ATPase domain-containing protein [Transcription]. 35614 KOG0393: Ras-related small GTPase, Rho type [General function prediction only]. 35615 KOG0394: Ras-related GTPase [General function prediction only]. 35616 KOG0395: Ras-related GTPase [General function prediction only]. 35617 KOG0396: Uncharacterized conserved protein [Function unknown]. 35618 KOG0397: 60S ribosomal protein L11 [Translation, ribosomal structure and biogenesis]. 35619 KOG0398: Mitochondrial/chloroplast ribosomal protein L5/L7 [Translation, ribosomal structure and biogenesis]. 35620 KOG0399: Glutamate synthase [Amino acid transport and metabolism]. 35621 KOG0400: 40S ribosomal protein S13 [Translation, ribosomal structure and biogenesis]. 35622 KOG0401: Translation initiation factor 4F, ribosome/mRNA-bridging subunit (eIF-4G) [Translation, ribosomal structure and biogenesis]. 35623 KOG0402: 60S ribosomal protein L37 [Translation, ribosomal structure and biogenesis]. 35624 KOG0403: Neoplastic transformation suppressor Pdcd4/MA-3, contains MA3 domain [Signal transduction mechanisms]. 35625 KOG0404: Thioredoxin reductase [Posttranslational modification, protein turnover, chaperones]. 35626 KOG0405: Pyridine nucleotide-disulphide oxidoreductase [Secondary metabolites biosynthesis, transport and catabolism]. 35627 KOG0406: Glutathione S-transferase [Posttranslational modification, protein turnover, chaperones]. 35628 KOG0407: 40S ribosomal protein S14 [Translation, ribosomal structure and biogenesis]. 35629 KOG0408: Mitochondrial/chloroplast ribosomal protein S11 [Translation, ribosomal structure and biogenesis]. 35630 KOG0409: Predicted dehydrogenase [General function prediction only]. 35631 KOG0410: Predicted GTP binding protein [General function prediction only]. 35632 KOG0411: Uncharacterized membrane protein [Function unknown]. 35633 KOG0412: Golgi transport complex COD1 protein [Intracellular trafficking, secretion, and vesicular transport]. 35634 KOG0413: Uncharacterized conserved protein related to condensin complex subunit 1 [Function unknown]. 35635 KOG0414: Chromosome condensation complex Condensin, subunit D2 [Chromatin structure and dynamics, Cell cycle control, cell division, chromosome partitioning]. 35636 KOG0415: Predicted peptidyl prolyl cis-trans isomerase [Posttranslational modification, protein turnover, chaperones]. 35637 KOG0416: Ubiquitin-protein ligase [Posttranslational modification, protein turnover, chaperones]. 35638 KOG0417: Ubiquitin-protein ligase [Posttranslational modification, protein turnover, chaperones]. 35639 KOG0418: Ubiquitin-protein ligase [Posttranslational modification, protein turnover, chaperones]. 35640 KOG0419: Ubiquitin-protein ligase [Posttranslational modification, protein turnover, chaperones]. 35641 KOG0420: Ubiquitin-protein ligase [Posttranslational modification, protein turnover, chaperones]. 35642 KOG0421: Ubiquitin-protein ligase [Posttranslational modification, protein turnover, chaperones]. 35643 KOG0422: Ubiquitin-protein ligase [Posttranslational modification, protein turnover, chaperones]. 35644 KOG0423: Ubiquitin-protein ligase [Posttranslational modification, protein turnover, chaperones]. 35645 KOG0424: Ubiquitin-protein ligase [Posttranslational modification, protein turnover, chaperones]. 35646 KOG0425: Ubiquitin-protein ligase [Posttranslational modification, protein turnover, chaperones]. 35647 KOG0426: Ubiquitin-protein ligase [Posttranslational modification, protein turnover, chaperones]. 35648 KOG0427: Ubiquitin conjugating enzyme [Posttranslational modification, protein turnover, chaperones]. 35649 KOG0428: Non-canonical ubiquitin conjugating enzyme 1 [Posttranslational modification, protein turnover, chaperones]. 35650 KOG0429: Ubiquitin-conjugating enzyme-related protein Ft1, involved in programmed cell death [Posttranslational modification, protein turnover, chaperones]. 35651 KOG0430: Xanthine dehydrogenase [Nucleotide transport and metabolism]. 35652 KOG0431: Auxilin-like protein and related proteins containing DnaJ domain [General function prediction only]. 35653 KOG0432: Valyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 35654 KOG0433: Isoleucyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 35655 KOG0434: Isoleucyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 35656 KOG0435: Leucyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 35657 KOG0436: Methionyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 35658 KOG0437: Leucyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 35659 KOG0438: Mitochondrial/chloroplast ribosomal protein L2 [Translation, ribosomal structure and biogenesis]. 35660 KOG0439: VAMP-associated protein involved in inositol metabolism [Intracellular trafficking, secretion, and vesicular transport]. 35661 KOG0440: Cell cycle-associated protein Mob1-1 [Cell cycle control, cell division, chromosome partitioning]. 35662 KOG0441: Cu2+/Zn2+ superoxide dismutase SOD1 [Inorganic ion transport and metabolism]. 35663 KOG0442: Structure-specific endonuclease ERCC1-XPF, catalytic component XPF/ERCC4 [Replication, recombination and repair]. 35664 KOG0443: Actin regulatory proteins (gelsolin/villin family) [Cytoskeleton]. 35665 KOG0444: Cytoskeletal regulator Flightless-I (contains leucine-rich and gelsolin repeats) [Cytoskeleton]. 35666 KOG0445: Actin regulatory protein supervillin (gelsolin/villin family) [Cytoskeleton]. 35667 KOG0446: Vacuolar sorting protein VPS1, dynamin, and related proteins [Intracellular trafficking, secretion, and vesicular transport, General function prediction only]. 35668 KOG0447: Dynamin-like GTP binding protein [General function prediction only]. 35669 KOG0448: Mitofusin 1 GTPase, involved in mitochondrila biogenesis [Posttranslational modification, protein turnover, chaperones]. 35670 KOG0449: Succinate dehydrogenase, cytochrome b subunit [Carbohydrate transport and metabolism]. 35671 KOG0450: 2-oxoglutarate dehydrogenase, E1 subunit [Carbohydrate transport and metabolism]. 35672 KOG0451: Predicted 2-oxoglutarate dehydrogenase, E1 subunit [Carbohydrate transport and metabolism]. 35673 KOG0452: RNA-binding translational regulator IRP (aconitase superfamily) [RNA processing and modification, Translation, ribosomal structure and biogenesis]. 35674 KOG0453: Aconitase/homoaconitase (aconitase superfamily) [Energy production and conversion, Amino acid transport and metabolism]. 35675 KOG0454: 3-isopropylmalate dehydratase (aconitase superfamily) [Amino acid transport and metabolism]. 35676 KOG0455: Homoserine dehydrogenase [Amino acid transport and metabolism]. 35677 KOG0456: Aspartate kinase [Amino acid transport and metabolism]. 35678 KOG0457: Histone acetyltransferase complex SAGA/ADA, subunit ADA2 [Chromatin structure and dynamics]. 35679 KOG0458: Elongation factor 1 alpha [Translation, ribosomal structure and biogenesis]. 35680 KOG0459: Polypeptide release factor 3 [Translation, ribosomal structure and biogenesis]. 35681 KOG0460: Mitochondrial translation elongation factor Tu [Translation, ribosomal structure and biogenesis]. 35682 KOG0461: Selenocysteine-specific elongation factor [Translation, ribosomal structure and biogenesis]. 35683 KOG0462: Elongation factor-type GTP-binding protein [Translation, ribosomal structure and biogenesis]. 35684 KOG0463: GTP-binding protein GP-1 [General function prediction only]. 35685 KOG0464: Elongation factor G [Translation, ribosomal structure and biogenesis]. 35686 KOG0465: Mitochondrial elongation factor [Translation, ribosomal structure and biogenesis]. 35687 KOG0466: Translation initiation factor 2, gamma subunit (eIF-2gamma; GTPase) [Translation, ribosomal structure and biogenesis]. 35688 KOG0467: Translation elongation factor 2/ribosome biogenesis protein RIA1 and related proteins [Translation, ribosomal structure and biogenesis]. 35689 KOG0468: U5 snRNP-specific protein [Translation, ribosomal structure and biogenesis]. 35690 KOG0469: Elongation factor 2 [Translation, ribosomal structure and biogenesis]. 35691 KOG0470: 1,4-alpha-glucan branching enzyme/starch branching enzyme II [Carbohydrate transport and metabolism]. 35692 KOG0471: Alpha-amylase [Carbohydrate transport and metabolism]. 35693 KOG0472: Leucine-rich repeat protein [Function unknown]. 35694 KOG0473: Leucine-rich repeat protein [Function unknown]. 35695 KOG0474: Cl- channel CLC-7 and related proteins (CLC superfamily) [Inorganic ion transport and metabolism]. 35696 KOG0475: Cl- channel CLC-3 and related proteins (CLC superfamily) [Inorganic ion transport and metabolism]. 35697 KOG0476: Cl- channel CLC-2 and related proteins (CLC superfamily) [Inorganic ion transport and metabolism]. 35698 KOG0477: DNA replication licensing factor, MCM2 component [Replication, recombination and repair]. 35699 KOG0478: DNA replication licensing factor, MCM4 component [Replication, recombination and repair]. 35700 KOG0479: DNA replication licensing factor, MCM3 component [Replication, recombination and repair]. 35701 KOG0480: DNA replication licensing factor, MCM6 component [Replication, recombination and repair]. 35702 KOG0481: DNA replication licensing factor, MCM5 component [Replication, recombination and repair]. 35703 KOG0482: DNA replication licensing factor, MCM7 component [Replication, recombination and repair]. 35704 KOG0483: Transcription factor HEX, contains HOX and HALZ domains [Transcription]. 35705 KOG0484: Transcription factor PHOX2/ARIX, contains HOX domain [Transcription]. 35706 KOG0485: Transcription factor NKX-5.1/HMX1, contains HOX domain [Transcription]. 35707 KOG0486: Transcription factor PTX1, contains HOX domain [Transcription]. 35708 KOG0487: Transcription factor Abd-B, contains HOX domain [Transcription]. 35709 KOG0488: Transcription factor BarH and related HOX domain proteins [General function prediction only]. 35710 KOG0489: Transcription factor zerknullt and related HOX domain proteins [General function prediction only]. 35711 KOG0490: Transcription factor, contains HOX domain [General function prediction only]. 35712 KOG0491: Transcription factor BSH, contains HOX domain [General function prediction only]. 35713 KOG0492: Transcription factor MSH, contains HOX domain [General function prediction only]. 35714 KOG0493: Transcription factor Engrailed, contains HOX domain [General function prediction only]. 35715 KOG0494: Transcription factor CHX10 and related HOX domain proteins [General function prediction only]. 35716 KOG0495: HAT repeat protein [RNA processing and modification]. 35717 KOG0496: Beta-galactosidase [Carbohydrate transport and metabolism]. 35718 KOG0497: Oxidosqualene-lanosterol cyclase and related proteins [Lipid transport and metabolism]. 35719 KOG0498: K+-channel ERG and related proteins, contain PAS/PAC sensor domain [Inorganic ion transport and metabolism, Signal transduction mechanisms]. 35720 KOG0499: Cyclic nucleotide-gated cation channel CNCG4 [Inorganic ion transport and metabolism, Signal transduction mechanisms]. 35721 KOG0500: Cyclic nucleotide-gated cation channel CNGA1-3 and related proteins [Inorganic ion transport and metabolism, Signal transduction mechanisms]. 35722 KOG0501: K+-channel KCNQ [Inorganic ion transport and metabolism]. 35723 KOG0502: Integral membrane ankyrin-repeat protein Kidins220 (protein kinase D substrate) [General function prediction only]. 35724 KOG0503: Asparaginase [Amino acid transport and metabolism]. 35725 KOG0505: Myosin phosphatase, regulatory subunit [Posttranslational modification, protein turnover, chaperones, Signal transduction mechanisms]. 35726 KOG0506: Glutaminase (contains ankyrin repeat) [Amino acid transport and metabolism]. 35727 KOG0507: CASK-interacting adaptor protein (caskin) and related proteins with ankyrin repeats and SAM domain [Signal transduction mechanisms]. 35728 KOG0508: Ankyrin repeat protein [General function prediction only]. 35729 KOG0509: Ankyrin repeat and DHHC-type Zn-finger domain containing proteins [General function prediction only]. 35730 KOG0510: Ankyrin repeat protein [General function prediction only]. 35731 KOG0511: Ankyrin repeat protein [General function prediction only]. 35732 KOG0512: Fetal globin-inducing factor (contains ankyrin repeats) [Transcription]. 35733 KOG0513: Ca2+-independent phospholipase A2 [Lipid transport and metabolism]. 35734 KOG0514: Ankyrin repeat protein [General function prediction only]. 35735 KOG0515: p53-interacting protein 53BP/ASPP, contains ankyrin and SH3 domains [Cell cycle control, cell division, chromosome partitioning]. 35736 KOG0516: Dystonin, GAS (Growth-arrest-specific protein), and related proteins [Cytoskeleton]. 35737 KOG0517: Beta-spectrin [Cytoskeleton]. 35738 KOG0518: Actin-binding cytoskeleton protein, filamin [Cytoskeleton]. 35739 KOG0519: Sensory transduction histidine kinase [Signal transduction mechanisms]. 35740 KOG0520: Uncharacterized conserved protein, contains IPT/TIG domain [Function unknown]. 35741 KOG0521: Putative GTPase activating proteins (GAPs) [Signal transduction mechanisms]. 35742 KOG0522: Ankyrin repeat protein [General function prediction only]. 35743 KOG0523: Transketolase [Carbohydrate transport and metabolism]. 35744 KOG0524: Pyruvate dehydrogenase E1, beta subunit [Energy production and conversion]. 35745 KOG0525: Branched chain alpha-keto acid dehydrogenase E1, beta subunit [Energy production and conversion]. 35746 KOG0526: Nucleosome-binding factor SPN, POB3 subunit [Transcription, Replication, recombination and repair, Chromatin structure and dynamics]. 35747 KOG0527: HMG-box transcription factor [Transcription]. 35748 KOG0528: HMG-box transcription factor SOX5 [Transcription]. 35749 KOG0529: Protein geranylgeranyltransferase type II, alpha subunit [Posttranslational modification, protein turnover, chaperones]. 35750 KOG0530: Protein farnesyltransferase, alpha subunit/protein geranylgeranyltransferase type I, alpha subunit [Posttranslational modification, protein turnover, chaperones]. 35751 KOG0531: Protein phosphatase 1, regulatory subunit, and related proteins [Signal transduction mechanisms]. 35752 KOG0532: Leucine-rich repeat (LRR) protein, contains calponin homology domain [Cytoskeleton]. 35753 KOG0533: RRM motif-containing protein [RNA processing and modification]. 35754 KOG0534: NADH-cytochrome b-5 reductase [Coenzyme transport and metabolism, Energy production and conversion]. 35755 KOG0535: Sulfite oxidase, molybdopterin-binding component [Energy production and conversion]. 35756 KOG0536: Flavohemoprotein b5+b5R [Energy production and conversion]. 35757 KOG0537: Cytochrome b5 [Energy production and conversion]. 35758 KOG0538: Glycolate oxidase [Energy production and conversion]. 35759 KOG0539: Sphingolipid fatty acid hydroxylase [Lipid transport and metabolism]. 35760 KOG0540: 3-Methylcrotonyl-CoA carboxylase, non-biotin containing subunit/Acetyl-CoA carboxylase carboxyl transferase, subunit beta [Amino acid transport and metabolism, Lipid transport and metabolism]. 35761 KOG0541: Alkyl hydroperoxide reductase/peroxiredoxin [Posttranslational modification, protein turnover, chaperones]. 35762 KOG0542: Predicted exonuclease [Replication, recombination and repair]. 35763 KOG0543: FKBP-type peptidyl-prolyl cis-trans isomerase [Posttranslational modification, protein turnover, chaperones]. 35764 KOG0544: FKBP-type peptidyl-prolyl cis-trans isomerase [Posttranslational modification, protein turnover, chaperones]. 35765 KOG0545: Aryl-hydrocarbon receptor-interacting protein [Posttranslational modification, protein turnover, chaperones]. 35766 KOG0546: HSP90 co-chaperone CPR7/Cyclophilin [Posttranslational modification, protein turnover, chaperones]. 35767 KOG0547: Translocase of outer mitochondrial membrane complex, subunit TOM70/TOM72 [Intracellular trafficking, secretion, and vesicular transport]. 35768 KOG0548: Molecular co-chaperone STI1 [Posttranslational modification, protein turnover, chaperones]. 35769 KOG0549: FKBP-type peptidyl-prolyl cis-trans isomerase [Posttranslational modification, protein turnover, chaperones]. 35770 KOG0550: Molecular chaperone (DnaJ superfamily) [Posttranslational modification, protein turnover, chaperones]. 35771 KOG0551: Hsp90 co-chaperone CNS1 (contains TPR repeats) [Posttranslational modification, protein turnover, chaperones]. 35772 KOG0552: FKBP-type peptidyl-prolyl cis-trans isomerase [Posttranslational modification, protein turnover, chaperones]. 35773 KOG0553: TPR repeat-containing protein [General function prediction only]. 35774 KOG0554: Asparaginyl-tRNA synthetase (mitochondrial) [Translation, ribosomal structure and biogenesis]. 35775 KOG0555: Asparaginyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 35776 KOG0556: Aspartyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 35777 KOG0557: Dihydrolipoamide acetyltransferase [Energy production and conversion]. 35778 KOG0558: Dihydrolipoamide transacylase (alpha-keto acid dehydrogenase E2 subunit) [Energy production and conversion]. 35779 KOG0559: Dihydrolipoamide succinyltransferase (2-oxoglutarate dehydrogenase, E2 subunit) [Energy production and conversion]. 35780 KOG0560: Sulfite reductase (ferredoxin) [Inorganic ion transport and metabolism]. 35781 KOG0561: bHLH transcription factor [Transcription]. 35782 KOG0562: Predicted hydrolase (HIT family) [General function prediction only]. 35783 KOG0563: Glucose-6-phosphate 1-dehydrogenase [Carbohydrate transport and metabolism]. 35784 KOG0564: 5,10-methylenetetrahydrofolate reductase [Amino acid transport and metabolism]. 35785 KOG0565: Inositol polyphosphate 5-phosphatase and related proteins [Intracellular trafficking, secretion, and vesicular transport]. 35786 KOG0566: Inositol-1,4,5-triphosphate 5-phosphatase (synaptojanin), INP51/INP52/INP53 family [Intracellular trafficking, secretion, and vesicular transport]. 35787 KOG0567: HEAT repeat-containing protein [General function prediction only]. 35788 KOG0568: Molecular chaperone (DnaJ superfamily) [Posttranslational modification, protein turnover, chaperones]. 35789 KOG0569: Permease of the major facilitator superfamily [Carbohydrate transport and metabolism]. 35790 KOG0570: Transcriptional coactivator [Transcription]. 35791 KOG0571: Asparagine synthase (glutamine-hydrolyzing) [Amino acid transport and metabolism]. 35792 KOG0572: Glutamine phosphoribosylpyrophosphate amidotransferase [Nucleotide transport and metabolism]. 35793 KOG0573: Asparagine synthase [Amino acid transport and metabolism]. 35794 KOG0574: STE20-like serine/threonine kinase MST [Signal transduction mechanisms]. 35795 KOG0575: Polo-like serine/threonine protein kinase [Cell cycle control, cell division, chromosome partitioning]. 35796 KOG0576: Mitogen-activated protein kinase kinase kinase kinase (MAP4K), germinal center kinase family [Signal transduction mechanisms]. 35797 KOG0577: Serine/threonine protein kinase [Signal transduction mechanisms]. 35798 KOG0578: p21-activated serine/threonine protein kinase [Signal transduction mechanisms]. 35799 KOG0579: Ste20-like serine/threonine protein kinase [Signal transduction mechanisms]. 35800 KOG0580: Serine/threonine protein kinase [Cell cycle control, cell division, chromosome partitioning]. 35801 KOG0581: Mitogen-activated protein kinase kinase (MAP2K) [Signal transduction mechanisms]. 35802 KOG0582: Ste20-like serine/threonine protein kinase [Signal transduction mechanisms]. 35803 KOG0583: Serine/threonine protein kinase [Signal transduction mechanisms]. 35804 KOG0584: Serine/threonine protein kinase [General function prediction only]. 35805 KOG0585: Ca2+/calmodulin-dependent protein kinase kinase beta and related serine/threonine protein kinases [Signal transduction mechanisms]. 35806 KOG0586: Serine/threonine protein kinase [General function prediction only]. 35807 KOG0587: Traf2- and Nck-interacting kinase and related germinal center kinase (GCK) family protein kinases [Signal transduction mechanisms]. 35808 KOG0588: Serine/threonine protein kinase [Cell cycle control, cell division, chromosome partitioning]. 35809 KOG0589: Serine/threonine protein kinase [General function prediction only]. 35810 KOG0590: Checkpoint kinase and related serine/threonine protein kinases [Cell cycle control, cell division, chromosome partitioning]. 35811 KOG0591: NIMA (never in mitosis)-related G2-specific serine/threonine protein kinase [Cell cycle control, cell division, chromosome partitioning]. 35812 KOG0592: 3-phosphoinositide-dependent protein kinase (PDK1) [Signal transduction mechanisms]. 35813 KOG0593: Predicted protein kinase KKIAMRE [General function prediction only]. 35814 KOG0594: Protein kinase PCTAIRE and related kinases [General function prediction only]. 35815 KOG0595: Serine/threonine-protein kinase involved in autophagy [Posttranslational modification, protein turnover, chaperones, Intracellular trafficking, secretion, and vesicular transport, Signal transduction mechanisms]. 35816 KOG0596: Dual specificity; serine/threonine and tyrosine kinase [Cell cycle control, cell division, chromosome partitioning]. 35817 KOG0597: Serine-threonine protein kinase FUSED [General function prediction only]. 35818 KOG0598: Ribosomal protein S6 kinase and related proteins [General function prediction only, Signal transduction mechanisms]. 35819 KOG0599: Phosphorylase kinase gamma subunit [Carbohydrate transport and metabolism]. 35820 KOG0600: Cdc2-related protein kinase [Cell cycle control, cell division, chromosome partitioning]. 35821 KOG0601: Cyclin-dependent kinase WEE1 [Cell cycle control, cell division, chromosome partitioning]. 35822 KOG0602: Neutral trehalase [Carbohydrate transport and metabolism]. 35823 KOG0603: Ribosomal protein S6 kinase [Signal transduction mechanisms]. 35824 KOG0604: MAP kinase-activated protein kinase 2 [Signal transduction mechanisms]. 35825 KOG0605: NDR and related serine/threonine kinases [General function prediction only]. 35826 KOG0606: Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms, General function prediction only]. 35827 KOG0607: MAP kinase-interacting kinase and related serine/threonine protein kinases [Signal transduction mechanisms]. 35828 KOG0608: Warts/lats-like serine threonine kinases [Cell cycle control, cell division, chromosome partitioning]. 35829 KOG0609: Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]. 35830 KOG0610: Putative serine/threonine protein kinase [General function prediction only]. 35831 KOG0611: Predicted serine/threonine protein kinase [General function prediction only]. 35832 KOG0612: Rho-associated, coiled-coil containing protein kinase [Signal transduction mechanisms]. 35833 KOG0613: Projectin/twitchin and related proteins [Cytoskeleton]. 35834 KOG0614: cGMP-dependent protein kinase [Signal transduction mechanisms]. 35835 KOG0615: Serine/threonine protein kinase Chk2 and related proteins [Cell cycle control, cell division, chromosome partitioning]. 35836 KOG0616: cAMP-dependent protein kinase catalytic subunit (PKA) [Signal transduction mechanisms]. 35837 KOG0617: Ras suppressor protein (contains leucine-rich repeats) [Signal transduction mechanisms]. 35838 KOG0618: Serine/threonine phosphatase 2C containing leucine-rich repeats, similar to SCN circadian oscillatory protein (SCOP) [Signal transduction mechanisms]. 35839 KOG0620: Glucose-repressible alcohol dehydrogenase transcriptional effector CCR4 and related proteins [Transcription]. 35840 KOG0621: Phospholipid scramblase [Cell wall/membrane/envelope biogenesis]. 35841 KOG0622: Ornithine decarboxylase [Amino acid transport and metabolism]. 35842 KOG0623: Glutamine amidotransferase/cyclase [Amino acid transport and metabolism]. 35843 KOG0624: dsRNA-activated protein kinase inhibitor P58, contains TPR and DnaJ domains [Defense mechanisms]. 35844 KOG0625: Phosphoglucomutase [Carbohydrate transport and metabolism]. 35845 KOG0626: Beta-glucosidase, lactase phlorizinhydrolase, and related proteins [Carbohydrate transport and metabolism]. 35846 KOG0627: Heat shock transcription factor [Transcription]. 35847 KOG0628: Aromatic-L-amino-acid/L-histidine decarboxylase [Amino acid transport and metabolism]. 35848 KOG0629: Glutamate decarboxylase and related proteins [Amino acid transport and metabolism]. 35849 KOG0630: Predicted pyridoxal-dependent decarboxylase [Amino acid transport and metabolism]. 35850 KOG0631: Galactokinase [Carbohydrate transport and metabolism]. 35851 KOG0632: Phytochelatin synthase [Inorganic ion transport and metabolism]. 35852 KOG0633: Histidinol phosphate aminotransferase [Amino acid transport and metabolism]. 35853 KOG0634: Aromatic amino acid aminotransferase and related proteins [Amino acid transport and metabolism]. 35854 KOG0635: Adenosine 5'-phosphosulfate kinase [Inorganic ion transport and metabolism]. 35855 KOG0636: ATP sulfurylase (sulfate adenylyltransferase) [Inorganic ion transport and metabolism]. 35856 KOG0637: Sucrose transporter and related proteins [Carbohydrate transport and metabolism]. 35857 KOG0638: 4-hydroxyphenylpyruvate dioxygenase [Amino acid transport and metabolism]. 35858 KOG0639: Transducin-like enhancer of split protein (contains WD40 repeats) [Chromatin structure and dynamics]. 35859 KOG0640: mRNA cleavage stimulating factor complex; subunit 1 [RNA processing and modification]. 35860 KOG0641: WD40 repeat protein [General function prediction only]. 35861 KOG0642: Cell-cycle nuclear protein, contains WD-40 repeats [Cell cycle control, cell division, chromosome partitioning]. 35862 KOG0643: Translation initiation factor 3, subunit i (eIF-3i)/TGF-beta receptor-interacting protein (TRIP-1) [Translation, ribosomal structure and biogenesis, Signal transduction mechanisms]. 35863 KOG0644: Uncharacterized conserved protein, contains WD40 repeat and BROMO domains [General function prediction only]. 35864 KOG0645: WD40 repeat protein [General function prediction only]. 35865 KOG0646: WD40 repeat protein [General function prediction only]. 35866 KOG0647: mRNA export protein (contains WD40 repeats) [RNA processing and modification]. 35867 KOG0648: Predicted NUDIX hydrolase FGF-2 and related proteins [Signal transduction mechanisms]. 35868 KOG0649: WD40 repeat protein [General function prediction only]. 35869 KOG0650: WD40 repeat nucleolar protein Bop1, involved in ribosome biogenesis [Translation, ribosomal structure and biogenesis]. 35870 KOG0651: 26S proteasome regulatory complex, ATPase RPT4 [Posttranslational modification, protein turnover, chaperones]. 35871 KOG0652: 26S proteasome regulatory complex, ATPase RPT5 [Posttranslational modification, protein turnover, chaperones]. 35872 KOG0653: Cyclin B and related kinase-activating proteins [Cell cycle control, cell division, chromosome partitioning]. 35873 KOG0654: G2/Mitotic-specific cyclin A [Cell cycle control, cell division, chromosome partitioning]. 35874 KOG0655: G1/S-specific cyclin E [Cell cycle control, cell division, chromosome partitioning]. 35875 KOG0656: G1/S-specific cyclin D [Cell cycle control, cell division, chromosome partitioning]. 35876 KOG0657: Glyceraldehyde 3-phosphate dehydrogenase [Carbohydrate transport and metabolism]. 35877 KOG0658: Glycogen synthase kinase-3 [Carbohydrate transport and metabolism]. 35878 KOG0659: Cdk activating kinase (CAK)/RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH/TFIIK, kinase subunit CDK7 [Cell cycle control, cell division, chromosome partitioning, Transcription, Replication, recombination and repair]. 35879 KOG0660: Mitogen-activated protein kinase [Signal transduction mechanisms]. 35880 KOG0661: MAPK related serine/threonine protein kinase [Signal transduction mechanisms]. 35881 KOG0662: Cyclin-dependent kinase CDK5 [Intracellular trafficking, secretion, and vesicular transport, Signal transduction mechanisms]. 35882 KOG0663: Protein kinase PITSLRE and related kinases [General function prediction only]. 35883 KOG0664: Nemo-like MAPK-related serine/threonine protein kinase [Signal transduction mechanisms]. 35884 KOG0665: Jun-N-terminal kinase (JNK) [Signal transduction mechanisms]. 35885 KOG0666: Cyclin C-dependent kinase CDK8 [Transcription]. 35886 KOG0667: Dual-specificity tyrosine-phosphorylation regulated kinase [General function prediction only]. 35887 KOG0668: Casein kinase II, alpha subunit [Signal transduction mechanisms, Cell cycle control, cell division, chromosome partitioning, Transcription]. 35888 KOG0669: Cyclin T-dependent kinase CDK9 [Cell cycle control, cell division, chromosome partitioning]. 35889 KOG0670: U4/U6-associated splicing factor PRP4 [RNA processing and modification]. 35890 KOG0671: LAMMER dual specificity kinases [Signal transduction mechanisms]. 35891 KOG0672: Halotolerance protein HAL3 (contains flavoprotein domain) [Inorganic ion transport and metabolism, Cell cycle control, cell division, chromosome partitioning]. 35892 KOG0673: Thymidylate synthase [Nucleotide transport and metabolism]. 35893 KOG0674: Calreticulin [Posttranslational modification, protein turnover, chaperones]. 35894 KOG0675: Calnexin [Posttranslational modification, protein turnover, chaperones]. 35895 KOG0676: Actin and related proteins [Cytoskeleton]. 35896 KOG0677: Actin-related protein Arp2/3 complex, subunit Arp2 [Cytoskeleton]. 35897 KOG0678: Actin-related protein Arp2/3 complex, subunit Arp3 [Cytoskeleton]. 35898 KOG0679: Actin-related protein - Arp4p/Act3p [Cytoskeleton]. 35899 KOG0680: Actin-related protein - Arp6p [Cytoskeleton]. 35900 KOG0681: Actin-related protein - Arp5p [Cytoskeleton]. 35901 KOG0682: Ammonia permease [Inorganic ion transport and metabolism]. 35902 KOG0683: Glutamine synthetase [Amino acid transport and metabolism]. 35903 KOG0684: Cytochrome P450 [Secondary metabolites biosynthesis, transport and catabolism]. 35904 KOG0685: Flavin-containing amine oxidase [Coenzyme transport and metabolism]. 35905 KOG0686: COP9 signalosome, subunit CSN1 [Posttranslational modification, protein turnover, chaperones, Signal transduction mechanisms]. 35906 KOG0687: 26S proteasome regulatory complex, subunit RPN7/PSMD6 [Posttranslational modification, protein turnover, chaperones]. 35907 KOG0688: Peptide chain release factor 1 (eRF1) [Translation, ribosomal structure and biogenesis]. 35908 KOG0689: Guanine nucleotide exchange factor for Rho and Rac GTPases [Signal transduction mechanisms]. 35909 KOG0690: Serine/threonine protein kinase [Signal transduction mechanisms]. 35910 KOG0691: Molecular chaperone (DnaJ superfamily) [Posttranslational modification, protein turnover, chaperones]. 35911 KOG0692: Pentafunctional AROM protein [Amino acid transport and metabolism]. 35912 KOG0693: Myo-inositol-1-phosphate synthase [Lipid transport and metabolism]. 35913 KOG0694: Serine/threonine protein kinase [Signal transduction mechanisms]. 35914 KOG0695: Serine/threonine protein kinase [Signal transduction mechanisms]. 35915 KOG0696: Serine/threonine protein kinase [Signal transduction mechanisms]. 35916 KOG0697: Protein phosphatase 1B (formerly 2C) [Signal transduction mechanisms]. 35917 KOG0698: Serine/threonine protein phosphatase [Signal transduction mechanisms]. 35918 KOG0699: Serine/threonine protein phosphatase [Signal transduction mechanisms]. 35919 KOG0700: Protein phosphatase 2C/pyruvate dehydrogenase (lipoamide) phosphatase [Signal transduction mechanisms]. 35920 KOG0701: dsRNA-specific nuclease Dicer and related ribonucleases [RNA processing and modification]. 35921 KOG0702: Predicted GTPase-activating protein [Signal transduction mechanisms]. 35922 KOG0703: Predicted GTPase-activating protein [Signal transduction mechanisms]. 35923 KOG0704: ADP-ribosylation factor GTPase activator [Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport, Cytoskeleton]. 35924 KOG0705: GTPase-activating protein Centaurin gamma (contains Ras-like GTPase, PH and ankyrin repeat domains) [Signal transduction mechanisms]. 35925 KOG0706: Predicted GTPase-activating protein [Signal transduction mechanisms]. 35926 KOG0707: Guanylate kinase [Nucleotide transport and metabolism]. 35927 KOG0708: Membrane-associated guanylate kinase MAGUK (contains PDZ, SH3, HOOK and GUK domains) [Nucleotide transport and metabolism]. 35928 KOG0709: CREB/ATF family transcription factor [Transcription]. 35929 KOG0710: Molecular chaperone (small heat-shock protein Hsp26/Hsp42) [Posttranslational modification, protein turnover, chaperones]. 35930 KOG0711: Polyprenyl synthetase [Coenzyme transport and metabolism]. 35931 KOG0712: Molecular chaperone (DnaJ superfamily) [Posttranslational modification, protein turnover, chaperones]. 35932 KOG0713: Molecular chaperone (DnaJ superfamily) [Posttranslational modification, protein turnover, chaperones]. 35933 KOG0714: Molecular chaperone (DnaJ superfamily) [Posttranslational modification, protein turnover, chaperones]. 35934 KOG0715: Molecular chaperone (DnaJ superfamily) [Posttranslational modification, protein turnover, chaperones]. 35935 KOG0716: Molecular chaperone (DnaJ superfamily) [Posttranslational modification, protein turnover, chaperones]. 35936 KOG0717: Molecular chaperone (DnaJ superfamily) [Posttranslational modification, protein turnover, chaperones]. 35937 KOG0718: Molecular chaperone (DnaJ superfamily) [Posttranslational modification, protein turnover, chaperones]. 35938 KOG0719: Molecular chaperone (DnaJ superfamily) [Posttranslational modification, protein turnover, chaperones]. 35939 KOG0720: Molecular chaperone (DnaJ superfamily) [Posttranslational modification, protein turnover, chaperones]. 35940 KOG0721: Molecular chaperone (DnaJ superfamily) [Posttranslational modification, protein turnover, chaperones]. 35941 KOG0722: Molecular chaperone (DnaJ superfamily) [Posttranslational modification, protein turnover, chaperones]. 35942 KOG0723: Molecular chaperone (DnaJ superfamily) [Posttranslational modification, protein turnover, chaperones]. 35943 KOG0724: Zuotin and related molecular chaperones (DnaJ superfamily), contains DNA-binding domains [Posttranslational modification, protein turnover, chaperones]. 35944 KOG0725: Reductases with broad range of substrate specificities [General function prediction only]. 35945 KOG0726: 26S proteasome regulatory complex, ATPase RPT2 [Posttranslational modification, protein turnover, chaperones]. 35946 KOG0727: 26S proteasome regulatory complex, ATPase RPT3 [Posttranslational modification, protein turnover, chaperones]. 35947 KOG0728: 26S proteasome regulatory complex, ATPase RPT6 [Posttranslational modification, protein turnover, chaperones]. 35948 KOG0729: 26S proteasome regulatory complex, ATPase RPT1 [Posttranslational modification, protein turnover, chaperones]. 35949 KOG0730: AAA+-type ATPase [Posttranslational modification, protein turnover, chaperones]. 35950 KOG0731: AAA+-type ATPase containing the peptidase M41 domain [Posttranslational modification, protein turnover, chaperones]. 35951 KOG0732: AAA+-type ATPase containing the bromodomain [Posttranslational modification, protein turnover, chaperones]. 35952 KOG0733: Nuclear AAA ATPase (VCP subfamily) [Posttranslational modification, protein turnover, chaperones]. 35953 KOG0734: AAA+-type ATPase containing the peptidase M41 domain [Posttranslational modification, protein turnover, chaperones]. 35954 KOG0735: AAA+-type ATPase [Posttranslational modification, protein turnover, chaperones]. 35955 KOG0736: Peroxisome assembly factor 2 containing the AAA+-type ATPase domain [Posttranslational modification, protein turnover, chaperones]. 35956 KOG0737: AAA+-type ATPase [Posttranslational modification, protein turnover, chaperones]. 35957 KOG0738: AAA+-type ATPase [Posttranslational modification, protein turnover, chaperones]. 35958 KOG0739: AAA+-type ATPase [Posttranslational modification, protein turnover, chaperones]. 35959 KOG0740: AAA+-type ATPase [Posttranslational modification, protein turnover, chaperones]. 35960 KOG0741: AAA+-type ATPase [Posttranslational modification, protein turnover, chaperones]. 35961 KOG0742: AAA+-type ATPase [Posttranslational modification, protein turnover, chaperones]. 35962 KOG0743: AAA+-type ATPase [Posttranslational modification, protein turnover, chaperones]. 35963 KOG0744: AAA+-type ATPase [Posttranslational modification, protein turnover, chaperones]. 35964 KOG0745: Putative ATP-dependent Clp-type protease (AAA+ ATPase superfamily) [Posttranslational modification, protein turnover, chaperones]. 35965 KOG0746: 60S ribosomal protein L3 and related proteins [Translation, ribosomal structure and biogenesis]. 35966 KOG0747: Putative NAD+-dependent epimerases [Carbohydrate transport and metabolism]. 35967 KOG0748: Predicted membrane proteins, contain hemolysin III domain [General function prediction only, Signal transduction mechanisms]. 35968 KOG0749: Mitochondrial ADP/ATP carrier proteins [Energy production and conversion]. 35969 KOG0750: Mitochondrial solute carrier protein [Energy production and conversion]. 35970 KOG0751: Mitochondrial aspartate/glutamate carrier protein Aralar/Citrin (contains EF-hand Ca2+-binding domains) [Energy production and conversion]. 35971 KOG0752: Mitochondrial solute carrier protein [Energy production and conversion]. 35972 KOG0753: Mitochondrial fatty acid anion carrier protein/Uncoupling protein [Energy production and conversion]. 35973 KOG0754: Mitochondrial oxodicarboxylate carrier protein [Energy production and conversion]. 35974 KOG0755: Mitochondrial oxaloacetate carrier protein [Energy production and conversion]. 35975 KOG0756: Mitochondrial tricarboxylate/dicarboxylate carrier proteins [Energy production and conversion]. 35976 KOG0757: Mitochondrial carrier protein - Rim2p/Mrs12p [Energy production and conversion]. 35977 KOG0758: Mitochondrial carnitine-acylcarnitine carrier protein [Energy production and conversion]. 35978 KOG0759: Mitochondrial oxoglutarate/malate carrier proteins [Energy production and conversion]. 35979 KOG0760: Mitochondrial carrier protein MRS3/4 [Energy production and conversion]. 35980 KOG0761: Mitochondrial carrier protein CGI-69 [Energy production and conversion]. 35981 KOG0762: Mitochondrial carrier protein [Energy production and conversion]. 35982 KOG0763: Mitochondrial ornithine transporter [Energy production and conversion]. 35983 KOG0764: Mitochondrial FAD carrier protein [Energy production and conversion]. 35984 KOG0765: Predicted mitochondrial carrier protein [Energy production and conversion]. 35985 KOG0766: Predicted mitochondrial carrier protein [Energy production and conversion]. 35986 KOG0767: Mitochondrial phosphate carrier protein [Energy production and conversion]. 35987 KOG0768: Mitochondrial carrier protein PET8 [Energy production and conversion]. 35988 KOG0769: Predicted mitochondrial carrier protein [Energy production and conversion]. 35989 KOG0770: Predicted mitochondrial carrier protein [Energy production and conversion]. 35990 KOG0771: Prolactin regulatory element-binding protein/Protein transport protein SEC12p [Intracellular trafficking, secretion, and vesicular transport]. 35991 KOG0772: Uncharacterized conserved protein, contains WD40 repeat [Function unknown]. 35992 KOG0773: Transcription factor MEIS1 and related HOX domain proteins [Transcription]. 35993 KOG0774: Transcription factor PBX and related HOX domain proteins [Transcription]. 35994 KOG0775: Transcription factor SIX and related HOX domain proteins [Transcription]. 35995 KOG0776: Geranylgeranyl pyrophosphate synthase/Polyprenyl synthetase [Coenzyme transport and metabolism]. 35996 KOG0777: Geranylgeranyl pyrophosphate synthase/Polyprenyl synthetase [Coenzyme transport and metabolism]. 35997 KOG0778: Protease, Ulp1 family [Posttranslational modification, protein turnover, chaperones]. 35998 KOG0779: Protease, Ulp1 family [Posttranslational modification, protein turnover, chaperones]. 35999 KOG0780: Signal recognition particle, subunit Srp54 [Intracellular trafficking, secretion, and vesicular transport]. 36000 KOG0781: Signal recognition particle receptor, alpha subunit [Intracellular trafficking, secretion, and vesicular transport]. 36001 KOG0782: Predicted diacylglycerol kinase [Signal transduction mechanisms]. 36002 KOG0783: Uncharacterized conserved protein, contains ankyrin and BTB/POZ domains [Function unknown]. 36003 KOG0784: Isocitrate dehydrogenase, gamma subunit [Amino acid transport and metabolism]. 36004 KOG0785: Isocitrate dehydrogenase, alpha subunit [Amino acid transport and metabolism]. 36005 KOG0786: 3-isopropylmalate dehydrogenase [Amino acid transport and metabolism]. 36006 KOG0787: Dehydrogenase kinase [Signal transduction mechanisms]. 36007 KOG0788: S-adenosylmethionine decarboxylase [Signal transduction mechanisms]. 36008 KOG0789: Protein tyrosine phosphatase [Signal transduction mechanisms]. 36009 KOG0790: Protein tyrosine phosphatase Corkscrew and related SH2 domain enzymes [Signal transduction mechanisms]. 36010 KOG0791: Protein tyrosine phosphatase, contains fn3 domain [Signal transduction mechanisms]. 36011 KOG0792: Protein tyrosine phosphatase PTPMEG, contains FERM domain [Signal transduction mechanisms]. 36012 KOG0793: Protein tyrosine phosphatase [Signal transduction mechanisms]. 36013 KOG0794: CDK8 kinase-activating protein cyclin C [Transcription]. 36014 KOG0795: Chorismate mutase [Amino acid transport and metabolism]. 36015 KOG0796: Spliceosome subunit [RNA processing and modification]. 36016 KOG0797: Actin-related protein [Cytoskeleton]. 36017 KOG0798: Uncharacterized conserved protein [Cell cycle control, cell division, chromosome partitioning]. 36018 KOG0799: Branching enzyme [Carbohydrate transport and metabolism]. 36019 KOG0801: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 36020 KOG0802: E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 36021 KOG0803: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 36022 KOG0804: Cytoplasmic Zn-finger protein BRAP2 (BRCA1 associated protein) [General function prediction only]. 36023 KOG0805: Carbon-nitrogen hydrolase [Amino acid transport and metabolism]. 36024 KOG0806: Carbon-nitrogen hydrolase [Amino acid transport and metabolism]. 36025 KOG0807: Carbon-nitrogen hydrolase [Amino acid transport and metabolism]. 36026 KOG0808: Carbon-nitrogen hydrolase [Amino acid transport and metabolism]. 36027 KOG0809: SNARE protein TLG2/Syntaxin 16 [Intracellular trafficking, secretion, and vesicular transport]. 36028 KOG0810: SNARE protein Syntaxin 1 and related proteins [Intracellular trafficking, secretion, and vesicular transport]. 36029 KOG0811: SNARE protein PEP12/VAM3/Syntaxin 7/Syntaxin 17 [Intracellular trafficking, secretion, and vesicular transport]. 36030 KOG0812: SNARE protein SED5/Syntaxin 5 [Intracellular trafficking, secretion, and vesicular transport]. 36031 KOG0813: Glyoxylase [General function prediction only]. 36032 KOG0814: Glyoxylase [General function prediction only]. 36033 KOG0815: 60S acidic ribosomal protein P0 [Translation, ribosomal structure and biogenesis]. 36034 KOG0816: Protein involved in mRNA turnover [RNA processing and modification]. 36035 KOG0817: Acyl-CoA-binding protein [Lipid transport and metabolism]. 36036 KOG0818: GTPase-activating proteins of the GIT family [Signal transduction mechanisms]. 36037 KOG0819: Annexin [Intracellular trafficking, secretion, and vesicular transport]. 36038 KOG0820: Ribosomal RNA adenine dimethylase [RNA processing and modification]. 36039 KOG0821: Predicted ribosomal RNA adenine dimethylase [RNA processing and modification]. 36040 KOG0822: Protein kinase inhibitor [Cell cycle control, cell division, chromosome partitioning]. 36041 KOG0823: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 36042 KOG0824: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 36043 KOG0825: PHD Zn-finger protein [General function prediction only]. 36044 KOG0826: Predicted E3 ubiquitin ligase involved in peroxisome organization [Posttranslational modification, protein turnover, chaperones]. 36045 KOG0827: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 36046 KOG0828: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 36047 KOG0829: 60S ribosomal protein L18A [Translation, ribosomal structure and biogenesis]. 36048 KOG0830: 40S ribosomal protein SA (P40)/Laminin receptor 1 [Translation, ribosomal structure and biogenesis]. 36049 KOG0831: Acyl-CoA:diacylglycerol acyltransferase (DGAT) [Lipid transport and metabolism]. 36050 KOG0832: Mitochondrial/chloroplast ribosomal protein S2 [Translation, ribosomal structure and biogenesis]. 36051 KOG0833: Cytidine deaminase [Nucleotide transport and metabolism]. 36052 KOG0834: CDK9 kinase-activating protein cyclin T [Cell cycle control, cell division, chromosome partitioning]. 36053 KOG0835: Cyclin L [General function prediction only]. 36054 KOG0836: F-actin capping protein, alpha subunit [Cytoskeleton]. 36055 KOG0837: Transcriptional activator of the JUN family [Transcription]. 36056 KOG0838: RNA Methylase, SpoU family [RNA processing and modification]. 36057 KOG0839: RNA Methylase, SpoU family [RNA processing and modification]. 36058 KOG0840: ATP-dependent Clp protease, proteolytic subunit [Posttranslational modification, protein turnover, chaperones]. 36059 KOG0841: Multifunctional chaperone (14-3-3 family) [Posttranslational modification, protein turnover, chaperones]. 36060 KOG0842: Transcription factor tinman/NKX2-3, contains HOX domain [Transcription]. 36061 KOG0843: Transcription factor EMX1 and related HOX domain proteins [Transcription]. 36062 KOG0844: Transcription factor EVX1, contains HOX domain [Transcription]. 36063 KOG0845: Nuclear pore complex, Nup98 component (sc Nup145/Nup100/Nup116) [Nuclear structure, Intracellular trafficking, secretion, and vesicular transport]. 36064 KOG0846: Mitochondrial/chloroplast ribosomal protein L15/L10 [Translation, ribosomal structure and biogenesis]. 36065 KOG0847: Transcription factor, contains HOX domain [Transcription]. 36066 KOG0848: Transcription factor Caudal, contains HOX domain [Transcription]. 36067 KOG0849: Transcription factor PRD and related proteins, contain PAX and HOX domains [Transcription]. 36068 KOG0850: Transcription factor DLX and related proteins with LIM Zn-binding and HOX domains [Transcription]. 36069 KOG0851: Single-stranded DNA-binding replication protein A (RPA), large (70 kD) subunit and related ssDNA-binding proteins [Replication, recombination and repair]. 36070 KOG0852: Alkyl hydroperoxide reductase, thiol specific antioxidant and related enzymes [Posttranslational modification, protein turnover, chaperones]. 36071 KOG0853: Glycosyltransferase [Cell wall/membrane/envelope biogenesis]. 36072 KOG0854: Alkyl hydroperoxide reductase, thiol specific antioxidant and related enzymes [Posttranslational modification, protein turnover, chaperones]. 36073 KOG0855: Alkyl hydroperoxide reductase, thiol specific antioxidant and related enzymes [Posttranslational modification, protein turnover, chaperones]. 36074 KOG0856: Predicted pilin-like transcription factor [Posttranslational modification, protein turnover, chaperones]. 36075 KOG0857: 60s ribosomal protein L10 [Translation, ribosomal structure and biogenesis]. 36076 KOG0858: Predicted membrane protein [Function unknown]. 36077 KOG0859: Synaptobrevin/VAMP-like protein [Intracellular trafficking, secretion, and vesicular transport]. 36078 KOG0860: Synaptobrevin/VAMP-like protein [Intracellular trafficking, secretion, and vesicular transport]. 36079 KOG0861: SNARE protein YKT6, synaptobrevin/VAMP syperfamily [Intracellular trafficking, secretion, and vesicular transport]. 36080 KOG0862: Synaptobrevin/VAMP-like protein SEC22 [Intracellular trafficking, secretion, and vesicular transport]. 36081 KOG0863: 20S proteasome, regulatory subunit alpha type PSMA1/PRE5 [Posttranslational modification, protein turnover, chaperones]. 36082 KOG0864: Ran-binding protein RANBP1 and related RanBD domain proteins [Intracellular trafficking, secretion, and vesicular transport]. 36083 KOG0865: Cyclophilin type peptidyl-prolyl cis-trans isomerase [Posttranslational modification, protein turnover, chaperones]. 36084 KOG0866: Ran-binding protein RANBP3 [Intracellular trafficking, secretion, and vesicular transport]. 36085 KOG0867: Glutathione S-transferase [Posttranslational modification, protein turnover, chaperones]. 36086 KOG0868: Glutathione S-transferase [Posttranslational modification, protein turnover, chaperones]. 36087 KOG0869: CCAAT-binding factor, subunit A (HAP3) [Transcription]. 36088 KOG0870: DNA polymerase epsilon, subunit D [Transcription]. 36089 KOG0871: Class 2 transcription repressor NC2, beta subunit (Dr1) [Transcription]. 36090 KOG0872: Sterol C5 desaturase [Lipid transport and metabolism]. 36091 KOG0873: C-4 sterol methyl oxidase [Lipid transport and metabolism]. 36092 KOG0874: Sphingolipid hydroxylase [Lipid transport and metabolism]. 36093 KOG0875: 60S ribosomal protein L5 [Translation, ribosomal structure and biogenesis]. 36094 KOG0876: Manganese superoxide dismutase [Inorganic ion transport and metabolism]. 36095 KOG0877: 40S ribosomal protein S2/30S ribosomal protein S5 [Translation, ribosomal structure and biogenesis]. 36096 KOG0878: 60S ribosomal protein L32 [Translation, ribosomal structure and biogenesis]. 36097 KOG0879: U-snRNP-associated cyclophilin type peptidyl-prolyl cis-trans isomerase [Posttranslational modification, protein turnover, chaperones]. 36098 KOG0880: Peptidyl-prolyl cis-trans isomerase [Posttranslational modification, protein turnover, chaperones]. 36099 KOG0881: Cyclophilin type peptidyl-prolyl cis-trans isomerase [Posttranslational modification, protein turnover, chaperones]. 36100 KOG0882: Cyclophilin-related peptidyl-prolyl cis-trans isomerase [Posttranslational modification, protein turnover, chaperones]. 36101 KOG0883: Cyclophilin type, U box-containing peptidyl-prolyl cis-trans isomerase [Posttranslational modification, protein turnover, chaperones]. 36102 KOG0884: Similar to cyclophilin-type peptidyl-prolyl cis-trans isomerase [Posttranslational modification, protein turnover, chaperones]. 36103 KOG0885: Peptidyl-prolyl cis-trans isomerase [Posttranslational modification, protein turnover, chaperones]. 36104 KOG0886: 40S ribosomal protein S2 [Translation, ribosomal structure and biogenesis]. 36105 KOG0887: 60S ribosomal protein L35A/L37 [Translation, ribosomal structure and biogenesis]. 36106 KOG0888: Nucleoside diphosphate kinase [Nucleotide transport and metabolism]. 36107 KOG0889: Histone acetyltransferase SAGA, TRRAP/TRA1 component, PI-3 kinase superfamily [Signal transduction mechanisms, Chromatin structure and dynamics, Replication, recombination and repair, Cell cycle control, cell division, chromosome partitioning]. 36108 KOG0890: Protein kinase of the PI-3 kinase family involved in mitotic growth, DNA repair and meiotic recombination [Signal transduction mechanisms, Chromatin structure and dynamics, Replication, recombination and repair, Cell cycle control, cell division, chromosome partitioning]. 36109 KOG0891: DNA-dependent protein kinase [Replication, recombination and repair]. 36110 KOG0892: Protein kinase ATM/Tel1, involved in telomere length regulation and DNA repair [Signal transduction mechanisms, Chromatin structure and dynamics, Replication, recombination and repair, Cell cycle control, cell division, chromosome partitioning]. 36111 KOG0893: 60S ribosomal protein L31 [Translation, ribosomal structure and biogenesis]. 36112 KOG0894: Ubiquitin-protein ligase [Posttranslational modification, protein turnover, chaperones]. 36113 KOG0895: Ubiquitin-conjugating enzyme [Posttranslational modification, protein turnover, chaperones]. 36114 KOG0896: Ubiquitin-conjugating enzyme E2 [Posttranslational modification, protein turnover, chaperones]. 36115 KOG0897: Predicted ubiquitin-conjugating enzyme [Posttranslational modification, protein turnover, chaperones]. 36116 KOG0898: 40S ribosomal protein S15 [Translation, ribosomal structure and biogenesis]. 36117 KOG0899: Mitochondrial/chloroplast ribosomal protein S19 [Translation, ribosomal structure and biogenesis]. 36118 KOG0900: 40S ribosomal protein S20 [Translation, ribosomal structure and biogenesis]. 36119 KOG0901: 60S ribosomal protein L14/L17/L23 [Translation, ribosomal structure and biogenesis]. 36120 KOG0902: Phosphatidylinositol 4-kinase [Signal transduction mechanisms]. 36121 KOG0903: Phosphatidylinositol 4-kinase, involved in intracellular trafficking and secretion [Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport]. 36122 KOG0904: Phosphatidylinositol 3-kinase catalytic subunit (p110) [Signal transduction mechanisms]. 36123 KOG0905: Phosphoinositide 3-kinase [Signal transduction mechanisms]. 36124 KOG0906: Phosphatidylinositol 3-kinase VPS34, involved in signal transduction [Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport]. 36125 KOG0907: Thioredoxin [Posttranslational modification, protein turnover, chaperones]. 36126 KOG0908: Thioredoxin-like protein [Posttranslational modification, protein turnover, chaperones]. 36127 KOG0909: Peptide:N-glycanase [Posttranslational modification, protein turnover, chaperones]. 36128 KOG0910: Thioredoxin-like protein [Posttranslational modification, protein turnover, chaperones]. 36129 KOG0911: Glutaredoxin-related protein [Posttranslational modification, protein turnover, chaperones]. 36130 KOG0912: Thiol-disulfide isomerase and thioredoxin [Posttranslational modification, protein turnover, chaperones, Energy production and conversion]. 36131 KOG0913: Thiol-disulfide isomerase and thioredoxin [Posttranslational modification, protein turnover, chaperones, Energy production and conversion]. 36132 KOG0914: Thioredoxin-like protein [Posttranslational modification, protein turnover, chaperones]. 36133 KOG0915: Uncharacterized conserved protein [Function unknown]. 36134 KOG0916: 1,3-beta-glucan synthase/callose synthase catalytic subunit [Cell wall/membrane/envelope biogenesis]. 36135 KOG0917: Uncharacterized conserved protein [Function unknown]. 36136 KOG0918: Selenium-binding protein [Inorganic ion transport and metabolism]. 36137 KOG0919: C-5 cytosine-specific DNA methylase [Transcription]. 36138 KOG0920: ATP-dependent RNA helicase A [RNA processing and modification]. 36139 KOG0921: Dosage compensation complex, subunit MLE [Transcription]. 36140 KOG0922: DEAH-box RNA helicase [RNA processing and modification]. 36141 KOG0923: mRNA splicing factor ATP-dependent RNA helicase [RNA processing and modification]. 36142 KOG0924: mRNA splicing factor ATP-dependent RNA helicase [RNA processing and modification]. 36143 KOG0925: mRNA splicing factor ATP-dependent RNA helicase [RNA processing and modification]. 36144 KOG0926: DEAH-box RNA helicase [RNA processing and modification, Translation, ribosomal structure and biogenesis]. 36145 KOG0927: Predicted transporter (ABC superfamily) [General function prediction only]. 36146 KOG0928: Pattern-formation protein/guanine nucleotide exchange factor [Intracellular trafficking, secretion, and vesicular transport]. 36147 KOG0929: Guanine nucleotide exchange factor [Intracellular trafficking, secretion, and vesicular transport]. 36148 KOG0930: Guanine nucleotide exchange factor Cytohesin, contains PH and Sec7 domains [Intracellular trafficking, secretion, and vesicular transport]. 36149 KOG0931: Predicted guanine nucleotide exchange factor, contains Sec7 domain [Intracellular trafficking, secretion, and vesicular transport]. 36150 KOG0932: Guanine nucleotide exchange factor EFA6 [Intracellular trafficking, secretion, and vesicular transport]. 36151 KOG0933: Structural maintenance of chromosome protein 2 (chromosome condensation complex Condensin, subunit E) [Chromatin structure and dynamics, Cell cycle control, cell division, chromosome partitioning]. 36152 KOG0934: Clathrin adaptor complex, small subunit [Intracellular trafficking, secretion, and vesicular transport]. 36153 KOG0935: Clathrin adaptor complex, small subunit [Intracellular trafficking, secretion, and vesicular transport]. 36154 KOG0936: Clathrin adaptor complex, small subunit [Intracellular trafficking, secretion, and vesicular transport]. 36155 KOG0937: Adaptor complexes medium subunit family [Intracellular trafficking, secretion, and vesicular transport]. 36156 KOG0938: Adaptor complexes medium subunit family [Intracellular trafficking, secretion, and vesicular transport]. 36157 KOG0939: E3 ubiquitin-protein ligase/Putative upstream regulatory element binding protein [Posttranslational modification, protein turnover, chaperones, Transcription]. 36158 KOG0940: Ubiquitin protein ligase RSP5/NEDD4 [Posttranslational modification, protein turnover, chaperones]. 36159 KOG0941: E3 ubiquitin protein ligase [Posttranslational modification, protein turnover, chaperones]. 36160 KOG0942: E3 ubiquitin protein ligase [Posttranslational modification, protein turnover, chaperones]. 36161 KOG0943: Predicted ubiquitin-protein ligase/hyperplastic discs protein, HECT superfamily [Posttranslational modification, protein turnover, chaperones]. 36162 KOG0944: Ubiquitin-specific protease UBP14 [Posttranslational modification, protein turnover, chaperones]. 36163 KOG0945: Alpha-aminoadipic semialdehyde dehydrogenase-phosphopantetheinyl transferase [Amino acid transport and metabolism, Coenzyme transport and metabolism]. 36164 KOG0946: ER-Golgi vesicle-tethering protein p115 [Intracellular trafficking, secretion, and vesicular transport]. 36165 KOG0947: Cytoplasmic exosomal RNA helicase SKI2, DEAD-box superfamily [RNA processing and modification]. 36166 KOG0948: Nuclear exosomal RNA helicase MTR4, DEAD-box superfamily [RNA processing and modification]. 36167 KOG0949: Predicted helicase, DEAD-box superfamily [General function prediction only]. 36168 KOG0950: DNA polymerase theta/eta, DEAD-box superfamily [General function prediction only]. 36169 KOG0951: RNA helicase BRR2, DEAD-box superfamily [RNA processing and modification]. 36170 KOG0952: DNA/RNA helicase MER3/SLH1, DEAD-box superfamily [RNA processing and modification]. 36171 KOG0953: Mitochondrial RNA helicase SUV3, DEAD-box superfamily [RNA processing and modification]. 36172 KOG0954: PHD finger protein [General function prediction only]. 36173 KOG0955: PHD finger protein BR140/LIN-49 [General function prediction only]. 36174 KOG0956: PHD finger protein AF10 [General function prediction only]. 36175 KOG0957: PHD finger protein [General function prediction only]. 36176 KOG0958: DNA damage-responsive repressor GIS1/RPH1, jumonji superfamily [Replication, recombination and repair]. 36177 KOG0959: N-arginine dibasic convertase NRD1 and related Zn2+-dependent endopeptidases, insulinase superfamily [Posttranslational modification, protein turnover, chaperones]. 36178 KOG0960: Mitochondrial processing peptidase, beta subunit, and related enzymes (insulinase superfamily) [Posttranslational modification, protein turnover, chaperones]. 36179 KOG0961: Predicted Zn2+-dependent endopeptidase, insulinase superfamily [Posttranslational modification, protein turnover, chaperones]. 36180 KOG0962: DNA repair protein RAD50, ABC-type ATPase/SMC superfamily [Replication, recombination and repair]. 36181 KOG0963: Transcription factor/CCAAT displacement protein CDP1 [Transcription]. 36182 KOG0964: Structural maintenance of chromosome protein 3 (sister chromatid cohesion complex Cohesin, subunit SMC3) [Cell cycle control, cell division, chromosome partitioning]. 36183 KOG0965: Predicted RNA-binding protein, contains SWAP and G-patch domains [General function prediction only]. 36184 KOG0966: ATP-dependent DNA ligase IV [Replication, recombination and repair]. 36185 KOG0967: ATP-dependent DNA ligase I [Replication, recombination and repair]. 36186 KOG0968: DNA polymerase zeta, catalytic subunit [Replication, recombination and repair]. 36187 KOG0969: DNA polymerase delta, catalytic subunit [Replication, recombination and repair]. 36188 KOG0970: DNA polymerase alpha, catalytic subunit [Replication, recombination and repair]. 36189 KOG0971: Microtubule-associated protein dynactin DCTN1/Glued [Cell cycle control, cell division, chromosome partitioning, Cytoskeleton]. 36190 KOG0972: Huntingtin interacting protein 1 (Hip1) interactor Hippi [Signal transduction mechanisms]. 36191 KOG0973: Histone transcription regulator HIRA, WD repeat superfamily [Cell cycle control, cell division, chromosome partitioning, Transcription]. 36192 KOG0974: WD-repeat protein WDR6, WD repeat superfamily [General function prediction only]. 36193 KOG0975: Branched chain aminotransferase BCAT1, pyridoxal phosphate enzymes type IV superfamily [Amino acid transport and metabolism]. 36194 KOG0976: Rho/Rac1-interacting serine/threonine kinase Citron [Signal transduction mechanisms]. 36195 KOG0977: Nuclear envelope protein lamin, intermediate filament superfamily [Cell cycle control, cell division, chromosome partitioning, Nuclear structure]. 36196 KOG0978: E3 ubiquitin ligase involved in syntaxin degradation [Posttranslational modification, protein turnover, chaperones]. 36197 KOG0979: Structural maintenance of chromosome protein SMC5/Spr18, SMC superfamily [Chromatin structure and dynamics, Cell cycle control, cell division, chromosome partitioning, Replication, recombination and repair]. 36198 KOG0980: Actin-binding protein SLA2/Huntingtin-interacting protein Hip1 [Cytoskeleton]. 36199 KOG0981: DNA topoisomerase I [Replication, recombination and repair]. 36200 KOG0982: Centrosomal protein Nuf [Cell cycle control, cell division, chromosome partitioning, Cytoskeleton]. 36201 KOG0983: Mitogen-activated protein kinase (MAPK) kinase MKK7/JNKK2 [Signal transduction mechanisms]. 36202 KOG0984: Mitogen-activated protein kinase (MAPK) kinase MKK3/MKK6 [Signal transduction mechanisms]. 36203 KOG0985: Vesicle coat protein clathrin, heavy chain [Intracellular trafficking, secretion, and vesicular transport]. 36204 KOG0986: G protein-coupled receptor kinase [Signal transduction mechanisms]. 36205 KOG0987: DNA helicase PIF1/RRM3 [Cell cycle control, cell division, chromosome partitioning]. 36206 KOG0988: RNA-directed RNA polymerase QDE-1 required for posttranscriptional gene silencing and RNA interference [RNA processing and modification]. 36207 KOG0989: Replication factor C, subunit RFC4 [Replication, recombination and repair]. 36208 KOG0990: Replication factor C, subunit RFC5 [Replication, recombination and repair]. 36209 KOG0991: Replication factor C, subunit RFC2 [Replication, recombination and repair]. 36210 KOG0992: Uncharacterized conserved protein [Function unknown]. 36211 KOG0993: Rab5 GTPase effector Rabaptin-5 [Intracellular trafficking, secretion, and vesicular transport]. 36212 KOG0994: Extracellular matrix glycoprotein Laminin subunit beta [Extracellular structures]. 36213 KOG0995: Centromere-associated protein HEC1 [Cell cycle control, cell division, chromosome partitioning]. 36214 KOG0996: Structural maintenance of chromosome protein 4 (chromosome condensation complex Condensin, subunit C) [Chromatin structure and dynamics, Cell cycle control, cell division, chromosome partitioning]. 36215 KOG0997: Uncharacterized conserved protein Sand [Function unknown]. 36216 KOG0998: Synaptic vesicle protein EHS-1 and related EH domain proteins [Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport]. 36217 KOG0999: Microtubule-associated protein Bicaudal-D [Intracellular trafficking, secretion, and vesicular transport]. 36218 KOG1000: Chromatin remodeling protein HARP/SMARCAL1, DEAD-box superfamily [Chromatin structure and dynamics]. 36219 KOG1001: Helicase-like transcription factor HLTF/DNA helicase RAD5, DEAD-box superfamily [Transcription, Replication, recombination and repair]. 36220 KOG1002: Nucleotide excision repair protein RAD16 [Replication, recombination and repair]. 36221 KOG1003: Actin filament-coating protein tropomyosin [Cytoskeleton]. 36222 KOG1004: Exosomal 3'-5' exoribonuclease complex subunit Rrp40 [Translation, ribosomal structure and biogenesis]. 36223 KOG1005: Telomerase catalytic subunit/reverse transcriptase TERT [Replication, recombination and repair, Chromatin structure and dynamics]. 36224 KOG1006: Mitogen-activated protein kinase (MAPK) kinase MKK4 [Signal transduction mechanisms]. 36225 KOG1007: WD repeat protein TSSC1, WD repeat superfamily [Function unknown]. 36226 KOG1008: Uncharacterized conserved protein, contains WD40 repeats [Function unknown]. 36227 KOG1009: Chromatin assembly complex 1 subunit B/CAC2 (contains WD40 repeats) [Chromatin structure and dynamics, Replication, recombination and repair]. 36228 KOG1010: Rb (Retinoblastoma tumor suppressor)-related protein [Cell cycle control, cell division, chromosome partitioning]. 36229 KOG1011: Neurotransmitter release regulator, UNC-13 [Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport]. 36230 KOG1012: Ca2+-dependent lipid-binding protein CLB1/vesicle protein vp115/Granuphilin A, contains C2 domain [General function prediction only]. 36231 KOG1013: Synaptic vesicle protein rabphilin-3A [Intracellular trafficking, secretion, and vesicular transport]. 36232 KOG1014: 17 beta-hydroxysteroid dehydrogenase type 3, HSD17B3 [Lipid transport and metabolism]. 36233 KOG1015: Transcription regulator XNP/ATRX, DEAD-box superfamily [Transcription]. 36234 KOG1016: Predicted DNA helicase, DEAD-box superfamily [General function prediction only]. 36235 KOG1017: Predicted uracil phosphoribosyltransferase [General function prediction only]. 36236 KOG1018: Cytosine deaminase FCY1 and related enzymes [Nucleotide transport and metabolism]. 36237 KOG1019: Retinoblastoma pathway protein LIN-9/chromatin-associated protein Aly [Chromatin structure and dynamics, Cell cycle control, cell division, chromosome partitioning, Signal transduction mechanisms]. 36238 KOG1020: Sister chromatid cohesion protein SCC2/Nipped-B [Chromatin structure and dynamics, Cell cycle control, cell division, chromosome partitioning, Replication, recombination and repair]. 36239 KOG1021: Acetylglucosaminyltransferase EXT1/exostosin 1 [Carbohydrate transport and metabolism, Cell wall/membrane/envelope biogenesis, Extracellular structures]. 36240 KOG1022: Acetylglucosaminyltransferase EXT2/exostosin 2 [Carbohydrate transport and metabolism, Cell wall/membrane/envelope biogenesis, Extracellular structures]. 36241 KOG1023: Natriuretic peptide receptor, guanylate cyclase [Signal transduction mechanisms]. 36242 KOG1024: Receptor-like protein tyrosine kinase RYK/derailed [Signal transduction mechanisms]. 36243 KOG1025: Epidermal growth factor receptor EGFR and related tyrosine kinases [Signal transduction mechanisms]. 36244 KOG1026: Nerve growth factor receptor TRKA and related tyrosine kinases [Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport]. 36245 KOG1027: Serine/threonine protein kinase and endoribonuclease ERN1/IRE1, sensor of the unfolded protein response pathway [Signal transduction mechanisms]. 36246 KOG1028: Ca2+-dependent phospholipid-binding protein Synaptotagmin, required for synaptic vesicle and secretory granule exocytosis [Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport]. 36247 KOG1029: Endocytic adaptor protein intersectin [Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport]. 36248 KOG1030: Predicted Ca2+-dependent phospholipid-binding protein [General function prediction only]. 36249 KOG1031: Predicted Ca2+-dependent phospholipid-binding protein [General function prediction only]. 36250 KOG1032: Uncharacterized conserved protein, contains GRAM domain [Function unknown]. 36251 KOG1033: eIF-2alpha kinase PEK/EIF2AK3 [Translation, ribosomal structure and biogenesis]. 36252 KOG1034: Transcriptional repressor EED/ESC/FIE, required for transcriptional silencing, WD repeat superfamily [Transcription]. 36253 KOG1035: eIF-2alpha kinase GCN2 [Translation, ribosomal structure and biogenesis]. 36254 KOG1036: Mitotic spindle checkpoint protein BUB3, WD repeat superfamily [Cell cycle control, cell division, chromosome partitioning]. 36255 KOG1037: NAD+ ADP-ribosyltransferase Parp, required for poly-ADP ribosylation of nuclear proteins [Transcription, Replication, recombination and repair, Posttranslational modification, protein turnover, chaperones]. 36256 KOG1038: Mitochondrial/chloroplast DNA-directed RNA polymerase RPO41, provides primers for DNA replication-initiation [Transcription, Replication, recombination and repair]. 36257 KOG1039: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 36258 KOG1040: Polyadenylation factor I complex, subunit, Yth1 (CPSF subunit) [RNA processing and modification]. 36259 KOG1041: Translation initiation factor 2C (eIF-2C) and related proteins [Translation, ribosomal structure and biogenesis]. 36260 KOG1042: Germ-line stem cell division protein Hiwi/Piwi; negative developmental regulator [Cell cycle control, cell division, chromosome partitioning]. 36261 KOG1043: Ca2+-binding transmembrane protein LETM1/MRS7 [Function unknown]. 36262 KOG1044: Actin-binding LIM Zn-finger protein Limatin involved in axon guidance [Signal transduction mechanisms, Cytoskeleton]. 36263 KOG1045: Uncharacterized conserved protein HEN1/CORYMBOSA2 [Function unknown]. 36264 KOG1046: Puromycin-sensitive aminopeptidase and related aminopeptidases [Amino acid transport and metabolism, Posttranslational modification, protein turnover, chaperones]. 36265 KOG1047: Bifunctional leukotriene A4 hydrolase/aminopeptidase LTA4H [Lipid transport and metabolism, Posttranslational modification, protein turnover, chaperones, Defense mechanisms, Amino acid transport and metabolism]. 36266 KOG1048: Neural adherens junction protein Plakophilin and related Armadillo repeat proteins [Signal transduction mechanisms, Extracellular structures]. 36267 KOG1049: Polyadenylation factor I complex, subunit FIP1 [RNA processing and modification]. 36268 KOG1050: Trehalose-6-phosphate synthase component TPS1 and related subunits [Carbohydrate transport and metabolism]. 36269 KOG1051: Chaperone HSP104 and related ATP-dependent Clp proteases [Posttranslational modification, protein turnover, chaperones]. 36270 KOG1052: Glutamate-gated kainate-type ion channel receptor subunit GluR5 and related subunits [Inorganic ion transport and metabolism, Amino acid transport and metabolism, Signal transduction mechanisms]. 36271 KOG1053: Glutamate-gated NMDA-type ion channel receptor subunit GRIN2A and related subunits [Inorganic ion transport and metabolism, Amino acid transport and metabolism, Signal transduction mechanisms]. 36272 KOG1054: Glutamate-gated AMPA-type ion channel receptor subunit GluR2 and related subunits [Inorganic ion transport and metabolism, Amino acid transport and metabolism, Signal transduction mechanisms]. 36273 KOG1055: GABA-B ion channel receptor subunit GABABR1 and related subunits, G-protein coupled receptor superfamily [Inorganic ion transport and metabolism, Amino acid transport and metabolism, Signal transduction mechanisms]. 36274 KOG1056: Glutamate-gated metabotropic ion channel receptor subunit GRM2 and related subunits, G-protein coupled receptor superfamily [Inorganic ion transport and metabolism, Signal transduction mechanisms]. 36275 KOG1057: Arp2/3 complex-interacting protein VIP1/Asp1, involved in regulation of actin cytoskeleton [Cytoskeleton]. 36276 KOG1058: Vesicle coat complex COPI, beta subunit [Intracellular trafficking, secretion, and vesicular transport]. 36277 KOG1059: Vesicle coat complex AP-3, delta subunit [Intracellular trafficking, secretion, and vesicular transport]. 36278 KOG1060: Vesicle coat complex AP-3, beta subunit [Intracellular trafficking, secretion, and vesicular transport]. 36279 KOG1061: Vesicle coat complex AP-1/AP-2/AP-4, beta subunit [Intracellular trafficking, secretion, and vesicular transport]. 36280 KOG1062: Vesicle coat complex AP-1, gamma subunit [Intracellular trafficking, secretion, and vesicular transport]. 36281 KOG1063: RNA polymerase II elongator complex, subunit ELP2, WD repeat superfamily [Chromatin structure and dynamics, Transcription]. 36282 KOG1064: RAVE (regulator of V-ATPase assembly) complex subunit RAV1/DMX protein, WD repeat superfamily [General function prediction only]. 36283 KOG1065: Maltase glucoamylase and related hydrolases, glycosyl hydrolase family 31 [Carbohydrate transport and metabolism]. 36284 KOG1066: Glucosidase II catalytic (alpha) subunit and related enzymes, glycosyl hydrolase family 31 [Carbohydrate transport and metabolism, Cell wall/membrane/envelope biogenesis, Posttranslational modification, protein turnover, chaperones]. 36285 KOG1067: Predicted RNA-binding polyribonucleotide nucleotidyltransferase [General function prediction only]. 36286 KOG1068: Exosomal 3'-5' exoribonuclease complex, subunit Rrp41 and related exoribonucleases [Translation, ribosomal structure and biogenesis]. 36287 KOG1069: Exosomal 3'-5' exoribonuclease complex, subunit Rrp46 [Translation, ribosomal structure and biogenesis]. 36288 KOG1070: rRNA processing protein Rrp5 [RNA processing and modification]. 36289 KOG1071: Mitochondrial translation elongation factor EF-Tsmt, catalyzes nucleotide exchange on EF-Tumt [Translation, ribosomal structure and biogenesis]. 36290 KOG1073: Uncharacterized mRNA-associated protein RAP55 [Intracellular trafficking, secretion, and vesicular transport]. 36291 KOG1074: Transcriptional repressor SALM [Transcription]. 36292 KOG1076: Translation initiation factor 3, subunit c (eIF-3c) [Translation, ribosomal structure and biogenesis]. 36293 KOG1077: Vesicle coat complex AP-2, alpha subunit [Intracellular trafficking, secretion, and vesicular transport]. 36294 KOG1078: Vesicle coat complex COPI, gamma subunit [Intracellular trafficking, secretion, and vesicular transport]. 36295 KOG1079: Transcriptional repressor EZH1 [Transcription]. 36296 KOG1080: Histone H3 (Lys4) methyltransferase complex, subunit SET1 and related methyltransferases [Chromatin structure and dynamics, Transcription]. 36297 KOG1081: Transcription factor NSD1 and related SET domain proteins [Transcription]. 36298 KOG1082: Histone H3 (Lys9) methyltransferase SUV39H1/Clr4, required for transcriptional silencing [Chromatin structure and dynamics, Transcription]. 36299 KOG1083: Putative transcription factor ASH1/LIN-59 [Transcription]. 36300 KOG1084: Transcription factor TCF20 [Transcription]. 36301 KOG1085: Predicted methyltransferase (contains a SET domain) [General function prediction only]. 36302 KOG1086: Cytosolic sorting protein/ADP-ribosylation factor effector GGA [Intracellular trafficking, secretion, and vesicular transport]. 36303 KOG1087: Cytosolic sorting protein GGA2/TOM1 [Intracellular trafficking, secretion, and vesicular transport]. 36304 KOG1088: Uncharacterized conserved protein [Function unknown]. 36305 KOG1089: Myotubularin-related phosphatidylinositol 3-phosphate 3-phosphatase MTM6 [General function prediction only]. 36306 KOG1090: Predicted dual-specificity phosphatase [General function prediction only]. 36307 KOG1091: Ypt/Rab-specific GTPase-activating protein GYP6 [Intracellular trafficking, secretion, and vesicular transport]. 36308 KOG1092: Ypt/Rab-specific GTPase-activating protein GYP1 [Intracellular trafficking, secretion, and vesicular transport]. 36309 KOG1093: Predicted protein kinase (contains TBC and RHOD domains) [General function prediction only]. 36310 KOG1094: Discoidin domain receptor DDR1 [Signal transduction mechanisms]. 36311 KOG1095: Protein tyrosine kinase [Signal transduction mechanisms]. 36312 KOG1096: Adenosine monophosphate deaminase [Nucleotide transport and metabolism]. 36313 KOG1097: Adenine deaminase/adenosine deaminase [Nucleotide transport and metabolism]. 36314 KOG1098: Putative SAM-dependent rRNA methyltransferase SPB1 [RNA processing and modification, General function prediction only]. 36315 KOG1099: SAM-dependent methyltransferase/cell division protein FtsJ [Cell cycle control, cell division, chromosome partitioning, General function prediction only]. 36316 KOG1100: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 36317 KOG1101: Apoptosis inhibitor IAP1 and related BIR domain proteins [Cell cycle control, cell division, chromosome partitioning, General function prediction only]. 36318 KOG1102: Rab6 GTPase activator GAPCenA and related TBC domain proteins [General function prediction only]. 36319 KOG1103: Predicted coiled-coil protein [Function unknown]. 36320 KOG1104: Nuclear cap-binding complex, subunit NCBP1/CBP80 [RNA processing and modification]. 36321 KOG1105: Transcription elongation factor TFIIS/Cofactor of enhancer-binding protein Sp1 [Transcription]. 36322 KOG1106: Uncharacterized conserved protein [Function unknown]. 36323 KOG1107: Membrane coat complex Retromer, subunit VPS35 [Intracellular trafficking, secretion, and vesicular transport]. 36324 KOG1108: Predicted heme/steroid binding protein [General function prediction only]. 36325 KOG1109: Vacuole membrane protein VMP1 [General function prediction only]. 36326 KOG1110: Putative steroid membrane receptor Hpr6.6/25-Dx [General function prediction only]. 36327 KOG1111: N-acetylglucosaminyltransferase complex, subunit PIG-A/SPT14, required for phosphatidylinositol biosynthesis/Sulfolipid synthase [Cell wall/membrane/envelope biogenesis, Posttranslational modification, protein turnover, chaperones, Lipid transport and metabolism]. 36328 KOG1112: Ribonucleotide reductase, alpha subunit [Nucleotide transport and metabolism]. 36329 KOG1113: cAMP-dependent protein kinase types I and II, regulatory subunit [Signal transduction mechanisms]. 36330 KOG1114: Tripeptidyl peptidase II [Posttranslational modification, protein turnover, chaperones]. 36331 KOG1115: Ceramide kinase [Lipid transport and metabolism, Signal transduction mechanisms]. 36332 KOG1116: Sphingosine kinase, involved in sphingolipid metabolism [Lipid transport and metabolism, Signal transduction mechanisms]. 36333 KOG1117: Rho- and Arf-GTPase activating protein ARAP3 [Signal transduction mechanisms, Cytoskeleton]. 36334 KOG1118: Lysophosphatidic acid acyltransferase endophilin/SH3GL, involved in synaptic vesicle formation [Lipid transport and metabolism, Signal transduction mechanisms]. 36335 KOG1119: Mitochondrial Fe-S cluster biosynthesis protein ISA2 (contains a HesB-like domain) [Energy production and conversion, Intracellular trafficking, secretion, and vesicular transport]. 36336 KOG1120: Fe-S cluster biosynthesis protein ISA1 (contains a HesB-like domain) [Inorganic ion transport and metabolism]. 36337 KOG1121: Tam3-transposase (Ac family) [Replication, recombination and repair]. 36338 KOG1122: tRNA and rRNA cytosine-C5-methylase (nucleolar protein NOL1/NOP2) [RNA processing and modification]. 36339 KOG1123: RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH, 3'-5' helicase subunit SSL2 [Transcription, Replication, recombination and repair]. 36340 KOG1125: TPR repeat-containing protein [General function prediction only]. 36341 KOG1126: DNA-binding cell division cycle control protein [Cell cycle control, cell division, chromosome partitioning]. 36342 KOG1127: TPR repeat-containing protein [RNA processing and modification]. 36343 KOG1128: Uncharacterized conserved protein, contains TPR repeats [General function prediction only]. 36344 KOG1129: TPR repeat-containing protein [General function prediction only]. 36345 KOG1130: Predicted G-alpha GTPase interaction protein, contains GoLoco domain [Signal transduction mechanisms]. 36346 KOG1131: RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH, 5'-3' helicase subunit RAD3 [Transcription, Replication, recombination and repair]. 36347 KOG1132: Helicase of the DEAD superfamily [Replication, recombination and repair]. 36348 KOG1133: Helicase of the DEAD superfamily [Replication, recombination and repair]. 36349 KOG1134: Uncharacterized conserved protein [General function prediction only]. 36350 KOG1135: mRNA cleavage and polyadenylation factor II complex, subunit CFT2 (CPSF subunit) [RNA processing and modification]. 36351 KOG1136: Predicted cleavage and polyadenylation specificity factor (CPSF subunit) [RNA processing and modification]. 36352 KOG1137: mRNA cleavage and polyadenylation factor II complex, BRR5 (CPSF subunit) [RNA processing and modification]. 36353 KOG1138: Predicted cleavage and polyadenylation specificity factor (CPSF subunit) [RNA processing and modification]. 36354 KOG1139: Predicted ubiquitin-protein ligase of the N-recognin family [Posttranslational modification, protein turnover, chaperones]. 36355 KOG1140: N-end rule pathway, recognition component UBR1 [Posttranslational modification, protein turnover, chaperones]. 36356 KOG1141: Predicted histone methyl transferase [Chromatin structure and dynamics]. 36357 KOG1142: Transcription initiation factor TFIID, subunit TAF12 (also component of histone acetyltransferase SAGA) [Transcription]. 36358 KOG1143: Predicted translation elongation factor [Translation, ribosomal structure and biogenesis]. 36359 KOG1144: Translation initiation factor 5B (eIF-5B) [Translation, ribosomal structure and biogenesis]. 36360 KOG1145: Mitochondrial translation initiation factor 2 (IF-2; GTPase) [Translation, ribosomal structure and biogenesis]. 36361 KOG1146: Homeobox protein [General function prediction only]. 36362 KOG1147: Glutamyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 36363 KOG1148: Glutaminyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 36364 KOG1149: Glutamyl-tRNA synthetase (mitochondrial) [Translation, ribosomal structure and biogenesis]. 36365 KOG1150: Predicted molecular chaperone (DnaJ superfamily) [Posttranslational modification, protein turnover, chaperones]. 36366 KOG1151: Tousled-like protein kinase [Signal transduction mechanisms]. 36367 KOG1152: Signal transduction serine/threonine kinase with PAS/PAC sensor domain [Signal transduction mechanisms]. 36368 KOG1153: Subtilisin-related protease/Vacuolar protease B [Posttranslational modification, protein turnover, chaperones]. 36369 KOG1154: Gamma-glutamyl kinase [Amino acid transport and metabolism]. 36370 KOG1155: Anaphase-promoting complex (APC), Cdc23 subunit [Cell cycle control, cell division, chromosome partitioning, Posttranslational modification, protein turnover, chaperones]. 36371 KOG1156: N-terminal acetyltransferase [Chromatin structure and dynamics]. 36372 KOG1157: Predicted guanosine polyphosphate pyrophosphohydrolase/synthase [Signal transduction mechanisms]. 36373 KOG1158: NADP/FAD dependent oxidoreductase [Energy production and conversion]. 36374 KOG1159: NADP-dependent flavoprotein reductase [Energy production and conversion]. 36375 KOG1160: Fe-S oxidoreductase [Energy production and conversion]. 36376 KOG1161: Protein involved in vacuolar polyphosphate accumulation, contains SPX domain [Inorganic ion transport and metabolism]. 36377 KOG1162: Predicted small molecule transporter [Intracellular trafficking, secretion, and vesicular transport]. 36378 KOG1163: Casein kinase (serine/threonine/tyrosine protein kinase) [Signal transduction mechanisms]. 36379 KOG1164: Casein kinase (serine/threonine/tyrosine protein kinase) [Signal transduction mechanisms]. 36380 KOG1165: Casein kinase (serine/threonine/tyrosine protein kinase) [Signal transduction mechanisms]. 36381 KOG1166: Mitotic checkpoint serine/threonine protein kinase [Cell cycle control, cell division, chromosome partitioning]. 36382 KOG1167: Serine/threonine protein kinase of the CDC7 subfamily involved in DNA synthesis, repair and recombination [Replication, recombination and repair]. 36383 KOG1168: Transcription factor ACJ6/BRN-3, contains POU and HOX domains [Transcription]. 36384 KOG1169: Diacylglycerol kinase [Lipid transport and metabolism, Signal transduction mechanisms]. 36385 KOG1170: Diacylglycerol kinase [Lipid transport and metabolism]. 36386 KOG1171: Metallothionein-like protein [Inorganic ion transport and metabolism]. 36387 KOG1172: Na+-independent Cl/HCO3 exchanger AE1 and related transporters (SLC4 family) [Inorganic ion transport and metabolism]. 36388 KOG1173: Anaphase-promoting complex (APC), Cdc16 subunit [Cell cycle control, cell division, chromosome partitioning, Posttranslational modification, protein turnover, chaperones]. 36389 KOG1174: Anaphase-promoting complex (APC), subunit 7 [Cell cycle control, cell division, chromosome partitioning, Posttranslational modification, protein turnover, chaperones]. 36390 KOG1175: Acyl-CoA synthetase [Lipid transport and metabolism]. 36391 KOG1176: Acyl-CoA synthetase [Lipid transport and metabolism]. 36392 KOG1177: Long chain fatty acid acyl-CoA ligase [Lipid transport and metabolism]. 36393 KOG1178: Non-ribosomal peptide synthetase/alpha-aminoadipate reductase and related enzymes [Secondary metabolites biosynthesis, transport and catabolism]. 36394 KOG1179: Very long-chain acyl-CoA synthetase/fatty acid transporter [Lipid transport and metabolism]. 36395 KOG1180: Acyl-CoA synthetase [Lipid transport and metabolism]. 36396 KOG1182: Branched chain alpha-keto acid dehydrogenase complex, alpha subunit [Energy production and conversion]. 36397 KOG1183: N-acetylglucosaminyltransferase complex, subunit PIG-Q/GPI1, required for phosphatidylinositol biosynthesis [Cell wall/membrane/envelope biogenesis, Posttranslational modification, protein turnover, chaperones]. 36398 KOG1184: Thiamine pyrophosphate-requiring enzyme [Amino acid transport and metabolism, Coenzyme transport and metabolism]. 36399 KOG1185: Thiamine pyrophosphate-requiring enzyme [Amino acid transport and metabolism, Coenzyme transport and metabolism]. 36400 KOG1186: Copper amine oxidase [Secondary metabolites biosynthesis, transport and catabolism]. 36401 KOG1187: Serine/threonine protein kinase [Signal transduction mechanisms]. 36402 KOG1188: WD40 repeat protein [General function prediction only]. 36403 KOG1189: Global transcriptional regulator, cell division control protein [Amino acid transport and metabolism]. 36404 KOG1190: Polypyrimidine tract-binding protein [RNA processing and modification]. 36405 KOG1191: Mitochondrial GTPase [Translation, ribosomal structure and biogenesis]. 36406 KOG1192: UDP-glucuronosyl and UDP-glucosyl transferase [Carbohydrate transport and metabolism, Energy production and conversion]. 36407 KOG1193: Arginyl-tRNA-protein transferase [Posttranslational modification, protein turnover, chaperones]. 36408 KOG1194: Predicted DNA-binding protein, contains Myb-like, SANT and ELM2 domains [Transcription]. 36409 KOG1195: Arginyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 36410 KOG1196: Predicted NAD-dependent oxidoreductase [General function prediction only]. 36411 KOG1197: Predicted quinone oxidoreductase [Energy production and conversion, General function prediction only]. 36412 KOG1198: Zinc-binding oxidoreductase [Energy production and conversion, General function prediction only]. 36413 KOG1199: Short-chain alcohol dehydrogenase/3-hydroxyacyl-CoA dehydrogenase [Secondary metabolites biosynthesis, transport and catabolism]. 36414 KOG1200: Mitochondrial/plastidial beta-ketoacyl-ACP reductase [Lipid transport and metabolism]. 36415 KOG1201: Hydroxysteroid 17-beta dehydrogenase 11 [Secondary metabolites biosynthesis, transport and catabolism]. 36416 KOG1202: Animal-type fatty acid synthase and related proteins [Lipid transport and metabolism]. 36417 KOG1203: Predicted dehydrogenase [Carbohydrate transport and metabolism]. 36418 KOG1204: Predicted dehydrogenase [Secondary metabolites biosynthesis, transport and catabolism]. 36419 KOG1205: Predicted dehydrogenase [Secondary metabolites biosynthesis, transport and catabolism]. 36420 KOG1206: Peroxisomal multifunctional beta-oxidation protein and related enzymes [Lipid transport and metabolism]. 36421 KOG1207: Diacetyl reductase/L-xylulose reductase [Secondary metabolites biosynthesis, transport and catabolism]. 36422 KOG1208: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) [Secondary metabolites biosynthesis, transport and catabolism]. 36423 KOG1209: 1-Acyl dihydroxyacetone phosphate reductase and related dehydrogenases [Secondary metabolites biosynthesis, transport and catabolism]. 36424 KOG1210: Predicted 3-ketosphinganine reductase [Secondary metabolites biosynthesis, transport and catabolism]. 36425 KOG1211: Amidases [Translation, ribosomal structure and biogenesis]. 36426 KOG1212: Amidases [Translation, ribosomal structure and biogenesis, Lipid transport and metabolism, Signal transduction mechanisms]. 36427 KOG1213: Sister chromatid cohesion complex Cohesin, subunit RAD21/SCC1 [Cell cycle control, cell division, chromosome partitioning]. 36428 KOG1214: Nidogen and related basement membrane protein proteins [Cell wall/membrane/envelope biogenesis, Extracellular structures]. 36429 KOG1215: Low-density lipoprotein receptors containing Ca2+-binding EGF-like domains [Signal transduction mechanisms]. 36430 KOG1216: von Willebrand factor and related coagulation proteins [Extracellular structures, Defense mechanisms]. 36431 KOG1217: Fibrillins and related proteins containing Ca2+-binding EGF-like domains [Signal transduction mechanisms]. 36432 KOG1218: Proteins containing Ca2+-binding EGF-like domains [Signal transduction mechanisms]. 36433 KOG1219: Uncharacterized conserved protein, contains laminin, cadherin and EGF domains [Signal transduction mechanisms]. 36434 KOG1220: Phosphoglucomutase/phosphomannomutase [Carbohydrate transport and metabolism]. 36435 KOG1221: Acyl-CoA reductase [Lipid transport and metabolism]. 36436 KOG1222: Kinesin associated protein KAP [Intracellular trafficking, secretion, and vesicular transport]. 36437 KOG1223: Isochorismate synthase [Amino acid transport and metabolism]. 36438 KOG1224: Para-aminobenzoate (PABA) synthase ABZ1 [Translation, ribosomal structure and biogenesis]. 36439 KOG1225: Teneurin-1 and related extracellular matrix proteins, contain EGF-like repeats [Signal transduction mechanisms, Extracellular structures]. 36440 KOG1226: Integrin beta subunit (N-terminal portion of extracellular region) [Signal transduction mechanisms, Extracellular structures]. 36441 KOG1227: Putative methyltransferase [General function prediction only]. 36442 KOG1228: Uncharacterized conserved protein [Function unknown]. 36443 KOG1229: 3'5'-cyclic nucleotide phosphodiesterases [Signal transduction mechanisms]. 36444 KOG1230: Protein containing repeated kelch motifs [General function prediction only]. 36445 KOG1231: Proteins containing the FAD binding domain [Energy production and conversion]. 36446 KOG1232: Proteins containing the FAD binding domain [Energy production and conversion]. 36447 KOG1233: Alkyl-dihydroxyacetonephosphate synthase [General function prediction only]. 36448 KOG1234: ABC (ATP binding cassette) 1 protein [General function prediction only]. 36449 KOG1235: Predicted unusual protein kinase [General function prediction only]. 36450 KOG1236: Predicted unusual protein kinase [General function prediction only]. 36451 KOG1237: H+/oligopeptide symporter [Amino acid transport and metabolism]. 36452 KOG1238: Glucose dehydrogenase/choline dehydrogenase/mandelonitrile lyase (GMC oxidoreductase family) [General function prediction only]. 36453 KOG1239: Inner membrane protein translocase involved in respiratory chain assembly [Posttranslational modification, protein turnover, chaperones, Intracellular trafficking, secretion, and vesicular transport]. 36454 KOG1240: Protein kinase containing WD40 repeats [Signal transduction mechanisms]. 36455 KOG1241: Karyopherin (importin) beta 1 [Nuclear structure, Intracellular trafficking, secretion, and vesicular transport]. 36456 KOG1242: Protein containing adaptin N-terminal region [Translation, ribosomal structure and biogenesis]. 36457 KOG1243: Protein kinase [General function prediction only]. 36458 KOG1244: Predicted transcription factor Requiem/NEURO-D4 [Transcription]. 36459 KOG1245: Chromatin remodeling complex WSTF-ISWI, large subunit (contains heterochromatin localization, PHD and BROMO domains) [Chromatin structure and dynamics]. 36460 KOG1246: DNA-binding protein jumonji/RBP2/SMCY, contains JmjC domain [General function prediction only]. 36461 KOG1247: Methionyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 36462 KOG1248: Uncharacterized conserved protein [Function unknown]. 36463 KOG1249: Predicted GTPases [General function prediction only]. 36464 KOG1250: Threonine/serine dehydratases [Amino acid transport and metabolism]. 36465 KOG1251: Serine racemase [Signal transduction mechanisms, Amino acid transport and metabolism]. 36466 KOG1252: Cystathionine beta-synthase and related enzymes [Amino acid transport and metabolism]. 36467 KOG1253: tRNA methyltransferase [Translation, ribosomal structure and biogenesis]. 36468 KOG1254: ATP-citrate lyase [Energy production and conversion]. 36469 KOG1255: Succinyl-CoA synthetase, alpha subunit [Energy production and conversion]. 36470 KOG1256: Long-chain acyl-CoA synthetases (AMP-forming) [Lipid transport and metabolism]. 36471 KOG1257: NADP+-dependent malic enzyme [Energy production and conversion]. 36472 KOG1258: mRNA processing protein [RNA processing and modification]. 36473 KOG1259: Nischarin, modulator of integrin alpha5 subunit action [Signal transduction mechanisms, Cytoskeleton]. 36474 KOG1260: Isocitrate lyase [Energy production and conversion]. 36475 KOG1261: Malate synthase [Energy production and conversion]. 36476 KOG1262: FAD-binding protein DIMINUTO [General function prediction only]. 36477 KOG1263: Multicopper oxidases [Secondary metabolites biosynthesis, transport and catabolism]. 36478 KOG1264: Phospholipase C [Lipid transport and metabolism]. 36479 KOG1265: Phospholipase C [Lipid transport and metabolism]. 36480 KOG1266: Protein kinase [Signal transduction mechanisms]. 36481 KOG1267: Mitochondrial transcription termination factor, mTERF [Transcription, General function prediction only]. 36482 KOG1268: Glucosamine 6-phosphate synthetases, contain amidotransferase and phosphosugar isomerase domains [Cell wall/membrane/envelope biogenesis]. 36483 KOG1269: SAM-dependent methyltransferases [Lipid transport and metabolism, General function prediction only]. 36484 KOG1270: Methyltransferases [Coenzyme transport and metabolism]. 36485 KOG1271: Methyltransferases [General function prediction only]. 36486 KOG1272: WD40-repeat-containing subunit of the 18S rRNA processing complex [RNA processing and modification]. 36487 KOG1273: WD40 repeat protein [General function prediction only]. 36488 KOG1274: WD40 repeat protein [General function prediction only]. 36489 KOG1275: PAB-dependent poly(A) ribonuclease, subunit PAN2 [Replication, recombination and repair]. 36490 KOG1276: Protoporphyrinogen oxidase [Coenzyme transport and metabolism]. 36491 KOG1277: Endosomal membrane proteins, EMP70 [Intracellular trafficking, secretion, and vesicular transport]. 36492 KOG1278: Endosomal membrane proteins, EMP70 [Intracellular trafficking, secretion, and vesicular transport]. 36493 KOG1279: Chromatin remodeling factor subunit and related transcription factors [Chromatin structure and dynamics]. 36494 KOG1280: Uncharacterized conserved protein containing ZZ-type Zn-finger [General function prediction only]. 36495 KOG1281: Na+/dicarboxylate, Na+/tricarboxylate and phosphate transporters [Inorganic ion transport and metabolism]. 36496 KOG1282: Serine carboxypeptidases (lysosomal cathepsin A) [Amino acid transport and metabolism, Posttranslational modification, protein turnover, chaperones]. 36497 KOG1283: Serine carboxypeptidases [Posttranslational modification, protein turnover, chaperones]. 36498 KOG1284: Bifunctional GTP cyclohydrolase II/3,4-dihydroxy-2butanone-4-phosphate synthase [Coenzyme transport and metabolism]. 36499 KOG1285: Beta, beta-carotene 15,15 '-dioxygenase and related enzymes [Secondary metabolites biosynthesis, transport and catabolism]. 36500 KOG1286: Amino acid transporters [Amino acid transport and metabolism]. 36501 KOG1287: Amino acid transporters [Amino acid transport and metabolism]. 36502 KOG1288: Amino acid transporters [Amino acid transport and metabolism]. 36503 KOG1289: Amino acid transporters [Amino acid transport and metabolism]. 36504 KOG1290: Serine/threonine protein kinase [Signal transduction mechanisms]. 36505 KOG1291: Mn2+ and Fe2+ transporters of the NRAMP family [Inorganic ion transport and metabolism]. 36506 KOG1292: Xanthine/uracil transporters [Nucleotide transport and metabolism]. 36507 KOG1293: Proteins containing armadillo/beta-catenin-like repeat [General function prediction only]. 36508 KOG1294: Apurinic/apyrimidinic endonuclease and related enzymes [Replication, recombination and repair]. 36509 KOG1295: Nonsense-mediated decay protein Upf3 [RNA processing and modification]. 36510 KOG1296: Uncharacterized conserved protein [Function unknown]. 36511 KOG1297: Uncharacterized conserved protein [Function unknown]. 36512 KOG1298: Squalene monooxygenase [Lipid transport and metabolism]. 36513 KOG1299: Vacuolar sorting protein VPS45/Stt10 (Sec1 family) [Intracellular trafficking, secretion, and vesicular transport]. 36514 KOG1300: Vesicle trafficking protein Sec1 [Intracellular trafficking, secretion, and vesicular transport]. 36515 KOG1301: Vesicle trafficking protein Sly1 (Sec1 family) [Intracellular trafficking, secretion, and vesicular transport]. 36516 KOG1302: Vacuolar sorting protein VPS33/slp1 (Sec1 family) [Intracellular trafficking, secretion, and vesicular transport]. 36517 KOG1303: Amino acid transporters [Amino acid transport and metabolism]. 36518 KOG1304: Amino acid transporters [Amino acid transport and metabolism]. 36519 KOG1305: Amino acid transporter protein [Amino acid transport and metabolism]. 36520 KOG1306: Ca2+/Na+ exchanger NCX1 and related proteins [Inorganic ion transport and metabolism, Signal transduction mechanisms]. 36521 KOG1307: K+-dependent Ca2+/Na+ exchanger NCKX1 and related proteins [Inorganic ion transport and metabolism, Signal transduction mechanisms]. 36522 KOG1308: Hsp70-interacting protein Hip/Transient component of progesterone receptor complexes and an Hsp70-binding protein [Posttranslational modification, protein turnover, chaperones, Signal transduction mechanisms]. 36523 KOG1309: Suppressor of G2 allele of skp1 [Signal transduction mechanisms]. 36524 KOG1310: WD40 repeat protein [General function prediction only]. 36525 KOG1311: DHHC-type Zn-finger proteins [General function prediction only]. 36526 KOG1312: DHHC-type Zn-finger proteins [General function prediction only]. 36527 KOG1313: DHHC-type Zn-finger proteins [General function prediction only]. 36528 KOG1314: DHHC-type Zn-finger protein [General function prediction only]. 36529 KOG1315: Predicted DHHC-type Zn-finger protein [General function prediction only]. 36530 KOG1316: Argininosuccinate lyase [Amino acid transport and metabolism]. 36531 KOG1317: Fumarase [Energy production and conversion]. 36532 KOG1318: Helix loop helix transcription factor EB [Transcription]. 36533 KOG1319: bHLHZip transcription factor BIGMAX [Transcription]. 36534 KOG1320: Serine protease [Posttranslational modification, protein turnover, chaperones]. 36535 KOG1321: Protoheme ferro-lyase (ferrochelatase) [Coenzyme transport and metabolism]. 36536 KOG1322: GDP-mannose pyrophosphorylase/mannose-1-phosphate guanylyltransferase [Cell wall/membrane/envelope biogenesis]. 36537 KOG1323: Serine/threonine phosphatase [Signal transduction mechanisms]. 36538 KOG1324: Dihydrofolate reductase [Coenzyme transport and metabolism]. 36539 KOG1325: Lysophospholipase [Lipid transport and metabolism]. 36540 KOG1326: Membrane-associated protein FER-1 and related ferlins, contain multiple C2 domains [Cell wall/membrane/envelope biogenesis]. 36541 KOG1327: Copine [Signal transduction mechanisms]. 36542 KOG1328: Synaptic vesicle protein BAIAP3, involved in vesicle priming/regulation [Intracellular trafficking, secretion, and vesicular transport, Signal transduction mechanisms]. 36543 KOG1329: Phospholipase D1 [Lipid transport and metabolism]. 36544 KOG1330: Sugar transporter/spinster transmembrane protein [Carbohydrate transport and metabolism]. 36545 KOG1331: Predicted methyltransferase [General function prediction only]. 36546 KOG1332: Vesicle coat complex COPII, subunit SEC13 [Intracellular trafficking, secretion, and vesicular transport]. 36547 KOG1333: Uncharacterized conserved protein [Function unknown]. 36548 KOG1334: WD40 repeat protein [General function prediction only]. 36549 KOG1335: Dihydrolipoamide dehydrogenase [Energy production and conversion]. 36550 KOG1336: Monodehydroascorbate/ferredoxin reductase [General function prediction only]. 36551 KOG1337: N-methyltransferase [General function prediction only]. 36552 KOG1338: Uncharacterized conserved protein [Function unknown]. 36553 KOG1339: Aspartyl protease [Posttranslational modification, protein turnover, chaperones]. 36554 KOG1340: Prosaposin [Lipid transport and metabolism, Carbohydrate transport and metabolism]. 36555 KOG1341: Na+/K+ transporter [Inorganic ion transport and metabolism]. 36556 KOG1342: Histone deacetylase complex, catalytic component RPD3 [Chromatin structure and dynamics]. 36557 KOG1343: Histone deacetylase complex, catalytic component HDA1 [Chromatin structure and dynamics]. 36558 KOG1344: Predicted histone deacetylase [Chromatin structure and dynamics]. 36559 KOG1345: Serine/threonine kinase [Signal transduction mechanisms]. 36560 KOG1346: Programmed cell death 8 (apoptosis-inducing factor) [Signal transduction mechanisms]. 36561 KOG1347: Uncharacterized membrane protein, predicted efflux pump [General function prediction only]. 36562 KOG1348: Asparaginyl peptidases [Posttranslational modification, protein turnover, chaperones]. 36563 KOG1349: Gpi-anchor transamidase [Posttranslational modification, protein turnover, chaperones]. 36564 KOG1350: F0F1-type ATP synthase, beta subunit [Energy production and conversion]. 36565 KOG1351: Vacuolar H+-ATPase V1 sector, subunit B [Energy production and conversion]. 36566 KOG1352: Vacuolar H+-ATPase V1 sector, subunit A [Energy production and conversion]. 36567 KOG1353: F0F1-type ATP synthase, alpha subunit [Energy production and conversion]. 36568 KOG1354: Serine/threonine protein phosphatase 2A, regulatory subunit [Signal transduction mechanisms]. 36569 KOG1355: Adenylosuccinate synthase [Nucleotide transport and metabolism]. 36570 KOG1356: Putative transcription factor 5qNCA, contains JmjC domain [Transcription]. 36571 KOG1357: Serine palmitoyltransferase [Posttranslational modification, protein turnover, chaperones]. 36572 KOG1358: Serine palmitoyltransferase [Posttranslational modification, protein turnover, chaperones]. 36573 KOG1359: Glycine C-acetyltransferase/2-amino-3-ketobutyrate-CoA ligase [Amino acid transport and metabolism]. 36574 KOG1360: 5-aminolevulinate synthase [Coenzyme transport and metabolism]. 36575 KOG1361: Predicted hydrolase involved in interstrand cross-link repair [Replication, recombination and repair]. 36576 KOG1362: Choline transporter-like protein [Lipid transport and metabolism]. 36577 KOG1363: Predicted regulator of the ubiquitin pathway (contains UAS and UBX domains) [Signal transduction mechanisms]. 36578 KOG1364: Predicted ubiquitin regulatory protein, contains UAS and UBX domains [Posttranslational modification, protein turnover, chaperones]. 36579 KOG1365: RNA-binding protein Fusilli, contains RRM domain [RNA processing and modification, General function prediction only]. 36580 KOG1366: Alpha-macroglobulin [Posttranslational modification, protein turnover, chaperones]. 36581 KOG1367: 3-phosphoglycerate kinase [Carbohydrate transport and metabolism]. 36582 KOG1368: Threonine aldolase [Amino acid transport and metabolism]. 36583 KOG1369: Hexokinase [Carbohydrate transport and metabolism]. 36584 KOG1370: S-adenosylhomocysteine hydrolase [Coenzyme transport and metabolism]. 36585 KOG1371: UDP-glucose 4-epimerase/UDP-sulfoquinovose synthase [Cell wall/membrane/envelope biogenesis]. 36586 KOG1372: GDP-mannose 4,6 dehydratase [Carbohydrate transport and metabolism]. 36587 KOG1373: Transport protein Sec61, alpha subunit [Intracellular trafficking, secretion, and vesicular transport, Posttranslational modification, protein turnover, chaperones]. 36588 KOG1374: Gamma tubulin [Cytoskeleton]. 36589 KOG1375: Beta tubulin [Cytoskeleton]. 36590 KOG1376: Alpha tubulin [Cytoskeleton]. 36591 KOG1377: Uridine 5'- monophosphate synthase/orotate phosphoribosyltransferase [Nucleotide transport and metabolism]. 36592 KOG1378: Purple acid phosphatase [Carbohydrate transport and metabolism]. 36593 KOG1379: Serine/threonine protein phosphatase [Signal transduction mechanisms]. 36594 KOG1380: Heme A farnesyltransferase [Coenzyme transport and metabolism]. 36595 KOG1381: Para-hydroxybenzoate-polyprenyl transferase [Coenzyme transport and metabolism]. 36596 KOG1382: Multiple inositol polyphosphate phosphatase [General function prediction only]. 36597 KOG1383: Glutamate decarboxylase/sphingosine phosphate lyase [Amino acid transport and metabolism]. 36598 KOG1384: tRNA delta(2)-isopentenylpyrophosphate transferase [Translation, ribosomal structure and biogenesis]. 36599 KOG1385: Nucleoside phosphatase [Nucleotide transport and metabolism]. 36600 KOG1386: Nucleoside phosphatase [Nucleotide transport and metabolism]. 36601 KOG1387: Glycosyltransferase [Cell wall/membrane/envelope biogenesis]. 36602 KOG1388: Attractin and platelet-activating factor acetylhydrolase [Signal transduction mechanisms, Defense mechanisms]. 36603 KOG1389: 3-oxoacyl CoA thiolase [Lipid transport and metabolism]. 36604 KOG1390: Acetyl-CoA acetyltransferase [Lipid transport and metabolism]. 36605 KOG1391: Acetyl-CoA acetyltransferase [Lipid transport and metabolism]. 36606 KOG1392: Acetyl-CoA acetyltransferase [Lipid transport and metabolism]. 36607 KOG1393: Hydroxymethylglutaryl-CoA synthase [Lipid transport and metabolism]. 36608 KOG1394: 3-oxoacyl-(acyl-carrier-protein) synthase (I and II) [Lipid transport and metabolism, Secondary metabolites biosynthesis, transport and catabolism]. 36609 KOG1395: Tryptophan synthase beta chain [Amino acid transport and metabolism]. 36610 KOG1396: Uncharacterized conserved protein [Function unknown]. 36611 KOG1397: Ca2+/H+ antiporter VCX1 and related proteins [Inorganic ion transport and metabolism]. 36612 KOG1398: Uncharacterized conserved protein [Function unknown]. 36613 KOG1399: Flavin-containing monooxygenase [Secondary metabolites biosynthesis, transport and catabolism]. 36614 KOG1400: Predicted ATP-dependent protease PIL, contains LON domain [General function prediction only]. 36615 KOG1401: Acetylornithine aminotransferase [Amino acid transport and metabolism]. 36616 KOG1402: Ornithine aminotransferase [Amino acid transport and metabolism]. 36617 KOG1403: Predicted alanine-glyoxylate aminotransferase [General function prediction only]. 36618 KOG1404: Alanine-glyoxylate aminotransferase AGT2 [Amino acid transport and metabolism]. 36619 KOG1405: 4-aminobutyrate aminotransferase [Amino acid transport and metabolism]. 36620 KOG1406: Peroxisomal 3-ketoacyl-CoA-thiolase P-44/SCP2 [Lipid transport and metabolism]. 36621 KOG1407: WD40 repeat protein [Function unknown]. 36622 KOG1408: WD40 repeat protein [Function unknown]. 36623 KOG1409: Uncharacterized conserved protein, contains WD40 repeats and FYVE domains [Function unknown]. 36624 KOG1410: Nuclear transport receptor RanBP16 (importin beta superfamily) [Nuclear structure, Intracellular trafficking, secretion, and vesicular transport]. 36625 KOG1411: Aspartate aminotransferase/Glutamic oxaloacetic transaminase AAT1/GOT2 [Amino acid transport and metabolism]. 36626 KOG1412: Aspartate aminotransferase/Glutamic oxaloacetic transaminase AAT2/GOT1 [Amino acid transport and metabolism]. 36627 KOG1413: N-acetylglucosaminyltransferase I [Carbohydrate transport and metabolism]. 36628 KOG1414: Transcriptional activator FOSB/c-Fos and related bZIP transcription factors [Transcription]. 36629 KOG1415: Ubiquitin C-terminal hydrolase UCHL1 [Posttranslational modification, protein turnover, chaperones]. 36630 KOG1416: tRNA(1-methyladenosine) methyltransferase, subunit GCD10 [Translation, ribosomal structure and biogenesis]. 36631 KOG1417: Homogentisate 1,2-dioxygenase [Amino acid transport and metabolism]. 36632 KOG1418: Tandem pore domain K+ channel [Inorganic ion transport and metabolism]. 36633 KOG1419: Voltage-gated K+ channel KCNQ [Inorganic ion transport and metabolism]. 36634 KOG1420: Ca2+-activated K+ channel Slowpoke, alpha subunit [Inorganic ion transport and metabolism, Signal transduction mechanisms]. 36635 KOG1421: Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]. 36636 KOG1422: Intracellular Cl- channel CLIC, contains GST domain [Inorganic ion transport and metabolism]. 36637 KOG1423: Ras-like GTPase ERA [Cell cycle control, cell division, chromosome partitioning, Signal transduction mechanisms]. 36638 KOG1424: Predicted GTP-binding protein MMR1 [General function prediction only]. 36639 KOG1425: Microfibrillar-associated protein MFAP1 [Cytoskeleton]. 36640 KOG1427: Uncharacterized conserved protein, contains RCC1 domain [Function unknown]. 36641 KOG1428: Inhibitor of type V adenylyl cyclases/Neuronal presynaptic protein Highwire/PAM/RPM-1 [Signal transduction mechanisms]. 36642 KOG1429: dTDP-glucose 4-6-dehydratase/UDP-glucuronic acid decarboxylase [Carbohydrate transport and metabolism, Cell wall/membrane/envelope biogenesis]. 36643 KOG1430: C-3 sterol dehydrogenase/3-beta-hydroxysteroid dehydrogenase and related dehydrogenases [Lipid transport and metabolism, Amino acid transport and metabolism]. 36644 KOG1431: GDP-L-fucose synthetase [Carbohydrate transport and metabolism, Posttranslational modification, protein turnover, chaperones]. 36645 KOG1432: Predicted DNA repair exonuclease SIA1 [General function prediction only]. 36646 KOG1433: DNA repair protein RAD51/RHP55 [Replication, recombination and repair]. 36647 KOG1434: Meiotic recombination protein Dmc1 [Energy production and conversion, Replication, recombination and repair]. 36648 KOG1435: Sterol reductase/lamin B receptor [Lipid transport and metabolism, Signal transduction mechanisms]. 36649 KOG1436: Dihydroorotate dehydrogenase [Nucleotide transport and metabolism]. 36650 KOG1437: Fasciclin and related adhesion glycoproteins [Cell wall/membrane/envelope biogenesis, Extracellular structures]. 36651 KOG1438: Anthranilate phosphoribosyltransferase [Amino acid transport and metabolism]. 36652 KOG1439: RAB proteins geranylgeranyltransferase component A (RAB escort protein) [Posttranslational modification, protein turnover, chaperones]. 36653 KOG1440: CDP-diacylglycerol synthase [Lipid transport and metabolism]. 36654 KOG1441: Glucose-6-phosphate/phosphate and phosphoenolpyruvate/phosphate antiporter [Carbohydrate transport and metabolism, Amino acid transport and metabolism]. 36655 KOG1442: GDP-fucose transporter [Carbohydrate transport and metabolism, Posttranslational modification, protein turnover, chaperones, Intracellular trafficking, secretion, and vesicular transport]. 36656 KOG1443: Predicted integral membrane protein [Function unknown]. 36657 KOG1444: Nucleotide-sugar transporter VRG4/SQV-7 [Carbohydrate transport and metabolism, Posttranslational modification, protein turnover, chaperones, Intracellular trafficking, secretion, and vesicular transport]. 36658 KOG1445: Tumor-specific antigen (contains WD repeats) [Cytoskeleton]. 36659 KOG1446: Histone H3 (Lys4) methyltransferase complex and RNA cleavage factor II complex, subunit SWD2 [RNA processing and modification, Chromatin structure and dynamics, Posttranslational modification, protein turnover, chaperones]. 36660 KOG1447: GTP-specific succinyl-CoA synthetase, beta subunit [Energy production and conversion]. 36661 KOG1448: Ribose-phosphate pyrophosphokinase [Nucleotide transport and metabolism, Amino acid transport and metabolism]. 36662 KOG1449: Predicted Rho GTPase-activating protein CdGAPr [Signal transduction mechanisms]. 36663 KOG1450: Predicted Rho GTPase-activating protein [Signal transduction mechanisms]. 36664 KOG1451: Oligophrenin-1 and related Rho GTPase-activating proteins [Signal transduction mechanisms]. 36665 KOG1452: Predicted Rho GTPase-activating protein [Signal transduction mechanisms]. 36666 KOG1453: Chimaerin and related Rho GTPase activating proteins [Signal transduction mechanisms]. 36667 KOG1454: Predicted hydrolase/acyltransferase (alpha/beta hydrolase superfamily) [General function prediction only]. 36668 KOG1455: Lysophospholipase [Lipid transport and metabolism]. 36669 KOG1456: Heterogeneous nuclear ribonucleoprotein L (contains RRM repeats) [RNA processing and modification]. 36670 KOG1457: RNA binding protein (contains RRM repeats) [General function prediction only]. 36671 KOG1458: Fructose-1,6-bisphosphatase [Carbohydrate transport and metabolism]. 36672 KOG1459: Squalene synthetase [Lipid transport and metabolism]. 36673 KOG1460: GDP-mannose pyrophosphorylase [Carbohydrate transport and metabolism, Cell wall/membrane/envelope biogenesis, Posttranslational modification, protein turnover, chaperones]. 36674 KOG1461: Translation initiation factor 2B, epsilon subunit (eIF-2Bepsilon/GCD6) [Translation, ribosomal structure and biogenesis]. 36675 KOG1462: Translation initiation factor 2B, gamma subunit (eIF-2Bgamma/GCD1) [Translation, ribosomal structure and biogenesis]. 36676 KOG1463: 26S proteasome regulatory complex, subunit RPN6/PSMD11 [Posttranslational modification, protein turnover, chaperones]. 36677 KOG1464: COP9 signalosome, subunit CSN2 [Posttranslational modification, protein turnover, chaperones, Signal transduction mechanisms]. 36678 KOG1465: Translation initiation factor 2B, beta subunit (eIF-2Bbeta/GCD7) [Translation, ribosomal structure and biogenesis]. 36679 KOG1466: Translation initiation factor 2B, alpha subunit (eIF-2Balpha/GCN3) [Translation, ribosomal structure and biogenesis]. 36680 KOG1467: Translation initiation factor 2B, delta subunit (eIF-2Bdelta/GCD2) [Translation, ribosomal structure and biogenesis]. 36681 KOG1468: Predicted translation initiation factor related to eIF-2B alpha/beta/delta subunits (CIG2/IDI2) [Translation, ribosomal structure and biogenesis]. 36682 KOG1469: Predicted acyl-CoA dehydrogenase [General function prediction only]. 36683 KOG1470: Phosphatidylinositol transfer protein PDR16 and related proteins [Lipid transport and metabolism]. 36684 KOG1471: Phosphatidylinositol transfer protein SEC14 and related proteins [Lipid transport and metabolism]. 36685 KOG1472: Histone acetyltransferase SAGA/ADA, catalytic subunit PCAF/GCN5 and related proteins [Chromatin structure and dynamics, Transcription]. 36686 KOG1473: Nucleosome remodeling factor, subunit NURF301/BPTF [Chromatin structure and dynamics, Transcription]. 36687 KOG1474: Transcription initiation factor TFIID, subunit BDF1 and related bromodomain proteins [Transcription]. 36688 KOG1475: Ribosomal protein RPL1/RPL2/RL4L4 [RNA processing and modification]. 36689 KOG1476: Beta-1,3-glucuronyltransferase B3GAT1/SQV-8 [Posttranslational modification, protein turnover, chaperones]. 36690 KOG1477: SPRY domain-containing proteins [General function prediction only]. 36691 KOG1478: 3-keto sterol reductase [Lipid transport and metabolism]. 36692 KOG1479: Nucleoside transporter [Nucleotide transport and metabolism]. 36693 KOG1480: Netrin transmembrane receptor unc-5 [Signal transduction mechanisms]. 36694 KOG1481: Cysteine synthase [Amino acid transport and metabolism]. 36695 KOG1482: Zn2+ transporter [Inorganic ion transport and metabolism]. 36696 KOG1483: Zn2+ transporter ZNT1 and related Cd2+/Zn2+ transporters (cation diffusion facilitator superfamily) [Inorganic ion transport and metabolism]. 36697 KOG1484: Putative Zn2+ transporter MSC2 (cation diffusion facilitator superfamily) [Inorganic ion transport and metabolism]. 36698 KOG1485: Mitochondrial Fe2+ transporter MMT1 and related transporters (cation diffusion facilitator superfamily) [Inorganic ion transport and metabolism]. 36699 KOG1486: GTP-binding protein DRG2 (ODN superfamily) [Signal transduction mechanisms]. 36700 KOG1487: GTP-binding protein DRG1 (ODN superfamily) [Signal transduction mechanisms]. 36701 KOG1488: Translational repressor Pumilio/PUF3 and related RNA-binding proteins (Puf superfamily) [Translation, ribosomal structure and biogenesis]. 36702 KOG1489: Predicted GTP-binding protein (ODN superfamily) [General function prediction only]. 36703 KOG1490: GTP-binding protein CRFG/NOG1 (ODN superfamily) [General function prediction only]. 36704 KOG1491: Predicted GTP-binding protein (ODN superfamily) [General function prediction only]. 36705 KOG1492: C3H1-type Zn-finger protein [General function prediction only]. 36706 KOG1493: Anaphase-promoting complex (APC), subunit 11 [Cell cycle control, cell division, chromosome partitioning, Posttranslational modification, protein turnover, chaperones]. 36707 KOG1494: NAD-dependent malate dehydrogenase [Energy production and conversion]. 36708 KOG1495: Lactate dehydrogenase [Energy production and conversion]. 36709 KOG1496: Malate dehydrogenase [Energy production and conversion]. 36710 KOG1497: COP9 signalosome, subunit CSN4 [Posttranslational modification, protein turnover, chaperones, Signal transduction mechanisms]. 36711 KOG1498: 26S proteasome regulatory complex, subunit RPN5/PSMD12 [Posttranslational modification, protein turnover, chaperones]. 36712 KOG1499: Protein arginine N-methyltransferase PRMT1 and related enzymes [Posttranslational modification, protein turnover, chaperones, Transcription, Signal transduction mechanisms]. 36713 KOG1500: Protein arginine N-methyltransferase CARM1 [Posttranslational modification, protein turnover, chaperones, Transcription]. 36714 KOG1501: Arginine N-methyltransferase [General function prediction only]. 36715 KOG1502: Flavonol reductase/cinnamoyl-CoA reductase [Defense mechanisms]. 36716 KOG1503: Phosphoribosylpyrophosphate synthetase-associated protein [Amino acid transport and metabolism, Nucleotide transport and metabolism]. 36717 KOG1504: Ornithine carbamoyltransferase OTC/ARG3 [Amino acid transport and metabolism]. 36718 KOG1505: Lysophosphatidic acid acyltransferase LPAAT and related acyltransferases [Lipid transport and metabolism]. 36719 KOG1506: S-adenosylmethionine synthetase [Coenzyme transport and metabolism]. 36720 KOG1507: Nucleosome assembly protein NAP-1 [Chromatin structure and dynamics, Cell cycle control, cell division, chromosome partitioning]. 36721 KOG1508: DNA replication factor/protein phosphatase inhibitor SET/SPR-2 [Replication, recombination and repair]. 36722 KOG1509: Predicted nucleic acid-binding protein ASMTL [Cell cycle control, cell division, chromosome partitioning]. 36723 KOG1510: RNA polymerase II holoenzyme and mediator subcomplex, subunit SURB7/SRB7 [Transcription]. 36724 KOG1511: Mevalonate kinase MVK/ERG12 [Lipid transport and metabolism]. 36725 KOG1512: PHD Zn-finger protein [General function prediction only]. 36726 KOG1513: Nuclear helicase MOP-3/SNO (DEAD-box superfamily) [Transcription, Signal transduction mechanisms]. 36727 KOG1514: Origin recognition complex, subunit 1, and related proteins [Replication, recombination and repair]. 36728 KOG1515: Arylacetamide deacetylase [Defense mechanisms]. 36729 KOG1516: Carboxylesterase and related proteins [General function prediction only]. 36730 KOG1517: Guanine nucleotide binding protein MIP1 [Cell cycle control, cell division, chromosome partitioning]. 36731 KOG1518: Coproporphyrinogen III oxidase CPO/HEM13 [Coenzyme transport and metabolism]. 36732 KOG1519: Predicted mitochondrial carrier protein [General function prediction only]. 36733 KOG1520: Predicted alkaloid synthase/Surface mucin Hemomucin [General function prediction only]. 36734 KOG1521: RNA polymerase I and III, subunit RPA40/RPC40 [Transcription]. 36735 KOG1522: RNA polymerase II, subunit POLR2C/RPB3 [Transcription]. 36736 KOG1523: Actin-related protein Arp2/3 complex, subunit ARPC1/p41-ARC [Cytoskeleton]. 36737 KOG1524: WD40 repeat-containing protein CHE-2 [General function prediction only]. 36738 KOG1525: Sister chromatid cohesion complex Cohesin, subunit PDS5 [Cell cycle control, cell division, chromosome partitioning]. 36739 KOG1526: NADP-dependent isocitrate dehydrogenase [Energy production and conversion]. 36740 KOG1527: Uroporphyrin III methyltransferase [Coenzyme transport and metabolism]. 36741 KOG1528: Salt-sensitive 3'-phosphoadenosine-5 '-phosphatase HAL2/SAL1 [Nucleotide transport and metabolism, Inorganic ion transport and metabolism]. 36742 KOG1529: Mercaptopyruvate sulfurtransferase/thiosulfate sulfurtransferase [Defense mechanisms]. 36743 KOG1530: Rhodanese-related sulfurtransferase [Inorganic ion transport and metabolism]. 36744 KOG1531: F0F1-type ATP synthase, gamma subunit [Energy production and conversion]. 36745 KOG1532: GTPase XAB1, interacts with DNA repair protein XPA [Replication, recombination and repair]. 36746 KOG1533: Predicted GTPase [General function prediction only]. 36747 KOG1534: Putative transcription factor FET5 [Transcription]. 36748 KOG1535: Predicted fumarylacetoacetate hydralase [General function prediction only]. 36749 KOG1536: Biotin holocarboxylase synthetase/biotin-protein ligase [Coenzyme transport and metabolism]. 36750 KOG1537: Homoserine kinase [Amino acid transport and metabolism]. 36751 KOG1538: Uncharacterized conserved protein WDR10, contains WD40 repeats [General function prediction only]. 36752 KOG1539: WD repeat protein [General function prediction only]. 36753 KOG1540: Ubiquinone biosynthesis methyltransferase COQ5 [Coenzyme transport and metabolism]. 36754 KOG1541: Predicted protein carboxyl methylase [General function prediction only]. 36755 KOG1542: Cysteine proteinase Cathepsin F [Posttranslational modification, protein turnover, chaperones]. 36756 KOG1543: Cysteine proteinase Cathepsin L [Posttranslational modification, protein turnover, chaperones]. 36757 KOG1544: Predicted cysteine proteinase TIN-ag [General function prediction only]. 36758 KOG1545: Voltage-gated shaker-like K+ channel KCNA [Inorganic ion transport and metabolism]. 36759 KOG1546: Metacaspase involved in regulation of apoptosis [Cell cycle control, cell division, chromosome partitioning, Posttranslational modification, protein turnover, chaperones]. 36760 KOG1547: Septin CDC10 and related P-loop GTPases [Cell cycle control, cell division, chromosome partitioning, Signal transduction mechanisms, Cytoskeleton]. 36761 KOG1548: Transcription elongation factor TAT-SF1 [Transcription]. 36762 KOG1549: Cysteine desulfurase NFS1 [Amino acid transport and metabolism]. 36763 KOG1550: Extracellular protein SEL-1 and related proteins [Cell wall/membrane/envelope biogenesis, Posttranslational modification, protein turnover, chaperones, Signal transduction mechanisms]. 36764 KOG1551: Uncharacterized conserved protein [Function unknown]. 36765 KOG1552: Predicted alpha/beta hydrolase [General function prediction only]. 36766 KOG1553: Predicted alpha/beta hydrolase BAT5 [General function prediction only]. 36767 KOG1554: COP9 signalosome, subunit CSN5 [Posttranslational modification, protein turnover, chaperones, Signal transduction mechanisms]. 36768 KOG1555: 26S proteasome regulatory complex, subunit RPN11 [Posttranslational modification, protein turnover, chaperones]. 36769 KOG1556: 26S proteasome regulatory complex, subunit RPN8/PSMD7 [Posttranslational modification, protein turnover, chaperones]. 36770 KOG1557: Fructose-biphosphate aldolase [Carbohydrate transport and metabolism]. 36771 KOG1558: Fe2+/Zn2+ regulated transporter [Inorganic ion transport and metabolism]. 36772 KOG1559: Gamma-glutamyl hydrolase [Coenzyme transport and metabolism]. 36773 KOG1560: Translation initiation factor 3, subunit h (eIF-3h) [Translation, ribosomal structure and biogenesis]. 36774 KOG1561: CCAAT-binding factor, subunit B (HAP2) [Transcription]. 36775 KOG1562: Spermidine synthase [Amino acid transport and metabolism]. 36776 KOG1563: Mitochondrial protein Surfeit 1/SURF1/SHY1, required for expression of cytochrome oxidase [Energy production and conversion]. 36777 KOG1564: DNA repair protein RHP57 [Replication, recombination and repair]. 36778 KOG1565: Gelatinase A and related matrix metalloproteases [Posttranslational modification, protein turnover, chaperones, Extracellular structures]. 36779 KOG1566: Conserved protein Mo25 [Function unknown]. 36780 KOG1567: Ribonucleotide reductase, beta subunit [Nucleotide transport and metabolism]. 36781 KOG1568: Mitochondrial inner membrane protease, subunit IMP2 [Posttranslational modification, protein turnover, chaperones, Intracellular trafficking, secretion, and vesicular transport]. 36782 KOG1569: 50S ribosomal protein L1 [Translation, ribosomal structure and biogenesis]. 36783 KOG1570: 60S ribosomal protein L10A [Translation, ribosomal structure and biogenesis]. 36784 KOG1571: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 36785 KOG1572: Predicted protein tyrosine phosphatase [Defense mechanisms]. 36786 KOG1573: Aldehyde reductase [General function prediction only]. 36787 KOG1574: Predicted cell growth/differentiation regulator, contains RA domain [Extracellular structures]. 36788 KOG1575: Voltage-gated shaker-like K+ channel, subunit beta/KCNAB [Energy production and conversion]. 36789 KOG1576: Predicted oxidoreductase [Energy production and conversion]. 36790 KOG1577: Aldo/keto reductase family proteins [General function prediction only]. 36791 KOG1578: Predicted carbonic anhydrase involved in protection against oxidative damage [Inorganic ion transport and metabolism]. 36792 KOG1579: Homocysteine S-methyltransferase [Amino acid transport and metabolism]. 36793 KOG1580: UDP-galactose transporter related protein [Carbohydrate transport and metabolism]. 36794 KOG1581: UDP-galactose transporter related protein [Carbohydrate transport and metabolism]. 36795 KOG1582: UDP-galactose transporter related protein [Carbohydrate transport and metabolism]. 36796 KOG1583: UDP-N-acetylglucosamine transporter [Carbohydrate transport and metabolism]. 36797 KOG1584: Sulfotransferase [General function prediction only]. 36798 KOG1585: Protein required for fusion of vesicles in vesicular transport, gamma-SNAP [Intracellular trafficking, secretion, and vesicular transport]. 36799 KOG1586: Protein required for fusion of vesicles in vesicular transport, alpha-SNAP [Intracellular trafficking, secretion, and vesicular transport]. 36800 KOG1587: Cytoplasmic dynein intermediate chain [Cytoskeleton]. 36801 KOG1588: RNA-binding protein Sam68 and related KH domain proteins [RNA processing and modification]. 36802 KOG1589: Uncharacterized conserved protein [Function unknown]. 36803 KOG1590: Uncharacterized conserved protein [Function unknown]. 36804 KOG1591: Prolyl 4-hydroxylase alpha subunit [Amino acid transport and metabolism]. 36805 KOG1592: Asparaginase [Amino acid transport and metabolism]. 36806 KOG1593: Asparaginase [Amino acid transport and metabolism]. 36807 KOG1594: Uncharacterized enzymes related to aldose 1-epimerase [Carbohydrate transport and metabolism]. 36808 KOG1595: CCCH-type Zn-finger protein [General function prediction only]. 36809 KOG1596: Fibrillarin and related nucleolar RNA-binding proteins [RNA processing and modification]. 36810 KOG1597: Transcription initiation factor TFIIB [Transcription]. 36811 KOG1598: Transcription initiation factor TFIIIB, Brf1 subunit [Transcription]. 36812 KOG1599: Uricase (urate oxidase) [Secondary metabolites biosynthesis, transport and catabolism]. 36813 KOG1600: Fatty acid desaturase [Lipid transport and metabolism]. 36814 KOG1601: GATA-4/5/6 transcription factors [Transcription]. 36815 KOG1602: Cis-prenyltransferase [Lipid transport and metabolism]. 36816 KOG1603: Copper chaperone [Inorganic ion transport and metabolism]. 36817 KOG1604: Predicted mutarotase [Carbohydrate transport and metabolism]. 36818 KOG1605: TFIIF-interacting CTD phosphatase, including NLI-interacting factor (involved in RNA polymerase II regulation) [Transcription]. 36819 KOG1606: Stationary phase-induced protein, SOR/SNZ family [Coenzyme transport and metabolism]. 36820 KOG1607: Protein transporter of the TRAM (translocating chain-associating membrane) superfamily [Intracellular trafficking, secretion, and vesicular transport]. 36821 KOG1608: Protein transporter of the TRAM (translocating chain-associating membrane) superfamily [Intracellular trafficking, secretion, and vesicular transport]. 36822 KOG1609: Protein involved in mRNA turnover and stability [RNA processing and modification]. 36823 KOG1610: Corticosteroid 11-beta-dehydrogenase and related short chain-type dehydrogenases [Secondary metabolites biosynthesis, transport and catabolism, General function prediction only]. 36824 KOG1611: Predicted short chain-type dehydrogenase [General function prediction only]. 36825 KOG1612: Exosomal 3'-5' exoribonuclease complex, subunit Rrp42 [Translation, ribosomal structure and biogenesis]. 36826 KOG1613: Exosomal 3'-5' exoribonuclease complex, subunit Rrp43 [Translation, ribosomal structure and biogenesis]. 36827 KOG1614: Exosomal 3'-5' exoribonuclease complex, subunit Rrp45 [Translation, ribosomal structure and biogenesis]. 36828 KOG1615: Phosphoserine phosphatase [Amino acid transport and metabolism]. 36829 KOG1616: Protein involved in Snf1 protein kinase complex assembly [Carbohydrate transport and metabolism]. 36830 KOG1617: CDP-alcohol phosphatidyltransferase/Phosphatidylglycerol-phosphate synthase [Lipid transport and metabolism]. 36831 KOG1618: Predicted phosphatase [General function prediction only]. 36832 KOG1619: Cytochrome b [Energy production and conversion]. 36833 KOG1620: Inositol polyphosphate multikinase, component of the ARGR transcription regulatory complex [Transcription, Lipid transport and metabolism, Signal transduction mechanisms]. 36834 KOG1621: 1D-myo-inositol-triphosphate 3-kinase A [Lipid transport and metabolism]. 36835 KOG1622: GMP synthase [Nucleotide transport and metabolism]. 36836 KOG1623: Multitransmembrane protein [General function prediction only]. 36837 KOG1624: Mitochondrial/chloroplast ribosomal protein L4 [Translation, ribosomal structure and biogenesis]. 36838 KOG1625: DNA polymerase alpha-primase complex, polymerase-associated subunit B [Replication, recombination and repair]. 36839 KOG1626: Inorganic pyrophosphatase/Nucleosome remodeling factor, subunit NURF38 [Energy production and conversion]. 36840 KOG1627: Translation elongation factor EF-1 gamma [Translation, ribosomal structure and biogenesis]. 36841 KOG1628: 40S ribosomal protein S3A [Translation, ribosomal structure and biogenesis]. 36842 KOG1629: Bax-mediated apoptosis inhibitor TEGT/BI-1 [Defense mechanisms]. 36843 KOG1630: Growth hormone-induced protein and related proteins [Signal transduction mechanisms]. 36844 KOG1631: Translocon-associated complex TRAP, alpha subunit [Intracellular trafficking, secretion, and vesicular transport]. 36845 KOG1632: Uncharacterized PHD Zn-finger protein [General function prediction only]. 36846 KOG1633: F-box protein JEMMA and related proteins with JmjC, PHD, F-box and LRR domains [Chromatin structure and dynamics]. 36847 KOG1634: Predicted transcription factor DATF1, contains PHD and TFS2M domains [Transcription]. 36848 KOG1635: Peptide methionine sulfoxide reductase [Posttranslational modification, protein turnover, chaperones]. 36849 KOG1636: DNA polymerase delta processivity factor (proliferating cell nuclear antigen) [Replication, recombination and repair]. 36850 KOG1637: Threonyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 36851 KOG1638: Steroid reductase [Lipid transport and metabolism]. 36852 KOG1639: Steroid reductase required for elongation of the very long chain fatty acids [Lipid transport and metabolism]. 36853 KOG1640: Predicted steroid reductase [Lipid transport and metabolism]. 36854 KOG1641: Mitochondrial chaperonin [Posttranslational modification, protein turnover, chaperones]. 36855 KOG1642: Ribonuclease, T2 family [RNA processing and modification]. 36856 KOG1643: Triosephosphate isomerase [Carbohydrate transport and metabolism]. 36857 KOG1644: U2-associated snRNP A' protein [RNA processing and modification]. 36858 KOG1645: RING-finger-containing E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 36859 KOG1646: 40S ribosomal protein S6 [Translation, ribosomal structure and biogenesis]. 36860 KOG1647: Vacuolar H+-ATPase V1 sector, subunit D [Energy production and conversion]. 36861 KOG1648: Uncharacterized conserved protein, contains RUN, BRK and TBC domains [General function prediction only]. 36862 KOG1649: SWI-SNF chromatin remodeling complex, Snf5 subunit [Chromatin structure and dynamics, Transcription]. 36863 KOG1650: Predicted K+/H+-antiporter [Inorganic ion transport and metabolism]. 36864 KOG1651: Glutathione peroxidase [Posttranslational modification, protein turnover, chaperones]. 36865 KOG1652: Mitochondrial import inner membrane translocase, subunit TIM17 [Intracellular trafficking, secretion, and vesicular transport]. 36866 KOG1653: Single-stranded DNA-binding protein [Replication, recombination and repair]. 36867 KOG1654: Microtubule-associated anchor protein involved in autophagy and membrane trafficking [Cytoskeleton]. 36868 KOG1655: Protein involved in vacuolar protein sorting [Intracellular trafficking, secretion, and vesicular transport]. 36869 KOG1656: Protein involved in glucose derepression and pre-vacuolar endosome protein sorting [Intracellular trafficking, secretion, and vesicular transport]. 36870 KOG1657: CCAAT-binding factor, subunit C (HAP5) [Transcription]. 36871 KOG1658: DNA polymerase epsilon, subunit C [Replication, recombination and repair]. 36872 KOG1659: Class 2 transcription repressor NC2, alpha subunit (DRAP1) [Transcription]. 36873 KOG1660: Sorting nexin SNX6/TFAF2, contains PX domain [Defense mechanisms]. 36874 KOG1661: Protein-L-isoaspartate(D-aspartate) O-methyltransferase [Posttranslational modification, protein turnover, chaperones]. 36875 KOG1662: Mitochondrial F1F0-ATP synthase, subunit OSCP/ATP5 [Energy production and conversion]. 36876 KOG1663: O-methyltransferase [Secondary metabolites biosynthesis, transport and catabolism]. 36877 KOG1664: Vacuolar H+-ATPase V1 sector, subunit E [Energy production and conversion]. 36878 KOG1665: AFH1-interacting protein FIP2, contains BTB/POZ domain and pentapeptide repeats [General function prediction only]. 36879 KOG1666: V-SNARE [Intracellular trafficking, secretion, and vesicular transport]. 36880 KOG1667: Zn2+-binding protein Melusin/RAR1, contains CHORD domain [General function prediction only]. 36881 KOG1668: Elongation factor 1 beta/delta chain [Transcription]. 36882 KOG1669: Predicted mRNA cap-binding protein related to eIF-4E [Translation, ribosomal structure and biogenesis]. 36883 KOG1670: Translation initiation factor 4F, cap-binding subunit (eIF-4E) and related cap-binding proteins [Translation, ribosomal structure and biogenesis]. 36884 KOG1671: Ubiquinol cytochrome c reductase, subunit RIP1 [Energy production and conversion]. 36885 KOG1672: ATP binding protein [Posttranslational modification, protein turnover, chaperones, Energy production and conversion]. 36886 KOG1673: Ras GTPases [General function prediction only]. 36887 KOG1674: Cyclin [General function prediction only]. 36888 KOG1675: Predicted cyclin [General function prediction only]. 36889 KOG1676: K-homology type RNA binding proteins [RNA processing and modification]. 36890 KOG1677: CCCH-type Zn-finger protein [General function prediction only]. 36891 KOG1678: 60s ribosomal protein L15 [Translation, ribosomal structure and biogenesis]. 36892 KOG1679: Enoyl-CoA hydratase [Lipid transport and metabolism]. 36893 KOG1680: Enoyl-CoA hydratase [Lipid transport and metabolism]. 36894 KOG1681: Enoyl-CoA isomerase [Lipid transport and metabolism]. 36895 KOG1682: Enoyl-CoA isomerase [Lipid transport and metabolism]. 36896 KOG1683: Hydroxyacyl-CoA dehydrogenase/enoyl-CoA hydratase [Lipid transport and metabolism]. 36897 KOG1684: Enoyl-CoA hydratase [Lipid transport and metabolism]. 36898 KOG1685: Uncharacterized conserved protein [Function unknown]. 36899 KOG1686: Mitochondrial/chloroplast ribosomal L21 protein [Translation, ribosomal structure and biogenesis]. 36900 KOG1687: NADH-ubiquinone oxidoreductase, NUFS7/PSST/20 kDa subunit [Energy production and conversion]. 36901 KOG1688: Golgi proteins involved in ER retention (RER) [Intracellular trafficking, secretion, and vesicular transport]. 36902 KOG1689: mRNA cleavage factor I subunit [RNA processing and modification]. 36903 KOG1690: emp24/gp25L/p24 family of membrane trafficking proteins [Intracellular trafficking, secretion, and vesicular transport]. 36904 KOG1691: emp24/gp25L/p24 family of membrane trafficking proteins [Intracellular trafficking, secretion, and vesicular transport]. 36905 KOG1692: Putative cargo transport protein EMP24 (p24 protein family) [Intracellular trafficking, secretion, and vesicular transport]. 36906 KOG1693: emp24/gp25L/p24 family of membrane trafficking proteins [Intracellular trafficking, secretion, and vesicular transport]. 36907 KOG1694: 60s ribosomal protein L6 [Translation, ribosomal structure and biogenesis]. 36908 KOG1695: Glutathione S-transferase [Posttranslational modification, protein turnover, chaperones]. 36909 KOG1696: 60s ribosomal protein L19 [Translation, ribosomal structure and biogenesis]. 36910 KOG1697: Mitochondrial/chloroplast ribosomal protein S9 [Translation, ribosomal structure and biogenesis]. 36911 KOG1698: Mitochondrial/chloroplast ribosomal protein L19 [Translation, ribosomal structure and biogenesis]. 36912 KOG1699: O-acetyltransferase [General function prediction only]. 36913 KOG1700: Regulatory protein MLP and related LIM proteins [Signal transduction mechanisms, Cytoskeleton]. 36914 KOG1701: Focal adhesion adaptor protein Paxillin and related LIM proteins [Signal transduction mechanisms]. 36915 KOG1702: Nebulin repeat protein [Cytoskeleton]. 36916 KOG1703: Adaptor protein Enigma and related PDZ-LIM proteins [Signal transduction mechanisms, Cytoskeleton]. 36917 KOG1705: Uncharacterized conserved protein, contains CXXC motifs [Function unknown]. 36918 KOG1706: Argininosuccinate synthase [Amino acid transport and metabolism]. 36919 KOG1707: Predicted Ras related/Rac-GTP binding protein [Defense mechanisms]. 36920 KOG1708: Mitochondrial/chloroplast ribosomal protein L24 [Translation, ribosomal structure and biogenesis]. 36921 KOG1709: Guanidinoacetate methyltransferase and related proteins [Amino acid transport and metabolism]. 36922 KOG1710: MYND Zn-finger and ankyrin repeat protein [General function prediction only]. 36923 KOG1711: Mitochondrial/chloroplast ribosomal protein L22 [Translation, ribosomal structure and biogenesis]. 36924 KOG1712: Adenine phosphoribosyl transferases [Nucleotide transport and metabolism]. 36925 KOG1713: NADH-ubiquinone oxidoreductase, NDUFS3/30 kDa subunit [Energy production and conversion]. 36926 KOG1714: 60s ribosomal protein L18 [Translation, ribosomal structure and biogenesis]. 36927 KOG1715: Mitochondrial/chloroplast ribosomal protein L12 [Translation, ribosomal structure and biogenesis]. 36928 KOG1716: Dual specificity phosphatase [Defense mechanisms]. 36929 KOG1717: Dual specificity phosphatase [Defense mechanisms]. 36930 KOG1718: Dual specificity phosphatase [Defense mechanisms]. 36931 KOG1719: Dual specificity phosphatase [Defense mechanisms]. 36932 KOG1720: Protein tyrosine phosphatase CDC14 [Defense mechanisms]. 36933 KOG1722: 60s ribosomal protein L24 [Translation, ribosomal structure and biogenesis]. 36934 KOG1723: 60s ribosomal protein L30 isolog [Translation, ribosomal structure and biogenesis]. 36935 KOG1724: SCF ubiquitin ligase, Skp1 component [Posttranslational modification, protein turnover, chaperones]. 36936 KOG1725: Protein involved in membrane traffic (YOP1/TB2/DP1/HVA22 family) [Intracellular trafficking, secretion, and vesicular transport]. 36937 KOG1726: HVA22/DP1 gene product-related proteins [Defense mechanisms]. 36938 KOG1727: Microtubule-binding protein (translationally controlled tumor protein) [Cell cycle control, cell division, chromosome partitioning, Cytoskeleton]. 36939 KOG1728: 40S ribosomal protein S11 [Translation, ribosomal structure and biogenesis]. 36940 KOG1729: FYVE finger containing protein [General function prediction only]. 36941 KOG1730: Thioredoxin-like protein [Posttranslational modification, protein turnover, chaperones]. 36942 KOG1731: FAD-dependent sulfhydryl oxidase/quiescin and related proteins [Cell cycle control, cell division, chromosome partitioning]. 36943 KOG1732: 60S ribosomal protein L21 [Translation, ribosomal structure and biogenesis]. 36944 KOG1733: Mitochondrial import inner membrane translocase, subunit TIM13 [Intracellular trafficking, secretion, and vesicular transport]. 36945 KOG1734: Predicted RING-containing E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 36946 KOG1735: Actin depolymerizing factor [Cytoskeleton]. 36947 KOG1736: Glia maturation factor beta [Extracellular structures]. 36948 KOG1737: Oxysterol-binding protein [Lipid transport and metabolism]. 36949 KOG1738: Membrane-associated guanylate kinase-interacting protein/connector enhancer of KSR-like [Nucleotide transport and metabolism]. 36950 KOG1739: Serine/threonine protein kinase GPBP [Signal transduction mechanisms, Defense mechanisms]. 36951 KOG1740: Predicted mitochondrial/chloroplast ribosomal protein S17 [Translation, ribosomal structure and biogenesis]. 36952 KOG1741: Mitochondrial/chloroplast ribosomal protein S14/S29 [Translation, ribosomal structure and biogenesis]. 36953 KOG1742: 60s ribosomal protein L15/L27 [Translation, ribosomal structure and biogenesis]. 36954 KOG1743: Ferric reductase-like proteins [Inorganic ion transport and metabolism]. 36955 KOG1744: Histone H2B [Chromatin structure and dynamics]. 36956 KOG1745: Histones H3 and H4 [Chromatin structure and dynamics]. 36957 KOG1746: Defender against cell death protein/oligosaccharyltransferase, epsilon subunit [Cell cycle control, cell division, chromosome partitioning, Posttranslational modification, protein turnover, chaperones]. 36958 KOG1747: Protein tyrosine kinase 9/actin monomer-binding protein [Extracellular structures]. 36959 KOG1748: Acyl carrier protein/NADH-ubiquinone oxidoreductase, NDUFAB1/SDAP subunit [Energy production and conversion, Lipid transport and metabolism, Secondary metabolites biosynthesis, transport and catabolism]. 36960 KOG1749: 40S ribosomal protein S23 [Translation, ribosomal structure and biogenesis]. 36961 KOG1750: Mitochondrial/chloroplast ribosomal protein S12 [Translation, ribosomal structure and biogenesis]. 36962 KOG1751: 60s ribosomal protein L23 [Translation, ribosomal structure and biogenesis]. 36963 KOG1752: Glutaredoxin and related proteins [Posttranslational modification, protein turnover, chaperones]. 36964 KOG1753: 40S ribosomal protein S16 [Translation, ribosomal structure and biogenesis]. 36965 KOG1754: 40S ribosomal protein S15/S22 [Translation, ribosomal structure and biogenesis]. 36966 KOG1755: Profilin [Cytoskeleton]. 36967 KOG1756: Histone 2A [Chromatin structure and dynamics]. 36968 KOG1757: Histone 2A [Chromatin structure and dynamics]. 36969 KOG1758: Mitochondrial F1F0-ATP synthase, subunit delta/ATP16 [Energy production and conversion]. 36970 KOG1759: Macrophage migration inhibitory factor [Defense mechanisms]. 36971 KOG1760: Molecular chaperone Prefoldin, subunit 4 [Posttranslational modification, protein turnover, chaperones]. 36972 KOG1761: Signal recognition particle, subunit Srp14 [Intracellular trafficking, secretion, and vesicular transport]. 36973 KOG1762: 60s acidic ribosomal protein P1 [Translation, ribosomal structure and biogenesis]. 36974 KOG1763: Uncharacterized conserved protein, contains CCCH-type Zn-finger [General function prediction only]. 36975 KOG1764: 5'-AMP-activated protein kinase, gamma subunit [Energy production and conversion]. 36976 KOG1765: Regulator of ribosome synthesis [Translation, ribosomal structure and biogenesis]. 36977 KOG1766: Enhancer of rudimentary [General function prediction only]. 36978 KOG1767: 40S ribosomal protein S25 [Translation, ribosomal structure and biogenesis]. 36979 KOG1768: 40s ribosomal protein S26 [Translation, ribosomal structure and biogenesis]. 36980 KOG1769: Ubiquitin-like proteins [Posttranslational modification, protein turnover, chaperones]. 36981 KOG1770: Translation initiation factor 1 (eIF-1/SUI1) [Translation, ribosomal structure and biogenesis]. 36982 KOG1771: GPI-alpha-mannosyltransferase III (GPI10/PIG-B) involved in glycosylphosphatidylinositol anchor biosynthesis [Cell wall/membrane/envelope biogenesis, Posttranslational modification, protein turnover, chaperones]. 36983 KOG1772: Vacuolar H+-ATPase V1 sector, subunit G [Energy production and conversion]. 36984 KOG1773: Stress responsive protein [General function prediction only]. 36985 KOG1774: Small nuclear ribonucleoprotein E [RNA processing and modification]. 36986 KOG1775: U6 snRNA-associated Sm-like protein [RNA processing and modification]. 36987 KOG1776: Zn-binding protein Push [Signal transduction mechanisms]. 36988 KOG1777: Putative Zn-finger protein [General function prediction only]. 36989 KOG1778: CREB binding protein/P300 and related TAZ Zn-finger proteins [Transcription]. 36990 KOG1779: 40s ribosomal protein S27 [Translation, ribosomal structure and biogenesis]. 36991 KOG1780: Small Nuclear ribonucleoprotein G [RNA processing and modification]. 36992 KOG1781: Small Nuclear ribonucleoprotein splicing factor [RNA processing and modification]. 36993 KOG1782: Small Nuclear ribonucleoprotein splicing factor [RNA processing and modification]. 36994 KOG1783: Small nuclear ribonucleoprotein F [RNA processing and modification]. 36995 KOG1784: Small Nuclear ribonucleoprotein splicing factor [RNA processing and modification]. 36996 KOG1785: Tyrosine kinase negative regulator CBL [Defense mechanisms]. 36997 KOG1786: Lysosomal trafficking regulator LYST and related BEACH and WD40 repeat proteins [Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport]. 36998 KOG1787: Kinase A-anchor protein Neurobeachin and related BEACH and WD40 repeat proteins [Intracellular trafficking, secretion, and vesicular transport]. 36999 KOG1788: Uncharacterized conserved protein [Function unknown]. 37000 KOG1789: Endocytosis protein RME-8, contains DnaJ domain [Intracellular trafficking, secretion, and vesicular transport, Posttranslational modification, protein turnover, chaperones]. 37001 KOG1790: 60s ribosomal protein L34 [Translation, ribosomal structure and biogenesis]. 37002 KOG1791: Uncharacterized conserved protein [Function unknown]. 37003 KOG1792: Reticulon [Intracellular trafficking, secretion, and vesicular transport]. 37004 KOG1793: Uncharacterized conserved protein [Function unknown]. 37005 KOG1794: N-Acetylglucosamine kinase [Carbohydrate transport and metabolism]. 37006 KOG1795: U5 snRNP spliceosome subunit [RNA processing and modification]. 37007 KOG1796: Vacuolar protein sorting-associated protein [Intracellular trafficking, secretion, and vesicular transport]. 37008 KOG1797: Uncharacterized conserved protein (Neuroblastoma-amplified protein) [Function unknown]. 37009 KOG1798: DNA polymerase epsilon, catalytic subunit A [Replication, recombination and repair]. 37010 KOG1799: Dihydropyrimidine dehydrogenase [Nucleotide transport and metabolism]. 37011 KOG1800: Ferredoxin/adrenodoxin reductase [Nucleotide transport and metabolism]. 37012 KOG1801: tRNA-splicing endonuclease positive effector (SEN1) [RNA processing and modification]. 37013 KOG1802: RNA helicase nonsense mRNA reducing factor (pNORF1) [RNA processing and modification]. 37014 KOG1803: DNA helicase [Replication, recombination and repair]. 37015 KOG1804: RNA helicase [RNA processing and modification]. 37016 KOG1805: DNA replication helicase [Replication, recombination and repair]. 37017 KOG1806: DEAD box containing helicases [Replication, recombination and repair]. 37018 KOG1807: Helicases [Replication, recombination and repair]. 37019 KOG1808: AAA ATPase containing von Willebrand factor type A (vWA) domain [General function prediction only]. 37020 KOG1809: Vacuolar protein sorting-associated protein [Intracellular trafficking, secretion, and vesicular transport]. 37021 KOG1810: Cell cycle-associated protein [Cell cycle control, cell division, chromosome partitioning]. 37022 KOG1811: Predicted Zn2+-binding protein, contains FYVE domain [General function prediction only]. 37023 KOG1812: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 37024 KOG1813: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 37025 KOG1814: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 37026 KOG1815: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 37027 KOG1816: Ubiquitin fusion-degradation protein [Posttranslational modification, protein turnover, chaperones]. 37028 KOG1817: Ribonuclease [RNA processing and modification]. 37029 KOG1818: Membrane trafficking and cell signaling protein HRS, contains VHS and FYVE domains [Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport]. 37030 KOG1819: FYVE finger-containing proteins [General function prediction only]. 37031 KOG1820: Microtubule-associated protein [Cytoskeleton]. 37032 KOG1821: Uncharacterized conserved protein [Function unknown]. 37033 KOG1822: Uncharacterized conserved protein [Function unknown]. 37034 KOG1823: DRIM (Down-regulated in metastasis)-like proteins [Defense mechanisms]. 37035 KOG1824: TATA-binding protein-interacting protein [General function prediction only]. 37036 KOG1825: Fry-like conserved proteins [General function prediction only]. 37037 KOG1826: Ras GTPase activating protein RasGAP/neurofibromin [Defense mechanisms]. 37038 KOG1827: Chromatin remodeling complex RSC, subunit RSC1/Polybromo and related proteins [Chromatin structure and dynamics, Transcription]. 37039 KOG1828: IRF-2-binding protein CELTIX-1, contains BROMO domain [Transcription]. 37040 KOG1829: Uncharacterized conserved protein, contains C1, PH and RUN domains [Signal transduction mechanisms]. 37041 KOG1830: Wiskott Aldrich syndrome proteins [Cytoskeleton]. 37042 KOG1831: Negative regulator of transcription [Transcription]. 37043 KOG1832: HIV-1 Vpr-binding protein [Cell cycle control, cell division, chromosome partitioning]. 37044 KOG1833: Nuclear pore complex, gp210 component [Nuclear structure, Intracellular trafficking, secretion, and vesicular transport]. 37045 KOG1834: Calsyntenin [Extracellular structures]. 37046 KOG1835: Uncharacterized conserved protein [Function unknown]. 37047 KOG1836: Extracellular matrix glycoprotein Laminin subunits alpha and gamma [Extracellular structures]. 37048 KOG1837: Uncharacterized conserved protein [Function unknown]. 37049 KOG1838: Alpha/beta hydrolase [General function prediction only]. 37050 KOG1839: Uncharacterized protein CLU1/cluA/TIF31 involved in mitochondrial morphology/distribution, also found associated with eIF-3 [General function prediction only]. 37051 KOG1840: Kinesin light chain [Cytoskeleton]. 37052 KOG1841: Smad anchor for receptor activation [Defense mechanisms]. 37053 KOG1842: FYVE finger-containing protein [General function prediction only]. 37054 KOG1843: Uncharacterized conserved protein [Function unknown]. 37055 KOG1844: PHD Zn-finger proteins [General function prediction only]. 37056 KOG1845: MORC family ATPases [Cell cycle control, cell division, chromosome partitioning]. 37057 KOG1846: Uncharacterized conserved protein, contains Sec7 domain [Intracellular trafficking, secretion, and vesicular transport]. 37058 KOG1847: mRNA splicing factor [RNA processing and modification]. 37059 KOG1848: Uncharacterized conserved protein [Function unknown]. 37060 KOG1849: Regulator of spindle pole body duplication [Cell cycle control, cell division, chromosome partitioning]. 37061 KOG1850: Myosin-like coiled-coil protein [Cytoskeleton]. 37062 KOG1851: Uncharacterized conserved protein [Function unknown]. 37063 KOG1852: Cell cycle-associated protein [Cell cycle control, cell division, chromosome partitioning]. 37064 KOG1853: LIS1-interacting protein NUDE [Cytoskeleton]. 37065 KOG1854: Mitochondrial inner membrane protein (mitofilin) [Cell wall/membrane/envelope biogenesis]. 37066 KOG1855: Predicted RNA-binding protein [General function prediction only]. 37067 KOG1856: Transcription elongation factor SPT6 [RNA processing and modification]. 37068 KOG1857: Transcription accessory protein TEX, contains S1 domain [Transcription]. 37069 KOG1858: Anaphase-promoting complex (APC), subunit 1 (meiotic check point regulator/Tsg24) [Cell cycle control, cell division, chromosome partitioning, Posttranslational modification, protein turnover, chaperones]. 37070 KOG1859: Leucine-rich repeat proteins [General function prediction only]. 37071 KOG1860: Nuclear protein export factor [Intracellular trafficking, secretion, and vesicular transport, Cell cycle control, cell division, chromosome partitioning]. 37072 KOG1861: Leucine permease transcriptional regulator [Transcription]. 37073 KOG1862: GYF domain containing proteins [General function prediction only]. 37074 KOG1863: Ubiquitin carboxyl-terminal hydrolase [Posttranslational modification, protein turnover, chaperones]. 37075 KOG1864: Ubiquitin-specific protease [Posttranslational modification, protein turnover, chaperones]. 37076 KOG1865: Ubiquitin carboxyl-terminal hydrolase [Posttranslational modification, protein turnover, chaperones]. 37077 KOG1866: Ubiquitin carboxyl-terminal hydrolase [Posttranslational modification, protein turnover, chaperones]. 37078 KOG1867: Ubiquitin-specific protease [Posttranslational modification, protein turnover, chaperones]. 37079 KOG1868: Ubiquitin C-terminal hydrolase [Posttranslational modification, protein turnover, chaperones]. 37080 KOG1869: Splicing coactivator SRm160/300, subunit SRm300 [RNA processing and modification]. 37081 KOG1870: Ubiquitin C-terminal hydrolase [Posttranslational modification, protein turnover, chaperones]. 37082 KOG1871: Ubiquitin-specific protease [Posttranslational modification, protein turnover, chaperones]. 37083 KOG1872: Ubiquitin-specific protease [Posttranslational modification, protein turnover, chaperones]. 37084 KOG1873: Ubiquitin-specific protease [Posttranslational modification, protein turnover, chaperones]. 37085 KOG1874: KEKE-like motif-containing transcription regulator (Rlr1)/suppressor of sin4 [Transcription]. 37086 KOG1875: Thyroid hormone receptor-associated coactivator complex component (TRAP170) [Transcription]. 37087 KOG1876: Actin-related protein Arp2/3 complex, subunit ARPC4 [Cytoskeleton]. 37088 KOG1877: Putative transmembrane protein cmp44E [General function prediction only]. 37089 KOG1878: Nuclear receptor coregulator SMRT/SMRTER, contains Myb-like domains [Transcription]. 37090 KOG1879: UDP-glucose:glycoprotein glucosyltransferase [Carbohydrate transport and metabolism]. 37091 KOG1880: Nuclear inhibitor of phosphatase-1 [General function prediction only]. 37092 KOG1881: Anion exchanger adaptor protein Kanadaptin, contains FHA domain [General function prediction only]. 37093 KOG1882: Transcriptional regulator SNIP1, contains FHA domain [Signal transduction mechanisms]. 37094 KOG1883: Cofactor required for Sp1 transcriptional activation, subunit 3 [Transcription]. 37095 KOG1884: Uncharacterized conserved protein [Function unknown]. 37096 KOG1885: Lysyl-tRNA synthetase (class II) [Translation, ribosomal structure and biogenesis]. 37097 KOG1886: BAH domain proteins [Transcription]. 37098 KOG1887: Ubiquitin carboxyl-terminal hydrolase [Posttranslational modification, protein turnover, chaperones]. 37099 KOG1888: Putative phosphoinositide phosphatase [Lipid transport and metabolism]. 37100 KOG1889: Putative phosphoinositide phosphatase [Lipid transport and metabolism]. 37101 KOG1890: Phosphoinositide phosphatase SAC1 [Lipid transport and metabolism]. 37102 KOG1891: Proline binding protein WW45 [General function prediction only]. 37103 KOG1892: Actin filament-binding protein Afadin [Cytoskeleton]. 37104 KOG1893: Uncharacterized conserved protein [Function unknown]. 37105 KOG1894: Uncharacterized conserved protein [Function unknown]. 37106 KOG1895: mRNA cleavage and polyadenylation factor II complex, subunit PTA1 [RNA processing and modification]. 37107 KOG1896: mRNA cleavage and polyadenylation factor II complex, subunit CFT1 (CPSF subunit) [RNA processing and modification]. 37108 KOG1897: Damage-specific DNA binding complex, subunit DDB1 [Replication, recombination and repair]. 37109 KOG1898: Splicing factor 3b, subunit 3 [RNA processing and modification]. 37110 KOG1899: LAR transmembrane tyrosine phosphatase-interacting protein liprin [General function prediction only]. 37111 KOG1900: Nuclear pore complex, Nup155 component (D Nup154, sc Nup157/Nup170) [Nuclear structure, Intracellular trafficking, secretion, and vesicular transport]. 37112 KOG1901: Uncharacterized high-glucose-regulated protein [General function prediction only]. 37113 KOG1902: Putative signal transduction protein involved in RNA splicing [Signal transduction mechanisms, RNA processing and modification]. 37114 KOG1903: Cell cycle-associated protein [Cell cycle control, cell division, chromosome partitioning]. 37115 KOG1904: Transcription coactivator [Transcription]. 37116 KOG1905: Class IV sirtuins (SIR2 family) [Chromatin structure and dynamics, Transcription]. 37117 KOG1906: DNA polymerase sigma [Replication, recombination and repair]. 37118 KOG1907: Phosphoribosylformylglycinamidine synthase [Nucleotide transport and metabolism]. 37119 KOG1908: Ribonuclease inhibitor type leucine-rich repeat proteins [RNA processing and modification]. 37120 KOG1909: Ran GTPase-activating protein [RNA processing and modification, Nuclear structure, Signal transduction mechanisms]. 37121 KOG1910: Uncharacterized conserved protein [Function unknown]. 37122 KOG1911: Heterochromatin-associated protein HP1 and related CHROMO domain proteins [Chromatin structure and dynamics]. 37123 KOG1912: WD40 repeat protein [General function prediction only]. 37124 KOG1913: Regucalcin gene promoter region-related protein (RGPR) [Transcription]. 37125 KOG1914: mRNA cleavage and polyadenylation factor I complex, subunit RNA14 [RNA processing and modification]. 37126 KOG1915: Cell cycle control protein (crooked neck) [Cell cycle control, cell division, chromosome partitioning]. 37127 KOG1916: Nuclear protein, contains WD40 repeats [General function prediction only]. 37128 KOG1917: Membrane-associated hematopoietic protein [General function prediction only]. 37129 KOG1918: 3-methyladenine DNA glycosidase [Replication, recombination and repair]. 37130 KOG1919: RNA pseudouridylate synthases [RNA processing and modification]. 37131 KOG1920: IkappaB kinase complex, IKAP component [Transcription]. 37132 KOG1921: Endonuclease III [Replication, recombination and repair]. 37133 KOG1922: Rho GTPase effector BNI1 and related formins [Signal transduction mechanisms, Cytoskeleton]. 37134 KOG1923: Rac1 GTPase effector FRL [Signal transduction mechanisms, Cytoskeleton]. 37135 KOG1924: RhoA GTPase effector DIA/Diaphanous [Signal transduction mechanisms, Cytoskeleton]. 37136 KOG1925: Rac1 GTPase effector FHOS [Signal transduction mechanisms, Cytoskeleton]. 37137 KOG1926: Predicted regulator of rRNA gene transcription (MYB-binding protein) [Transcription]. 37138 KOG1927: R-kappa-B and related transcription factors [Transcription]. 37139 KOG1928: Alpha-1,4-N-acetylglucosaminyltransferase [Carbohydrate transport and metabolism]. 37140 KOG1929: Nucleotide excision repair factor NEF2, RAD4/CUT5 component [Replication, recombination and repair]. 37141 KOG1930: Focal adhesion protein Tensin, contains PTB domain [Signal transduction mechanisms, Cytoskeleton]. 37142 KOG1931: Putative transmembrane protein [General function prediction only]. 37143 KOG1932: TATA binding protein associated factor [Transcription]. 37144 KOG1933: Cholesterol transport protein (Niemann-Pick C disease protein) [Lipid transport and metabolism]. 37145 KOG1934: Predicted membrane protein (patched superfamily) [General function prediction only]. 37146 KOG1935: Membrane protein Patched/PTCH [Signal transduction mechanisms]. 37147 KOG1936: Histidyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 37148 KOG1937: Uncharacterized conserved protein [Function unknown]. 37149 KOG1938: Protein with predicted involvement in meiosis (GSG1) [Cell cycle control, cell division, chromosome partitioning]. 37150 KOG1939: Oxoprolinase [Amino acid transport and metabolism]. 37151 KOG1940: Zn-finger protein [General function prediction only]. 37152 KOG1941: Acetylcholine receptor-associated protein of the synapse (rapsyn) [Extracellular structures]. 37153 KOG1942: DNA helicase, TBP-interacting protein [Replication, recombination and repair]. 37154 KOG1943: Beta-tubulin folding cofactor D [Posttranslational modification, protein turnover, chaperones]. 37155 KOG1944: Peroxisomal membrane protein MPV17 and related proteins [General function prediction only]. 37156 KOG1945: Protein phosphatase 1 binding protein spinophilin/neurabin II [Signal transduction mechanisms]. 37157 KOG1946: RNA polymerase I transcription factor UAF [Transcription]. 37158 KOG1947: Leucine rich repeat proteins, some proteins contain F-box [General function prediction only]. 37159 KOG1948: Metalloproteinase-related collagenase pM5 [Posttranslational modification, protein turnover, chaperones]. 37160 KOG1949: Uncharacterized conserved protein [Function unknown]. 37161 KOG1950: Glycosyl transferase, family 8 - glycogenin [Carbohydrate transport and metabolism]. 37162 KOG1951: GTP-binding protein AARP2 involved in 40S ribosome biogenesis [Translation, ribosomal structure and biogenesis]. 37163 KOG1952: Transcription factor NF-X1, contains NFX-type Zn2+-binding and R3H domains [Transcription]. 37164 KOG1953: Targeting complex (TRAPP) subunit [Intracellular trafficking, secretion, and vesicular transport]. 37165 KOG1954: Endocytosis/signaling protein EHD1 [Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport]. 37166 KOG1955: Ral-GTPase effector RALBP1 [Intracellular trafficking, secretion, and vesicular transport]. 37167 KOG1956: DNA topoisomerase III alpha [Replication, recombination and repair]. 37168 KOG1957: DNA topoisomerase III beta [Replication, recombination and repair]. 37169 KOG1958: Glycosyl hydrolase, family 38 - alpha-mannosidase [Carbohydrate transport and metabolism]. 37170 KOG1959: Glycosyl hydrolase, family 38 - alpha-mannosidase [Carbohydrate transport and metabolism]. 37171 KOG1960: Predicted RNA-binding protein, contains KH domains [RNA processing and modification]. 37172 KOG1961: Vacuolar sorting protein VPS52/suppressor of actin Sac2 [Intracellular trafficking, secretion, and vesicular transport, Cytoskeleton]. 37173 KOG1962: B-cell receptor-associated protein and related proteins [Defense mechanisms]. 37174 KOG1963: WD40 repeat protein [General function prediction only]. 37175 KOG1964: Nuclear pore complex, rNup107 component (sc Nup84) [Nuclear structure, Intracellular trafficking, secretion, and vesicular transport]. 37176 KOG1965: Sodium/hydrogen exchanger protein [Inorganic ion transport and metabolism]. 37177 KOG1966: Sodium/hydrogen exchanger protein [Inorganic ion transport and metabolism]. 37178 KOG1967: DNA repair/transcription protein Mms19 [Replication, recombination and repair, Transcription]. 37179 KOG1968: Replication factor C, subunit RFC1 (large subunit) [Replication, recombination and repair]. 37180 KOG1969: DNA replication checkpoint protein CHL12/CTF18 [Energy production and conversion, Replication, recombination and repair]. 37181 KOG1970: Checkpoint RAD17-RFC complex, RAD17/RAD24 component [Energy production and conversion, Replication, recombination and repair]. 37182 KOG1971: Lysyl hydroxylase [Posttranslational modification, protein turnover, chaperones]. 37183 KOG1972: Uncharacterized conserved protein [Function unknown]. 37184 KOG1973: Chromatin remodeling protein, contains PHD Zn-finger [Chromatin structure and dynamics]. 37185 KOG1974: DNA topoisomerase I-interacting protein [Replication, recombination and repair]. 37186 KOG1975: mRNA cap methyltransferase [RNA processing and modification]. 37187 KOG1976: Inositol polyphosphate 5-phosphatase, type I [Lipid transport and metabolism]. 37188 KOG1977: DNA mismatch repair protein - MLH3 family [Replication, recombination and repair]. 37189 KOG1978: DNA mismatch repair protein - MLH2/PMS1/Pms2 family [Replication, recombination and repair]. 37190 KOG1979: DNA mismatch repair protein - MLH1 family [Replication, recombination and repair]. 37191 KOG1980: Uncharacterized conserved protein [Function unknown]. 37192 KOG1981: SOK1 kinase belonging to the STE20/SPS1/GC kinase family [Signal transduction mechanisms]. 37193 KOG1982: Nuclear 5'-3' exoribonuclease-interacting protein, Rai1p [Replication, recombination and repair]. 37194 KOG1983: Tomosyn and related SNARE-interacting proteins [Intracellular trafficking, secretion, and vesicular transport]. 37195 KOG1984: Vesicle coat complex COPII, subunit SFB3 [Intracellular trafficking, secretion, and vesicular transport]. 37196 KOG1985: Vesicle coat complex COPII, subunit SEC24/subunit SFB2 [Intracellular trafficking, secretion, and vesicular transport]. 37197 KOG1986: Vesicle coat complex COPII, subunit SEC23 [Intracellular trafficking, secretion, and vesicular transport]. 37198 KOG1987: Speckle-type POZ protein SPOP and related proteins with TRAF, MATH and BTB/POZ domains [Cell cycle control, cell division, chromosome partitioning, General function prediction only]. 37199 KOG1988: Uncharacterized conserved protein [Function unknown]. 37200 KOG1989: ARK protein kinase family [Signal transduction mechanisms]. 37201 KOG1990: Poly(A)-specific exoribonuclease PARN [Replication, recombination and repair]. 37202 KOG1991: Nuclear transport receptor RANBP7/RANBP8 (importin beta superfamily) [Nuclear structure, Intracellular trafficking, secretion, and vesicular transport]. 37203 KOG1992: Nuclear export receptor CSE1/CAS (importin beta superfamily) [Nuclear structure, Intracellular trafficking, secretion, and vesicular transport]. 37204 KOG1993: Nuclear transport receptor KAP120 (importin beta superfamily) [Nuclear structure, Intracellular trafficking, secretion, and vesicular transport]. 37205 KOG1994: Predicted RNA binding protein, contains G-patch and Zn-finger domains [RNA processing and modification]. 37206 KOG1995: Conserved Zn-finger protein [General function prediction only]. 37207 KOG1996: mRNA splicing factor [RNA processing and modification]. 37208 KOG1997: PH domain-containing protein [Signal transduction mechanisms]. 37209 KOG1998: Signaling protein DOCK180 [Signal transduction mechanisms]. 37210 KOG1999: RNA polymerase II transcription elongation factor DSIF/SUPT5H/SPT5 [Transcription]. 37211 KOG2000: Gamma-tubulin complex, DGRIP91/SPC98 component [Cytoskeleton]. 37212 KOG2001: Gamma-tubulin complex, DGRIP84/SPC97 component [Cytoskeleton]. 37213 KOG2002: TPR-containing nuclear phosphoprotein that regulates K(+) uptake [Inorganic ion transport and metabolism]. 37214 KOG2003: TPR repeat-containing protein [General function prediction only]. 37215 KOG2004: Mitochondrial ATP-dependent protease PIM1/LON [Posttranslational modification, protein turnover, chaperones]. 37216 KOG2005: 26S proteasome regulatory complex, subunit RPN1/PSMD2 [Posttranslational modification, protein turnover, chaperones]. 37217 KOG2006: WD40 repeat protein [General function prediction only]. 37218 KOG2007: Cysteinyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 37219 KOG2008: BTK-associated SH3-domain binding protein SAB [Signal transduction mechanisms]. 37220 KOG2009: Transcription initiation factor TFIIIB, Bdp1 subunit [Transcription]. 37221 KOG2010: Double stranded RNA binding protein [General function prediction only]. 37222 KOG2011: Sister chromatid cohesion complex Cohesin, subunit STAG/IRR1/SCC3 [Cell cycle control, cell division, chromosome partitioning]. 37223 KOG2012: Ubiquitin activating enzyme UBA1 [Posttranslational modification, protein turnover, chaperones]. 37224 KOG2013: SMT3/SUMO-activating complex, catalytic component UBA2 [Posttranslational modification, protein turnover, chaperones]. 37225 KOG2014: SMT3/SUMO-activating complex, AOS1/RAD31 component [Posttranslational modification, protein turnover, chaperones]. 37226 KOG2015: NEDD8-activating complex, catalytic component UBA3 [Posttranslational modification, protein turnover, chaperones]. 37227 KOG2016: NEDD8-activating complex, APP-BP1/UBA5 component [Posttranslational modification, protein turnover, chaperones]. 37228 KOG2017: Molybdopterin synthase sulfurylase [Coenzyme transport and metabolism]. 37229 KOG2018: Predicted dinucleotide-utilizing enzyme involved in molybdopterin and thiamine biosynthesis [Posttranslational modification, protein turnover, chaperones]. 37230 KOG2019: Metalloendoprotease HMP1 (insulinase superfamily) [General function prediction only, Posttranslational modification, protein turnover, chaperones]. 37231 KOG2020: Nuclear transport receptor CRM1/MSN5 (importin beta superfamily) [Nuclear structure, Intracellular trafficking, secretion, and vesicular transport]. 37232 KOG2021: Nuclear mRNA export factor receptor LOS1/Exportin-t (importin beta superfamily) [Nuclear structure, Intracellular trafficking, secretion, and vesicular transport, Translation, ribosomal structure and biogenesis]. 37233 KOG2022: Nuclear transport receptor LGL2 (importin beta superfamily) [Nuclear structure, Intracellular trafficking, secretion, and vesicular transport]. 37234 KOG2023: Nuclear transport receptor Karyopherin-beta2/Transportin (importin beta superfamily) [Nuclear structure, Intracellular trafficking, secretion, and vesicular transport]. 37235 KOG2024: Beta-Glucuronidase GUSB (glycosylhydrolase superfamily 2) [Carbohydrate transport and metabolism]. 37236 KOG2025: Chromosome condensation complex Condensin, subunit G [Chromatin structure and dynamics, Cell cycle control, cell division, chromosome partitioning]. 37237 KOG2026: Spindle pole body protein - Sad1p [Cytoskeleton]. 37238 KOG2027: Spindle pole body protein [Cytoskeleton]. 37239 KOG2028: ATPase related to the helicase subunit of the Holliday junction resolvase [Replication, recombination and repair]. 37240 KOG2029: Uncharacterized conserved protein [Function unknown]. 37241 KOG2030: Predicted RNA-binding protein [General function prediction only]. 37242 KOG2031: Tyrosyl-DNA phosphodiesterase [Replication, recombination and repair]. 37243 KOG2032: Uncharacterized conserved protein [Function unknown]. 37244 KOG2033: Low density lipoprotein B-like protein [Lipid transport and metabolism]. 37245 KOG2034: Vacuolar sorting protein PEP3/VPS18 [Intracellular trafficking, secretion, and vesicular transport]. 37246 KOG2035: Replication factor C, subunit RFC3 [Energy production and conversion, Replication, recombination and repair]. 37247 KOG2036: Predicted P-loop ATPase fused to an acetyltransferase [General function prediction only]. 37248 KOG2037: Guanylate-binding protein [General function prediction only]. 37249 KOG2038: CAATT-binding transcription factor/60S ribosomal subunit biogenesis protein [Translation, ribosomal structure and biogenesis, Transcription]. 37250 KOG2039: Transcriptional coactivator p100 [Transcription]. 37251 KOG2040: Glycine dehydrogenase (decarboxylating) [Amino acid transport and metabolism]. 37252 KOG2041: WD40 repeat protein [General function prediction only]. 37253 KOG2042: Ubiquitin fusion degradation protein-2 [Posttranslational modification, protein turnover, chaperones]. 37254 KOG2043: Signaling protein SWIFT and related BRCT domain proteins [Transcription, Signal transduction mechanisms, Cell cycle control, cell division, chromosome partitioning, Replication, recombination and repair]. 37255 KOG2044: 5'-3' exonuclease HKE1/RAT1 [Replication, recombination and repair, RNA processing and modification]. 37256 KOG2045: 5'-3' exonuclease XRN1/KEM1/SEP1 involved in DNA strand exchange and mRNA turnover [Replication, recombination and repair, Cell cycle control, cell division, chromosome partitioning]. 37257 KOG2046: Calponin [Cytoskeleton]. 37258 KOG2047: mRNA splicing factor [RNA processing and modification]. 37259 KOG2048: WD40 repeat protein [General function prediction only]. 37260 KOG2049: Translational repressor MPT5/PUF4 and related RNA-binding proteins (Puf superfamily) [Translation, ribosomal structure and biogenesis]. 37261 KOG2050: Puf family RNA-binding protein [Translation, ribosomal structure and biogenesis]. 37262 KOG2051: Nonsense-mediated mRNA decay 2 protein [RNA processing and modification]. 37263 KOG2052: Activin A type IB receptor, serine/threonine protein kinase [Signal transduction mechanisms]. 37264 KOG2053: Mitochondrial inheritance and actin cytoskeleton organization protein [Cytoskeleton]. 37265 KOG2054: Nucleolar RNA-associated protein (NRAP) [Function unknown]. 37266 KOG2055: WD40 repeat protein [General function prediction only]. 37267 KOG2056: Equilibrative nucleoside transporter protein [Nucleotide transport and metabolism]. 37268 KOG2057: Predicted equilibrative nucleoside transporter protein [Nucleotide transport and metabolism]. 37269 KOG2058: Ypt/Rab GTPase activating protein [Intracellular trafficking, secretion, and vesicular transport]. 37270 KOG2059: Ras GTPase-activating protein [Signal transduction mechanisms]. 37271 KOG2060: Rab3 effector RIM1 and related proteins, contain PDZ and C2 domains [Intracellular trafficking, secretion, and vesicular transport]. 37272 KOG2061: Uncharacterized MYND Zn-finger protein [General function prediction only]. 37273 KOG2062: 26S proteasome regulatory complex, subunit RPN2/PSMD1 [Posttranslational modification, protein turnover, chaperones]. 37274 KOG2063: Vacuolar assembly/sorting proteins VPS39/VAM6/VPS3 [Intracellular trafficking, secretion, and vesicular transport]. 37275 KOG2064: Poly(ADP-ribose) glycohydrolase [Signal transduction mechanisms]. 37276 KOG2065: Gamma-tubulin ring complex protein [Cytoskeleton]. 37277 KOG2066: Vacuolar assembly/sorting protein VPS41 [Intracellular trafficking, secretion, and vesicular transport]. 37278 KOG2067: Mitochondrial processing peptidase, alpha subunit [Posttranslational modification, protein turnover, chaperones]. 37279 KOG2068: MOT2 transcription factor [Transcription]. 37280 KOG2069: Golgi transport complex subunit [Intracellular trafficking, secretion, and vesicular transport]. 37281 KOG2070: Guanine nucleotide exchange factor [Nucleotide transport and metabolism]. 37282 KOG2071: mRNA cleavage and polyadenylation factor I/II complex, subunit Pcf11 [RNA processing and modification]. 37283 KOG2072: Translation initiation factor 3, subunit a (eIF-3a) [Translation, ribosomal structure and biogenesis]. 37284 KOG2073: SAP family cell cycle dependent phosphatase-associated protein [Cell cycle control, cell division, chromosome partitioning]. 37285 KOG2074: RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH, subunit TFB1 [Transcription, Replication, recombination and repair]. 37286 KOG2075: Topoisomerase TOP1-interacting protein BTBD1 [Function unknown]. 37287 KOG2076: RNA polymerase III transcription factor TFIIIC [Transcription]. 37288 KOG2077: JNK/SAPK-associated protein-1 [Signal transduction mechanisms]. 37289 KOG2078: tRNA modification enzyme [RNA processing and modification]. 37290 KOG2079: Vacuolar assembly/sorting protein VPS8 [Intracellular trafficking, secretion, and vesicular transport]. 37291 KOG2080: Uncharacterized conserved protein, contains DENN and RUN domains [Signal transduction mechanisms]. 37292 KOG2081: Nuclear transport regulator [Intracellular trafficking, secretion, and vesicular transport]. 37293 KOG2082: K+/Cl- cotransporter KCC1 and related transporters [Inorganic ion transport and metabolism]. 37294 KOG2083: Na+/K+ symporter [Inorganic ion transport and metabolism]. 37295 KOG2084: Predicted histone tail methylase containing SET domain [Chromatin structure and dynamics]. 37296 KOG2085: Serine/threonine protein phosphatase 2A, regulatory subunit [Signal transduction mechanisms]. 37297 KOG2086: Protein tyrosine phosphatase SHP1/Cofactor for p97 ATPase-mediated vesicle membrane fusion [Nuclear structure]. 37298 KOG2087: Glycoprotein hormone receptor [Signal transduction mechanisms]. 37299 KOG2088: Predicted lipase/calmodulin-binding heat-shock protein [Lipid transport and metabolism, Posttranslational modification, protein turnover, chaperones, Signal transduction mechanisms]. 37300 KOG2089: Metalloendopeptidase family - saccharolysin & thimet oligopeptidase [Posttranslational modification, protein turnover, chaperones]. 37301 KOG2090: Metalloendopeptidase family - mitochondrial intermediate peptidase [Posttranslational modification, protein turnover, chaperones]. 37302 KOG2091: Predicted member of glycosyl hydrolase family 18 [Carbohydrate transport and metabolism]. 37303 KOG2092: Uncharacterized conserved protein [Function unknown]. 37304 KOG2093: Translesion DNA polymerase - REV1 deoxycytidyl transferase [Replication, recombination and repair]. 37305 KOG2094: Predicted DNA damage inducible protein [Replication, recombination and repair]. 37306 KOG2095: DNA polymerase iota/DNA damage inducible protein [Replication, recombination and repair]. 37307 KOG2096: WD40 repeat protein [General function prediction only]. 37308 KOG2097: Predicted N6-adenine methylase involved in transcription regulation [Transcription]. 37309 KOG2098: Predicted N6-adenine RNA methylase [RNA processing and modification]. 37310 KOG2099: Glycogen phosphorylase [Carbohydrate transport and metabolism]. 37311 KOG2100: Dipeptidyl aminopeptidase [Posttranslational modification, protein turnover, chaperones]. 37312 KOG2101: Intermediate filament-like protein, sorting nexins, and related proteins containing PX (PhoX) domain(s) [Cytoskeleton, Intracellular trafficking, secretion, and vesicular transport, Cell cycle control, cell division, chromosome partitioning]. 37313 KOG2102: Exosomal 3'-5' exoribonuclease complex, subunit Rrp44/Dis3 [Translation, ribosomal structure and biogenesis]. 37314 KOG2103: Uncharacterized conserved protein [Function unknown]. 37315 KOG2104: Nuclear transport factor 2 [Intracellular trafficking, secretion, and vesicular transport]. 37316 KOG2105: Predicted metal-dependent hydrolase, contains AlaS domain [General function prediction only]. 37317 KOG2106: Uncharacterized conserved protein, contains HELP and WD40 domains [Function unknown]. 37318 KOG2107: Uncharacterized conserved protein, contains double-stranded beta-helix domain [Function unknown]. 37319 KOG2108: 3'-5' DNA helicase [Replication, recombination and repair]. 37320 KOG2109: WD40 repeat protein [General function prediction only]. 37321 KOG2110: Uncharacterized conserved protein, contains WD40 repeats [Function unknown]. 37322 KOG2111: Uncharacterized conserved protein, contains WD40 repeats [Function unknown]. 37323 KOG2112: Lysophospholipase [Lipid transport and metabolism]. 37324 KOG2113: Predicted RNA binding protein, contains KH domain [General function prediction only]. 37325 KOG2114: Vacuolar assembly/sorting protein PEP5/VPS11 [Intracellular trafficking, secretion, and vesicular transport]. 37326 KOG2115: Vacuolar sorting protein VPS45 [Intracellular trafficking, secretion, and vesicular transport]. 37327 KOG2116: Protein involved in plasmid maintenance/nuclear protein involved in lipid metabolism [Cell motility, Lipid transport and metabolism]. 37328 KOG2117: Uncharacterized conserved protein [Function unknown]. 37329 KOG2118: Predicted membrane protein, contains two CBS domains [Function unknown]. 37330 KOG2119: Predicted bile acid beta-glucosidase [Carbohydrate transport and metabolism]. 37331 KOG2120: SCF ubiquitin ligase, Skp2 component [Posttranslational modification, protein turnover, chaperones]. 37332 KOG2121: Predicted metal-dependent hydrolase (beta-lactamase superfamily) [General function prediction only]. 37333 KOG2122: Beta-catenin-binding protein APC, contains ARM repeats [Signal transduction mechanisms, Cytoskeleton]. 37334 KOG2123: Uncharacterized conserved protein [Function unknown]. 37335 KOG2124: Glycosylphosphatidylinositol anchor synthesis protein [Signal transduction mechanisms]. 37336 KOG2125: Glycosylphosphatidylinositol anchor synthesis protein [Signal transduction mechanisms]. 37337 KOG2126: Glycosylphosphatidylinositol anchor synthesis protein [Signal transduction mechanisms]. 37338 KOG2127: Calmodulin-binding protein CRAG, contains DENN domain [Signal transduction mechanisms]. 37339 KOG2128: Ras GTPase-activating protein family - IQGAP [Signal transduction mechanisms]. 37340 KOG2129: Uncharacterized conserved protein H4 [Function unknown]. 37341 KOG2130: Phosphatidylserine-specific receptor PtdSerR, contains JmjC domain [Chromatin structure and dynamics, Signal transduction mechanisms]. 37342 KOG2131: Uncharacterized conserved protein, contains JmjC domain [Chromatin structure and dynamics, Signal transduction mechanisms]. 37343 KOG2132: Uncharacterized conserved protein, contains JmjC domain [Chromatin structure and dynamics, Signal transduction mechanisms]. 37344 KOG2133: Transcriptional corepressor Atrophin-1/DRPLA [General function prediction only]. 37345 KOG2134: Polynucleotide kinase 3' phosphatase [Replication, recombination and repair]. 37346 KOG2135: Proteins containing the RNA recognition motif [General function prediction only]. 37347 KOG2136: Transcriptional regulators binding to the GC-rich sequences [Transcription]. 37348 KOG2137: Protein kinase [Signal transduction mechanisms]. 37349 KOG2138: Predicted RNA binding protein, contains G-patch domain [RNA processing and modification]. 37350 KOG2139: WD40 repeat protein [General function prediction only]. 37351 KOG2140: Uncharacterized conserved protein [General function prediction only]. 37352 KOG2141: Protein involved in high osmolarity signaling pathway [Signal transduction mechanisms]. 37353 KOG2142: Molybdenum cofactor sulfurase [Coenzyme transport and metabolism]. 37354 KOG2143: Uncharacterized conserved protein [Function unknown]. 37355 KOG2144: Tyrosyl-tRNA synthetase, cytoplasmic [Translation, ribosomal structure and biogenesis]. 37356 KOG2145: Cytoplasmic tryptophanyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 37357 KOG2146: Splicing coactivator SRm160/300, subunit SRm160 (contains PWI domain) [RNA processing and modification, General function prediction only]. 37358 KOG2147: Nucleolar protein involved in 40S ribosome biogenesis [Translation, ribosomal structure and biogenesis]. 37359 KOG2148: Exocyst protein Sec3 [Intracellular trafficking, secretion, and vesicular transport]. 37360 KOG2149: Uncharacterized conserved protein [Function unknown]. 37361 KOG2150: CCR4-NOT transcriptional regulation complex, NOT5 subunit [Transcription]. 37362 KOG2151: Predicted transcriptional regulator [Transcription, Cell cycle control, cell division, chromosome partitioning, General function prediction only]. 37363 KOG2152: Sister chromatid cohesion protein [Cell cycle control, cell division, chromosome partitioning]. 37364 KOG2153: Protein involved in the nuclear export of pre-ribosomes [Translation, ribosomal structure and biogenesis, Intracellular trafficking, secretion, and vesicular transport]. 37365 KOG2154: Predicted nucleolar protein involved in ribosome biogenesis [Translation, ribosomal structure and biogenesis]. 37366 KOG2155: Tubulin-tyrosine ligase-related protein [Posttranslational modification, protein turnover, chaperones]. 37367 KOG2156: Tubulin-tyrosine ligase-related protein [Posttranslational modification, protein turnover, chaperones]. 37368 KOG2157: Predicted tubulin-tyrosine ligase [Posttranslational modification, protein turnover, chaperones]. 37369 KOG2158: Tubulin-tyrosine ligase-related protein [Posttranslational modification, protein turnover, chaperones]. 37370 KOG2159: tRNA nucleotidyltransferase/poly(A) polymerase [Translation, ribosomal structure and biogenesis]. 37371 KOG2160: Armadillo/beta-catenin-like repeat-containing protein [Posttranslational modification, protein turnover, chaperones]. 37372 KOG2161: Glucosidase I [Carbohydrate transport and metabolism]. 37373 KOG2162: Nonsense-mediated mRNA decay protein [RNA processing and modification]. 37374 KOG2163: Centromere/kinetochore protein zw10 involved in mitotic chromosome segregation [Cell cycle control, cell division, chromosome partitioning]. 37375 KOG2164: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 37376 KOG2165: Anaphase-promoting complex (APC), subunit 2 [Cell cycle control, cell division, chromosome partitioning, Posttranslational modification, protein turnover, chaperones]. 37377 KOG2166: Cullins [Cell cycle control, cell division, chromosome partitioning]. 37378 KOG2167: Cullins [Cell cycle control, cell division, chromosome partitioning]. 37379 KOG2168: Cullins [Cell cycle control, cell division, chromosome partitioning]. 37380 KOG2169: Zn-finger transcription factor [Transcription]. 37381 KOG2170: ATPase of the AAA+ superfamily [General function prediction only]. 37382 KOG2171: Karyopherin (importin) beta 3 [Nuclear structure, Intracellular trafficking, secretion, and vesicular transport]. 37383 KOG2172: Uncharacterized conserved protein [Function unknown]. 37384 KOG2173: Integral membrane protein [General function prediction only]. 37385 KOG2174: Leptin receptor gene-related protein [Signal transduction mechanisms]. 37386 KOG2175: Protein predicted to be involved in carbohydrate metabolism [Carbohydrate transport and metabolism]. 37387 KOG2176: Exocyst complex, subunit SEC15 [Intracellular trafficking, secretion, and vesicular transport]. 37388 KOG2177: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 37389 KOG2178: Predicted sugar kinase [Carbohydrate transport and metabolism]. 37390 KOG2179: Nucleotide excision repair complex XPC-HR23B, subunit XPC/DPB11 [Replication, recombination and repair]. 37391 KOG2180: Late Golgi protein sorting complex, subunit Vps53 [Intracellular trafficking, secretion, and vesicular transport]. 37392 KOG2181: LIM domain binding protein LDB1/NLI/CLIM [Transcription]. 37393 KOG2182: Hydrolytic enzymes of the alpha/beta hydrolase fold [Posttranslational modification, protein turnover, chaperones, General function prediction only]. 37394 KOG2183: Prolylcarboxypeptidase (angiotensinase C) [Posttranslational modification, protein turnover, chaperones, General function prediction only]. 37395 KOG2184: Tuftelin-interacting protein TIP39, contains G-patch domain [RNA processing and modification]. 37396 KOG2185: Predicted RNA-processing protein, contains G-patch domain [RNA processing and modification]. 37397 KOG2186: Cell growth-regulating nucleolar protein [Cell cycle control, cell division, chromosome partitioning]. 37398 KOG2187: tRNA uracil-5-methyltransferase and related tRNA-modifying enzymes [Translation, ribosomal structure and biogenesis]. 37399 KOG2188: Predicted RNA-binding protein, contains Pumilio domains [Translation, ribosomal structure and biogenesis]. 37400 KOG2189: Vacuolar H+-ATPase V0 sector, subunit a [Energy production and conversion]. 37401 KOG2190: PolyC-binding proteins alphaCP-1 and related KH domain proteins [RNA processing and modification, General function prediction only]. 37402 KOG2191: RNA-binding protein NOVA1/PASILLA and related KH domain proteins [RNA processing and modification, General function prediction only]. 37403 KOG2192: PolyC-binding hnRNP-K protein HRB57A/hnRNP, contains KH domain [RNA processing and modification, General function prediction only]. 37404 KOG2193: IGF-II mRNA-binding protein IMP, contains RRM and KH domains [RNA processing and modification, General function prediction only]. 37405 KOG2194: Aminopeptidases of the M20 family [Posttranslational modification, protein turnover, chaperones, General function prediction only]. 37406 KOG2195: Transferrin receptor and related proteins containing the protease-associated (PA) domain [Posttranslational modification, protein turnover, chaperones, Inorganic ion transport and metabolism, General function prediction only]. 37407 KOG2196: Nuclear porin [Nuclear structure]. 37408 KOG2197: Ypt/Rab-specific GTPase-activating protein GYP7 and related proteins [Signal transduction mechanisms]. 37409 KOG2198: tRNA cytosine-5-methylases and related enzymes of the NOL1/NOP2/sun superfamily [Translation, ribosomal structure and biogenesis]. 37410 KOG2199: Signal transducing adaptor protein STAM/STAM2 [Signal transduction mechanisms]. 37411 KOG2200: Tumour suppressor protein p122-RhoGAP/DLC1 [Signal transduction mechanisms]. 37412 KOG2201: Pantothenate kinase PanK and related proteins [Coenzyme transport and metabolism]. 37413 KOG2202: U2 snRNP splicing factor, small subunit, and related proteins [RNA processing and modification]. 37414 KOG2203: GTP-binding protein [General function prediction only]. 37415 KOG2204: Mannosyl-oligosaccharide alpha-1,2-mannosidase and related glycosyl hydrolases [Carbohydrate transport and metabolism]. 37416 KOG2205: Uncharacterized conserved protein [Function unknown]. 37417 KOG2206: Exosome 3'-5' exoribonuclease complex, subunit PM/SCL-100 (Rrp6) [Translation, ribosomal structure and biogenesis]. 37418 KOG2207: Predicted 3'-5' exonuclease [Replication, recombination and repair]. 37419 KOG2208: Vigilin [Lipid transport and metabolism]. 37420 KOG2209: Oxysterol-binding protein [Signal transduction mechanisms]. 37421 KOG2210: Oxysterol-binding protein [Signal transduction mechanisms]. 37422 KOG2211: Predicted Golgi transport complex 1 protein [Intracellular trafficking, secretion, and vesicular transport]. 37423 KOG2212: Alpha-amylase [Carbohydrate transport and metabolism]. 37424 KOG2213: Apoptosis inhibitor 5/fibroblast growth factor 2-interacting factor 2, and related proteins [Signal transduction mechanisms]. 37425 KOG2214: Predicted esterase of the alpha-beta hydrolase superfamily [General function prediction only]. 37426 KOG2215: Exocyst complex subunit [Intracellular trafficking, secretion, and vesicular transport]. 37427 KOG2216: Conserved coiled/coiled coil protein [Function unknown]. 37428 KOG2217: U4/U6.U5 snRNP associated protein [RNA processing and modification]. 37429 KOG2218: ER to golgi transport protein/RAD50-interacting protein 1 [Intracellular trafficking, secretion, and vesicular transport, Cell cycle control, cell division, chromosome partitioning]. 37430 KOG2219: Uncharacterized conserved protein [Function unknown]. 37431 KOG2220: Predicted signal transduction protein [General function prediction only]. 37432 KOG2221: PDZ-domain interacting protein EPI64, contains TBC domain [Intracellular trafficking, secretion, and vesicular transport]. 37433 KOG2222: Uncharacterized conserved protein, contains TBC, SH3 and RUN domains [Signal transduction mechanisms, General function prediction only]. 37434 KOG2223: Uncharacterized conserved protein, contains TBC domain [Signal transduction mechanisms, General function prediction only]. 37435 KOG2224: Uncharacterized conserved protein, contains TBC domain [Signal transduction mechanisms, General function prediction only]. 37436 KOG2225: Proteins containing regions of low-complexity [General function prediction only]. 37437 KOG2226: Proteins containing regions of low-complexity [General function prediction only]. 37438 KOG2227: Pre-initiation complex, subunit CDC6, AAA+ superfamily ATPase [Replication, recombination and repair, Cell cycle control, cell division, chromosome partitioning]. 37439 KOG2228: Origin recognition complex, subunit 4 [Replication, recombination and repair]. 37440 KOG2229: Protein required for actin cytoskeleton organization and cell cycle progression [Cell cycle control, cell division, chromosome partitioning, Cytoskeleton]. 37441 KOG2230: Predicted beta-mannosidase [Carbohydrate transport and metabolism]. 37442 KOG2231: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 37443 KOG2232: Ceramidases [Signal transduction mechanisms]. 37444 KOG2233: Alpha-N-acetylglucosaminidase [Intracellular trafficking, secretion, and vesicular transport]. 37445 KOG2234: Predicted UDP-galactose transporter [Carbohydrate transport and metabolism]. 37446 KOG2235: Uncharacterized conserved protein [Function unknown]. 37447 KOG2236: Uncharacterized conserved protein [Function unknown]. 37448 KOG2237: Predicted serine protease [Posttranslational modification, protein turnover, chaperones]. 37449 KOG2238: Uncharacterized conserved protein TEX2, contains PH domain [General function prediction only]. 37450 KOG2239: Transcription factor containing NAC and TS-N domains [Transcription]. 37451 KOG2240: RNA polymerase II general transcription factor BTF3 and related proteins [Transcription]. 37452 KOG2241: tRNA-binding protein [Translation, ribosomal structure and biogenesis]. 37453 KOG2242: Scaffold/matrix specific factor hnRNP-U/SAF-A, contains SPRY domain [RNA processing and modification]. 37454 KOG2243: Ca2+ release channel (ryanodine receptor) [Signal transduction mechanisms]. 37455 KOG2244: Highly conserved protein containing a thioredoxin domain [General function prediction only]. 37456 KOG2245: Poly(A) polymerase and related nucleotidyltransferases [RNA processing and modification]. 37457 KOG2246: Galactosyltransferases [Carbohydrate transport and metabolism]. 37458 KOG2247: WD40 repeat-containing protein [General function prediction only]. 37459 KOG2248: 3'-5' exonuclease [Replication, recombination and repair]. 37460 KOG2249: 3'-5' exonuclease [Replication, recombination and repair]. 37461 KOG2250: Glutamate/leucine/phenylalanine/valine dehydrogenases [Amino acid transport and metabolism]. 37462 KOG2251: Homeobox transcription factor [Transcription]. 37463 KOG2252: CCAAT displacement protein and related homeoproteins [Transcription]. 37464 KOG2253: U1 snRNP complex, subunit SNU71 and related PWI-motif proteins [RNA processing and modification]. 37465 KOG2254: Predicted endo-1,3-beta-glucanase [Carbohydrate transport and metabolism]. 37466 KOG2255: Peptidyl-tRNA hydrolase [Translation, ribosomal structure and biogenesis]. 37467 KOG2256: Predicted protein involved in nuclear export of pre-ribosomes [Translation, ribosomal structure and biogenesis]. 37468 KOG2257: N-acetylglucosaminyltransferase complex, subunit PIG-P, required for phosphatidylinositol biosynthesis [Function unknown]. 37469 KOG2258: Glycerophosphoryl diester phosphodiesterase [Energy production and conversion]. 37470 KOG2259: Uncharacterized conserved protein [Function unknown]. 37471 KOG2260: Cell division cycle 37 protein, CDC37 [Cell cycle control, cell division, chromosome partitioning]. 37472 KOG2261: Polycomb enhancer protein, EPC [Transcription]. 37473 KOG2262: Sexual differentiation process protein ISP4 [Signal transduction mechanisms]. 37474 KOG2263: Methionine synthase II (cobalamin-independent) [Amino acid transport and metabolism]. 37475 KOG2264: Exostosin EXT1L [Signal transduction mechanisms]. 37476 KOG2265: Nuclear distribution protein NUDC [Signal transduction mechanisms]. 37477 KOG2266: Chromatin-associated protein Dek and related proteins, contains SAP DNA binding domain [Chromatin structure and dynamics]. 37478 KOG2267: Eukaryotic-type DNA primase, large subunit [Replication, recombination and repair]. 37479 KOG2268: Serine/threonine protein kinase [Signal transduction mechanisms, General function prediction only]. 37480 KOG2269: Serine/threonine protein kinase [Signal transduction mechanisms]. 37481 KOG2270: Serine/threonine protein kinase involved in cell cycle control [Signal transduction mechanisms, Cell cycle control, cell division, chromosome partitioning]. 37482 KOG2271: Nuclear pore complex component (sc Nup85) [Nuclear structure, Intracellular trafficking, secretion, and vesicular transport]. 37483 KOG2272: Focal adhesion protein PINCH-1, contains LIM domains [Signal transduction mechanisms, Cytoskeleton]. 37484 KOG2273: Membrane coat complex Retromer, subunit VPS5/SNX1, Sorting nexins, and related PX domain-containing proteins [Intracellular trafficking, secretion, and vesicular transport]. 37485 KOG2274: Predicted importin 9 [Intracellular trafficking, secretion, and vesicular transport, Nuclear structure]. 37486 KOG2275: Aminoacylase ACY1 and related metalloexopeptidases [Amino acid transport and metabolism]. 37487 KOG2276: Metalloexopeptidases [Amino acid transport and metabolism]. 37488 KOG2277: S-M checkpoint control protein CID1 and related nucleotidyltransferases [Cell cycle control, cell division, chromosome partitioning]. 37489 KOG2278: RNA:NAD 2'-phosphotransferase TPT1 [Translation, ribosomal structure and biogenesis]. 37490 KOG2279: Kinase anchor protein AKAP149, contains KH and Tudor RNA-binding domains [Signal transduction mechanisms]. 37491 KOG2280: Vacuolar assembly/sorting protein VPS16 [Intracellular trafficking, secretion, and vesicular transport]. 37492 KOG2281: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases [Posttranslational modification, protein turnover, chaperones]. 37493 KOG2282: NADH-ubiquinone oxidoreductase, NDUFS1/75 kDa subunit [Energy production and conversion]. 37494 KOG2283: Clathrin coat dissociation kinase GAK/PTEN/Auxilin and related tyrosine phosphatases [Signal transduction mechanisms, General function prediction only]. 37495 KOG2284: E3 ubiquitin ligase, Cullin 2 component [Posttranslational modification, protein turnover, chaperones]. 37496 KOG2285: E3 ubiquitin ligase, Cullin 1 component [Posttranslational modification, protein turnover, chaperones]. 37497 KOG2286: Exocyst complex subunit SEC6 [Intracellular trafficking, secretion, and vesicular transport]. 37498 KOG2287: Galactosyltransferases [Carbohydrate transport and metabolism]. 37499 KOG2288: Galactosyltransferases [Carbohydrate transport and metabolism]. 37500 KOG2289: Rhomboid family proteins [Signal transduction mechanisms]. 37501 KOG2290: Rhomboid family proteins [Signal transduction mechanisms]. 37502 KOG2291: Oligosaccharyltransferase, alpha subunit (ribophorin I) [Posttranslational modification, protein turnover, chaperones]. 37503 KOG2292: Oligosaccharyltransferase, STT3 subunit [Posttranslational modification, protein turnover, chaperones]. 37504 KOG2293: Daxx-interacting protein MSP58/p78, contains FHA domain [Transcription, Signal transduction mechanisms]. 37505 KOG2294: Transcription factor of the Forkhead/HNF3 family [Transcription]. 37506 KOG2295: C2H2 Zn-finger protein [General function prediction only]. 37507 KOG2296: Integral membrane protein [General function prediction only]. 37508 KOG2297: Predicted translation factor, contains W2 domain [Translation, ribosomal structure and biogenesis]. 37509 KOG2298: Glycyl-tRNA synthetase and related class II tRNA synthetase [Translation, ribosomal structure and biogenesis]. 37510 KOG2299: Ribonuclease HI [Replication, recombination and repair]. 37511 KOG2300: Uncharacterized conserved protein [Function unknown]. 37512 KOG2301: Voltage-gated Ca2+ channels, alpha1 subunits [Inorganic ion transport and metabolism, Signal transduction mechanisms]. 37513 KOG2302: T-type voltage-gated Ca2+ channel, pore-forming alpha1I subunit [Inorganic ion transport and metabolism, Signal transduction mechanisms]. 37514 KOG2303: Predicted NAD synthase, contains CN hydrolase domain [Coenzyme transport and metabolism, General function prediction only]. 37515 KOG2304: 3-hydroxyacyl-CoA dehydrogenase [Lipid transport and metabolism]. 37516 KOG2305: 3-hydroxyacyl-CoA dehydrogenase [Lipid transport and metabolism]. 37517 KOG2306: Uncharacterized conserved protein [Function unknown]. 37518 KOG2307: Low density lipoprotein receptor [Intracellular trafficking, secretion, and vesicular transport]. 37519 KOG2308: Phosphatidic acid-preferring phospholipase A1, contains DDHD domain [Lipid transport and metabolism, Intracellular trafficking, secretion, and vesicular transport]. 37520 KOG2309: 60s ribosomal protein L2/L8 [Translation, ribosomal structure and biogenesis]. 37521 KOG2310: DNA repair exonuclease MRE11 [Replication, recombination and repair]. 37522 KOG2311: NAD/FAD-utilizing protein possibly involved in translation [Translation, ribosomal structure and biogenesis]. 37523 KOG2312: Predicted transcriptional regulator, contains ARID domain [Transcription]. 37524 KOG2313: Stress-induced protein UVI31+ [Signal transduction mechanisms]. 37525 KOG2314: Translation initiation factor 3, subunit b (eIF-3b) [Translation, ribosomal structure and biogenesis]. 37526 KOG2315: Predicted translation initiation factor related to eIF-3a [Translation, ribosomal structure and biogenesis]. 37527 KOG2316: Predicted ATPase (PP-loop superfamily) [General function prediction only]. 37528 KOG2317: Putative translation initiation inhibitor UK114/IBM1 [Translation, ribosomal structure and biogenesis]. 37529 KOG2318: Uncharacterized conserved protein [Function unknown]. 37530 KOG2319: Vacuolar assembly/sorting protein VPS9 [Intracellular trafficking, secretion, and vesicular transport]. 37531 KOG2320: RAS effector RIN1 (contains VPS domain) [Intracellular trafficking, secretion, and vesicular transport]. 37532 KOG2321: WD40 repeat protein [General function prediction only]. 37533 KOG2322: N-methyl-D-aspartate receptor glutamate-binding subunit [Signal transduction mechanisms]. 37534 KOG2323: Pyruvate kinase [Carbohydrate transport and metabolism]. 37535 KOG2324: Prolyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 37536 KOG2325: Predicted transporter/transmembrane protein [General function prediction only]. 37537 KOG2326: DNA-binding subunit of a DNA-dependent protein kinase (Ku80 autoantigen) [Replication, recombination and repair]. 37538 KOG2327: DNA-binding subunit of a DNA-dependent protein kinase (Ku70 autoantigen) [Replication, recombination and repair]. 37539 KOG2328: Chromosome condensation complex Condensin, subunit H [Chromatin structure and dynamics, Cell cycle control, cell division, chromosome partitioning]. 37540 KOG2329: Alkaline ceramidase [Lipid transport and metabolism]. 37541 KOG2330: Splicing factor 3b, subunit 2 [RNA processing and modification]. 37542 KOG2331: Predicted glycosylhydrolase [General function prediction only]. 37543 KOG2332: Ferritin [Inorganic ion transport and metabolism]. 37544 KOG2333: Uncharacterized conserved protein [General function prediction only]. 37545 KOG2334: tRNA-dihydrouridine synthase [Translation, ribosomal structure and biogenesis]. 37546 KOG2335: tRNA-dihydrouridine synthase [Translation, ribosomal structure and biogenesis]. 37547 KOG2336: Molybdopterin biosynthesis-related protein [Coenzyme transport and metabolism]. 37548 KOG2337: Ubiquitin activating E1 enzyme-like protein [Coenzyme transport and metabolism]. 37549 KOG2338: Transcriptional effector CCR4-related protein [Transcription]. 37550 KOG2339: Uncharacterized conserved protein [Function unknown]. 37551 KOG2340: Uncharacterized conserved protein [Function unknown]. 37552 KOG2341: TATA box binding protein (TBP)-associated factor, RNA polymerase II [Transcription]. 37553 KOG2342: Uncharacterized conserved protein [Function unknown]. 37554 KOG2343: Glucose-repressible protein and related proteins [General function prediction only]. 37555 KOG2344: Exocyst component protein and related proteins [Intracellular trafficking, secretion, and vesicular transport]. 37556 KOG2345: Serine/threonine protein kinase/TGF-beta stimulated factor [Transcription, Lipid transport and metabolism, Signal transduction mechanisms]. 37557 KOG2346: Uncharacterized conserved protein [Function unknown]. 37558 KOG2347: Sec5 subunit of exocyst complex [Intracellular trafficking, secretion, and vesicular transport]. 37559 KOG2348: Urea transporter [Amino acid transport and metabolism]. 37560 KOG2349: Na+:iodide/myo-inositol/multivitamin symporters [Inorganic ion transport and metabolism]. 37561 KOG2350: Zn-finger protein joined to JAZF1 (predicted suppressor) [General function prediction only]. 37562 KOG2351: RNA polymerase II, fourth largest subunit [Transcription]. 37563 KOG2352: Predicted spermine/spermidine synthase [Amino acid transport and metabolism]. 37564 KOG2353: L-type voltage-dependent Ca2+ channel, alpha2/delta subunit [Inorganic ion transport and metabolism, Signal transduction mechanisms]. 37565 KOG2354: RNA Polymerase C (III) 37 kDa subunit [Transcription]. 37566 KOG2355: Predicted ABC-type transport, ATPase component/CCR4 associated factor [General function prediction only, Transcription]. 37567 KOG2356: Transcriptional activator, adenine-specific DNA methyltransferase [Transcription, Signal transduction mechanisms]. 37568 KOG2357: Uncharacterized conserved protein [Function unknown]. 37569 KOG2358: NifU-like domain-containing proteins [Posttranslational modification, protein turnover, chaperones]. 37570 KOG2359: Uncharacterized conserved protein [Function unknown]. 37571 KOG2360: Proliferation-associated nucleolar protein (NOL1) [Cell cycle control, cell division, chromosome partitioning]. 37572 KOG2361: Predicted methyltransferase [General function prediction only]. 37573 KOG2362: Uncharacterized Fe-S protein [General function prediction only]. 37574 KOG2363: Protein subunit of nuclear ribonuclease P (RNase P) [Translation, ribosomal structure and biogenesis]. 37575 KOG2364: Predicted pseudouridylate synthase [Translation, ribosomal structure and biogenesis]. 37576 KOG2365: Uncharacterized membrane protein [Function unknown]. 37577 KOG2366: Alpha-D-galactosidase (melibiase) [Carbohydrate transport and metabolism]. 37578 KOG2367: Alpha-isopropylmalate synthase/homocitrate synthase [Amino acid transport and metabolism]. 37579 KOG2368: Hydroxymethylglutaryl-CoA lyase [Energy production and conversion, Amino acid transport and metabolism]. 37580 KOG2369: Lecithin:cholesterol acyltransferase (LCAT)/Acyl-ceramide synthase [Lipid transport and metabolism]. 37581 KOG2370: Cactin [Signal transduction mechanisms]. 37582 KOG2371: Molybdopterin biosynthesis protein [Coenzyme transport and metabolism]. 37583 KOG2372: Oxidation resistance protein [Replication, recombination and repair]. 37584 KOG2373: Predicted mitochondrial DNA helicase twinkle [Replication, recombination and repair]. 37585 KOG2374: Uncharacterized conserved protein [Function unknown]. 37586 KOG2375: Protein interacting with poly(A)-binding protein [RNA processing and modification]. 37587 KOG2376: Signal recognition particle, subunit Srp72 [Intracellular trafficking, secretion, and vesicular transport]. 37588 KOG2377: Uncharacterized conserved protein [Function unknown]. 37589 KOG2378: cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]. 37590 KOG2379: Endonuclease MUS81 [Replication, recombination and repair]. 37591 KOG2380: Prephenate dehydrogenase (NADP+) [Amino acid transport and metabolism]. 37592 KOG2381: Phosphatidylinositol 4-kinase [Signal transduction mechanisms]. 37593 KOG2382: Predicted alpha/beta hydrolase [General function prediction only]. 37594 KOG2383: Predicted ATPase [General function prediction only]. 37595 KOG2384: Major histocompatibility complex protein BAT4, contains G-patch and ankyrin domains [General function prediction only]. 37596 KOG2385: Uncharacterized conserved protein [Function unknown]. 37597 KOG2386: mRNA capping enzyme, guanylyltransferase (alpha) subunit [RNA processing and modification]. 37598 KOG2387: CTP synthase (UTP-ammonia lyase) [Nucleotide transport and metabolism]. 37599 KOG2388: UDP-N-acetylglucosamine pyrophosphorylase [Cell wall/membrane/envelope biogenesis]. 37600 KOG2389: Predicted bromodomain transcription factor [Transcription]. 37601 KOG2390: Uncharacterized conserved protein [Function unknown]. 37602 KOG2391: Vacuolar sorting protein/ubiquitin receptor VPS23 [Posttranslational modification, protein turnover, chaperones, Intracellular trafficking, secretion, and vesicular transport]. 37603 KOG2392: Serpin [Defense mechanisms]. 37604 KOG2393: Transcription initiation factor IIF, large subunit (RAP74) [Transcription]. 37605 KOG2394: WD40 protein DMR-N9 [General function prediction only]. 37606 KOG2395: Protein involved in vacuole import and degradation [Intracellular trafficking, secretion, and vesicular transport]. 37607 KOG2396: HAT (Half-A-TPR) repeat-containing protein [General function prediction only]. 37608 KOG2397: Protein kinase C substrate, 80 KD protein, heavy chain [Signal transduction mechanisms]. 37609 KOG2398: Predicted proline-serine-threonine phosphatase-interacting protein (PSTPIP) [Cell cycle control, cell division, chromosome partitioning]. 37610 KOG2399: K+-dependent Na+:Ca2+ antiporter [Inorganic ion transport and metabolism]. 37611 KOG2400: Nuclear protein ZAP [Defense mechanisms]. 37612 KOG2401: Predicted MutS-related protein involved in mismatch repair [Replication, recombination and repair]. 37613 KOG2402: Paf1/RNA polymerase II complex, RTF1 component (involved in regulation of TATA box-binding protein) [Transcription]. 37614 KOG2403: Succinate dehydrogenase, flavoprotein subunit [Energy production and conversion]. 37615 KOG2404: Fumarate reductase, flavoprotein subunit [Energy production and conversion]. 37616 KOG2405: Predicted 3'-5' exonuclease [Replication, recombination and repair]. 37617 KOG2406: MADS box transcription factor [Transcription]. 37618 KOG2407: GPI transamidase complex, GPI16/PIG-T component, involved in glycosylphosphatidylinositol anchor biosynthesis [Cell wall/membrane/envelope biogenesis, Posttranslational modification, protein turnover, chaperones]. 37619 KOG2408: Peroxidase/oxygenase [General function prediction only]. 37620 KOG2409: KRR1-interacting protein involved in 40S ribosome biogenesis [Translation, ribosomal structure and biogenesis]. 37621 KOG2410: Gamma-glutamyltransferase [Amino acid transport and metabolism]. 37622 KOG2411: Aspartyl-tRNA synthetase, mitochondrial [Translation, ribosomal structure and biogenesis]. 37623 KOG2412: Nuclear-export-signal (NES)-containing protein/polyadenylated-RNA export factor [RNA processing and modification]. 37624 KOG2413: Xaa-Pro aminopeptidase [Amino acid transport and metabolism]. 37625 KOG2414: Putative Xaa-Pro aminopeptidase [Amino acid transport and metabolism]. 37626 KOG2415: Electron transfer flavoprotein ubiquinone oxidoreductase [Energy production and conversion]. 37627 KOG2416: Acinus (induces apoptotic chromatin condensation) [Chromatin structure and dynamics]. 37628 KOG2417: Predicted G-protein coupled receptor [Signal transduction mechanisms]. 37629 KOG2418: Microtubule-associated protein TAU [Cytoskeleton]. 37630 KOG2419: Phosphatidylserine decarboxylase [Lipid transport and metabolism]. 37631 KOG2420: Phosphatidylserine decarboxylase [Lipid transport and metabolism]. 37632 KOG2421: Predicted starch-binding protein [General function prediction only]. 37633 KOG2422: Uncharacterized conserved protein [Function unknown]. 37634 KOG2423: Nucleolar GTPase [General function prediction only]. 37635 KOG2424: Protein involved in transcription start site selection [Transcription]. 37636 KOG2425: Nuclear protein involved in cell morphogenesis and cell surface growth [General function prediction only]. 37637 KOG2426: Dihydroxyacetone kinase/glycerone kinase [Carbohydrate transport and metabolism]. 37638 KOG2427: Uncharacterized conserved protein [Function unknown]. 37639 KOG2428: Uncharacterized conserved protein [Function unknown]. 37640 KOG2429: Glycosyl hydrolase, family 47 [Carbohydrate transport and metabolism]. 37641 KOG2430: Glycosyl hydrolase, family 47 [Carbohydrate transport and metabolism]. 37642 KOG2431: 1, 2-alpha-mannosidase [Carbohydrate transport and metabolism]. 37643 KOG2432: Uncharacterized conserved protein [Function unknown]. 37644 KOG2433: Uncharacterized conserved protein [Function unknown]. 37645 KOG2434: RNA polymerase I transcription factor [Transcription]. 37646 KOG2435: Uncharacterized conserved protein [Function unknown]. 37647 KOG2436: Acetylglutamate kinase/acetylglutamate synthase [Amino acid transport and metabolism]. 37648 KOG2437: Muskelin [Signal transduction mechanisms]. 37649 KOG2438: Glutamyl-tRNA amidotransferase subunit B [Translation, ribosomal structure and biogenesis]. 37650 KOG2439: Nuclear architecture related protein [Nuclear structure]. 37651 KOG2440: Pyrophosphate-dependent phosphofructo-1-kinase [Carbohydrate transport and metabolism]. 37652 KOG2441: mRNA splicing factor/probable chromatin binding snw family nuclear protein [RNA processing and modification, Chromatin structure and dynamics]. 37653 KOG2442: Uncharacterized conserved protein, contains PA domain [General function prediction only]. 37654 KOG2443: Uncharacterized conserved protein [Function unknown]. 37655 KOG2444: WD40 repeat protein [General function prediction only]. 37656 KOG2445: Nuclear pore complex component (sc Seh1) [Nuclear structure, Intracellular trafficking, secretion, and vesicular transport]. 37657 KOG2446: Glucose-6-phosphate isomerase [Carbohydrate transport and metabolism]. 37658 KOG2447: Oligosaccharyltransferase, delta subunit (ribophorin II) [Posttranslational modification, protein turnover, chaperones]. 37659 KOG2448: Dihydroxy-acid dehydratase [Amino acid transport and metabolism]. 37660 KOG2449: Methylmalonate semialdehyde dehydrogenase [Amino acid transport and metabolism, Carbohydrate transport and metabolism]. 37661 KOG2450: Aldehyde dehydrogenase [Energy production and conversion]. 37662 KOG2451: Aldehyde dehydrogenase [Energy production and conversion]. 37663 KOG2452: Formyltetrahydrofolate dehydrogenase [Nucleotide transport and metabolism]. 37664 KOG2453: Aldehyde dehydrogenase [Energy production and conversion]. 37665 KOG2454: Betaine aldehyde dehydrogenase [Energy production and conversion]. 37666 KOG2455: Delta-1-pyrroline-5-carboxylate dehydrogenase [Amino acid transport and metabolism]. 37667 KOG2456: Aldehyde dehydrogenase [Energy production and conversion]. 37668 KOG2457: A/G-specific adenine DNA glycosylase [Replication, recombination and repair]. 37669 KOG2458: Endoplasmic reticulum protein EP58, contains filamin rod domain and KDEL motif [General function prediction only]. 37670 KOG2459: GPI transamidase complex, GPI17/PIG-S component, involved in glycosylphosphatidylinositol anchor biosynthesis [Cell wall/membrane/envelope biogenesis, Posttranslational modification, protein turnover, chaperones]. 37671 KOG2460: Signal recognition particle, subunit Srp68 [Intracellular trafficking, secretion, and vesicular transport]. 37672 KOG2461: Transcription factor BLIMP-1/PRDI-BF1, contains C2H2-type Zn-finger and SET domains [Transcription]. 37673 KOG2462: C2H2-type Zn-finger protein [Transcription]. 37674 KOG2463: Predicted RNA-binding protein Nob1p involved in 26S proteasome assembly [Posttranslational modification, protein turnover, chaperones]. 37675 KOG2464: Serine/threonine kinase (haspin family) [Cell cycle control, cell division, chromosome partitioning]. 37676 KOG2465: Uncharacterized conserved protein [Function unknown]. 37677 KOG2466: Uridine permease/thiamine transporter/allantoin transport [Nucleotide transport and metabolism, Coenzyme transport and metabolism]. 37678 KOG2467: Glycine/serine hydroxymethyltransferase [Amino acid transport and metabolism]. 37679 KOG2468: Dolichol kinase [Lipid transport and metabolism]. 37680 KOG2469: IMP-GMP specific 5'-nucleotidase [Nucleotide transport and metabolism]. 37681 KOG2470: Similar to IMP-GMP specific 5 '-nucleotidase [Nucleotide transport and metabolism]. 37682 KOG2471: TPR repeat-containing protein [General function prediction only]. 37683 KOG2472: Phenylalanyl-tRNA synthetase beta subunit [Translation, ribosomal structure and biogenesis]. 37684 KOG2473: RNA polymerase III transcription factor (TF)IIIC subunit [Transcription]. 37685 KOG2474: Zinc transporter and related ZIP domain-containing proteins [Inorganic ion transport and metabolism]. 37686 KOG2475: CDC45 (cell division cycle 45)-like protein [Replication, recombination and repair]. 37687 KOG2476: Uncharacterized conserved protein [Function unknown]. 37688 KOG2477: Uncharacterized conserved protein [Function unknown]. 37689 KOG2478: Putative RNA polymerase II regulator [Transcription]. 37690 KOG2479: Translation initiation factor 3, subunit d (eIF-3d) [Translation, ribosomal structure and biogenesis]. 37691 KOG2480: 3-hydroxy-3-methylglutaryl-CoA (HMG-CoA) reductase [Lipid transport and metabolism]. 37692 KOG2481: Protein required for normal rRNA processing [RNA processing and modification]. 37693 KOG2482: Predicted C2H2-type Zn-finger protein [Transcription]. 37694 KOG2483: Upstream transcription factor 2/L-myc-2 protein [Transcription]. 37695 KOG2484: GTPase [General function prediction only]. 37696 KOG2485: Conserved ATP/GTP binding protein [General function prediction only]. 37697 KOG2486: Predicted GTPase [General function prediction only]. 37698 KOG2487: RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH, subunit TFB4 [Transcription, Replication, recombination and repair]. 37699 KOG2488: Acetyltransferase (GNAT) domain-containing protein [General function prediction only]. 37700 KOG2489: Transmembrane protein [General function prediction only]. 37701 KOG2490: Predicted membrane protein [Function unknown]. 37702 KOG2491: Nuclear matrix protein [Nuclear structure]. 37703 KOG2492: CDK5 activator-binding protein [Signal transduction mechanisms]. 37704 KOG2493: Na+/Pi symporter [Inorganic ion transport and metabolism]. 37705 KOG2494: C3H1-type Zn-finger protein [Transcription]. 37706 KOG2495: NADH-dehydrogenase (ubiquinone) [Energy production and conversion]. 37707 KOG2496: Cdk activating kinase (CAK)/RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH/TFIIK, cyclin H subunit [Cell cycle control, cell division, chromosome partitioning, Transcription, Replication, recombination and repair]. 37708 KOG2497: Predicted methyltransferase [General function prediction only]. 37709 KOG2498: IK cytokine down-regulator of HLA class II [Signal transduction mechanisms]. 37710 KOG2499: Beta-N-acetylhexosaminidase [Carbohydrate transport and metabolism]. 37711 KOG2500: Uncharacterized conserved protein [Function unknown]. 37712 KOG2501: Thioredoxin, nucleoredoxin and related proteins [General function prediction only]. 37713 KOG2502: Tub family proteins [General function prediction only]. 37714 KOG2503: Tubby superfamily protein TULP4 [General function prediction only]. 37715 KOG2504: Monocarboxylate transporter [Carbohydrate transport and metabolism]. 37716 KOG2505: Ankyrin repeat protein [General function prediction only]. 37717 KOG2506: SpoU rRNA Methylase family protein [Translation, ribosomal structure and biogenesis]. 37718 KOG2507: Ubiquitin regulatory protein UBXD2, contains UAS and UBX domains [General function prediction only]. 37719 KOG2508: Predicted phospholipase [Lipid transport and metabolism]. 37720 KOG2509: Seryl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 37721 KOG2510: SWI-SNF chromatin-remodeling complex protein [Chromatin structure and dynamics]. 37722 KOG2511: Nicotinic acid phosphoribosyltransferase [Coenzyme transport and metabolism]. 37723 KOG2512: Beta-tubulin folding cofactor C [Posttranslational modification, protein turnover, chaperones]. 37724 KOG2513: Protein required for meiotic chromosome segregation [Cell cycle control, cell division, chromosome partitioning]. 37725 KOG2514: Uncharacterized conserved protein [Function unknown]. 37726 KOG2515: Mannosyltransferase [Cell wall/membrane/envelope biogenesis, Intracellular trafficking, secretion, and vesicular transport]. 37727 KOG2516: Protein involved in dolichol pathway for N-glycosylation (mannosyltransferase family) [Cell wall/membrane/envelope biogenesis, Intracellular trafficking, secretion, and vesicular transport]. 37728 KOG2517: Ribulose kinase and related carbohydrate kinases [Carbohydrate transport and metabolism]. 37729 KOG2518: 5'-3' exonuclease [Replication, recombination and repair]. 37730 KOG2519: 5'-3' exonuclease [Replication, recombination and repair]. 37731 KOG2520: 5'-3' exonuclease [Replication, recombination and repair]. 37732 KOG2521: Uncharacterized conserved protein [Function unknown]. 37733 KOG2522: Filamentous baseplate protein Ligatin, contains PUA domain [Translation, ribosomal structure and biogenesis]. 37734 KOG2523: Predicted RNA-binding protein with PUA domain [Translation, ribosomal structure and biogenesis]. 37735 KOG2524: Cobyrinic acid a,c-diamide synthase [Coenzyme transport and metabolism]. 37736 KOG2525: Folylpolyglutamate synthase [Coenzyme transport and metabolism]. 37737 KOG2526: Predicted aminopeptidases - M20/M25/M40 family [Amino acid transport and metabolism]. 37738 KOG2527: Sorting nexin SNX11 [Intracellular trafficking, secretion, and vesicular transport]. 37739 KOG2528: Sorting nexin SNX9/SH3PX1 and related proteins [Intracellular trafficking, secretion, and vesicular transport]. 37740 KOG2529: Pseudouridine synthase [Translation, ribosomal structure and biogenesis]. 37741 KOG2530: Members of tubulin/FtsZ family [Cytoskeleton]. 37742 KOG2531: Sugar (pentulose and hexulose) kinases [Carbohydrate transport and metabolism]. 37743 KOG2532: Permease of the major facilitator superfamily [Carbohydrate transport and metabolism]. 37744 KOG2533: Permease of the major facilitator superfamily [Carbohydrate transport and metabolism]. 37745 KOG2534: DNA polymerase IV (family X) [Replication, recombination and repair]. 37746 KOG2535: RNA polymerase II elongator complex, subunit ELP3/histone acetyltransferase [Chromatin structure and dynamics, Transcription]. 37747 KOG2536: MAM33, mitochondrial matrix glycoprotein [Energy production and conversion]. 37748 KOG2537: Phosphoglucomutase/phosphomannomutase [Carbohydrate transport and metabolism]. 37749 KOG2538: Origin recognition complex, subunit 3 [Replication, recombination and repair]. 37750 KOG2539: Mitochondrial/chloroplast ribosome small subunit component [Translation, ribosomal structure and biogenesis]. 37751 KOG2540: Cytochrome oxidase assembly factor COX11 [Posttranslational modification, protein turnover, chaperones]. 37752 KOG2541: Palmitoyl protein thioesterase [Lipid transport and metabolism, Posttranslational modification, protein turnover, chaperones]. 37753 KOG2542: Uncharacterized conserved protein (YdiU family) [Function unknown]. 37754 KOG2543: Origin recognition complex, subunit 5 [Replication, recombination and repair]. 37755 KOG2544: Dihydropteroate synthase/7,8-dihydro-6-hydroxymethylpterin-pyrophosphokinase/Dihydroneopterin aldolase [Coenzyme transport and metabolism]. 37756 KOG2545: Conserved membrane protein [Function unknown]. 37757 KOG2546: Abl interactor ABI-1, contains SH3 domain [Signal transduction mechanisms, Cytoskeleton]. 37758 KOG2547: Ceramide glucosyltransferase [Lipid transport and metabolism, Cell wall/membrane/envelope biogenesis]. 37759 KOG2548: SWAP mRNA splicing regulator [RNA processing and modification]. 37760 KOG2549: Transcription initiation factor TFIID, subunit TAF6 (also component of histone acetyltransferase SAGA) [Transcription]. 37761 KOG2550: IMP dehydrogenase/GMP reductase [Nucleotide transport and metabolism]. 37762 KOG2551: Phospholipase/carboxyhydrolase [Amino acid transport and metabolism]. 37763 KOG2552: Major facilitator superfamily permease - Cdc91p [General function prediction only]. 37764 KOG2553: Pseudouridylate synthase [Translation, ribosomal structure and biogenesis]. 37765 KOG2554: Pseudouridylate synthase [Translation, ribosomal structure and biogenesis]. 37766 KOG2555: AICAR transformylase/IMP cyclohydrolase/methylglyoxal synthase [Nucleotide transport and metabolism]. 37767 KOG2556: Leishmanolysin-like peptidase (Peptidase M8 family) [Cell wall/membrane/envelope biogenesis, Defense mechanisms]. 37768 KOG2557: Uncharacterized conserved protein, contains TLDc domain [Function unknown]. 37769 KOG2558: Negative regulator of histones [Transcription]. 37770 KOG2559: Predicted pseudouridine synthase [Translation, ribosomal structure and biogenesis]. 37771 KOG2560: RNA splicing factor - Slu7p [RNA processing and modification]. 37772 KOG2561: Adaptor protein NUB1, contains UBA domain [Posttranslational modification, protein turnover, chaperones, Signal transduction mechanisms]. 37773 KOG2562: Protein phosphatase 2 regulatory subunit [RNA processing and modification]. 37774 KOG2563: Permease of the major facilitator superfamily [General function prediction only]. 37775 KOG2564: Predicted acetyltransferases and hydrolases with the alpha/beta hydrolase fold [General function prediction only]. 37776 KOG2565: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) [General function prediction only]. 37777 KOG2566: Beta-glucocerebrosidase [Carbohydrate transport and metabolism]. 37778 KOG2567: Uncharacterized conserved protein [Function unknown]. 37779 KOG2568: Predicted membrane protein [Function unknown]. 37780 KOG2569: G protein-coupled seven transmembrane receptor [Signal transduction mechanisms]. 37781 KOG2570: SWI/SNF transcription activation complex subunit [Chromatin structure and dynamics, Transcription]. 37782 KOG2571: Chitin synthase/hyaluronan synthase (glycosyltransferases) [Cell wall/membrane/envelope biogenesis]. 37783 KOG2572: Ribosome biogenesis protein - Nop58p/Nop5p [RNA processing and modification, Translation, ribosomal structure and biogenesis]. 37784 KOG2573: Ribosome biogenesis protein - Nop56p/Sik1p [RNA processing and modification, Translation, ribosomal structure and biogenesis]. 37785 KOG2574: mRNA splicing factor PRP31 [RNA processing and modification]. 37786 KOG2575: Glucosyltransferase - Alg6p [Carbohydrate transport and metabolism, Amino acid transport and metabolism]. 37787 KOG2576: Glucosyltransferase - Alg8p [Transcription]. 37788 KOG2577: Transcription factor E2F/dimerization partner (TDP) [Transcription]. 37789 KOG2578: Transcription factor E2F/dimerization partner (TDP)-like proteins [Transcription]. 37790 KOG2579: Ficolin and related extracellular proteins [General function prediction only]. 37791 KOG2580: Mitochondrial import inner membrane translocase, subunit TIM44 [Intracellular trafficking, secretion, and vesicular transport]. 37792 KOG2581: 26S proteasome regulatory complex, subunit RPN3/PSMD3 [Posttranslational modification, protein turnover, chaperones]. 37793 KOG2582: COP9 signalosome, subunit CSN3 [Posttranslational modification, protein turnover, chaperones, Signal transduction mechanisms]. 37794 KOG2583: Ubiquinol cytochrome c reductase, subunit QCR2 [Energy production and conversion]. 37795 KOG2584: Dihydroorotase and related enzymes [Nucleotide transport and metabolism]. 37796 KOG2585: Uncharacterized conserved protein [Function unknown]. 37797 KOG2586: Pyridoxamine-phosphate oxidase [Coenzyme transport and metabolism]. 37798 KOG2587: RNA polymerase III (C) subunit [Transcription]. 37799 KOG2588: Predicted DNA-binding protein [Transcription]. 37800 KOG2589: Histone tail methylase [Chromatin structure and dynamics]. 37801 KOG2590: RNA-binding protein LARP/SRO9 and related La domain proteins [Posttranslational modification, protein turnover, chaperones, Translation, ribosomal structure and biogenesis]. 37802 KOG2591: c-Mpl binding protein, contains La domain [Signal transduction mechanisms]. 37803 KOG2592: Tumor differentially expressed (TDE) protein [Function unknown]. 37804 KOG2593: Transcription initiation factor IIE, alpha subunit [Transcription]. 37805 KOG2594: Uncharacterized conserved protein [Function unknown]. 37806 KOG2595: Predicted GTPase activator protein [Signal transduction mechanisms]. 37807 KOG2596: Aminopeptidase I zinc metalloprotease (M18) [Amino acid transport and metabolism]. 37808 KOG2597: Predicted aminopeptidase of the M17 family [General function prediction only]. 37809 KOG2598: Phosphomethylpyrimidine kinase [Coenzyme transport and metabolism, Transcription]. 37810 KOG2599: Pyridoxal/pyridoxine/pyridoxamine kinase [Coenzyme transport and metabolism]. 37811 KOG2600: U3 small nucleolar ribonucleoprotein (snoRNP) subunit - Mpp10p [RNA processing and modification]. 37812 KOG2601: Iron transporter [Inorganic ion transport and metabolism]. 37813 KOG2602: Predicted cell surface protein homologous to bacterial outer membrane proteins [General function prediction only]. 37814 KOG2603: Oligosaccharyltransferase, gamma subunit [Posttranslational modification, protein turnover, chaperones]. 37815 KOG2604: Subunit of cis-Golgi transport vesicle tethering complex - Sec34p [Intracellular trafficking, secretion, and vesicular transport]. 37816 KOG2605: OTU (ovarian tumor)-like cysteine protease [Signal transduction mechanisms, Posttranslational modification, protein turnover, chaperones]. 37817 KOG2606: OTU (ovarian tumor)-like cysteine protease [Signal transduction mechanisms, Posttranslational modification, protein turnover, chaperones]. 37818 KOG2607: CDK5 activator-binding protein [Signal transduction mechanisms]. 37819 KOG2608: Endoplasmic reticulum membrane-associated oxidoreductin involved in disulfide bond formation [Posttranslational modification, protein turnover, chaperones, Intracellular trafficking, secretion, and vesicular transport]. 37820 KOG2609: Cyclin D-interacting protein GCIP [Cell cycle control, cell division, chromosome partitioning, RNA processing and modification]. 37821 KOG2610: Uncharacterized conserved protein [Function unknown]. 37822 KOG2611: Neurochondrin/leucine-rich protein (Neurochondrin) [Function unknown]. 37823 KOG2612: Predicted integral membrane protein [Function unknown]. 37824 KOG2613: NMD protein affecting ribosome stability and mRNA decay [Translation, ribosomal structure and biogenesis]. 37825 KOG2614: Kynurenine 3-monooxygenase and related flavoprotein monooxygenases [Energy production and conversion, General function prediction only]. 37826 KOG2615: Permease of the major facilitator superfamily [General function prediction only]. 37827 KOG2616: Pyridoxalphosphate-dependent enzyme/predicted threonine synthase [Amino acid transport and metabolism]. 37828 KOG2617: Citrate synthase [Energy production and conversion]. 37829 KOG2618: Uncharacterized conserved protein [Function unknown]. 37830 KOG2619: Fucosyltransferase [Carbohydrate transport and metabolism, Amino acid transport and metabolism]. 37831 KOG2620: Prohibitins and stomatins of the PID superfamily [Energy production and conversion]. 37832 KOG2621: Prohibitins and stomatins of the PID superfamily [Energy production and conversion]. 37833 KOG2622: Putative myrosinase precursor [Defense mechanisms]. 37834 KOG2623: Tyrosyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 37835 KOG2624: Triglyceride lipase-cholesterol esterase [Lipid transport and metabolism]. 37836 KOG2625: Uncharacterized conserved protein [Function unknown]. 37837 KOG2626: Histone H3 (Lys4) methyltransferase complex, subunit CPS60/ASH2/BRE2 [Chromatin structure and dynamics, Transcription]. 37838 KOG2627: Nuclear protein ES2 [General function prediction only]. 37839 KOG2628: Farnesyl cysteine-carboxyl methyltransferase [Posttranslational modification, protein turnover, chaperones]. 37840 KOG2629: Peroxisomal membrane anchor protein (peroxin) [Cell wall/membrane/envelope biogenesis,Posttranslational modification, protein turnover, chaperones , Intracellular trafficking, secretion, and vesicular transport]. 37841 KOG2630: Enolase-phosphatase E-1 [Amino acid transport and metabolism]. 37842 KOG2631: Class II aldolase/adducin N-terminal domain protein [Carbohydrate transport and metabolism]. 37843 KOG2632: Rhomboid family proteins [Function unknown]. 37844 KOG2633: Hismacro and SEC14 domain-containing proteins [Chromatin structure and dynamics, Transcription]. 37845 KOG2634: Initiator tRNA phosphoribosyl-transferase [RNA processing and modification]. 37846 KOG2635: Medium subunit of clathrin adaptor complex [Intracellular trafficking, secretion, and vesicular transport]. 37847 KOG2636: Splicing factor 3a, subunit 3 [RNA processing and modification]. 37848 KOG2637: Uncharacterized conserved protein [Function unknown]. 37849 KOG2638: UDP-glucose pyrophosphorylase [Carbohydrate transport and metabolism]. 37850 KOG2639: Sodium sulfate symporter and related arsenite permeases [Inorganic ion transport and metabolism]. 37851 KOG2640: Thioredoxin [Function unknown]. 37852 KOG2641: Predicted seven transmembrane receptor - rhodopsin family [Signal transduction mechanisms]. 37853 KOG2642: Alpha-1,2 glucosyltransferase/transcriptional activator [Posttranslational modification, protein turnover, chaperones, Transcription, Lipid transport and metabolism, Signal transduction mechanisms]. 37854 KOG2643: Ca2+ binding protein, contains EF-hand motifs [Inorganic ion transport and metabolism]. 37855 KOG2644: 3'-phosphoadenosine 5 '-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes [Amino acid transport and metabolism, Coenzyme transport and metabolism]. 37856 KOG2645: Type I phosphodiesterase/nucleotide pyrophosphatase [General function prediction only]. 37857 KOG2646: Ribosomal protein S5 [Translation, ribosomal structure and biogenesis]. 37858 KOG2647: Predicted Dolichyl-phosphate-mannose-protein mannosyltransferase [General function prediction only]. 37859 KOG2648: Diphthamide biosynthesis protein [Translation, ribosomal structure and biogenesis]. 37860 KOG2649: Zinc carboxypeptidase [General function prediction only]. 37861 KOG2650: Zinc carboxypeptidase [Function unknown]. 37862 KOG2651: rRNA adenine N-6-methyltransferase [RNA processing and modification]. 37863 KOG2652: RNA polymerase II transcription initiation factor TFIIA, large chain [Transcription]. 37864 KOG2653: 6-phosphogluconate dehydrogenase [Carbohydrate transport and metabolism]. 37865 KOG2654: Uncharacterized conserved protein [Function unknown]. 37866 KOG2655: Septin family protein (P-loop GTPase) [Cell cycle control, cell division, chromosome partitioning, Nuclear structure, Intracellular trafficking, secretion, and vesicular transport]. 37867 KOG2656: DNA methyltransferase 1-associated protein-1 [Chromatin structure and dynamics, Transcription]. 37868 KOG2657: Transmembrane glycoprotein nicastrin [Signal transduction mechanisms, Posttranslational modification, protein turnover, chaperones]. 37869 KOG2658: NADH:ubiquinone oxidoreductase, NDUFV1/51kDa subunit [Energy production and conversion]. 37870 KOG2659: LisH motif-containing protein [Cytoskeleton]. 37871 KOG2660: Locus-specific chromosome binding proteins [Function unknown]. 37872 KOG2661: Peptidase family M48 [Posttranslational modification, protein turnover, chaperones]. 37873 KOG2662: Magnesium transporters: CorA family [Inorganic ion transport and metabolism]. 37874 KOG2663: Acetolactate synthase, small subunit [Amino acid transport and metabolism]. 37875 KOG2664: Small nuclear RNA activating protein complex - 50kD subunit (SNAP50) [Transcription]. 37876 KOG2665: Predicted FAD-dependent oxidoreductase [Function unknown]. 37877 KOG2666: UDP-glucose/GDP-mannose dehydrogenase [Carbohydrate transport and metabolism, Signal transduction mechanisms]. 37878 KOG2667: COPII vesicle protein [Intracellular trafficking, secretion, and vesicular transport]. 37879 KOG2668: Flotillins [Intracellular trafficking, secretion, and vesicular transport, Cytoskeleton]. 37880 KOG2669: Regulator of nuclear mRNA [RNA processing and modification]. 37881 KOG2670: Enolase [Carbohydrate transport and metabolism]. 37882 KOG2671: Putative RNA methylase [Replication, recombination and repair]. 37883 KOG2672: Lipoate synthase [Coenzyme transport and metabolism]. 37884 KOG2673: Uncharacterized conserved protein, contains PSP domain [Function unknown]. 37885 KOG2674: Cysteine protease required for autophagy - Apg4p/Aut2p [Cytoskeleton, Intracellular trafficking, secretion, and vesicular transport]. 37886 KOG2675: Adenylate cyclase-associated protein (CAP/Srv2p) [Cytoskeleton, Signal transduction mechanisms]. 37887 KOG2676: Uncharacterized conserved protein [Function unknown]. 37888 KOG2677: Stoned B synaptic vesicle biogenesis protein [Intracellular trafficking, secretion, and vesicular transport]. 37889 KOG2678: Predicted membrane protein [Function unknown]. 37890 KOG2679: Purple (tartrate-resistant) acid phosphatase [Posttranslational modification, protein turnover, chaperones]. 37891 KOG2680: DNA helicase TIP49, TBP-interacting protein [Transcription]. 37892 KOG2681: Metal-dependent phosphohydrolase [Function unknown]. 37893 KOG2682: NAD-dependent histone deacetylases and class I sirtuins (SIR2 family) [Chromatin structure and dynamics, Transcription]. 37894 KOG2683: Sirtuin 4 and related class II sirtuins (SIR2 family) [Chromatin structure and dynamics, Transcription]. 37895 KOG2684: Sirtuin 5 and related class III sirtuins (SIR2 family) [Chromatin structure and dynamics, Transcription]. 37896 KOG2685: Cystoskeletal protein Tektin [Cytoskeleton]. 37897 KOG2686: Choline kinase [Cell wall/membrane/envelope biogenesis]. 37898 KOG2687: Spindle pole body protein, contains UNC-84 domain [Cell cycle control, cell division, chromosome partitioning]. 37899 KOG2688: Transcription-associated recombination protein - Thp1p [Cell cycle control, cell division, chromosome partitioning]. 37900 KOG2689: Predicted ubiquitin regulatory protein [Posttranslational modification, protein turnover, chaperones]. 37901 KOG2690: Uncharacterized conserved protein, contains BSD domain [Function unknown]. 37902 KOG2691: RNA polymerase II subunit 9 [Transcription]. 37903 KOG2692: Sialyltransferase [Carbohydrate transport and metabolism]. 37904 KOG2693: Putative zinc transporter [Inorganic ion transport and metabolism]. 37905 KOG2694: Putative zinc transporter [Inorganic ion transport and metabolism]. 37906 KOG2695: WD40 repeat protein [General function prediction only]. 37907 KOG2696: Histone acetyltransferase type b catalytic subunit [Chromatin structure and dynamics]. 37908 KOG2697: Histidinol dehydrogenase [Amino acid transport and metabolism]. 37909 KOG2698: GTP cyclohydrolase I [Coenzyme transport and metabolism]. 37910 KOG2699: Predicted ubiquitin regulatory protein [Posttranslational modification, protein turnover, chaperones]. 37911 KOG2700: Adenylosuccinate lyase [Nucleotide transport and metabolism]. 37912 KOG2701: Uncharacterized conserved protein [Function unknown]. 37913 KOG2702: Predicted panthothenate kinase/uridine kinase-related protein [Nucleotide transport and metabolism, Coenzyme transport and metabolism]. 37914 KOG2703: C4-type Zn-finger protein [General function prediction only]. 37915 KOG2704: Predicted membrane protein [Function unknown]. 37916 KOG2705: Predicted membrane protein [Function unknown]. 37917 KOG2706: Predicted membrane protein [Function unknown]. 37918 KOG2707: Predicted metalloprotease with chaperone activity (RNAse H/HSP70 fold) [Posttranslational modification, protein turnover, chaperones]. 37919 KOG2708: Predicted metalloprotease with chaperone activity (RNAse H/HSP70 fold) [Posttranslational modification, protein turnover, chaperones]. 37920 KOG2709: Uncharacterized conserved protein [Function unknown]. 37921 KOG2710: Rho GTPase-activating protein [Signal transduction mechanisms, Cytoskeleton]. 37922 KOG2711: Glycerol-3-phosphate dehydrogenase/dihydroxyacetone 3-phosphate reductase [Energy production and conversion]. 37923 KOG2712: Transcriptional coactivator [Transcription]. 37924 KOG2713: Mitochondrial tryptophanyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 37925 KOG2714: SETA binding protein SB1 and related proteins, contain BTB/POZ domain [General function prediction only]. 37926 KOG2715: Uncharacterized conserved protein, contains BTB/POZ domain [General function prediction only]. 37927 KOG2716: Polymerase delta-interacting protein PDIP1 and related proteins, contain BTB/POZ domain [Inorganic ion transport and metabolism]. 37928 KOG2717: Uncharacterized conserved protein with similarity to embryogenesis protein H beta 58 and VPS26 [General function prediction only]. 37929 KOG2718: Na+-bile acid cotransporter [Inorganic ion transport and metabolism]. 37930 KOG2719: Metalloprotease [General function prediction only]. 37931 KOG2720: Predicted hydrolase (HIT family) [General function prediction only]. 37932 KOG2721: Uncharacterized conserved protein [Function unknown]. 37933 KOG2722: Predicted membrane protein [Function unknown]. 37934 KOG2723: Uncharacterized conserved protein, contains BTB/POZ domain [General function prediction only]. 37935 KOG2724: Nuclear pore complex component NPAP60L/NUP50 [Intracellular trafficking, secretion, and vesicular transport]. 37936 KOG2725: Cytochrome oxidase assembly factor COX15 [Posttranslational modification, protein turnover, chaperones]. 37937 KOG2726: Mitochondrial polypeptide chain release factor [Translation, ribosomal structure and biogenesis]. 37938 KOG2727: Rab3 GTPase-activating protein, non-catalytic subunit [Intracellular trafficking, secretion, and vesicular transport]. 37939 KOG2728: Uncharacterized conserved protein with similarity to phosphopantothenoylcysteine synthetase/decarboxylase [General function prediction only]. 37940 KOG2729: ER vesicle integral membrane protein involved in establishing cell polarity, signaling and protein degradation [Posttranslational modification, protein turnover, chaperones, Intracellular trafficking, secretion, and vesicular transport, Signal transduction mechanisms]. 37941 KOG2730: Methylase [General function prediction only]. 37942 KOG2731: DNA alkylation damage repair protein [RNA processing and modification]. 37943 KOG2732: DNA polymerase delta, regulatory subunit 55 [Replication, recombination and repair]. 37944 KOG2733: Uncharacterized membrane protein [Function unknown]. 37945 KOG2734: Uncharacterized conserved protein [Function unknown]. 37946 KOG2735: Phosphatidylserine synthase [Lipid transport and metabolism]. 37947 KOG2736: Presenilin [Signal transduction mechanisms]. 37948 KOG2737: Putative metallopeptidase [General function prediction only]. 37949 KOG2738: Putative methionine aminopeptidase [Posttranslational modification, protein turnover, chaperones]. 37950 KOG2739: Leucine-rich acidic nuclear protein [Cell cycle control, cell division, chromosome partitioning, General function prediction only]. 37951 KOG2740: Clathrin-associated protein medium chain [Intracellular trafficking, secretion, and vesicular transport]. 37952 KOG2741: Dimeric dihydrodiol dehydrogenase [Carbohydrate transport and metabolism, Secondary metabolites biosynthesis, transport and catabolism]. 37953 KOG2742: Predicted oxidoreductase [General function prediction only]. 37954 KOG2743: Cobalamin synthesis protein [Coenzyme transport and metabolism]. 37955 KOG2744: DNA-binding proteins Bright/BRCAA1/RBP1 and related proteins containing BRIGHT domain [Transcription]. 37956 KOG2745: Mitochondrial carrier protein [General function prediction only]. 37957 KOG2746: HMG-box transcription factor Capicua and related proteins [Transcription]. 37958 KOG2747: Histone acetyltransferase (MYST family) [Chromatin structure and dynamics]. 37959 KOG2748: Uncharacterized conserved protein, contains chromo domain [Chromatin structure and dynamics]. 37960 KOG2749: mRNA cleavage and polyadenylation factor IA/II complex, subunit CLP1 [RNA processing and modification]. 37961 KOG2750: Uncharacterized conserved protein similar to ATP/GTP-binding protein [General function prediction only]. 37962 KOG2751: Beclin-like protein [Signal transduction mechanisms]. 37963 KOG2752: Uncharacterized conserved protein, contains N-recognin-type Zn-finger [General function prediction only]. 37964 KOG2753: Uncharacterized conserved protein, contains PCI domain [General function prediction only]. 37965 KOG2754: Oligosaccharyltransferase, beta subunit [Posttranslational modification, protein turnover, chaperones]. 37966 KOG2755: Oxidoreductase [General function prediction only]. 37967 KOG2756: Predicted Mg2+-dependent phosphodiesterase TTRAP [Signal transduction mechanisms]. 37968 KOG2757: Mannose-6-phosphate isomerase [Carbohydrate transport and metabolism]. 37969 KOG2758: Translation initiation factor 3, subunit e (eIF-3e) [Translation, ribosomal structure and biogenesis]. 37970 KOG2759: Vacuolar H+-ATPase V1 sector, subunit H [Energy production and conversion]. 37971 KOG2760: Vacuolar sorting protein VPS36 [Intracellular trafficking, secretion, and vesicular transport]. 37972 KOG2761: START domain-containing proteins involved in steroidogenesis/phosphatidylcholine transfer [Lipid transport and metabolism]. 37973 KOG2762: Mannosyltransferase [Carbohydrate transport and metabolism]. 37974 KOG2763: Acyl-CoA thioesterase [Lipid transport and metabolism]. 37975 KOG2764: Putative transcriptional regulator DJ-1 [General function prediction only, Defense mechanisms]. 37976 KOG2765: Predicted membrane protein [Function unknown]. 37977 KOG2766: Predicted membrane protein [Function unknown]. 37978 KOG2767: Translation initiation factor 5 (eIF-5) [Translation, ribosomal structure and biogenesis]. 37979 KOG2768: Translation initiation factor 2, beta subunit (eIF-2beta) [Translation, ribosomal structure and biogenesis]. 37980 KOG2769: Putative u4/u6 small nuclear ribonucleoprotein [RNA processing and modification]. 37981 KOG2770: Aminomethyl transferase [Amino acid transport and metabolism]. 37982 KOG2771: Subunit of tRNA-specific adenosine-34 deaminase [RNA processing and modification]. 37983 KOG2772: Transaldolase [Carbohydrate transport and metabolism]. 37984 KOG2773: Apoptosis antagonizing transcription factor/protein transport protein [Transcription, Intracellular trafficking, secretion, and vesicular transport]. 37985 KOG2774: NAD dependent epimerase [General function prediction only]. 37986 KOG2775: Metallopeptidase [General function prediction only]. 37987 KOG2776: Metallopeptidase [General function prediction only]. 37988 KOG2777: tRNA-specific adenosine deaminase 1 [RNA processing and modification]. 37989 KOG2778: Ubiquitin C-terminal hydrolase [Posttranslational modification, protein turnover, chaperones]. 37990 KOG2779: N-myristoyl transferase [Lipid transport and metabolism]. 37991 KOG2780: Ribosome biogenesis protein RPF1, contains IMP4 domain [RNA processing and modification]. 37992 KOG2781: U3 small nucleolar ribonucleoprotein (snoRNP) component [RNA processing and modification]. 37993 KOG2782: Putative SAM dependent methyltransferases [General function prediction only]. 37994 KOG2783: Phenylalanyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 37995 KOG2784: Phenylalanyl-tRNA synthetase, beta subunit [Translation, ribosomal structure and biogenesis]. 37996 KOG2785: C2H2-type Zn-finger protein [General function prediction only]. 37997 KOG2786: Putative glutamate/ornithine acetyltransferase [Amino acid transport and metabolism]. 37998 KOG2787: Lanthionine synthetase C-like protein 1 [Defense mechanisms]. 37999 KOG2788: Glycosyltransferase [Carbohydrate transport and metabolism]. 38000 KOG2789: Putative Zn-finger protein [General function prediction only]. 38001 KOG2790: Phosphoserine aminotransferase [Coenzyme transport and metabolism, Amino acid transport and metabolism]. 38002 KOG2791: N-acetylglucosaminyltransferase [Carbohydrate transport and metabolism]. 38003 KOG2792: Putative cytochrome C oxidase assembly protein [Energy production and conversion]. 38004 KOG2793: Putative N2,N2-dimethylguanosine tRNA methyltransferase [RNA processing and modification]. 38005 KOG2794: Delta-aminolevulinic acid dehydratase [Coenzyme transport and metabolism]. 38006 KOG2795: Catalytic subunit of the meiotic double strand break transesterase [Replication, recombination and repair]. 38007 KOG2796: Uncharacterized conserved protein [Function unknown]. 38008 KOG2797: Prephenate dehydratase [Amino acid transport and metabolism]. 38009 KOG2798: Putative trehalase [Carbohydrate transport and metabolism]. 38010 KOG2799: Succinyl-CoA synthetase, beta subunit [Energy production and conversion]. 38011 KOG2800: Conserved developmentally regulated protein [General function prediction only]. 38012 KOG2801: Probable Rab-GAPs [Intracellular trafficking, secretion, and vesicular transport]. 38013 KOG2802: Membrane protein HUEL (cation efflux superfamily) [General function prediction only]. 38014 KOG2803: Choline phosphate cytidylyltransferase/Predicted CDP-ethanolamine synthase [Lipid transport and metabolism]. 38015 KOG2804: Phosphorylcholine transferase/cholinephosphate cytidylyltransferase [Lipid transport and metabolism]. 38016 KOG2805: tRNA (5-methylaminomethyl-2-thiouridylate)-methyltransferase [Translation, ribosomal structure and biogenesis]. 38017 KOG2806: Chitinase [Carbohydrate transport and metabolism]. 38018 KOG2807: RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH, subunit SSL1 [Transcription, Replication, recombination and repair]. 38019 KOG2808: U5 snRNP-associated RNA splicing factor [RNA processing and modification]. 38020 KOG2809: Telomerase elongation inhibitor/RNA maturation protein PINX1 [RNA processing and modification, Cell cycle control, cell division, chromosome partitioning]. 38021 KOG2810: Checkpoint 9-1-1 complex, RAD9 component [Energy production and conversion, Replication, recombination and repair]. 38022 KOG2811: Uncharacterized conserved protein [Function unknown]. 38023 KOG2812: Uncharacterized conserved protein [Function unknown]. 38024 KOG2813: Predicted molecular chaperone, contains DnaJ domain [Posttranslational modification, protein turnover, chaperones]. 38025 KOG2814: Transcription coactivator complex, P50 component (LigT RNA ligase/phosphodiesterase family) [Transcription]. 38026 KOG2815: Mitochondrial/choloroplast ribosomal protein S15 [Translation, ribosomal structure and biogenesis]. 38027 KOG2816: Predicted transporter ADD1 (major facilitator superfamily) [General function prediction only]. 38028 KOG2817: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 38029 KOG2818: Predicted undecaprenyl diphosphate synthase [Lipid transport and metabolism]. 38030 KOG2819: Uncharacterized conserved protein [Function unknown]. 38031 KOG2820: FAD-dependent oxidoreductase [General function prediction only]. 38032 KOG2821: RNA polymerase II transcription elongation factor Elongin/SIII, subunit elongin A [Transcription]. 38033 KOG2822: Sphingoid base-phosphate phosphatase [Lipid transport and metabolism]. 38034 KOG2823: Cellular protein (glioma tumor suppressor candidate region gene 2) [General function prediction only]. 38035 KOG2824: Glutaredoxin-related protein [Posttranslational modification, protein turnover, chaperones]. 38036 KOG2825: Putative arsenite-translocating ATPase [Inorganic ion transport and metabolism]. 38037 KOG2826: Actin-related protein Arp2/3 complex, subunit ARPC2 [Cytoskeleton]. 38038 KOG2827: Uncharacterized conserved protein [Function unknown]. 38039 KOG2828: Acetyl-CoA hydrolase [Energy production and conversion]. 38040 KOG2829: E2F-like protein [Transcription]. 38041 KOG2830: Protein phosphatase 2A-associated protein [Signal transduction mechanisms]. 38042 KOG2831: ATP phosphoribosyltransferase [Amino acid transport and metabolism]. 38043 KOG2832: TFIIF-interacting CTD phosphatase, including NLI-interacting factor (involved in RNA polymerase II regulation) [Transcription]. 38044 KOG2833: Mevalonate pyrophosphate decarboxylase [Lipid transport and metabolism]. 38045 KOG2834: Nuclear pore complex, rNpl4 component (sc Npl4) [Nuclear structure, Intracellular trafficking, secretion, and vesicular transport]. 38046 KOG2835: Phosphoribosylamidoimidazole-succinocarboxamide synthase [Nucleotide transport and metabolism]. 38047 KOG2836: Protein tyrosine phosphatase IVA1 [Signal transduction mechanisms]. 38048 KOG2837: Protein containing a U1-type Zn-finger and implicated in RNA splicing or processing [RNA processing and modification]. 38049 KOG2838: Uncharacterized conserved protein, contains BTB/POZ domain [General function prediction only]. 38050 KOG2839: Diadenosine and diphosphoinositol polyphosphate phosphohydrolase [Signal transduction mechanisms]. 38051 KOG2840: Uncharacterized conserved protein with similarity to predicted ATPase of the PP-loop superfamily [General function prediction only]. 38052 KOG2841: Structure-specific endonuclease ERCC1-XPF, ERCC1 component [Replication, recombination and repair]. 38053 KOG2842: Interferon-related protein PC4 like [Cytoskeleton]. 38054 KOG2843: Fumarylacetoacetase [Carbohydrate transport and metabolism]. 38055 KOG2844: Dimethylglycine dehydrogenase precursor [Amino acid transport and metabolism]. 38056 KOG2845: Activating signal cointegrator 1 [Transcription]. 38057 KOG2846: Predicted membrane protein [Function unknown]. 38058 KOG2847: Phosphate acyltransferase [Lipid transport and metabolism]. 38059 KOG2848: 1-acyl-sn-glycerol-3-phosphate acyltransferase [Lipid transport and metabolism]. 38060 KOG2849: Placental protein 11 [General function prediction only]. 38061 KOG2850: Predicted peptidoglycan-binding protein, contains LysM domain [General function prediction only]. 38062 KOG2851: Eukaryotic-type DNA primase, catalytic (small) subunit [Replication, recombination and repair]. 38063 KOG2852: Possible oxidoreductase [General function prediction only]. 38064 KOG2853: Possible oxidoreductase [General function prediction only]. 38065 KOG2854: Possible pfkB family carbohydrate kinase [Carbohydrate transport and metabolism]. 38066 KOG2855: Ribokinase [Carbohydrate transport and metabolism]. 38067 KOG2856: Adaptor protein PACSIN [Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport, Cytoskeleton]. 38068 KOG2857: Predicted MYND Zn-finger protein/hormone receptor interactor [Transcription]. 38069 KOG2858: Uncharacterized conserved protein [General function prediction only]. 38070 KOG2859: DNA repair protein, member of the recA/RAD51 family [Replication, recombination and repair]. 38071 KOG2860: Uncharacterized conserved protein, contains TraB domain [Signal transduction mechanisms]. 38072 KOG2861: Uncharacterized conserved protein [Function unknown]. 38073 KOG2862: Alanine-glyoxylate aminotransferase AGT1 [General function prediction only]. 38074 KOG2863: RNA lariat debranching enzyme [RNA processing and modification]. 38075 KOG2864: Nuclear division RFT1 protein [Cell cycle control, cell division, chromosome partitioning]. 38076 KOG2865: NADH:ubiquinone oxidoreductase, NDUFA9/39kDa subunit [Energy production and conversion]. 38077 KOG2866: Uncharacterized conserved protein [Function unknown]. 38078 KOG2867: Phosphotyrosyl phosphatase activator [Cell cycle control, cell division, chromosome partitioning, Signal transduction mechanisms]. 38079 KOG2868: Decapping enzyme complex component DCP1 [Transcription, RNA processing and modification]. 38080 KOG2869: Meiotic cell division protein Pelota/DOM34 [Translation, ribosomal structure and biogenesis]. 38081 KOG2870: NADH:ubiquinone oxidoreductase, NDUFS2/49 kDa subunit [Energy production and conversion]. 38082 KOG2871: Uncharacterized conserved protein [Function unknown]. 38083 KOG2872: Uroporphyrinogen decarboxylase [Coenzyme transport and metabolism]. 38084 KOG2873: Ubiquinol cytochrome c reductase assembly protein CBP3 [Energy production and conversion]. 38085 KOG2874: rRNA processing protein [Translation, ribosomal structure and biogenesis, Cell cycle control, cell division, chromosome partitioning]. 38086 KOG2875: 8-oxoguanine DNA glycosylase [Replication, recombination and repair]. 38087 KOG2876: Molybdenum cofactor biosynthesis pathway protein [Coenzyme transport and metabolism]. 38088 KOG2877: sn-1,2-diacylglycerol ethanolamine- and cholinephosphotranferases [Lipid transport and metabolism]. 38089 KOG2878: Predicted kinase [General function prediction only]. 38090 KOG2879: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 38091 KOG2880: SMAD6 interacting protein AMSH, contains JAB/MPN/Mov34 domain [Signal transduction mechanisms]. 38092 KOG2881: Predicted membrane protein [Function unknown]. 38093 KOG2882: p-Nitrophenyl phosphatase [Inorganic ion transport and metabolism]. 38094 KOG2883: NIPSNAP1 protein [Function unknown]. 38095 KOG2884: 26S proteasome regulatory complex, subunit RPN10/PSMD4 [Posttranslational modification, protein turnover, chaperones]. 38096 KOG2885: Uncharacterized conserved protein [Function unknown]. 38097 KOG2886: Uncharacterized conserved protein [Function unknown]. 38098 KOG2887: Membrane protein involved in ER to Golgi transport [Intracellular trafficking, secretion, and vesicular transport]. 38099 KOG2888: Putative RNA binding protein [General function prediction only]. 38100 KOG2889: Predicted PRP38-like splicing factor [Function unknown]. 38101 KOG2890: Predicted membrane protein [Function unknown]. 38102 KOG2891: Surface glycoprotein [General function prediction only]. 38103 KOG2892: Porphobilinogen deaminase [Coenzyme transport and metabolism]. 38104 KOG2893: Zn finger protein [General function prediction only]. 38105 KOG2894: Uncharacterized conserved protein XAP-5 [Function unknown]. 38106 KOG2895: Uncharacterized conserved protein [Function unknown]. 38107 KOG2896: UV radiation resistance associated protein [General function prediction only]. 38108 KOG2897: DNA-binding protein YL1 and related proteins [General function prediction only]. 38109 KOG2898: Predicted phosphate acyltransferase, contains PlsC domain [Lipid transport and metabolism]. 38110 KOG2899: Predicted methyltransferase [General function prediction only]. 38111 KOG2900: Biotin synthase [Coenzyme transport and metabolism]. 38112 KOG2901: Uncharacterized conserved protein [Function unknown]. 38113 KOG2902: Dihydroorotase [Nucleotide transport and metabolism]. 38114 KOG2903: Predicted glutathione S-transferase [Posttranslational modification, protein turnover, chaperones]. 38115 KOG2904: Predicted methyltransferase [General function prediction only]. 38116 KOG2905: Transcription initiation factor IIF, small subunit (RAP30) [Transcription]. 38117 KOG2906: RNA polymerase III subunit C11 [Transcription]. 38118 KOG2907: RNA polymerase I transcription factor TFIIS, subunit A12.2/RPA12 [Transcription]. 38119 KOG2908: 26S proteasome regulatory complex, subunit RPN9/PSMD13 [Posttranslational modification, protein turnover, chaperones]. 38120 KOG2909: Vacuolar H+-ATPase V1 sector, subunit C [Energy production and conversion]. 38121 KOG2910: Uncharacterized conserved protein predicted to be involved in protein sorting [General function prediction only]. 38122 KOG2911: Uncharacterized conserved protein [Function unknown]. 38123 KOG2912: Predicted DNA methylase [Function unknown]. 38124 KOG2913: Predicted membrane protein [Function unknown]. 38125 KOG2914: Predicted haloacid-halidohydrolase and related hydrolases [General function prediction only]. 38126 KOG2915: tRNA(1-methyladenosine) methyltransferase, subunit GCD14 [Translation, ribosomal structure and biogenesis]. 38127 KOG2916: Translation initiation factor 2, alpha subunit (eIF-2alpha) [Translation, ribosomal structure and biogenesis]. 38128 KOG2917: Predicted exosome subunit [Translation, ribosomal structure and biogenesis]. 38129 KOG2918: Carboxymethyl transferase [Posttranslational modification, protein turnover, chaperones]. 38130 KOG2919: Guanine nucleotide-binding protein [General function prediction only]. 38131 KOG2920: Predicted methyltransferase [General function prediction only]. 38132 KOG2921: Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]. 38133 KOG2922: Uncharacterized conserved protein [Function unknown]. 38134 KOG2923: Uncharacterized conserved protein [Function unknown]. 38135 KOG2924: Deoxyhypusine synthase [Posttranslational modification, protein turnover, chaperones]. 38136 KOG2925: Predicted translation initiation factor related to eIF-1A [Translation, ribosomal structure and biogenesis]. 38137 KOG2926: Malonyl-CoA:ACP transacylase [Lipid transport and metabolism]. 38138 KOG2927: Membrane component of ER protein translocation complex [Intracellular trafficking, secretion, and vesicular transport]. 38139 KOG2928: Origin recognition complex, subunit 2 [Replication, recombination and repair]. 38140 KOG2929: Transcription factor, component of CCR4 transcriptional complex [Transcription]. 38141 KOG2930: SCF ubiquitin ligase, Rbx1 component [Posttranslational modification, protein turnover, chaperones]. 38142 KOG2931: Differentiation-related gene 1 protein (NDR1 protein), related proteins [Function unknown]. 38143 KOG2932: E3 ubiquitin ligase involved in ubiquitination of E-cadherin complex [Posttranslational modification, protein turnover, chaperones]. 38144 KOG2933: Uncharacterized conserved protein [Function unknown]. 38145 KOG2934: Uncharacterized conserved protein, contains Josephin domain [General function prediction only]. 38146 KOG2935: Ataxin 3/Josephin [General function prediction only]. 38147 KOG2936: Uncharacterized conserved protein [Function unknown]. 38148 KOG2937: Decapping enzyme complex, predicted pyrophosphatase DCP2 [RNA processing and modification]. 38149 KOG2938: Predicted inosine-uridine preferring nucleoside hydrolase [Nucleotide transport and metabolism]. 38150 KOG2939: Uncharacterized conserved protein [Function unknown]. 38151 KOG2940: Predicted methyltransferase [General function prediction only]. 38152 KOG2941: Beta-1,4-mannosyltransferase [Posttranslational modification, protein turnover, chaperones]. 38153 KOG2942: Uncharacterized conserved protein [Function unknown]. 38154 KOG2943: Predicted glyoxalase [Carbohydrate transport and metabolism]. 38155 KOG2944: Glyoxalase [Carbohydrate transport and metabolism]. 38156 KOG2945: Predicted RNA-binding protein [General function prediction only]. 38157 KOG2946: Uncharacterized conserved protein [Function unknown]. 38158 KOG2947: Carbohydrate kinase [Carbohydrate transport and metabolism]. 38159 KOG2948: Predicted metal-binding protein [General function prediction only]. 38160 KOG2949: Ketopantoate hydroxymethyltransferase [Coenzyme transport and metabolism]. 38161 KOG2950: Uncharacterized protein involved in protein-protein interaction, contains polyproline-binding GYF domain [General function prediction only]. 38162 KOG2951: Inositol monophosphatase [Carbohydrate transport and metabolism]. 38163 KOG2952: Cell cycle control protein [Cell cycle control, cell division, chromosome partitioning, Transcription, Signal transduction mechanisms]. 38164 KOG2953: mRNA-binding protein Encore [RNA processing and modification]. 38165 KOG2954: Mitochondrial carrier protein [General function prediction only]. 38166 KOG2955: Uncharacterized conserved protein [Function unknown]. 38167 KOG2956: CLIP-associating protein [General function prediction only]. 38168 KOG2957: Vacuolar H+-ATPase V0 sector, subunit d [Energy production and conversion]. 38169 KOG2958: Galactose-1-phosphate uridylyltransferase [Energy production and conversion]. 38170 KOG2959: Transcriptional regulator [Transcription]. 38171 KOG2960: Protein involved in thiamine biosynthesis and DNA damage tolerance [General function prediction only]. 38172 KOG2961: Predicted hydrolase (HAD superfamily) [General function prediction only]. 38173 KOG2962: Prohibitin-related membrane protease subunits [General function prediction only]. 38174 KOG2963: RNA-binding protein required for 60S ribosomal subunit biogenesis [Translation, ribosomal structure and biogenesis]. 38175 KOG2964: Arginase family protein [Amino acid transport and metabolism]. 38176 KOG2965: Arginase [Amino acid transport and metabolism]. 38177 KOG2966: Uncharacterized conserved protein [General function prediction only]. 38178 KOG2967: Uncharacterized conserved protein [Function unknown]. 38179 KOG2968: Predicted esterase of the alpha-beta hydrolase superfamily (Neuropathy target esterase), contains cAMP-binding domains [General function prediction only]. 38180 KOG2969: Uncharacterized conserved protein [Function unknown]. 38181 KOG2970: Predicted membrane protein [Function unknown]. 38182 KOG2971: RNA-binding protein required for biogenesis of the ribosomal 60S subunit [Translation, ribosomal structure and biogenesis]. 38183 KOG2972: Uncharacterized conserved protein [Function unknown]. 38184 KOG2973: Uncharacterized conserved protein [Function unknown]. 38185 KOG2974: Uncharacterized conserved protein [Function unknown]. 38186 KOG2975: Translation initiation factor 3, subunit f (eIF-3f) [Translation, ribosomal structure and biogenesis]. 38187 KOG2976: Protein involved in autophagy and nutrient starvation [Posttranslational modification, protein turnover, chaperones]. 38188 KOG2977: Glycosyltransferase [General function prediction only]. 38189 KOG2978: Dolichol-phosphate mannosyltransferase [General function prediction only]. 38190 KOG2979: Protein involved in DNA repair [General function prediction only]. 38191 KOG2980: Integral membrane protease of the rhomboid family involved in different forms of regulated intramembrane proteolysis [Signal transduction mechanisms]. 38192 KOG2981: Protein involved in autophagocytosis during starvation [General function prediction only]. 38193 KOG2982: Uncharacterized conserved protein [Function unknown]. 38194 KOG2983: Uncharacterized conserved protein [Function unknown]. 38195 KOG2984: Predicted hydrolase [General function prediction only]. 38196 KOG2985: Uncharacterized conserved protein [Function unknown]. 38197 KOG2986: Uncharacterized conserved protein [Function unknown]. 38198 KOG2987: Fatty acid desaturase [Lipid transport and metabolism]. 38199 KOG2988: 60S ribosomal protein L30 [Translation, ribosomal structure and biogenesis]. 38200 KOG2989: Uncharacterized conserved protein [Function unknown]. 38201 KOG2990: C2C2-type Zn-finger protein [Function unknown]. 38202 KOG2991: Splicing regulator [RNA processing and modification]. 38203 KOG2992: Nucleolar GTPase/ATPase p130 [Nuclear structure]. 38204 KOG2993: Cytoplasm to vacuole targeting protein [Intracellular trafficking, secretion, and vesicular transport]. 38205 KOG2994: Uracil DNA glycosylase [Replication, recombination and repair]. 38206 KOG2996: Rho guanine nucleotide exchange factor VAV3 [Signal transduction mechanisms]. 38207 KOG2997: F-box protein FBX9 [General function prediction only]. 38208 KOG2998: Uncharacterized conserved protein [Function unknown]. 38209 KOG2999: Regulator of Rac1, required for phagocytosis and cell migration [Signal transduction mechanisms]. 38210 KOG3000: Microtubule-binding protein involved in cell cycle control [Cell cycle control, cell division, chromosome partitioning, Cytoskeleton]. 38211 KOG3001: Dosage compensation regulatory complex/histone acetyltransferase complex, subunit MSL-3/MRG15/EAF3, and related CHROMO domain-containing proteins [Chromatin structure and dynamics, Transcription]. 38212 KOG3002: Zn finger protein [General function prediction only]. 38213 KOG3003: Molecular chaperone of the GrpE family [Posttranslational modification, protein turnover, chaperones]. 38214 KOG3004: Meiotic chromosome segregation protein [Cell cycle control, cell division, chromosome partitioning]. 38215 KOG3005: GIY-YIG type nuclease [General function prediction only]. 38216 KOG3006: Transthyretin and related proteins [Lipid transport and metabolism]. 38217 KOG3007: Mu-crystallin [Amino acid transport and metabolism]. 38218 KOG3008: Quinolinate phosphoribosyl transferase [Nucleotide transport and metabolism]. 38219 KOG3009: Predicted carbohydrate kinase, contains PfkB domain [General function prediction only]. 38220 KOG3010: Methyltransferase [General function prediction only]. 38221 KOG3011: Ubiquitin-conjugating enzyme [Posttranslational modification, protein turnover, chaperones]. 38222 KOG3012: Uncharacterized conserved protein [Function unknown]. 38223 KOG3013: Exosomal 3'-5' exoribonuclease complex, subunit Rrp4 [RNA processing and modification]. 38224 KOG3014: Protein involved in establishing cohesion between sister chromatids during DNA replication [Replication, recombination and repair]. 38225 KOG3015: F1-ATP synthase assembly protein [Energy production and conversion]. 38226 KOG3016: Acyl-CoA thioesterase [Lipid transport and metabolism]. 38227 KOG3017: Defense-related protein containing SCP domain [Function unknown]. 38228 KOG3018: Malonyl-CoA decarboxylase [Carbohydrate transport and metabolism]. 38229 KOG3019: Predicted nucleoside-diphosphate sugar epimerase [Nucleotide transport and metabolism]. 38230 KOG3020: TatD-related DNase [Replication, recombination and repair]. 38231 KOG3021: Predicted kinase [General function prediction only]. 38232 KOG3022: Predicted ATPase, nucleotide-binding [Cell cycle control, cell division, chromosome partitioning]. 38233 KOG3023: Glutamate-cysteine ligase regulatory subunit [Amino acid transport and metabolism]. 38234 KOG3024: Uncharacterized conserved protein [Function unknown]. 38235 KOG3025: Mitochondrial F1F0-ATP synthase, subunit c/ATP9/proteolipid [Energy production and conversion]. 38236 KOG3026: Splicing factor SPF30 [RNA processing and modification]. 38237 KOG3027: Mitochondrial outer membrane protein Metaxin 2, Metaxin 1-binding protein [Cell wall/membrane/envelope biogenesis, Intracellular trafficking, secretion, and vesicular transport]. 38238 KOG3028: Translocase of outer mitochondrial membrane complex, subunit TOM37/Metaxin 1 [Intracellular trafficking, secretion, and vesicular transport]. 38239 KOG3029: Glutathione S-transferase-related protein [General function prediction only]. 38240 KOG3030: Lipid phosphate phosphatase and related enzymes of the PAP2 family [Lipid transport and metabolism]. 38241 KOG3031: Protein required for biogenesis of the ribosomal 60S subunit [Translation, ribosomal structure and biogenesis]. 38242 KOG3032: Uncharacterized conserved protein [Function unknown]. 38243 KOG3033: Predicted PhzC/PhzF-type epimerase [General function prediction only]. 38244 KOG3034: Isoamyl acetate-hydrolyzing esterase and related enzymes [General function prediction only]. 38245 KOG3035: Isoamyl acetate-hydrolyzing esterase [Lipid transport and metabolism]. 38246 KOG3036: Protein involved in cell differentiation/sexual development [General function prediction only]. 38247 KOG3037: Cell membrane glycoprotein [General function prediction only]. 38248 KOG3038: Histone acetyltransferase SAGA associated factor SGF29 [General function prediction only]. 38249 KOG3039: Uncharacterized conserved protein [Function unknown]. 38250 KOG3040: Predicted sugar phosphatase (HAD superfamily) [General function prediction only]. 38251 KOG3041: Nucleoside diphosphate-sugar hydrolase of the MutT (NUDIX) family [Replication, recombination and repair]. 38252 KOG3042: Panthothenate synthetase [Coenzyme transport and metabolism]. 38253 KOG3043: Predicted hydrolase related to dienelactone hydrolase [General function prediction only]. 38254 KOG3044: Uncharacterized conserved protein [Function unknown]. 38255 KOG3045: Predicted RNA methylase involved in rRNA processing [RNA processing and modification]. 38256 KOG3046: Transcription factor, subunit of SRB subcomplex of RNA polymerase II [Transcription]. 38257 KOG3047: Predicted transcriptional regulator UXT [Transcription]. 38258 KOG3048: Molecular chaperone Prefoldin, subunit 5 [Posttranslational modification, protein turnover, chaperones]. 38259 KOG3049: Succinate dehydrogenase, Fe-S protein subunit [Energy production and conversion]. 38260 KOG3050: COP9 signalosome, subunit CSN6 [Posttranslational modification, protein turnover, chaperones, Signal transduction mechanisms]. 38261 KOG3051: RNA binding/translational regulation protein of the SUA5 family [Translation, ribosomal structure and biogenesis]. 38262 KOG3052: Cytochrome c1 [Energy production and conversion]. 38263 KOG3053: Uncharacterized conserved protein [Function unknown]. 38264 KOG3054: Uncharacterized conserved protein [Function unknown]. 38265 KOG3055: Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase [Amino acid transport and metabolism]. 38266 KOG3056: Protein required for S-phase initiation or completion [Cell cycle control, cell division, chromosome partitioning]. 38267 KOG3057: Cytochrome c oxidase, subunit VIb/COX12 [Energy production and conversion]. 38268 KOG3058: Uncharacterized conserved protein [Function unknown]. 38269 KOG3059: N-acetylglucosaminyltransferase complex, subunit PIG-C/GPI2, required for phosphatidylinositol biosynthesis [Lipid transport and metabolism]. 38270 KOG3060: Uncharacterized conserved protein [Function unknown]. 38271 KOG3061: Proteasome maturation factor [Posttranslational modification, protein turnover, chaperones]. 38272 KOG3062: RNA polymerase II elongator associated protein [General function prediction only]. 38273 KOG3063: Membrane coat complex Retromer, subunit VPS26 [Intracellular trafficking, secretion, and vesicular transport]. 38274 KOG3064: RNA-binding nuclear protein (MAK16) containing a distinct C4 Zn-finger [RNA processing and modification]. 38275 KOG3065: SNAP-25 (synaptosome-associated protein) component of SNARE complex [Intracellular trafficking, secretion, and vesicular transport]. 38276 KOG3066: Translin-associated protein X [General function prediction only]. 38277 KOG3067: Translin family protein [General function prediction only]. 38278 KOG3068: mRNA splicing factor [RNA processing and modification]. 38279 KOG3069: Peroxisomal NUDIX hydrolase [Replication, recombination and repair]. 38280 KOG3070: Predicted RNA-binding protein containing PIN domain and invovled in translation or RNA processing [Translation, ribosomal structure and biogenesis]. 38281 KOG3071: Fatty acyl-CoA elongase/Polyunsaturated fatty acid specific elongation enzyme [Lipid transport and metabolism]. 38282 KOG3072: Long chain fatty acid elongase [Lipid transport and metabolism]. 38283 KOG3073: Protein required for 18S rRNA maturation and 40S ribosome biogenesis [Translation, ribosomal structure and biogenesis]. 38284 KOG3074: Transcriptional regulator of the PUR family, single-stranded-DNA-binding [Transcription]. 38285 KOG3075: Ribose 5-phosphate isomerase [Carbohydrate transport and metabolism]. 38286 KOG3076: 5'-phosphoribosylglycinamide formyltransferase [Carbohydrate transport and metabolism]. 38287 KOG3077: Uncharacterized conserved protein [Function unknown]. 38288 KOG3078: Adenylate kinase [Nucleotide transport and metabolism]. 38289 KOG3079: Uridylate kinase/adenylate kinase [Nucleotide transport and metabolism]. 38290 KOG3080: Nucleolar protein-like/EBNA1-binding protein [RNA processing and modification]. 38291 KOG3081: Vesicle coat complex COPI, epsilon subunit [Intracellular trafficking, secretion, and vesicular transport]. 38292 KOG3082: Methionyl-tRNA formyltransferase [Translation, ribosomal structure and biogenesis]. 38293 KOG3083: Prohibitin [Posttranslational modification, protein turnover, chaperones]. 38294 KOG3084: NADH pyrophosphatase I of the Nudix family of hydrolases [Replication, recombination and repair]. 38295 KOG3085: Predicted hydrolase (HAD superfamily) [General function prediction only]. 38296 KOG3086: Predicted dioxygenase [General function prediction only]. 38297 KOG3087: Serine/threonine protein kinase [General function prediction only]. 38298 KOG3088: Secretory carrier membrane protein [Intracellular trafficking, secretion, and vesicular transport]. 38299 KOG3089: Predicted DEAD-box-containing helicase [General function prediction only]. 38300 KOG3090: Prohibitin-like protein [Posttranslational modification, protein turnover, chaperones]. 38301 KOG3091: Nuclear pore complex, p54 component (sc Nup57) [Nuclear structure, Intracellular trafficking, secretion, and vesicular transport]. 38302 KOG3092: Casein kinase II, beta subunit [Signal transduction mechanisms, Cell cycle control, cell division, chromosome partitioning, Transcription]. 38303 KOG3093: 5-formyltetrahydrofolate cyclo-ligase [Coenzyme transport and metabolism]. 38304 KOG3094: Predicted membrane protein [Function unknown]. 38305 KOG3095: Transcription initiation factor IIE, beta subunit [Transcription]. 38306 KOG3096: Spliceosome-associated coiled-coil protein [Function unknown]. 38307 KOG3097: Predicted membrane protein [Function unknown]. 38308 KOG3098: Uncharacterized conserved protein [Function unknown]. 38309 KOG3099: Bisphosphate 3'-nucleotidase BPNT1/Inositol polyphosphate 1-phosphatase [Nucleotide transport and metabolism]. 38310 KOG3100: Uncharacterized conserved protein [Function unknown]. 38311 KOG3101: Esterase D [General function prediction only]. 38312 KOG3102: Uncharacterized conserved protein [Function unknown]. 38313 KOG3103: Rab GTPase interacting factor, Golgi membrane protein [Intracellular trafficking, secretion, and vesicular transport]. 38314 KOG3104: Mod5 protein sorting/negative effector of RNA Pol III synthesis [Transcription]. 38315 KOG3105: DNA-binding centromere protein B (CENP-B) [Chromatin structure and dynamics, Cell cycle control, cell division, chromosome partitioning]. 38316 KOG3106: ER lumen protein retaining receptor [Intracellular trafficking, secretion, and vesicular transport]. 38317 KOG3107: Predicted haloacid dehalogenase-like hydrolase (eyes absent) [General function prediction only]. 38318 KOG3108: Single-stranded DNA-binding replication protein A (RPA), medium (30 kD) subunit [Replication, recombination and repair]. 38319 KOG3109: Haloacid dehalogenase-like hydrolase [General function prediction only]. 38320 KOG3110: Riboflavin kinase [Coenzyme transport and metabolism]. 38321 KOG3111: D-ribulose-5-phosphate 3-epimerase [Carbohydrate transport and metabolism]. 38322 KOG3112: Uncharacterized conserved protein [Function unknown]. 38323 KOG3113: Uncharacterized conserved protein [Function unknown]. 38324 KOG3114: Uncharacterized conserved protein [Function unknown]. 38325 KOG3115: Methyltransferase-like protein [General function prediction only]. 38326 KOG3116: Predicted C3H1-type Zn-finger protein [General function prediction only]. 38327 KOG3117: Protein involved in rRNA processing [RNA processing and modification]. 38328 KOG3118: Disrupter of silencing SAS10 [Chromatin structure and dynamics]. 38329 KOG3119: Basic region leucine zipper transcription factor [Transcription]. 38330 KOG3120: Predicted haloacid dehalogenase-like hydrolase [General function prediction only]. 38331 KOG3121: Dynactin, subunit p25 [Cytoskeleton]. 38332 KOG3122: DNA-directed RNA polymerase III subunit [Transcription]. 38333 KOG3123: Diphthine synthase [Translation, ribosomal structure and biogenesis]. 38334 KOG3124: Pyrroline-5-carboxylate reductase [Amino acid transport and metabolism]. 38335 KOG3125: Thymidine kinase [Nucleotide transport and metabolism]. 38336 KOG3126: Porin/voltage-dependent anion-selective channel protein [Inorganic ion transport and metabolism]. 38337 KOG3127: Deoxycytidylate deaminase [Nucleotide transport and metabolism]. 38338 KOG3128: Uncharacterized conserved protein [Function unknown]. 38339 KOG3129: 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]. 38340 KOG3130: Uncharacterized conserved protein [Function unknown]. 38341 KOG3131: Uncharacterized conserved protein [Function unknown]. 38342 KOG3132: m3G-cap-specific nuclear import receptor (Snurportin1) [RNA processing and modification]. 38343 KOG3133: 40 kDa farnesylated protein associated with peroxisomes [Intracellular trafficking, secretion, and vesicular transport]. 38344 KOG3134: Predicted membrane protein [Function unknown]. 38345 KOG3135: 1,4-benzoquinone reductase-like; Trp repressor binding protein-like/protoplast-secreted protein [General function prediction only]. 38346 KOG3136: Uncharacterized conserved protein [Function unknown]. 38347 KOG3137: Peptide deformylase [Translation, ribosomal structure and biogenesis]. 38348 KOG3138: Predicted N-acetyltransferase [General function prediction only]. 38349 KOG3139: N-acetyltransferase [General function prediction only]. 38350 KOG3140: Predicted membrane protein [Function unknown]. 38351 KOG3141: Mitochondrial/chloroplast ribosomal protein L3 [Translation, ribosomal structure and biogenesis]. 38352 KOG3142: Prenylated rab acceptor 1 [Intracellular trafficking, secretion, and vesicular transport]. 38353 KOG3143: Imidazoleglycerol-phosphate dehydratase [Amino acid transport and metabolism]. 38354 KOG3144: Ethanolamine-P-transferase GPI11/PIG-F, involved in glycosylphosphatidylinositol anchor biosynthesis [Cell wall/membrane/envelope biogenesis, Posttranslational modification, protein turnover, chaperones]. 38355 KOG3145: Cystine transporter Cystinosin [Amino acid transport and metabolism]. 38356 KOG3146: Dolichyl pyrophosphate phosphatase and related acid phosphatases [Lipid transport and metabolism]. 38357 KOG3147: 6-phosphogluconolactonase - like protein [Carbohydrate transport and metabolism]. 38358 KOG3148: Glucosamine-6-phosphate isomerase [Carbohydrate transport and metabolism]. 38359 KOG3149: Transcription initiation factor IIF, auxiliary subunit [Transcription]. 38360 KOG3150: Uncharacterized conserved protein [Function unknown]. 38361 KOG3151: 26S proteasome regulatory complex, subunit RPN12/PSMD8 [Posttranslational modification, protein turnover, chaperones]. 38362 KOG3152: TBP-binding protein, activator of basal transcription (contains rrm motif) [Transcription]. 38363 KOG3153: Thiamine pyrophosphokinase [Coenzyme transport and metabolism]. 38364 KOG3154: Uncharacterized conserved protein [Function unknown]. 38365 KOG3155: Actin-related protein Arp2/3 complex, subunit ARPC3 [Cytoskeleton]. 38366 KOG3156: Uncharacterized membrane protein [Function unknown]. 38367 KOG3157: Proline synthetase co-transcribed protein [General function prediction only]. 38368 KOG3158: HSP90 co-chaperone p23 [Posttranslational modification, protein turnover, chaperones]. 38369 KOG3159: Lipoate-protein ligase A [Coenzyme transport and metabolism]. 38370 KOG3160: Gamma-interferon inducible lysosomal thiol reductase [Posttranslational modification, protein turnover, chaperones]. 38371 KOG3161: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 38372 KOG3162: Mitochondrial/chloroplast ribosomal protein S18 [Translation, ribosomal structure and biogenesis]. 38373 KOG3163: Uncharacterized conserved protein related to ribosomal protein S8E [General function prediction only]. 38374 KOG3164: Uncharacterized proteins of PilT N-term./Vapc superfamily [General function prediction only]. 38375 KOG3165: Predicted nucleic-acid-binding protein, contains PIN domain [General function prediction only]. 38376 KOG3166: 60S ribosomal protein L7A [Translation, ribosomal structure and biogenesis]. 38377 KOG3167: Box H/ACA snoRNP component, involved in ribosomal RNA pseudouridinylation [RNA processing and modification]. 38378 KOG3168: U1 snRNP component [Transcription]. 38379 KOG3169: RNA polymerase II transcriptional regulation mediator [Transcription]. 38380 KOG3170: Conserved phosducin-like protein [Signal transduction mechanisms]. 38381 KOG3171: Conserved phosducin-like protein [Signal transduction mechanisms]. 38382 KOG3172: Small nuclear ribonucleoprotein Sm D3 [RNA processing and modification]. 38383 KOG3173: Predicted Zn-finger protein [General function prediction only]. 38384 KOG3174: F-actin capping protein, beta subunit [Cytoskeleton]. 38385 KOG3175: Protein phosphatase 4 regulatory subunit 2 related protein [General function prediction only]. 38386 KOG3176: Predicted alpha-helical protein, potentially involved in replication/repair [Replication, recombination and repair]. 38387 KOG3177: Oligoketide cyclase/lipid transport protein [Lipid transport and metabolism]. 38388 KOG3178: Hydroxyindole-O-methyltransferase and related SAM-dependent methyltransferases [General function prediction only]. 38389 KOG3179: Predicted glutamine synthetase [Nucleotide transport and metabolism]. 38390 KOG3180: Electron transfer flavoprotein, beta subunit [Energy production and conversion]. 38391 KOG3181: 40S ribosomal protein S3 [Translation, ribosomal structure and biogenesis]. 38392 KOG3182: Predicted cation transporter [Inorganic ion transport and metabolism]. 38393 KOG3183: Predicted Zn-finger protein [General function prediction only]. 38394 KOG3184: 60S ribosomal protein L7 [Translation, ribosomal structure and biogenesis]. 38395 KOG3185: Translation initiation factor 6 (eIF-6) [Translation, ribosomal structure and biogenesis]. 38396 KOG3186: Mitotic spindle checkpoint protein [Cell cycle control, cell division, chromosome partitioning]. 38397 KOG3187: Protein tyrosine phosphatase-like protein PTPLA (contains Pro instead of catalytic Arg) [General function prediction only]. 38398 KOG3188: Uncharacterized conserved protein [Function unknown]. 38399 KOG3189: Phosphomannomutase [Lipid transport and metabolism]. 38400 KOG3190: Uncharacterized conserved protein [Function unknown]. 38401 KOG3191: Predicted N6-DNA-methyltransferase [Translation, ribosomal structure and biogenesis]. 38402 KOG3192: Mitochondrial J-type chaperone [Posttranslational modification, protein turnover, chaperones]. 38403 KOG3193: K+ channel subunit [Inorganic ion transport and metabolism]. 38404 KOG3194: Checkpoint 9-1-1 complex, RAD1 component [Energy production and conversion, Replication, recombination and repair]. 38405 KOG3195: Uncharacterized membrane protein NPD008/CGI-148 [General function prediction only]. 38406 KOG3196: NADH:ubiquinone oxidoreductase, NDUFV2/24 kD subunit [Energy production and conversion]. 38407 KOG3197: Predicted hydrolases of HD superfamily [General function prediction only]. 38408 KOG3198: Signal recognition particle, subunit Srp19 [Intracellular trafficking, secretion, and vesicular transport]. 38409 KOG3199: Nicotinamide mononucleotide adenylyl transferase [Coenzyme transport and metabolism]. 38410 KOG3200: Uncharacterized conserved protein [Function unknown]. 38411 KOG3201: Uncharacterized conserved protein [Function unknown]. 38412 KOG3202: SNARE protein TLG1/Syntaxin 6 [Intracellular trafficking, secretion, and vesicular transport]. 38413 KOG3203: Mitochondrial/chloroplast ribosomal protein L13 [Translation, ribosomal structure and biogenesis]. 38414 KOG3204: 60S ribosomal protein L13a [Translation, ribosomal structure and biogenesis]. 38415 KOG3205: Rho GDP-dissociation inhibitor [Signal transduction mechanisms]. 38416 KOG3206: Alpha-tubulin folding cofactor B [Posttranslational modification, protein turnover, chaperones]. 38417 KOG3207: Beta-tubulin folding cofactor E [Posttranslational modification, protein turnover, chaperones]. 38418 KOG3208: SNARE protein GS28 [Intracellular trafficking, secretion, and vesicular transport]. 38419 KOG3209: WW domain-containing protein [General function prediction only]. 38420 KOG3210: Imidazoleglycerol-phosphate synthase subunit H-like [Coenzyme transport and metabolism]. 38421 KOG3211: Predicted endoplasmic reticulum membrane protein Lec35/MPDU1 involved in monosaccharide-P-dolichol utilization [General function prediction only]. 38422 KOG3212: Uncharacterized conserved protein related to IojAP [Function unknown]. 38423 KOG3213: Transcription factor IIB [Transcription]. 38424 KOG3214: Uncharacterized Zn ribbon-containing protein [Function unknown]. 38425 KOG3215: Uncharacterized conserved protein [Function unknown]. 38426 KOG3216: Diamine acetyltransferase [Amino acid transport and metabolism]. 38427 KOG3217: Protein tyrosine phosphatase [Signal transduction mechanisms]. 38428 KOG3218: RNA polymerase, 25-kDa subunit (common to polymerases I, II and III) [Transcription]. 38429 KOG3219: Transcription initiation factor TFIID, subunit TAF11 [Transcription]. 38430 KOG3220: Similar to bacterial dephospho-CoA kinase [Coenzyme transport and metabolism]. 38431 KOG3221: Glycolipid transfer protein [Carbohydrate transport and metabolism]. 38432 KOG3222: Inosine triphosphate pyrophosphatase [Nucleotide transport and metabolism]. 38433 KOG3223: Uncharacterized conserved protein [Function unknown]. 38434 KOG3224: Uncharacterized conserved protein [Function unknown]. 38435 KOG3225: Mitochondrial import inner membrane translocase, subunit TIM22 [Intracellular trafficking, secretion, and vesicular transport]. 38436 KOG3226: DNA repair protein [Replication, recombination and repair]. 38437 KOG3227: Calcium-responsive transcription coactivator [Transcription]. 38438 KOG3228: Uncharacterized conserved protein [Function unknown]. 38439 KOG3229: Vacuolar sorting protein VPS24 [Intracellular trafficking, secretion, and vesicular transport]. 38440 KOG3230: Vacuolar assembly/sorting protein DID4 [Intracellular trafficking, secretion, and vesicular transport]. 38441 KOG3231: Predicted assembly/vacuolar sorting protein [Intracellular trafficking, secretion, and vesicular transport]. 38442 KOG3232: Vacuolar assembly/sorting protein DID2 [Intracellular trafficking, secretion, and vesicular transport]. 38443 KOG3233: RNA polymerase III, subunit C34 [Transcription]. 38444 KOG3234: Acetyltransferase, (GNAT) family [General function prediction only]. 38445 KOG3235: Subunit of the major N alpha-acetyltransferase [General function prediction only]. 38446 KOG3236: Predicted membrane protein [Function unknown]. 38447 KOG3237: Uncharacterized conserved protein [Function unknown]. 38448 KOG3238: Chloride ion current inducer protein [Inorganic ion transport and metabolism]. 38449 KOG3239: Density-regulated protein related to translation initiation factor 1 (eIF-1/SUI1) [General function prediction only]. 38450 KOG3240: Phosphatidylinositol synthase [Lipid transport and metabolism]. 38451 KOG3241: Uncharacterized conserved protein [Function unknown]. 38452 KOG3242: Oligoribonuclease (3'->5' exoribonuclease) [RNA processing and modification]. 38453 KOG3243: 6,7-dimethyl-8-ribityllumazine synthase [Coenzyme transport and metabolism]. 38454 KOG3244: Protein involved in ubiquinone biosynthesis [Coenzyme transport and metabolism]. 38455 KOG3245: Uncharacterized conserved protein [Function unknown]. 38456 KOG3246: Sentrin-specific cysteine protease (Ulp1 family) [General function prediction only]. 38457 KOG3247: Uncharacterized conserved protein [Function unknown]. 38458 KOG3248: Transcription factor TCF-4 [Transcription]. 38459 KOG3249: Uncharacterized conserved protein [Function unknown]. 38460 KOG3250: COP9 signalosome, subunit CSN7 [Posttranslational modification, protein turnover, chaperones, Signal transduction mechanisms]. 38461 KOG3251: Golgi SNAP receptor complex member [Intracellular trafficking, secretion, and vesicular transport]. 38462 KOG3252: Uncharacterized conserved protein [Function unknown]. 38463 KOG3253: Predicted alpha/beta hydrolase [General function prediction only]. 38464 KOG3254: Mitochondrial/chloroplast ribosomal protein L6 [Translation, ribosomal structure and biogenesis]. 38465 KOG3255: 60S ribosomal protein L9 [Translation, ribosomal structure and biogenesis]. 38466 KOG3256: NADH:ubiquinone oxidoreductase, NDUFS8/23 kDa subunit [Energy production and conversion]. 38467 KOG3257: Mitochondrial/chloroplast ribosomal protein L11 [Translation, ribosomal structure and biogenesis]. 38468 KOG3258: Parvulin-like peptidyl-prolyl cis-trans isomerase [Posttranslational modification, protein turnover, chaperones]. 38469 KOG3259: Peptidyl-prolyl cis-trans isomerase [Posttranslational modification, protein turnover, chaperones]. 38470 KOG3260: Calcyclin-binding protein CacyBP [Signal transduction mechanisms]. 38471 KOG3261: Uncharacterized conserved protein [Function unknown]. 38472 KOG3262: H/ACA small nucleolar RNP component GAR1 [Translation, ribosomal structure and biogenesis]. 38473 KOG3263: Nucleic acid binding protein [General function prediction only]. 38474 KOG3264: Uncharacterized conserved protein [Function unknown]. 38475 KOG3265: Histone chaperone involved in gene silencing [Transcription, Chromatin structure and dynamics]. 38476 KOG3266: Predicted glycine cleavage system H protein [Amino acid transport and metabolism]. 38477 KOG3267: Uncharacterized conserved protein [Function unknown]. 38478 KOG3268: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 38479 KOG3269: Predicted membrane protein [Function unknown]. 38480 KOG3270: Uncharacterized conserved protein [Function unknown]. 38481 KOG3271: Translation initiation factor 5A (eIF-5A) [Translation, ribosomal structure and biogenesis]. 38482 KOG3272: Predicted coiled-coil protein [General function prediction only]. 38483 KOG3273: Predicted RNA-binding protein Pno1p interacting with Nob1p and involved in 26S proteasome assembly [Posttranslational modification, protein turnover, chaperones]. 38484 KOG3274: Uncharacterized conserved protein, AMMECR1 [Function unknown]. 38485 KOG3275: Zinc-binding protein of the histidine triad (HIT) family [Signal transduction mechanisms]. 38486 KOG3276: Uncharacterized conserved protein, contains YggU domain [Function unknown]. 38487 KOG3277: Uncharacterized conserved protein [Function unknown]. 38488 KOG3278: Mitochondrial/chloroplast ribosomal protein L28 [Translation, ribosomal structure and biogenesis]. 38489 KOG3279: Uncharacterized conserved protein (melanoma antigen P15) [Function unknown]. 38490 KOG3280: Mitochondrial/chloroplast ribosomal protein L17 [Translation, ribosomal structure and biogenesis]. 38491 KOG3281: Mitochondrial F1-ATPase assembly protein [Posttranslational modification, protein turnover, chaperones]. 38492 KOG3282: Uncharacterized conserved protein [Function unknown]. 38493 KOG3283: 40S ribosomal protein S8 [Translation, ribosomal structure and biogenesis]. 38494 KOG3284: Vacuolar sorting protein VPS28 [Intracellular trafficking, secretion, and vesicular transport]. 38495 KOG3285: Spindle assembly checkpoint protein [Cell cycle control, cell division, chromosome partitioning, Cytoskeleton]. 38496 KOG3286: Selenoprotein T [General function prediction only]. 38497 KOG3287: Membrane trafficking protein, emp24/gp25L/p24 family [Intracellular trafficking, secretion, and vesicular transport]. 38498 KOG3288: OTU-like cysteine protease [Signal transduction mechanisms, Posttranslational modification, protein turnover, chaperones]. 38499 KOG3289: Uncharacterized conserved protein encoded by sequence overlapping the COX4 gene [General function prediction only]. 38500 KOG3290: Peroxisomal phytanoyl-CoA hydroxylase [Lipid transport and metabolism]. 38501 KOG3291: Ribosomal protein S7 [Translation, ribosomal structure and biogenesis]. 38502 KOG3292: Predicted membrane protein [Function unknown]. 38503 KOG3293: Small nuclear ribonucleoprotein (snRNP) [RNA processing and modification]. 38504 KOG3294: WW domain binding protein WBP-2, contains GRAM domain [Signal transduction mechanisms]. 38505 KOG3295: 60S Ribosomal protein L13 [Translation, ribosomal structure and biogenesis]. 38506 KOG3296: Translocase of outer mitochondrial membrane complex, subunit TOM40 [Intracellular trafficking, secretion, and vesicular transport]. 38507 KOG3297: DNA-directed RNA polymerase subunit E' [Transcription]. 38508 KOG3298: DNA-directed RNA polymerase subunit E' [Transcription]. 38509 KOG3299: Uncharacterized conserved protein [Function unknown]. 38510 KOG3300: NADH:ubiquinone oxidoreductase, B16.6 subunit/cell death-regulatory protein [Energy production and conversion, Cell cycle control, cell division, chromosome partitioning]. 38511 KOG3301: Ribosomal protein S4 [Translation, ribosomal structure and biogenesis]. 38512 KOG3302: TATA-box binding protein (TBP), component of TFIID and TFIIIB [Transcription]. 38513 KOG3303: Predicted alpha-helical protein, potentially involved in replication/repair [Replication, recombination and repair]. 38514 KOG3304: Surfeit family protein 5 [General function prediction only]. 38515 KOG3305: Uncharacterized conserved protein [Function unknown]. 38516 KOG3306: Predicted membrane protein [Function unknown]. 38517 KOG3307: Molybdopterin converting factor subunit 2 [Coenzyme transport and metabolism]. 38518 KOG3308: Uncharacterized protein of the uridine kinase family [Nucleotide transport and metabolism]. 38519 KOG3309: Ferredoxin [Energy production and conversion]. 38520 KOG3310: Riboflavin synthase alpha chain [Coenzyme transport and metabolism]. 38521 KOG3311: Ribosomal protein S18 [Translation, ribosomal structure and biogenesis]. 38522 KOG3312: Predicted membrane protein [Function unknown]. 38523 KOG3313: Molecular chaperone Prefoldin, subunit 3 [Posttranslational modification, protein turnover, chaperones]. 38524 KOG3314: Ku70-binding protein [Replication, recombination and repair]. 38525 KOG3315: Transport protein particle (TRAPP) complex subunit [Intracellular trafficking, secretion, and vesicular transport]. 38526 KOG3316: Transport protein particle (TRAPP) complex subunit [Intracellular trafficking, secretion, and vesicular transport]. 38527 KOG3317: Translocon-associated complex TRAP, beta subunit [Intracellular trafficking, secretion, and vesicular transport]. 38528 KOG3318: Predicted membrane protein [Function unknown]. 38529 KOG3319: Predicted membrane protein [Function unknown]. 38530 KOG3320: 40S ribosomal protein S7 [Translation, ribosomal structure and biogenesis]. 38531 KOG3321: Mitochondrial ribosomal protein S10 [Translation, ribosomal structure and biogenesis]. 38532 KOG3322: Ribonucleases P/MRP protein subunit [RNA processing and modification]. 38533 KOG3323: D-Tyr-tRNA (Tyr) deacylase [Translation, ribosomal structure and biogenesis]. 38534 KOG3324: Mitochondrial import inner membrane translocase, subunit TIM23 [Intracellular trafficking, secretion, and vesicular transport]. 38535 KOG3325: Membrane coat complex Retromer, subunit VPS29/PEP11 [Intracellular trafficking, secretion, and vesicular transport]. 38536 KOG3326: Uncharacterized conserved protein [Function unknown]. 38537 KOG3327: Thymidylate kinase/adenylate kinase [Nucleotide transport and metabolism]. 38538 KOG3328: HGG motif-containing thioesterase [General function prediction only]. 38539 KOG3329: RAN guanine nucleotide release factor [Signal transduction mechanisms]. 38540 KOG3330: Transport protein particle (TRAPP) complex subunit [Intracellular trafficking, secretion, and vesicular transport]. 38541 KOG3331: Mitochondrial/chloroplast ribosomal protein L4/L29 [Translation, ribosomal structure and biogenesis]. 38542 KOG3332: N-acetylglucosaminyl phosphatidylinositol de-N-acetylase [Cell wall/membrane/envelope biogenesis]. 38543 KOG3333: Mitochondrial/chloroplast ribosomal protein L18 [Translation, ribosomal structure and biogenesis]. 38544 KOG3334: Transcription initiation factor TFIID, subunit TAF9 (also component of histone acetyltransferase SAGA) [Transcription]. 38545 KOG3335: Predicted coiled-coil protein [General function prediction only]. 38546 KOG3336: Predicted member of the intramitochondrial sorting protein family [Intracellular trafficking, secretion, and vesicular transport]. 38547 KOG3337: Protein similar to predicted member of the intramitochondrial sorting protein family [Intracellular trafficking, secretion, and vesicular transport]. 38548 KOG3338: Divalent cation tolerance-related protein [Inorganic ion transport and metabolism]. 38549 KOG3339: Predicted glycosyltransferase [General function prediction only]. 38550 KOG3340: Alpha-L-fucosidase [Carbohydrate transport and metabolism]. 38551 KOG3341: RNA polymerase II transcription factor complex subunit [Transcription]. 38552 KOG3342: Signal peptidase I [Intracellular trafficking, secretion, and vesicular transport]. 38553 KOG3343: Vesicle coat complex COPI, zeta subunit [Intracellular trafficking, secretion, and vesicular transport]. 38554 KOG3344: 40s ribosomal protein s10 [Translation, ribosomal structure and biogenesis]. 38555 KOG3345: Uncharacterized conserved protein [Function unknown]. 38556 KOG3346: Phosphatidylethanolamine binding protein [General function prediction only]. 38557 KOG3347: Predicted nucleotide kinase/nuclear protein involved oxidative stress response [Nucleotide transport and metabolism]. 38558 KOG3348: BolA (bacterial stress-induced morphogen)-related protein [Signal transduction mechanisms]. 38559 KOG3349: Predicted glycosyltransferase [General function prediction only]. 38560 KOG3350: Uncharacterized conserved protein [Function unknown]. 38561 KOG3351: Predicted nucleotidyltransferase [General function prediction only]. 38562 KOG3352: Cytochrome c oxidase, subunit Vb/COX4 [Energy production and conversion]. 38563 KOG3353: 60S ribosomal protein L22 [Translation, ribosomal structure and biogenesis]. 38564 KOG3354: Gluconate kinase [Carbohydrate transport and metabolism]. 38565 KOG3355: Mitochondrial sulfhydryl oxidase involved in the biogenesis of cytosolic Fe/S proteins [Posttranslational modification, protein turnover, chaperones]. 38566 KOG3356: Predicted membrane protein [Function unknown]. 38567 KOG3357: Uncharacterized conserved protein [Function unknown]. 38568 KOG3358: Uncharacterized secreted protein SDF2 (Stromal cell-derived factor 2), contains MIR domains [General function prediction only]. 38569 KOG3359: Dolichyl-phosphate-mannose:protein O-mannosyl transferase [Posttranslational modification, protein turnover, chaperones]. 38570 KOG3360: Acylphosphatase [Energy production and conversion]. 38571 KOG3361: Iron binding protein involved in Fe-S cluster formation [Energy production and conversion]. 38572 KOG3362: Predicted BBOX Zn-finger protein [General function prediction only]. 38573 KOG3363: Uncharacterized conserved nuclear protein [Function unknown]. 38574 KOG3364: Membrane protein involved in organellar division [Cell wall/membrane/envelope biogenesis]. 38575 KOG3365: NADH:ubiquinone oxidoreductase, NDUFA5/B13 subunit [Energy production and conversion]. 38576 KOG3366: Mitochondrial F1F0-ATP synthase, subunit d/ATP7 [Energy production and conversion]. 38577 KOG3367: Hypoxanthine-guanine phosphoribosyltransferase [Nucleotide transport and metabolism]. 38578 KOG3368: Transport protein particle (TRAPP) complex subunit [Intracellular trafficking, secretion, and vesicular transport]. 38579 KOG3369: Transport protein particle (TRAPP) complex subunit [Intracellular trafficking, secretion, and vesicular transport]. 38580 KOG3370: dUTPase [Nucleotide transport and metabolism]. 38581 KOG3371: Uncharacterized conserved protein [Function unknown]. 38582 KOG3372: Signal peptidase complex subunit [Intracellular trafficking, secretion, and vesicular transport]. 38583 KOG3373: Glycine cleavage system H protein (lipoate-binding) [Amino acid transport and metabolism]. 38584 KOG3374: Cellular repressor of transcription [Transcription]. 38585 KOG3375: Phosphoprotein/predicted coiled-coil protein [General function prediction only]. 38586 KOG3376: Uncharacterized conserved protein [Function unknown]. 38587 KOG3377: Uncharacterized conserved protein [Function unknown]. 38588 KOG3378: Hemoglobin-like flavoprotein [Energy production and conversion]. 38589 KOG3379: Diadenosine polyphosphate hydrolase and related proteins of the histidine triad (HIT) family [Nucleotide transport and metabolism, General function prediction only]. 38590 KOG3380: Actin-related protein Arp2/3 complex, subunit ARPC5 [Cytoskeleton]. 38591 KOG3381: Uncharacterized conserved protein [Function unknown]. 38592 KOG3382: NADH:ubiquinone oxidoreductase, B17.2 subunit [Energy production and conversion]. 38593 KOG3383: Uncharacterized conserved protein [Function unknown]. 38594 KOG3384: Selenoprotein [General function prediction only]. 38595 KOG3385: V-SNARE [Intracellular trafficking, secretion, and vesicular transport]. 38596 KOG3386: Copper transporter [Inorganic ion transport and metabolism]. 38597 KOG3387: 60S ribosomal protein 15.5kD/SNU13, NHP2/L7A family (includes ribonuclease P subunit p38), involved in splicing [RNA processing and modification, Translation, ribosomal structure and biogenesis]. 38598 KOG3388: Predicted transcription regulator/nuclease, contains ParB domain [Replication, recombination and repair]. 38599 KOG3389: NADH:ubiquinone oxidoreductase, NDUFS4/18 kDa subunit [Energy production and conversion]. 38600 KOG3390: General control of amino-acid synthesis 5-like 1 [Transcription]. 38601 KOG3391: Transcriptional co-repressor component [Transcription]. 38602 KOG3392: Exon-exon junction complex, Magoh component [RNA processing and modification]. 38603 KOG3393: Predicted membrane protein [Function unknown]. 38604 KOG3394: Protein OS-9 [General function prediction only]. 38605 KOG3395: Uncharacterized conserved protein [Function unknown]. 38606 KOG3396: Glucosamine-phosphate N-acetyltransferase [Cell wall/membrane/envelope biogenesis]. 38607 KOG3397: Acetyltransferases [General function prediction only]. 38608 KOG3398: Transcription factor MBF1 [Transcription]. 38609 KOG3399: Predicted Yippee-type zinc-binding protein [General function prediction only]. 38610 KOG3400: RNA polymerase subunit 8 [Transcription]. 38611 KOG3401: 60S ribosomal protein L26 [Translation, ribosomal structure and biogenesis]. 38612 KOG3402: Predicted membrane protein [Function unknown]. 38613 KOG3403: Translation initiation factor 1A (eIF-1A) [Translation, ribosomal structure and biogenesis]. 38614 KOG3404: G10 protein/predicted nuclear transcription regulator [Transcription]. 38615 KOG3405: RNA polymerase subunit K [Transcription]. 38616 KOG3406: 40S ribosomal protein S12 [Translation, ribosomal structure and biogenesis]. 38617 KOG3407: Uncharacterized conserved protein [Function unknown]. 38618 KOG3408: U1-like Zn-finger-containing protein, probabl erole in RNA processing/splicing [RNA processing and modification]. 38619 KOG3409: Exosomal 3'-5' exoribonuclease complex, subunit ski4 (Csl4) [Translation, ribosomal structure and biogenesis]. 38620 KOG3410: Conserved alpha-helical protein [Function unknown]. 38621 KOG3411: 40S ribosomal protein S19 [Translation, ribosomal structure and biogenesis]. 38622 KOG3412: 60S ribosomal protein L28 [Translation, ribosomal structure and biogenesis]. 38623 KOG3413: Mitochondrial matrix protein frataxin, involved in Fe/S protein biosynthesis [Inorganic ion transport and metabolism]. 38624 KOG3414: Component of the U4/U6.U5 snRNP/mitosis protein DIM1 [RNA processing and modification, Cell cycle control, cell division, chromosome partitioning]. 38625 KOG3415: Putative Rab5-interacting protein [Intracellular trafficking, secretion, and vesicular transport]. 38626 KOG3416: Predicted nucleic acid binding protein [General function prediction only]. 38627 KOG3417: Ras1 guanine nucleotide exchange factor [Signal transduction mechanisms]. 38628 KOG3418: 60S ribosomal protein L27 [Translation, ribosomal structure and biogenesis]. 38629 KOG3419: Mitochondrial/chloroplast ribosomal protein S16 [Translation, ribosomal structure and biogenesis]. 38630 KOG3420: Predicted RNA methylase [Translation, ribosomal structure and biogenesis]. 38631 KOG3421: 60S ribosomal protein L14 [Translation, ribosomal structure and biogenesis]. 38632 KOG3422: Mitochondrial ribosomal protein L16 [Translation, ribosomal structure and biogenesis]. 38633 KOG3423: Transcription initiation factor TFIID, subunit TAF10 (also component of histone acetyltransferase SAGA) [Transcription]. 38634 KOG3424: 40S ribosomal protein S24 [Translation, ribosomal structure and biogenesis]. 38635 KOG3425: Uncharacterized conserved protein [Function unknown]. 38636 KOG3426: NADH:ubiquinone oxidoreductase, NDUFA6/B14 subunit [Energy production and conversion]. 38637 KOG3427: Polyglutamine tract-binding protein PQBP-1 [Transcription]. 38638 KOG3428: Small nuclear ribonucleoprotein SMD1 and related snRNPs [RNA processing and modification]. 38639 KOG3429: Predicted peptidyl-tRNA hydrolase [Translation, ribosomal structure and biogenesis]. 38640 KOG3430: Dynein light chain type 1 [Cytoskeleton]. 38641 KOG3431: Apoptosis-related protein/predicted DNA-binding protein [Cell cycle control, cell division, chromosome partitioning]. 38642 KOG3432: Vacuolar H+-ATPase V1 sector, subunit F [Energy production and conversion]. 38643 KOG3433: Protein involved in meiotic recombination/predicted coiled-coil protein [Cell cycle control, cell division, chromosome partitioning, General function prediction only]. 38644 KOG3434: 60S ribosomal protein L22 [Translation, ribosomal structure and biogenesis]. 38645 KOG3435: Mitochondrial/chloroplast ribosomal protein L54/L37 [Translation, ribosomal structure and biogenesis]. 38646 KOG3436: 60S ribosomal protein L35 [Translation, ribosomal structure and biogenesis]. 38647 KOG3437: Anaphase-promoting complex (APC), subunit 10 [Cell cycle control, cell division, chromosome partitioning, Posttranslational modification, protein turnover, chaperones]. 38648 KOG3438: DNA-directed RNA polymerase, subunit L [Transcription]. 38649 KOG3439: Protein conjugation factor involved in autophagy [Posttranslational modification, protein turnover, chaperones]. 38650 KOG3440: Ubiquinol cytochrome c reductase, subunit QCR7 [Energy production and conversion]. 38651 KOG3441: Mitochondrial ribosomal protein L14 [Translation, ribosomal structure and biogenesis]. 38652 KOG3442: Uncharacterized conserved protein [Function unknown]. 38653 KOG3443: Uncharacterized conserved protein [Function unknown]. 38654 KOG3444: Uncharacterized conserved protein [Function unknown]. 38655 KOG3445: Mitochondrial/chloroplast ribosomal protein 36a [Translation, ribosomal structure and biogenesis]. 38656 KOG3446: NADH:ubiquinone oxidoreductase NDUFA2/B8 subunit [Energy production and conversion]. 38657 KOG3447: Mitochondrial/chloroplast ribosomal S17-like protein [Translation, ribosomal structure and biogenesis]. 38658 KOG3448: Predicted snRNP core protein [RNA processing and modification]. 38659 KOG3449: 60S acidic ribosomal protein P2 [Translation, ribosomal structure and biogenesis]. 38660 KOG3450: Huntingtin interacting protein HYPK [General function prediction only]. 38661 KOG3451: Uncharacterized conserved protein [Function unknown]. 38662 KOG3452: 60S ribosomal protein L36 [Translation, ribosomal structure and biogenesis]. 38663 KOG3453: Cytochrome c [Energy production and conversion]. 38664 KOG3454: U1 snRNP-specific protein C [RNA processing and modification]. 38665 KOG3455: Predicted membrane protein [Function unknown]. 38666 KOG3456: NADH:ubiquinone oxidoreductase, NDUFS6/13 kDa subunit [Energy production and conversion]. 38667 KOG3457: Sec61 protein translocation complex, beta subunit [Posttranslational modification, protein turnover, chaperones]. 38668 KOG3458: NADH:ubiquinone oxidoreductase, NDUFA8/PGIV/19 kDa subunit [Energy production and conversion]. 38669 KOG3459: Small nuclear ribonucleoprotein (snRNP) Sm core protein [RNA processing and modification]. 38670 KOG3460: Small nuclear ribonucleoprotein (snRNP) LSM3 [RNA processing and modification]. 38671 KOG3461: CDGSH-type Zn-finger containing protein [General function prediction only]. 38672 KOG3462: Predicted membrane protein [Function unknown]. 38673 KOG3463: Transcription initiation factor IIA, gamma subunit [Transcription]. 38674 KOG3464: 60S ribosomal protein L44 [Translation, ribosomal structure and biogenesis]. 38675 KOG3465: Signal recognition particle, subunit Srp9 [Intracellular trafficking, secretion, and vesicular transport]. 38676 KOG3466: NADH:ubiquinone oxidoreductase, NDUFB9/B22 subunit [Energy production and conversion]. 38677 KOG3467: Histone H4 [Chromatin structure and dynamics]. 38678 KOG3468: NADH:ubiquinone oxidoreductase, NDUFB7/B18 subunit [Energy production and conversion]. 38679 KOG3469: Cytochrome c oxidase, subunit VIa/COX13 [Energy production and conversion]. 38680 KOG3470: Beta-tubulin folding cofactor A [Posttranslational modification, protein turnover, chaperones]. 38681 KOG3471: RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH, subunit TFB2 [Transcription, Replication, recombination and repair]. 38682 KOG3472: Predicted small membrane protein [Function unknown]. 38683 KOG3473: RNA polymerase II transcription elongation factor Elongin/SIII, subunit elongin C [Transcription]. 38684 KOG3474: Molybdopterin converting factor, small subunit [Energy production and conversion]. 38685 KOG3475: 60S ribosomal protein L37 [Translation, ribosomal structure and biogenesis]. 38686 KOG3476: Microtubule-associated protein CRIPT [Cytoskeleton]. 38687 KOG3477: Putative cytochrome c oxidase, subunit COX19 [Energy production and conversion]. 38688 KOG3478: Prefoldin subunit 6, KE2 family [Posttranslational modification, protein turnover, chaperones]. 38689 KOG3479: Mitochondrial import inner membrane translocase, subunit TIM9 [Intracellular trafficking, secretion, and vesicular transport]. 38690 KOG3480: Mitochondrial import inner membrane translocase, subunits TIM10/TIM12 [Intracellular trafficking, secretion, and vesicular transport]. 38691 KOG3481: Uncharacterized conserved protein [Function unknown]. 38692 KOG3482: Small nuclear ribonucleoprotein (snRNP) SMF [RNA processing and modification]. 38693 KOG3483: Uncharacterized conserved protein [Function unknown]. 38694 KOG3484: Cyclin-dependent protein kinase CDC28, regulatory subunit CKS1, and related proteins [Cell cycle control, cell division, chromosome partitioning]. 38695 KOG3485: Uncharacterized conserved protein [Function unknown]. 38696 KOG3486: 40S ribosomal protein S21 [Translation, ribosomal structure and biogenesis]. 38697 KOG3487: TRAPP 20 K subunit [Intracellular trafficking, secretion, and vesicular transport]. 38698 KOG3488: Dolichol phosphate-mannose regulatory protein (DPM2) [Posttranslational modification, protein turnover, chaperones]. 38699 KOG3489: Mitochondrial import inner membrane translocase, subunit TIM8 [Intracellular trafficking, secretion, and vesicular transport]. 38700 KOG3490: Transcription elongation factor SPT4 [Transcription]. 38701 KOG3491: Predicted membrane protein [Function unknown]. 38702 KOG3492: Ribosome biogenesis protein NIP7 [Translation, ribosomal structure and biogenesis]. 38703 KOG3493: Ubiquitin-like protein [Posttranslational modification, protein turnover, chaperones]. 38704 KOG3494: Ubiquinol cytochrome c oxidoreductase, subunit QCR9 [Energy production and conversion]. 38705 KOG3495: Mitochondrial F1F0-ATP synthase, subunit epsilon/ATP15 [Energy production and conversion]. 38706 KOG3496: Cytochrome c oxidase assembly protein/Cu2+ chaperone COX17 [Posttranslational modification, protein turnover, chaperones]. 38707 KOG3497: DNA-directed RNA polymerase, subunit RPB10 [Transcription]. 38708 KOG3498: Preprotein translocase, gamma subunit [Intracellular trafficking, secretion, and vesicular transport]. 38709 KOG3499: 60S ribosomal protein L38 [Translation, ribosomal structure and biogenesis]. 38710 KOG3500: Vacuolar H+-ATPase V0 sector, subunit M9.7 (M9.2) [Energy production and conversion]. 38711 KOG3501: Molecular chaperone Prefoldin, subunit 1 [Posttranslational modification, protein turnover, chaperones]. 38712 KOG3502: 40S ribosomal protein S28 [Translation, ribosomal structure and biogenesis]. 38713 KOG3503: H/ACA snoRNP complex, subunit NOP10 [RNA processing and modification]. 38714 KOG3504: 60S ribosomal protein L29 [Translation, ribosomal structure and biogenesis]. 38715 KOG3505: Mitochondrial/chloroplast ribosomal protein L33-like [Translation, ribosomal structure and biogenesis]. 38716 KOG3506: 40S ribosomal protein S29 [Translation, ribosomal structure and biogenesis]. 38717 KOG3507: DNA-directed RNA polymerase, subunit RPB7.0 [Transcription]. 38718 KOG3508: GTPase-activating protein [General function prediction only]. 38719 KOG3509: Basement membrane-specific heparan sulfate proteoglycan (HSPG) core protein [Posttranslational modification, protein turnover, chaperones]. 38720 KOG3511: Sortilin and related receptors [General function prediction only]. 38721 KOG3512: Netrin, axonal chemotropic factor [Signal transduction mechanisms]. 38722 KOG3513: Neural cell adhesion molecule L1 [Signal transduction mechanisms]. 38723 KOG3514: Neurexin III-alpha [Signal transduction mechanisms]. 38724 KOG3515: Predicted transmembrane protein of the immunoglobulin family of cell adhesion molecules [General function prediction only]. 38725 KOG3516: Neurexin IV [Signal transduction mechanisms]. 38726 KOG3517: Transcription factor PAX1/9 [Transcription]. 38727 KOG3518: Putative guanine nucleotide exchange factor [General function prediction only]. 38728 KOG3519: Invasion-inducing protein TIAM1/CDC24 and related RhoGEF GTPases [Signal transduction mechanisms]. 38729 KOG3520: Predicted guanine nucleotide exchange factor [Signal transduction mechanisms]. 38730 KOG3521: Predicted guanine nucleotide exchange factor [Signal transduction mechanisms]. 38731 KOG3522: Predicted guanine nucleotide exchange factor [Signal transduction mechanisms]. 38732 KOG3523: Putative guanine nucleotide exchange factor TIM [Signal transduction mechanisms]. 38733 KOG3524: Predicted guanine nucleotide exchange factor (PEBBLE) [Signal transduction mechanisms]. 38734 KOG3525: Subtilisin-like proprotein convertase [Posttranslational modification, protein turnover, chaperones]. 38735 KOG3526: Subtilisin-like proprotein convertase [Posttranslational modification, protein turnover, chaperones]. 38736 KOG3527: Erythrocyte membrane protein 4.1 and related proteins of the ERM family [General function prediction only]. 38737 KOG3529: Radixin, moesin and related proteins of the ERM family [General function prediction only]. 38738 KOG3530: FERM domain protein EHM2 [General function prediction only]. 38739 KOG3531: Rho guanine nucleotide exchange factor CDEP [Signal transduction mechanisms]. 38740 KOG3532: Predicted protein kinase [General function prediction only]. 38741 KOG3533: Inositol 1,4,5-trisphosphate receptor [Signal transduction mechanisms]. 38742 KOG3534: p53 inducible protein PIR121 [General function prediction only]. 38743 KOG3535: Adaptor protein Disabled [Signal transduction mechanisms]. 38744 KOG3536: Adaptor protein CED-6, contains PTB domain [General function prediction only]. 38745 KOG3537: Adaptor protein NUMB [Signal transduction mechanisms]. 38746 KOG3538: Disintegrin metalloproteinases with thrombospondin repeats [Posttranslational modification, protein turnover, chaperones]. 38747 KOG3539: Spondins, extracellular matrix proteins [Extracellular structures]. 38748 KOG3540: Beta amyloid precursor protein [General function prediction only]. 38749 KOG3541: Predicted guanine nucleotide exchange factor [Signal transduction mechanisms]. 38750 KOG3542: cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]. 38751 KOG3543: Ca2+-dependent activator protein [Signal transduction mechanisms]. 38752 KOG3544: Collagens (type IV and type XIII), and related proteins [Extracellular structures]. 38753 KOG3545: Olfactomedin and related extracellular matrix glycoproteins [Extracellular structures]. 38754 KOG3546: Collagens (type XV) [Extracellular structures]. 38755 KOG3547: Bestrophin (Best vitelliform macular dystrophy-associated protein) [General function prediction only]. 38756 KOG3548: DNA damage checkpoint protein RHP9/CRB2/53BP1 [Replication, recombination and repair]. 38757 KOG3549: Syntrophins (type gamma) [Extracellular structures]. 38758 KOG3550: Receptor targeting protein Lin-7 [Extracellular structures]. 38759 KOG3551: Syntrophins (type beta) [Extracellular structures]. 38760 KOG3552: FERM domain protein FRM-8 [General function prediction only]. 38761 KOG3553: Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]. 38762 KOG3554: Histone deacetylase complex, MTA1 component [Chromatin structure and dynamics]. 38763 KOG3555: Ca2+-binding proteoglycan Testican [General function prediction only]. 38764 KOG3556: Familial cylindromatosis protein [General function prediction only]. 38765 KOG3557: Epidermal growth factor receptor kinase substrate [Signal transduction mechanisms]. 38766 KOG3558: Hypoxia-inducible factor 1/Neuronal PAS domain protein NPAS1 [Signal transduction mechanisms, Transcription]. 38767 KOG3559: Transcriptional regulator SIM1 [Transcription]. 38768 KOG3560: Aryl-hydrocarbon receptor [Transcription]. 38769 KOG3561: Aryl-hydrocarbon receptor nuclear translocator [Transcription]. 38770 KOG3562: Forkhead/HNF-3-related transcription factor [Transcription]. 38771 KOG3563: Forkhead/HNF-3-related transcription factor [Transcription]. 38772 KOG3564: GTPase-activating protein [General function prediction only]. 38773 KOG3565: Cdc42-interacting protein CIP4 [Cytoskeleton]. 38774 KOG3566: Glycosylphosphatidylinositol anchor attachment protein GAA1 [Posttranslational modification, protein turnover, chaperones]. 38775 KOG3567: Peptidylglycine alpha-amidating monooxygenase [Posttranslational modification, protein turnover, chaperones]. 38776 KOG3568: Dopamine beta-monooxygenase [Amino acid transport and metabolism]. 38777 KOG3569: RAS signaling inhibitor ST5 [Signal transduction mechanisms]. 38778 KOG3570: MAPK-activating protein DENN [Signal transduction mechanisms]. 38779 KOG3571: Dishevelled 3 and related proteins [General function prediction only]. 38780 KOG3572: Uncharacterized conserved protein, contains DEP domain [Signal transduction mechanisms]. 38781 KOG3573: Caspase, apoptotic cysteine protease [Cell cycle control, cell division, chromosome partitioning]. 38782 KOG3574: Acetyl-CoA transporter [Inorganic ion transport and metabolism]. 38783 KOG3576: Ovo and related transcription factors [Transcription]. 38784 KOG3577: Smoothened and related G-protein-coupled receptors [Signal transduction mechanisms]. 38785 KOG3578: Uncharacterized conserved protein [Function unknown]. 38786 KOG3579: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 38787 KOG3580: Tight junction proteins [Signal transduction mechanisms]. 38788 KOG3581: Creatine kinases [Energy production and conversion]. 38789 KOG3582: Mlx interactors and related transcription factors [Transcription]. 38790 KOG3583: Uncharacterized conserved protein [Function unknown]. 38791 KOG3584: cAMP response element binding protein and related transcription factors [Transcription]. 38792 KOG3585: TBX2 and related T-box transcription factors [Transcription]. 38793 KOG3586: TBX1 and related T-box transcription factors [Transcription]. 38794 KOG3587: Galectin, galactose-binding lectin [Extracellular structures]. 38795 KOG3588: Chondroitin synthase 1 [Carbohydrate transport and metabolism]. 38796 KOG3589: G protein signaling regulators [Signal transduction mechanisms]. 38797 KOG3590: Protein kinase A anchoring protein [Signal transduction mechanisms]. 38798 KOG3591: Alpha crystallins [Posttranslational modification, protein turnover, chaperones]. 38799 KOG3592: Microtubule-associated proteins [Cytoskeleton]. 38800 KOG3593: Predicted receptor-like serine/threonine kinase [Signal transduction mechanisms]. 38801 KOG3595: Dyneins, heavy chain [Cytoskeleton]. 38802 KOG3596: Uncharacterized conserved protein [Function unknown]. 38803 KOG3597: Proteoglycan [General function prediction only]. 38804 KOG3598: Thyroid hormone receptor-associated protein complex, subunit TRAP230 [Transcription]. 38805 KOG3599: Ca2+-modulated nonselective cation channel polycystin [Inorganic ion transport and metabolism, Signal transduction mechanisms]. 38806 KOG3600: Thyroid hormone receptor-associated protein complex, subunit TRAP240 [Transcription]. 38807 KOG3601: Adaptor protein GRB2, contains SH2 and SH3 domains [Signal transduction mechanisms]. 38808 KOG3602: WD40 repeat-containing protein [General function prediction only]. 38809 KOG3603: Predicted phospholipase D [General function prediction only]. 38810 KOG3604: Pecanex [Function unknown]. 38811 KOG3605: Beta amyloid precursor-binding protein [General function prediction only]. 38812 KOG3606: Cell polarity protein PAR6 [Signal transduction mechanisms]. 38813 KOG3607: Meltrins, fertilins and related Zn-dependent metalloproteinases of the ADAMs family [Posttranslational modification, protein turnover, chaperones]. 38814 KOG3608: Zn finger proteins [General function prediction only]. 38815 KOG3609: Receptor-activated Ca2+-permeable cation channels (STRPC family) [Inorganic ion transport and metabolism, Signal transduction mechanisms]. 38816 KOG3610: Plexins (functional semaphorin receptors) [Signal transduction mechanisms]. 38817 KOG3611: Semaphorins [Signal transduction mechanisms]. 38818 KOG3612: PHD Zn-finger protein [General function prediction only]. 38819 KOG3613: Dopey and related predicted leucine zipper transcription factors [Transcription]. 38820 KOG3614: Ca2+/Mg2+-permeable cation channels (LTRPC family) [Inorganic ion transport and metabolism, Signal transduction mechanisms]. 38821 KOG3615: Uncharacterized conserved protein [Function unknown]. 38822 KOG3616: Selective LIM binding factor [Transcription]. 38823 KOG3617: WD40 and TPR repeat-containing protein [General function prediction only]. 38824 KOG3618: Adenylyl cyclase [General function prediction only]. 38825 KOG3619: Adenylate/guanylate cyclase [Energy production and conversion]. 38826 KOG3620: Uncharacterized conserved protein [Function unknown]. 38827 KOG3621: WD40 repeat-containing protein [General function prediction only]. 38828 KOG3622: Uncharacterized conserved protein [Function unknown]. 38829 KOG3623: Homeobox transcription factor SIP1 [Transcription]. 38830 KOG3624: M13 family peptidase [Amino acid transport and metabolism]. 38831 KOG3625: Alpha amylase [Carbohydrate transport and metabolism]. 38832 KOG3626: Organic anion transporter [Secondary metabolites biosynthesis, transport and catabolism]. 38833 KOG3627: Trypsin [Amino acid transport and metabolism]. 38834 KOG3628: Predicted AMP-binding protein [General function prediction only]. 38835 KOG3629: Guanine-nucleotide releasing factor [Signal transduction mechanisms]. 38836 KOG3630: Nuclear pore complex, Nup214/CAN component [Nuclear structure, Intracellular trafficking, secretion, and vesicular transport]. 38837 KOG3631: Alpha-parvin and related focal adhesion proteins [Cytoskeleton]. 38838 KOG3632: Peripheral benzodiazepine receptor PRAX-1 [Signal transduction mechanisms, Cytoskeleton]. 38839 KOG3633: BAG family molecular chaperone regulator 2 [Posttranslational modification, protein turnover, chaperones]. 38840 KOG3634: Troponin [Cytoskeleton]. 38841 KOG3635: Phosphorylase kinase [Carbohydrate transport and metabolism]. 38842 KOG3636: Uncharacterized conserved protein, contains TBC and Rhodanese domains [General function prediction only]. 38843 KOG3637: Vitronectin receptor, alpha subunit [Extracellular structures]. 38844 KOG3638: Sonic hedgehog and related proteins [Signal transduction mechanisms]. 38845 KOG3639: C2 Ca2+-binding motif-containing protein [General function prediction only]. 38846 KOG3640: Actin binding protein Anillin [Cell cycle control, cell division, chromosome partitioning, Cytoskeleton]. 38847 KOG3641: Zinc carboxypeptidase [Amino acid transport and metabolism]. 38848 KOG3642: GABA receptor [Signal transduction mechanisms]. 38849 KOG3643: GABA receptor [Signal transduction mechanisms]. 38850 KOG3644: Ligand-gated ion channel [Signal transduction mechanisms]. 38851 KOG3645: Acetylcholine receptor [Signal transduction mechanisms]. 38852 KOG3646: Acetylcholine receptor [Intracellular trafficking, secretion, and vesicular transport]. 38853 KOG3647: Predicted coiled-coil protein [General function prediction only]. 38854 KOG3648: Golgi apparatus protein (cysteine-rich fibroblast growth factor receptor) [Intracellular trafficking, secretion, and vesicular transport]. 38855 KOG3650: Predicted coiled-coil protein [General function prediction only]. 38856 KOG3651: Protein kinase C, alpha binding protein [Signal transduction mechanisms]. 38857 KOG3652: Uncharacterized conserved protein [Function unknown]. 38858 KOG3653: Transforming growth factor beta/activin receptor subfamily of serine/threonine kinases [Signal transduction mechanisms]. 38859 KOG3654: Uncharacterized CH domain protein [Cytoskeleton]. 38860 KOG3655: Drebrins and related actin binding proteins [Cytoskeleton]. 38861 KOG3657: Mitochondrial DNA polymerase gamma, catalytic subunit [Replication, recombination and repair]. 38862 KOG3658: Tumor necrosis factor-alpha-converting enzyme (TACE/ADAM17) and related metalloproteases [Extracellular structures]. 38863 KOG3659: Sodium-neurotransmitter symporter [Signal transduction mechanisms]. 38864 KOG3660: Sodium-neurotransmitter symporter [Signal transduction mechanisms]. 38865 KOG3661: Uncharacterized conserved protein [Function unknown]. 38866 KOG3662: Cell division control protein/predicted DNA repair exonuclease [Replication, recombination and repair]. 38867 KOG3663: Nuclear factor I [Transcription]. 38868 KOG3664: Predicted patched transmembrane receptor [Signal transduction mechanisms]. 38869 KOG3665: ZYG-1-like serine/threonine protein kinases [General function prediction only]. 38870 KOG3666: Uncharacterized conserved protein [Function unknown]. 38871 KOG3667: STAT protein [Transcription, Signal transduction mechanisms]. 38872 KOG3668: Phosphatidylinositol transfer protein [Lipid transport and metabolism, Signal transduction mechanisms]. 38873 KOG3669: Uncharacterized conserved protein, contains dysferlin, TECPR and PH domains [General function prediction only]. 38874 KOG3670: Phospholipase [Lipid transport and metabolism]. 38875 KOG3671: Actin regulatory protein (Wiskott-Aldrich syndrome protein) [Signal transduction mechanisms, Cytoskeleton]. 38876 KOG3672: Histidine acid phosphatase [General function prediction only]. 38877 KOG3673: FtsJ-like RNA methyltransferase [RNA processing and modification]. 38878 KOG3674: FtsJ-like RNA methyltransferase [RNA processing and modification]. 38879 KOG3675: Dipeptidyl peptidase III [General function prediction only]. 38880 KOG3676: Ca2+-permeable cation channel OSM-9 and related channels (OTRPC family) [Inorganic ion transport and metabolism, Signal transduction mechanisms]. 38881 KOG3677: RNA polymerase I-associated factor - PAF67 [Translation, ribosomal structure and biogenesis, Transcription]. 38882 KOG3678: SARM protein (with sterile alpha and armadillo motifs) [Extracellular structures]. 38883 KOG3679: Predicted coiled-coil protein [General function prediction only]. 38884 KOG3680: Uncharacterized conserved protein [Function unknown]. 38885 KOG3681: Alpha-catenin [Extracellular structures]. 38886 KOG3682: Predicted membrane protein (associated with esophageal cancer in humans) [Function unknown]. 38887 KOG3683: Uncharacterized conserved protein [Function unknown]. 38888 KOG3684: Ca2+-activated K+ channel proteins (intermediate/small conductance classes) [Inorganic ion transport and metabolism]. 38889 KOG3685: Uncharacterized conserved protein [Function unknown]. 38890 KOG3686: Rap1-GTPase-activating protein (Rap1GAP) [Signal transduction mechanisms]. 38891 KOG3687: Tuberin - Rap/ran-GTPase-activating protein [Cell cycle control, cell division, chromosome partitioning, Signal transduction mechanisms]. 38892 KOG3688: Cyclic GMP phosphodiesterase [Signal transduction mechanisms]. 38893 KOG3689: Cyclic nucleotide phosphodiesterase [Signal transduction mechanisms]. 38894 KOG3690: Angiotensin I-converting enzymes - M2 family peptidases [Amino acid transport and metabolism]. 38895 KOG3691: Exocyst complex subunit Sec8 [Intracellular trafficking, secretion, and vesicular transport]. 38896 KOG3692: Uncharacterized conserved protein [Function unknown]. 38897 KOG3693: Uncharacterized conserved protein [Function unknown]. 38898 KOG3694: Protein required for meiosis [Cell cycle control, cell division, chromosome partitioning]. 38899 KOG3695: Uncharacterized conserved protein [Function unknown]. 38900 KOG3696: Aspartyl beta-hydroxylase [Posttranslational modification, protein turnover, chaperones]. 38901 KOG3697: Adaptor protein SHC and related proteins [Signal transduction mechanisms]. 38902 KOG3698: Hyaluronoglucosaminidase [Posttranslational modification, protein turnover, chaperones]. 38903 KOG3699: Cytoskeletal protein Adducin [Signal transduction mechanisms, Cytoskeleton]. 38904 KOG3700: Predicted acyltransferase [General function prediction only]. 38905 KOG3701: TGFbeta receptor signaling protein SMAD and related proteins [Signal transduction mechanisms, Transcription]. 38906 KOG3702: Nuclear polyadenylated RNA binding protein [RNA processing and modification]. 38907 KOG3703: Heparan sulfate N-deacetylase/N-sulfotransferase [Posttranslational modification, protein turnover, chaperones]. 38908 KOG3704: Heparan sulfate D-glucosaminyl 3-O-sulfotransferase [Posttranslational modification, protein turnover, chaperones]. 38909 KOG3705: Glycoprotein 6-alpha-L-fucosyltransferase [Posttranslational modification, protein turnover, chaperones]. 38910 KOG3706: Uncharacterized conserved protein [Function unknown]. 38911 KOG3707: Uncharacterized conserved protein [Function unknown]. 38912 KOG3708: Uncharacterized conserved protein [Function unknown]. 38913 KOG3709: PACS-1 cytosolic sorting protein [Intracellular trafficking, secretion, and vesicular transport]. 38914 KOG3710: EGL-Nine (EGLN) protein [Signal transduction mechanisms]. 38915 KOG3711: Uncharacterized conserved protein [Function unknown]. 38916 KOG3712: RFX family transcription factor [Transcription]. 38917 KOG3713: Voltage-gated K+ channel KCNB/KCNC [Inorganic ion transport and metabolism]. 38918 KOG3714: Meprin A metalloprotease [Posttranslational modification, protein turnover, chaperones]. 38919 KOG3715: LST7 amino acid permease Golgi transport protein [Intracellular trafficking, secretion, and vesicular transport, Amino acid transport and metabolism]. 38920 KOG3716: Carnitine O-acyltransferase CPTI [Lipid transport and metabolism]. 38921 KOG3717: Carnitine O-acyltransferase CRAT [Lipid transport and metabolism]. 38922 KOG3718: Carnitine O-acyltransferase CROT [Lipid transport and metabolism]. 38923 KOG3719: Carnitine O-acyltransferase CPT2/YAT1 [Lipid transport and metabolism]. 38924 KOG3720: Lysosomal & prostatic acid phosphatases [Lipid transport and metabolism]. 38925 KOG3721: Mitochondrial endonuclease [Nucleotide transport and metabolism]. 38926 KOG3722: Lipocalin-interacting membrane receptor (LIMR) [Defense mechanisms]. 38927 KOG3723: PH domain protein Melted [Signal transduction mechanisms]. 38928 KOG3724: Negative regulator of COPII vesicle formation [Intracellular trafficking, secretion, and vesicular transport]. 38929 KOG3725: SH3 domain protein SH3GLB [Signal transduction mechanisms]. 38930 KOG3726: Uncharacterized conserved protein [Function unknown]. 38931 KOG3727: Mitogen inducible gene product (contains ERM and PH domains) [Cell cycle control, cell division, chromosome partitioning]. 38932 KOG3728: Uridine phosphorylase [Nucleotide transport and metabolism]. 38933 KOG3729: Mitochondrial glycerol-3-phosphate acyltransferase GPAT [Lipid transport and metabolism]. 38934 KOG3730: Acyl-CoA:dihydroxyactetone-phosphate acyltransferase DHAPAT [Lipid transport and metabolism]. 38935 KOG3731: Sulfatases [Carbohydrate transport and metabolism]. 38936 KOG3732: Staufen and related double-stranded-RNA-binding proteins [Intracellular trafficking, secretion, and vesicular transport, Transcription]. 38937 KOG3733: Mucolipidin and related proteins (TRML subfamily of transient receptor potential proteins) [Inorganic ion transport and metabolism]. 38938 KOG3734: Predicted phosphoglycerate mutase [Carbohydrate transport and metabolism]. 38939 KOG3735: Tropomodulin and leiomodulin [Cytoskeleton]. 38940 KOG3736: Polypeptide N-acetylgalactosaminyltransferase [Posttranslational modification, protein turnover, chaperones]. 38941 KOG3737: Predicted polypeptide N-acetylgalactosaminyltransferase [Posttranslational modification, protein turnover, chaperones]. 38942 KOG3738: Predicted polypeptide N-acetylgalactosaminyltransferase [Posttranslational modification, protein turnover, chaperones]. 38943 KOG3739: Stress-activated MAP kinase-interacting protein, Sin1p [Signal transduction mechanisms]. 38944 KOG3740: Uncharacterized conserved protein [Function unknown]. 38945 KOG3741: Poly(A) ribonuclease subunit [RNA processing and modification]. 38946 KOG3742: Glycogen synthase [Carbohydrate transport and metabolism]. 38947 KOG3743: Recombination signal binding protein-J kappa(CBF1, Su(H), HS2NF5) [Transcription]. 38948 KOG3744: Uncharacterized conserved protein [Function unknown]. 38949 KOG3745: Exocyst subunit - Sec10p [Intracellular trafficking, secretion, and vesicular transport]. 38950 KOG3746: Uncharacterized conserved protein [Function unknown]. 38951 KOG3747: Concentrative Na+-nucleoside cotransporter CNT1/CNT2 [Nucleotide transport and metabolism, Inorganic ion transport and metabolism]. 38952 KOG3748: Uncharacterized conserved protein [Function unknown]. 38953 KOG3749: Phosphoenolpyruvate carboxykinase [Energy production and conversion]. 38954 KOG3750: Inositol phospholipid synthesis protein, Scs3p [Lipid transport and metabolism]. 38955 KOG3751: Growth factor receptor-bound proteins (GRB7, GRB10, GRB14) [Signal transduction mechanisms]. 38956 KOG3752: Ribonuclease H [Replication, recombination and repair]. 38957 KOG3753: Circadian clock protein period [Signal transduction mechanisms]. 38958 KOG3754: Gamma-glutamylcysteine synthetase [Coenzyme transport and metabolism]. 38959 KOG3755: SATB1 matrix attachment region binding protein [Transcription]. 38960 KOG3756: Pinin (desmosome-associated protein) [Cytoskeleton]. 38961 KOG3757: Microtubule assembly protein Doublecortin and related proteins, contain DCX domain [Cell cycle control, cell division, chromosome partitioning, Cytoskeleton]. 38962 KOG3758: Uncharacterized conserved protein [Function unknown]. 38963 KOG3759: Uncharacterized RUN domain protein [Signal transduction mechanisms]. 38964 KOG3760: Heparan sulfate-glucuronic acid C5-epimerase [Carbohydrate transport and metabolism]. 38965 KOG3761: Choline transporter [Lipid transport and metabolism]. 38966 KOG3762: Predicted transporter [General function prediction only]. 38967 KOG3763: mRNA export factor TAP/MEX67 [RNA processing and modification]. 38968 KOG3764: Vesicular amine transporter [Intracellular trafficking, secretion, and vesicular transport]. 38969 KOG3765: Predicted glycosyltransferase [Carbohydrate transport and metabolism]. 38970 KOG3766: Polycomb group protein SCM/L(3)MBT (tumor-supressor in Drosophila and humans) [Transcription]. 38971 KOG3767: Sideroflexin [General function prediction only]. 38972 KOG3768: DEAD box RNA helicase [General function prediction only]. 38973 KOG3769: Ribonuclease III domain proteins [Translation, ribosomal structure and biogenesis]. 38974 KOG3770: Acid sphingomyelinase and PHM5 phosphate metabolism protein [Lipid transport and metabolism]. 38975 KOG3771: Amphiphysin [Intracellular trafficking, secretion, and vesicular transport]. 38976 KOG3772: M-phase inducer phosphatase [Cell cycle control, cell division, chromosome partitioning]. 38977 KOG3773: Adiponutrin and related vesicular transport proteins; predicted alpha/beta hydrolase [Intracellular trafficking, secretion, and vesicular transport]. 38978 KOG3774: Uncharacterized conserved protein Lama [Signal transduction mechanisms]. 38979 KOG3775: Mitogen-activated protein kinase scaffold protein JIP [Signal transduction mechanisms]. 38980 KOG3776: Plasma membrane glycoprotein CD36 and related membrane receptors [Signal transduction mechanisms]. 38981 KOG3777: Uncharacterized conserved protein [Function unknown]. 38982 KOG3778: Uncharacterized conserved protein [Function unknown]. 38983 KOG3779: Homeobox transcription factor prospero [Transcription]. 38984 KOG3780: Thioredoxin binding protein TBP-2/VDUP1 [General function prediction only]. 38985 KOG3781: Dystroglycan [Extracellular structures]. 38986 KOG3782: Predicted membrane protein, contains type II SA sequence [General function prediction only]. 38987 KOG3783: Uncharacterized conserved protein [Function unknown]. 38988 KOG3784: Sorting nexin protein SNX27 [General function prediction only, Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport]. 38989 KOG3785: Uncharacterized conserved protein [Function unknown]. 38990 KOG3786: RNA polymerase II assessory factor Cdc73p [Transcription]. 38991 KOG3787: Glutamate/aspartate and neutral amino acid transporters [Amino acid transport and metabolism]. 38992 KOG3788: Predicted divalent cation transporter [Inorganic ion transport and metabolism]. 38993 KOG3789: Nitrogen permease regulator NLRG/NPR2 [Inorganic ion transport and metabolism]. 38994 KOG3790: Uncharacterized conserved protein [Function unknown]. 38995 KOG3791: Predicted RNA-binding protein involved in translational regulation [Translation, ribosomal structure and biogenesis, Signal transduction mechanisms]. 38996 KOG3792: Transcription factor NFAT, subunit NF90 [General function prediction only]. 38997 KOG3793: Transcription factor NFAT, subunit NF45 [Transcription]. 38998 KOG3794: CBF1-interacting corepressor CIR and related proteins [Transcription]. 38999 KOG3795: Uncharacterized conserved protein [Function unknown]. 39000 KOG3796: Ammonium transporter RHBG [Intracellular trafficking, secretion, and vesicular transport, General function prediction only]. 39001 KOG3797: Peripheral-type benzodiazepine receptor and related proteins [Signal transduction mechanisms]. 39002 KOG3798: Predicted Zn-dependent hydrolase (beta-lactamase superfamily) [General function prediction only]. 39003 KOG3799: Rab3 effector RIM1 and related proteins, contain Rab3a binding domain [Intracellular trafficking, secretion, and vesicular transport]. 39004 KOG3800: Predicted E3 ubiquitin ligase containing RING finger, subunit of transcription/repair factor TFIIH and CDK-activating kinase assembly factor [Posttranslational modification, protein turnover, chaperones]. 39005 KOG3801: Uncharacterized conserved protein BCN92 [RNA processing and modification]. 39006 KOG3802: Transcription factor OCT-1, contains POU and HOX domains [Transcription]. 39007 KOG3803: Transcription factor containing C2HC type Zn finger [Transcription]. 39008 KOG3804: Transcription factor NERF and related proteins, contain ETS domain [Transcription]. 39009 KOG3805: ERG and related ETS transcription factors [Transcription]. 39010 KOG3806: Predicted transcription factor [Transcription]. 39011 KOG3807: Predicted membrane protein ST7 (tumor suppressor in humans) [General function prediction only]. 39012 KOG3808: Uncharacterized conserved protein [Function unknown]. 39013 KOG3809: Microtubule-binding protein MIP-T3 [Cytoskeleton]. 39014 KOG3810: Micronutrient transporters (folate transporter family) [Coenzyme transport and metabolism]. 39015 KOG3811: Transcription factor AP-2 [Transcription]. 39016 KOG3812: L-type voltage-dependent Ca2+ channel, beta subunit [Inorganic ion transport and metabolism, Signal transduction mechanisms]. 39017 KOG3813: Uncharacterized conserved protein (tumor-suppressor AXUD1 in humans) [General function prediction only]. 39018 KOG3814: Signaling protein van gogh/strabismus [Signal transduction mechanisms]. 39019 KOG3815: Transcription factor Doublesex [Transcription]. 39020 KOG3816: Cell differentiation regulator of the Headcase family [Signal transduction mechanisms]. 39021 KOG3817: Uncharacterized conserved protein [Function unknown]. 39022 KOG3818: DNA polymerase epsilon, subunit B [Replication, recombination and repair]. 39023 KOG3819: Uncharacterized conserved proteins (Hepatitis delta antigen-interacting protein A) [Function unknown]. 39024 KOG3820: Aromatic amino acid hydroxylase [Amino acid transport and metabolism]. 39025 KOG3821: Heparin sulfate cell surface proteoglycan [Signal transduction mechanisms]. 39026 KOG3822: Succinyl-CoA:alpha-ketoacid-CoA transferase [Energy production and conversion]. 39027 KOG3823: Uncharacterized conserved protein [Function unknown]. 39028 KOG3824: Huntingtin interacting protein HYPE [General function prediction only]. 39029 KOG3825: Deoxyribonuclease II [Replication, recombination and repair]. 39030 KOG3826: Na+/H+ antiporter [Inorganic ion transport and metabolism]. 39031 KOG3827: Inward rectifier K+ channel [Inorganic ion transport and metabolism]. 39032 KOG3828: Uncharacterized conserved protein [Function unknown]. 39033 KOG3829: Uncharacterized conserved protein [Function unknown]. 39034 KOG3830: Uncharacterized conserved protein [Function unknown]. 39035 KOG3831: Uncharacterized conserved protein [Function unknown]. 39036 KOG3832: Predicted amino acid transporter [General function prediction only]. 39037 KOG3833: Uncharacterized conserved protein, contains RtcB domain [Function unknown]. 39038 KOG3834: Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]. 39039 KOG3835: Transcriptional corepressor NAB1 [Transcription]. 39040 KOG3836: HLH transcription factor EBF/Olf-1 and related DNA binding proteins [Transcription]. 39041 KOG3837: Uncharacterized conserved protein, contains DM14 and C2 domains [General function prediction only]. 39042 KOG3838: Mannose lectin ERGIC-53, involved in glycoprotein traffic [Intracellular trafficking, secretion, and vesicular transport]. 39043 KOG3839: Lectin VIP36, involved in the transport of glycoproteins carrying high mannose-type glycans [Intracellular trafficking, secretion, and vesicular transport]. 39044 KOG3840: Uncharaterized conserved protein, contains BTB/POZ domain [General function prediction only]. 39045 KOG3841: TEF-1 and related transcription factor, TEAD family [Transcription]. 39046 KOG3842: Adaptor protein Pellino [Signal transduction mechanisms]. 39047 KOG3843: Predicted serine hydroxymethyltransferase SLA/LP (autoimmune hepatitis marker in humans) [Translation, ribosomal structure and biogenesis]. 39048 KOG3844: Predicted component of NuA3 histone acetyltransferase complex [Chromatin structure and dynamics]. 39049 KOG3845: MLN, STAR and related lipid-binding proteins [Lipid transport and metabolism]. 39050 KOG3846: L-kynurenine hydrolase [Amino acid transport and metabolism]. 39051 KOG3847: Phospholipase A2 (platelet-activating factor acetylhydrolase in humans) [Lipid transport and metabolism]. 39052 KOG3848: Extracellular protein TEM7, contains PSI domain (tumor endothelial marker in humans) [Extracellular structures]. 39053 KOG3849: GDP-fucose protein O-fucosyltransferase [Posttranslational modification, protein turnover, chaperones]. 39054 KOG3850: Predicted membrane protein [Function unknown]. 39055 KOG3851: Sulfide:quinone oxidoreductase/flavo-binding protein [Energy production and conversion]. 39056 KOG3852: Uncharacterized conserved protein [Function unknown]. 39057 KOG3853: Inositol monophosphatase [Signal transduction mechanisms]. 39058 KOG3854: SPRT-like metalloprotease [Function unknown]. 39059 KOG3855: Monooxygenase involved in coenzyme Q (ubiquinone) biosynthesis [Coenzyme transport and metabolism, Energy production and conversion]. 39060 KOG3856: Uncharacterized conserved protein [Function unknown]. 39061 KOG3857: Alcohol dehydrogenase, class IV [Energy production and conversion]. 39062 KOG3858: Ephrin, ligand for Eph receptor tyrosine kinase [Signal transduction mechanisms]. 39063 KOG3859: Septins (P-loop GTPases) [Cell cycle control, cell division, chromosome partitioning]. 39064 KOG3860: Acyltransferase required for palmitoylation of Hedgehog (Hh) family of secreted signaling proteins [Signal transduction mechanisms]. 39065 KOG3861: Sensory cilia assembly protein [Extracellular structures]. 39066 KOG3862: Transcription factor PAX2/5/8, contains PAX domain [Transcription]. 39067 KOG3863: bZIP transcription factor NRF1 [Transcription]. 39068 KOG3864: Uncharacterized conserved protein [Function unknown]. 39069 KOG3865: Arrestin [Signal transduction mechanisms]. 39070 KOG3866: DNA-binding protein of the nucleobindin family [General function prediction only]. 39071 KOG3867: Sulfatase [General function prediction only]. 39072 KOG3868: Vacuolar H+-ATPase V0 sector, accessory subunit S1 (Ac45) [Energy production and conversion]. 39073 KOG3869: Uncharacterized conserved protein [Function unknown]. 39074 KOG3870: Uncharacterized conserved protein [Function unknown]. 39075 KOG3871: Cell adhesion complex protein bystin [Extracellular structures]. 39076 KOG3873: Sphingomyelinase family protein [Signal transduction mechanisms]. 39077 KOG3874: Uncharacterized conserved protein [Function unknown]. 39078 KOG3875: Peroxisomal biogenesis protein peroxin [Intracellular trafficking, secretion, and vesicular transport]. 39079 KOG3876: Arfaptin and related proteins [Signal transduction mechanisms]. 39080 KOG3877: NADH:ubiquinone oxidoreductase, NDUFA10/42kDa subunit [Energy production and conversion]. 39081 KOG3878: Protein involved in maintenance of Golgi structure and ER-Golgi transport [Intracellular trafficking, secretion, and vesicular transport]. 39082 KOG3879: Predicted membrane protein [Function unknown]. 39083 KOG3880: Predicted small molecule transporter involved in cellular pH homeostasis (Batten disease protein in human) [General function prediction only]. 39084 KOG3881: Uncharacterized conserved protein [Function unknown]. 39085 KOG3882: Tetraspanin family integral membrane protein [General function prediction only]. 39086 KOG3883: Ras family small GTPase [Signal transduction mechanisms]. 39087 KOG3884: Neural proliferation, differentiation and control protein [Signal transduction mechanisms]. 39088 KOG3885: Fibroblast growth factor [Signal transduction mechanisms]. 39089 KOG3886: GTP-binding protein [Signal transduction mechanisms]. 39090 KOG3887: Predicted small GTPase involved in nuclear protein import [Intracellular trafficking, secretion, and vesicular transport]. 39091 KOG3888: Gamma-butyrobetaine,2-oxoglutarate dioxygenase [Lipid transport and metabolism]. 39092 KOG3889: Predicted gamma-butyrobetaine,2-oxoglutarate dioxygenase [Lipid transport and metabolism]. 39093 KOG3890: Mitochondrial 28S ribosomal protein S22 [Translation, ribosomal structure and biogenesis]. 39094 KOG3891: Secretory vesicle-associated protein ICA69, contains Arfaptin domain [Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport]. 39095 KOG3892: N-acetyl-glucosamine-6-phosphate deacetylase [Carbohydrate transport and metabolism]. 39096 KOG3893: Mannosyltransferase [Carbohydrate transport and metabolism]. 39097 KOG3894: SNARE protein Syntaxin 18/UFE1 [Intracellular trafficking, secretion, and vesicular transport]. 39098 KOG3895: Synaptic vesicle protein Synapsin [Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport]. 39099 KOG3896: Dynactin, subunit p62 [Cell motility]. 39100 KOG3897: Conserved protein Knockout [Function unknown]. 39101 KOG3898: Transcription factor NeuroD and related HTH proteins [Transcription]. 39102 KOG3899: Uncharacterized conserved protein [Function unknown]. 39103 KOG3900: Transforming growth factor beta, bone morphogenetic protein and related proteins [Signal transduction mechanisms]. 39104 KOG3901: Transcription initiation factor IID subunit [Transcription]. 39105 KOG3902: Histone acetyltransferase PCAF/SAGA, subunit SUPT3H/SPT3 [Transcription]. 39106 KOG3903: Mitotic checkpoint protein PRCC [Cell cycle control, cell division, chromosome partitioning]. 39107 KOG3904: Predicted hydrolase RP2 (NUDIX/MutT superfamily) [Function unknown]. 39108 KOG3905: Dynein light intermediate chain [Cell motility]. 39109 KOG3906: Tryptophan 2,3-dioxygenase [Amino acid transport and metabolism]. 39110 KOG3907: ZIP-like zinc transporter proteins [Intracellular trafficking, secretion, and vesicular transport]. 39111 KOG3908: Queuine-tRNA ribosyltransferase [RNA processing and modification]. 39112 KOG3909: Queuine-tRNA ribosyltransferase [RNA processing and modification]. 39113 KOG3910: Helix loop helix transcription factor [Transcription]. 39114 KOG3911: Nucleolar protein NOP52/RRP1 [RNA processing and modification]. 39115 KOG3912: Predicted integral membrane protein [General function prediction only]. 39116 KOG3913: Wnt family of developmental regulators [Signal transduction mechanisms]. 39117 KOG3914: WD repeat protein WDR4 [Function unknown]. 39118 KOG3915: Transcription regulator dachshund, contains SKI/SNO domain [Transcription]. 39119 KOG3916: UDP-Gal:glucosylceramide beta-1,4-galactosyltransferase [Carbohydrate transport and metabolism]. 39120 KOG3917: Beta-1,4-galactosyltransferase B4GALT7/SQV-3 [Carbohydrate transport and metabolism]. 39121 KOG3918: Predicted membrane protein [Function unknown]. 39122 KOG3919: Kinesin-associated fasciculation and elongation protein involved in axonal transport [Intracellular trafficking, secretion, and vesicular transport]. 39123 KOG3920: Uncharacterized conserved protein, contains PA domain [General function prediction only]. 39124 KOG3921: Uncharacterized conserved protein [Function unknown]. 39125 KOG3922: Sulfotransferases [Posttranslational modification, protein turnover, chaperones]. 39126 KOG3923: D-aspartate oxidase [Amino acid transport and metabolism]. 39127 KOG3924: Putative protein methyltransferase involved in meiosis and transcriptional silencing (Dot1) [Cell cycle control, cell division, chromosome partitioning, Transcription]. 39128 KOG3925: Uncharacterized conserved protein [Function unknown]. 39129 KOG3926: F-box proteins [Amino acid transport and metabolism]. 39130 KOG3927: Na+/K+ ATPase, beta subunit [Inorganic ion transport and metabolism]. 39131 KOG3928: Mitochondrial ribosome small subunit component, mediator of apoptosis DAP3 [Translation, ribosomal structure and biogenesis]. 39132 KOG3929: Uncharacterized conserved protein [Function unknown]. 39133 KOG3930: Uncharacterized conserved protein [Function unknown]. 39134 KOG3931: Uncharacterized conserved protein [Function unknown]. 39135 KOG3932: CDK5 kinase activator p35/Nck5a [Cell cycle control, cell division, chromosome partitioning]. 39136 KOG3933: Mitochondrial ribosomal protein S28 [Translation, ribosomal structure and biogenesis]. 39137 KOG3934: Histone mRNA stem-loop binding protein [RNA processing and modification]. 39138 KOG3935: Predicted glycerate kinase [Carbohydrate transport and metabolism]. 39139 KOG3936: Nitroreductases [Energy production and conversion]. 39140 KOG3937: mRNA splicing factor [RNA processing and modification]. 39141 KOG3938: RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport]. 39142 KOG3939: Selenophosphate synthetase [Signal transduction mechanisms]. 39143 KOG3940: Uncharacterized conserved protein [Function unknown]. 39144 KOG3941: Intermediate in Toll signal transduction pathway (ECSIT) [Signal transduction mechanisms]. 39145 KOG3942: MIF4G domain-containing protein [Translation, ribosomal structure and biogenesis]. 39146 KOG3943: THUMP domain-containing proteins [General function prediction only]. 39147 KOG3944: Uncharacterized conserved protein [Function unknown]. 39148 KOG3945: Uncharacterized conserved protein [Function unknown]. 39149 KOG3946: Glutaminyl cyclase [Posttranslational modification, protein turnover, chaperones]. 39150 KOG3947: Phosphoesterases [General function prediction only]. 39151 KOG3948: Mediator of U snRNA nuclear export PHAX [RNA processing and modification]. 39152 KOG3949: RNA polymerase II elongator complex, subunit ELP4 [Chromatin structure and dynamics, Transcription]. 39153 KOG3950: Gamma/delta sarcoglycan [Cytoskeleton]. 39154 KOG3951: Uncharacterized conserved protein [Function unknown]. 39155 KOG3952: Uncharacterized conserved protein [Function unknown]. 39156 KOG3953: SOCS box protein SSB-1, contains SPRY domain [General function prediction only]. 39157 KOG3954: Electron transfer flavoprotein, alpha subunit [Energy production and conversion]. 39158 KOG3955: Heparan sulfate 6-O-sulfotransferase [Cell wall/membrane/envelope biogenesis, Carbohydrate transport and metabolism]. 39159 KOG3956: Alpha 2-macroglobulin receptor-associated protein [Posttranslational modification, protein turnover, chaperones, Intracellular trafficking, secretion, and vesicular transport, Signal transduction mechanisms, Lipid transport and metabolism, Defense mechanisms]. 39160 KOG3957: Predicted L-carnitine dehydratase/alpha-methylacyl-CoA racemase [Lipid transport and metabolism]. 39161 KOG3958: Putative dynamitin [Cytoskeleton]. 39162 KOG3959: 2-Oxoglutarate- and iron-dependent dioxygenase-related proteins [General function prediction only]. 39163 KOG3960: Myogenic helix-loop-helix transcription factor [Transcription]. 39164 KOG3961: Uncharacterized conserved protein [Function unknown]. 39165 KOG3962: Predicted actin-bundling protein [Cytoskeleton]. 39166 KOG3963: Mab-21-like cell fate specification proteins [Signal transduction mechanisms]. 39167 KOG3964: Phosphatidylglycerolphosphate synthase [Lipid transport and metabolism]. 39168 KOG3965: Uncharacterized conserved protein [Function unknown]. 39169 KOG3966: p53-mediated apoptosis protein EI24/PIG8 [Signal transduction mechanisms, Defense mechanisms]. 39170 KOG3967: Uncharacterized conserved protein [Function unknown]. 39171 KOG3968: Atrazine chlorohydrolase/guanine deaminase [Nucleotide transport and metabolism, Secondary metabolites biosynthesis, transport and catabolism]. 39172 KOG3969: Uncharacterized conserved protein [Function unknown]. 39173 KOG3970: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 39174 KOG3971: SAP-90/PSD-95 associated protein-related protein (Vulcan) [Signal transduction mechanisms]. 39175 KOG3972: Predicted membrane protein [Function unknown]. 39176 KOG3973: Uncharacterized conserved glycine-rich protein [Function unknown]. 39177 KOG3974: Predicted sugar kinase [Carbohydrate transport and metabolism]. 39178 KOG3975: Uncharacterized conserved protein [Function unknown]. 39179 KOG3976: Mitochondrial F1F0-ATP synthase, subunit b/ATP4 [Energy production and conversion]. 39180 KOG3977: Troponin I [Cytoskeleton]. 39181 KOG3978: Predicted membrane protein [Function unknown]. 39182 KOG3979: FGF receptor activating protein 1 [Signal transduction mechanisms]. 39183 KOG3980: RNA 3'-terminal phosphate cyclase [RNA processing and modification]. 39184 KOG3981: Deoxyribose-phosphate aldolase [Nucleotide transport and metabolism]. 39185 KOG3982: Runt and related transcription factors [Transcription]. 39186 KOG3983: Golgi protein [Intracellular trafficking, secretion, and vesicular transport]. 39187 KOG3984: Purine nucleoside phosphorylase [Nucleotide transport and metabolism]. 39188 KOG3985: Methylthioadenosine phosphorylase MTAP [Nucleotide transport and metabolism]. 39189 KOG3986: Protein phosphatase, regulatory subunit PPP1R3C/D [Posttranslational modification, protein turnover, chaperones, Signal transduction mechanisms]. 39190 KOG3987: Uncharacterized conserved protein DREV/CGI-81 [Function unknown]. 39191 KOG3988: Protein-tyrosine sulfotransferase TPST1/TPST2 [Posttranslational modification, protein turnover, chaperones]. 39192 KOG3989: Beta-2-glycoprotein I [Extracellular structures]. 39193 KOG3990: Uncharacterized conserved protein [Function unknown]. 39194 KOG3991: Uncharacterized conserved protein [Function unknown]. 39195 KOG3992: Uncharacterized conserved protein [Function unknown]. 39196 KOG3993: Transcription factor (contains Zn finger) [Transcription]. 39197 KOG3994: Uncharacterized conserved protein [Function unknown]. 39198 KOG3995: 3-hydroxyanthranilate oxygenase HAAO [Amino acid transport and metabolism]. 39199 KOG3996: Holocytochrome c synthase/heme-lyase [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 39200 KOG3997: Major apurinic/apyrimidinic endonuclease/3'-repair diesterase APN1 [Replication, recombination and repair]. 39201 KOG3998: Putative cargo transport protein ERV29 [Intracellular trafficking, secretion, and vesicular transport]. 39202 KOG3999: Checkpoint 9-1-1 complex, HUS1 component [Energy production and conversion, Replication, recombination and repair]. 39203 KOG4000: Uncharacterized conserved protein [Function unknown]. 39204 KOG4001: Axonemal dynein light chain [Cytoskeleton]. 39205 KOG4002: Uncharacterized integral membrane protein [Function unknown]. 39206 KOG4003: Pyrazinamidase/nicotinamidase PNC1 [Defense mechanisms]. 39207 KOG4004: Matricellular protein Osteonectin/SPARC/BM-40 [Extracellular structures]. 39208 KOG4005: Transcription factor XBP-1 [Transcription]. 39209 KOG4006: Anti-proliferation factor BTG1/TOB [Signal transduction mechanisms, General function prediction only]. 39210 KOG4007: Uncharacterized conserved protein [Function unknown]. 39211 KOG4008: rRNA processing protein RRP7 [RNA processing and modification]. 39212 KOG4009: NADH-ubiquinone oxidoreductase, subunit NDUFB10/PDSW [Energy production and conversion]. 39213 KOG4010: Coiled-coil protein TPD52 [General function prediction only]. 39214 KOG4011: Transcription initiation factor TFIID, subunit TAF7 [Transcription]. 39215 KOG4012: Histone H1 [Chromatin structure and dynamics]. 39216 KOG4013: Predicted Cu2+ homeostasis protein CutC [Inorganic ion transport and metabolism]. 39217 KOG4014: Uncharacterized conserved protein (contains TPR repeat) [Function unknown]. 39218 KOG4015: Fatty acid-binding protein FABP [Lipid transport and metabolism]. 39219 KOG4016: Synaptic vesicle protein Synaptogyrin involved in regulation of Ca2+-dependent exocytosis [Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport]. 39220 KOG4017: DNA excision repair protein XPA/XPAC/RAD14 [Replication, recombination and repair]. 39221 KOG4018: Uncharacterized conserved protein, contains RWD domain [Function unknown]. 39222 KOG4019: Calcineurin-mediated signaling pathway inhibitor DSCR1 [Signal transduction mechanisms, General function prediction only]. 39223 KOG4020: Protein DRE2, required for cell viability [Function unknown]. 39224 KOG4021: Mitochondrial ribosomal protein S18b [Translation, ribosomal structure and biogenesis]. 39225 KOG4022: Dihydropteridine reductase DHPR/QDPR [Amino acid transport and metabolism]. 39226 KOG4023: Uncharacterized conserved protein [Function unknown]. 39227 KOG4024: Complement component 1, Q subcomponent binding protein/mRNA splicing factor SF2, subunit P32 [Defense mechanisms]. 39228 KOG4025: Putative apoptosis related protein [Function unknown]. 39229 KOG4026: Uncharacterized conserved protein [Function unknown]. 39230 KOG4027: Uncharacterized conserved protein [Function unknown]. 39231 KOG4028: Uncharacterized conserved protein [Function unknown]. 39232 KOG4029: Transcription factor HAND2/Transcription factor TAL1/TAL2/LYL1 [Transcription]. 39233 KOG4030: Uncharacterized conserved protein, contains SPRY domain [Function unknown]. 39234 KOG4031: Vesicle coat protein clathrin, light chain [Intracellular trafficking, secretion, and vesicular transport]. 39235 KOG4032: Uncharacterized conserved protein [Function unknown]. 39236 KOG4033: Uncharacterized conserved protein [Function unknown]. 39237 KOG4034: Uncharacterized conserved protein NOF (Neighbor of FAU) [Function unknown]. 39238 KOG4035: Coeffector of mDia Rho GTPase, regulates actin polymerization and cell adhesion turnover [Signal transduction mechanisms, Cytoskeleton]. 39239 KOG4036: Uncharacterized conserved protein [Function unknown]. 39240 KOG4037: Photoreceptor synaptic vesicle protein HRG4/UNC-119 [Intracellular trafficking, secretion, and vesicular transport, Signal transduction mechanisms]. 39241 KOG4038: cGMP-phosphodiesterase, delta subunit [Signal transduction mechanisms]. 39242 KOG4039: Serine/threonine kinase TIP30/CC3 [Signal transduction mechanisms]. 39243 KOG4040: NADH:ubiquinone oxidoreductase, NDUFB8/ASHI subunit [Energy production and conversion]. 39244 KOG4041: Protein phosphatase 1, regulatory (inhibitor) subunit PPP1R2 [Posttranslational modification, protein turnover, chaperones, Signal transduction mechanisms]. 39245 KOG4042: Dynactin subunit p27/WS-3, involved in transport of organelles along microtubules [Intracellular trafficking, secretion, and vesicular transport, Cytoskeleton]. 39246 KOG4043: Uncharacterized conserved protein [Function unknown]. 39247 KOG4044: Mitochondrial associated endoribonuclease MAR1 (isochorismatase superfamily) [General function prediction only]. 39248 KOG4045: Uncharacterized conserved protein [Function unknown]. 39249 KOG4046: RNase MRP and P, subunit POP4/p29 [RNA processing and modification]. 39250 KOG4047: Docking protein 1 (p62dok) [Signal transduction mechanisms]. 39251 KOG4048: Uncharacterized conserved protein [Function unknown]. 39252 KOG4049: Proliferation-related protein MLF [Function unknown]. 39253 KOG4050: Glutamate transporter EAAC1-interacting protein GTRAP3-18 [Amino acid transport and metabolism, Signal transduction mechanisms]. 39254 KOG4051: Uncharacterized conserved protein [Function unknown]. 39255 KOG4052: Uncharacterized conserved protein [Function unknown]. 39256 KOG4053: Ataxin-1, involved in Ca2+ homeostasis [Function unknown]. 39257 KOG4054: Uncharacterized conserved protein [Function unknown]. 39258 KOG4055: Uncharacterized conserved protein [Function unknown]. 39259 KOG4056: Translocase of outer mitochondrial membrane complex, subunit TOM20 [Intracellular trafficking, secretion, and vesicular transport]. 39260 KOG4057: Uncharacterized conserved protein [Function unknown]. 39261 KOG4058: Uncharacterized conserved protein [Function unknown]. 39262 KOG4059: Uncharacterized conserved protein [Function unknown]. 39263 KOG4060: Uncharacterized conserved protein [Function unknown]. 39264 KOG4061: DMQ mono-oxygenase/Ubiquinone biosynthesis protein COQ7/CLK-1/CAT5 [General function prediction only]. 39265 KOG4062: 6-O-methylguanine-DNA methyltransferase MGMT/MGT1, involved in DNA repair [Replication, recombination and repair]. 39266 KOG4063: Major epididymal secretory protein HE1 [Function unknown]. 39267 KOG4064: Cysteine dioxygenase CDO1 [Amino acid transport and metabolism]. 39268 KOG4065: Uncharacterized conserved protein [Function unknown]. 39269 KOG4066: Cell growth regulatory protein CGR11 [Function unknown]. 39270 KOG4067: Uncharacterized conserved protein [Function unknown]. 39271 KOG4068: Uncharacterized conserved protein [Function unknown]. 39272 KOG4069: Uncharacterized conserved protein [Function unknown]. 39273 KOG4070: Putative signal transduction protein p25 [General function prediction only, Signal transduction mechanisms]. 39274 KOG4071: Uncharacterized conserved protein [Function unknown]. 39275 KOG4072: Signal peptidase complex, subunit SPC25 [Intracellular trafficking, secretion, and vesicular transport]. 39276 KOG4073: Pterin carbinolamine dehydratase PCBD/dimerization cofactor of HNF1 [Transcription]. 39277 KOG4074: Leucine zipper nuclear factor [Function unknown]. 39278 KOG4075: Cytochrome c oxidase, subunit IV/COX5b [Energy production and conversion]. 39279 KOG4076: Regulator of ATP-sensitive K+ channels Alpha-endosulfine/ARPP-19 and related cAMP-regulated phosphoproteins [Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport]. 39280 KOG4077: Cytochrome c oxidase, subunit Va/COX6 [Energy production and conversion]. 39281 KOG4078: Putative mitochondrial ribosomal protein mRpS35 [Translation, ribosomal structure and biogenesis]. 39282 KOG4079: Putative mitochondrial ribosomal protein mRpS25 [Translation, ribosomal structure and biogenesis]. 39283 KOG4080: Mitochondrial ribosomal protein L32 [Translation, ribosomal structure and biogenesis]. 39284 KOG4081: Dynein light chain [Cell motility]. 39285 KOG4082: Uncharacterized conserved protein [Function unknown]. 39286 KOG4083: Head-elevated expression protein [Transcription]. 39287 KOG4084: Transmembrane protein [General function prediction only]. 39288 KOG4085: Uncharacterized conserved protein [Function unknown]. 39289 KOG4086: Transcriptional regulator SOH1 [Transcription, Replication, recombination and repair]. 39290 KOG4087: Phospholipase A2 [Lipid transport and metabolism]. 39291 KOG4088: Translocon-associated complex TRAP, delta subunit [Intracellular trafficking, secretion, and vesicular transport]. 39292 KOG4089: Predicted mitochondrial ribosomal protein L23 [Translation, ribosomal structure and biogenesis]. 39293 KOG4090: Uncharacterized conserved protein [Function unknown]. 39294 KOG4091: Transcription factor [Transcription]. 39295 KOG4092: Mitochondrial F1F0-ATP synthase, subunit f [Energy production and conversion]. 39296 KOG4093: Uncharacterized conserved protein [Function unknown]. 39297 KOG4094: Uncharacterized conserved protein [Function unknown]. 39298 KOG4095: Uncharacterized conserved protein (tumor-specific protein BCL7 in humans) [General function prediction only]. 39299 KOG4096: Uncharacterized conserved protein [Function unknown]. 39300 KOG4097: Succinate dehydrogenase membrane anchor subunit and related proteins [Energy production and conversion, Intracellular trafficking, secretion, and vesicular transport]. 39301 KOG4098: Molecular chaperone Prefoldin, subunit 2 [Posttranslational modification, protein turnover, chaperones]. 39302 KOG4099: Predicted membrane protein [Function unknown]. 39303 KOG4100: Uncharacterized conserved protein [Function unknown]. 39304 KOG4101: Cysteine-rich hydrophobic proteins [General function prediction only]. 39305 KOG4102: Uncharacterized conserved protein [Function unknown]. 39306 KOG4103: Mitochondrial F1F0-ATP synthase, subunit g/ATP20 [Energy production and conversion]. 39307 KOG4104: Ganglioside-induced differentiation associated protein 3 [Signal transduction mechanisms]. 39308 KOG4105: 6-pyruvoyl tetrahydrobiopterin synthase [Coenzyme transport and metabolism]. 39309 KOG4106: Uncharacterized conserved protein [Function unknown]. 39310 KOG4107: MP1 adaptor interacting protein P14 [Signal transduction mechanisms]. 39311 KOG4108: Dynein light chain [Cell motility]. 39312 KOG4109: Histone H3 (Lys4) methyltransferase complex, subunit CPS25/DPY-30 [Transcription]. 39313 KOG4110: NADH:ubiquinone oxidoreductase, NDUFS5/15kDa [Energy production and conversion]. 39314 KOG4111: Translocase of outer mitochondrial membrane complex, subunit TOM22 [Intracellular trafficking, secretion, and vesicular transport]. 39315 KOG4112: Signal peptidase subunit [Intracellular trafficking, secretion, and vesicular transport]. 39316 KOG4113: Guanine nucleotide exchange factor [Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport]. 39317 KOG4114: Cytochrome c oxidase assembly protein PET191 [Posttranslational modification, protein turnover, chaperones]. 39318 KOG4115: Dynein-associated protein Roadblock [Cell cycle control, cell division, chromosome partitioning, Cell motility]. 39319 KOG4116: Ubiquinol cytochrome c reductase, subunit QCR8 [Energy production and conversion]. 39320 KOG4117: Heat shock factor binding protein [Transcription, Posttranslational modification, protein turnover, chaperones]. 39321 KOG4118: Uncharacterized conserved protein [Function unknown]. 39322 KOG4119: G protein gamma subunit [Signal transduction mechanisms]. 39323 KOG4120: G/T mismatch-specific thymine DNA glycosylase [Replication, recombination and repair]. 39324 KOG4121: Nuclear pore complex, Nup133 component (sc Nup133) [Nuclear structure, Intracellular trafficking, secretion, and vesicular transport]. 39325 KOG4122: Mitochondrial/chloroplast ribosomal protein L36 [Translation, ribosomal structure and biogenesis]. 39326 KOG4123: Putative alpha 1,2 mannosyltransferase [Carbohydrate transport and metabolism]. 39327 KOG4124: Putative transcriptional repressor regulating G2/M transition [Transcription, Cell cycle control, cell division, chromosome partitioning]. 39328 KOG4125: Acid trehalase [Carbohydrate transport and metabolism]. 39329 KOG4126: Alkaline phosphatase [Inorganic ion transport and metabolism]. 39330 KOG4127: Renal dipeptidase [Posttranslational modification, protein turnover, chaperones]. 39331 KOG4128: Bleomycin hydrolases and aminopeptidases of cysteine protease family [Amino acid transport and metabolism]. 39332 KOG4129: Exopolyphosphatases and related proteins [Energy production and conversion]. 39333 KOG4130: Prenyl protein protease [Posttranslational modification, protein turnover, chaperones]. 39334 KOG4131: Ngg1-interacting factor 3 protein NIF3L1 [General function prediction only]. 39335 KOG4132: Uroporphyrinogen III synthase UROS/HEM4 [Coenzyme transport and metabolism]. 39336 KOG4133: tRNA splicing endonuclease [Translation, ribosomal structure and biogenesis]. 39337 KOG4134: DNA-dependent RNA polymerase I [Transcription]. 39338 KOG4135: Predicted phosphoglucosamine acetyltransferase [Carbohydrate transport and metabolism]. 39339 KOG4136: Predicted mitochondrial cholesterol transporter [Signal transduction mechanisms, Lipid transport and metabolism]. 39340 KOG4137: Uncharacterized conserved protein [Function unknown]. 39341 KOG4138: Unchracterized conserved protein (estrogen up-regulated protein E2IG2 in human) [General function prediction only]. 39342 KOG4139: Protein kinase essential for the initiation of DNA replication [Replication, recombination and repair, Cell cycle control, cell division, chromosome partitioning]. 39343 KOG4140: Nuclear protein Ataxin-7 [Chromatin structure and dynamics]. 39344 KOG4141: DNA repair and recombination protein RAD52/RAD22 [Replication, recombination and repair]. 39345 KOG4142: Phospholipid methyltransferase [Lipid transport and metabolism]. 39346 KOG4143: Sigma receptor and C-8 sterol isomerase [Signal transduction mechanisms]. 39347 KOG4144: Arylalkylamine N-acetyltransferase [General function prediction only]. 39348 KOG4145: Allantoicase [Nucleotide transport and metabolism]. 39349 KOG4146: Ubiquitin-like protein [Posttranslational modification, protein turnover, chaperones]. 39350 KOG4147: Uncharacterized conserved protein [Function unknown]. 39351 KOG4148: Uncharacterized conserved protein [Function unknown]. 39352 KOG4149: Uncharacterized conserved protein [Function unknown]. 39353 KOG4150: Predicted ATP-dependent RNA helicase [RNA processing and modification]. 39354 KOG4151: Myosin assembly protein/sexual cycle protein and related proteins [Posttranslational modification, protein turnover, chaperones, Cell cycle control, cell division, chromosome partitioning, General function prediction only]. 39355 KOG4152: Host cell transcription factor HCFC1 [Cell cycle control, cell division, chromosome partitioning, Transcription]. 39356 KOG4153: Fructose 1,6-bisphosphate aldolase [Carbohydrate transport and metabolism]. 39357 KOG4154: Arginine-rich protein [General function prediction only]. 39358 KOG4156: Claspin, protein mediating phosphorylation and activation of Chk1 protein kinase in the DNA replication checkpoint response [Cell cycle control, cell division, chromosome partitioning, Signal transduction mechanisms]. 39359 KOG4157: beta-1,6-N-acetylglucosaminyltransferase, contains WSC domain [Carbohydrate transport and metabolism, Posttranslational modification, protein turnover, chaperones]. 39360 KOG4158: BRPK/PTEN-induced protein kinase [Signal transduction mechanisms]. 39361 KOG4159: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 39362 KOG4160: BPI/LBP/CETP family protein [Defense mechanisms]. 39363 KOG4161: Methyl-CpG binding transcription regulators [Transcription, Chromatin structure and dynamics]. 39364 KOG4162: Predicted calmodulin-binding protein [Signal transduction mechanisms]. 39365 KOG4163: Prolyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 39366 KOG4164: Cyclin ik3-1/CABLES [Cell cycle control, cell division, chromosome partitioning]. 39367 KOG4165: Gamma-glutamyl phosphate reductase [Amino acid transport and metabolism]. 39368 KOG4166: Thiamine pyrophosphate-requiring enzyme [Amino acid transport and metabolism, Coenzyme transport and metabolism]. 39369 KOG4167: Predicted DNA-binding protein, contains SANT and ELM2 domains [Transcription]. 39370 KOG4168: Predicted RNA polymerase III subunit C17 [Transcription]. 39371 KOG4169: 15-hydroxyprostaglandin dehydrogenase and related dehydrogenases [Lipid transport and metabolism, General function prediction only]. 39372 KOG4170: 2-enoyl-CoA hydratase/3-hydroxyacyl-CoA dehydrogenase/Peroxisomal 3-ketoacyl-CoA-thiolase, sterol-binding domain and related enzymes [Lipid transport and metabolism]. 39373 KOG4171: Adenylate/guanylate kinase [Nucleotide transport and metabolism]. 39374 KOG4172: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 39375 KOG4173: Alpha-SNAP protein [Intracellular trafficking, secretion, and vesicular transport]. 39376 KOG4174: Uncharacterized conserved protein [Function unknown]. 39377 KOG4175: Tryptophan synthase alpha chain [Amino acid transport and metabolism]. 39378 KOG4176: Uncharacterized conserved protein [Function unknown]. 39379 KOG4177: Ankyrin [Cell wall/membrane/envelope biogenesis]. 39380 KOG4178: Soluble epoxide hydrolase [Lipid transport and metabolism]. 39381 KOG4179: Lysyl hydrolase/glycosyltransferase family 25 [Posttranslational modification, protein turnover, chaperones]. 39382 KOG4180: Predicted kinase [General function prediction only]. 39383 KOG4181: Uncharacterized conserved protein [Function unknown]. 39384 KOG4182: Uncharacterized conserved protein [Function unknown]. 39385 KOG4183: RNA polymerase I 49 kDa subunit [Transcription]. 39386 KOG4184: Predicted sugar kinase [Carbohydrate transport and metabolism, General function prediction only]. 39387 KOG4185: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 39388 KOG4186: Peroxisomal biogenesis protein (peroxin) [Intracellular trafficking, secretion, and vesicular transport]. 39389 KOG4187: Proprotein convertase (PC) 2 chaperone involved in secretion (neuroendocrine protein 7B2) [Posttranslational modification, protein turnover, chaperones, Intracellular trafficking, secretion, and vesicular transport]. 39390 KOG4188: Uncharacterized conserved protein [Function unknown]. 39391 KOG4189: Uncharacterized conserved protein [Function unknown]. 39392 KOG4190: Uncharacterized conserved protein [Function unknown]. 39393 KOG4191: Histone acetyltransferases PCAF/SAGA/ADA, subunit TADA3L/NGG1 [Chromatin structure and dynamics]. 39394 KOG4192: Uncharacterized conserved protein [Function unknown]. 39395 KOG4193: G protein-coupled receptors [Signal transduction mechanisms]. 39396 KOG4194: Membrane glycoprotein LIG-1 [Signal transduction mechanisms]. 39397 KOG4195: Transient receptor potential-related channel 7 [Inorganic ion transport and metabolism]. 39398 KOG4196: bZIP transcription factor MafK [Transcription]. 39399 KOG4198: RNA-binding Ran Zn-finger protein and related proteins [General function prediction only]. 39400 KOG4199: Uncharacterized conserved protein [Function unknown]. 39401 KOG4200: Uncharacterized conserved protein [Function unknown]. 39402 KOG4201: Anthranilate synthase component II [Amino acid transport and metabolism]. 39403 KOG4202: Phosphoribosylanthranilate isomerase [Amino acid transport and metabolism]. 39404 KOG4203: Armadillo/beta-Catenin/plakoglobin [Signal transduction mechanisms, Cytoskeleton]. 39405 KOG4204: Histone deacetylase complex, SIN3 component [Chromatin structure and dynamics]. 39406 KOG4205: RNA-binding protein musashi/mRNA cleavage and polyadenylation factor I complex, subunit HRP1 [RNA processing and modification]. 39407 KOG4206: Spliceosomal protein snRNP-U1A/U2B [RNA processing and modification]. 39408 KOG4207: Predicted splicing factor, SR protein superfamily [RNA processing and modification]. 39409 KOG4208: Nucleolar RNA-binding protein NIFK [General function prediction only]. 39410 KOG4209: Splicing factor RNPS1, SR protein superfamily [RNA processing and modification]. 39411 KOG4210: Nuclear localization sequence binding protein [Transcription]. 39412 KOG4211: Splicing factor hnRNP-F and related RNA-binding proteins [RNA processing and modification]. 39413 KOG4212: RNA-binding protein hnRNP-M [RNA processing and modification]. 39414 KOG4213: RNA-binding protein La [RNA processing and modification]. 39415 KOG4214: Myotrophin and similar proteins [Transcription]. 39416 KOG4215: Hepatocyte nuclear factor 4 and similar steroid hormone receptors [Transcription]. 39417 KOG4216: Steroid hormone nuclear receptor [Transcription]. 39418 KOG4217: Nuclear receptors of the nerve growth factor-induced protein B type [Transcription]. 39419 KOG4218: Nuclear hormone receptor betaFTZ-F1 [Transcription]. 39420 KOG4219: G protein-coupled receptor [Signal transduction mechanisms]. 39421 KOG4220: Muscarinic acetylcholine receptor [Signal transduction mechanisms]. 39422 KOG4221: Receptor mediating netrin-dependent axon guidance [Signal transduction mechanisms]. 39423 KOG4222: Axon guidance receptor Dscam [Signal transduction mechanisms]. 39424 KOG4223: Reticulocalbin, calumenin, DNA supercoiling factor, and related Ca2+-binding proteins of the CREC family (EF-Hand protein superfamily) [Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport]. 39425 KOG4224: Armadillo repeat protein VAC8 required for vacuole fusion, inheritance and cytosol-to-vacuole protein targeting [Intracellular trafficking, secretion, and vesicular transport]. 39426 KOG4225: Sorbin and SH3 domain-containing protein [Signal transduction mechanisms]. 39427 KOG4226: Adaptor protein NCK/Dock, contains SH2 and SH3 domains [Signal transduction mechanisms]. 39428 KOG4227: WD40 repeat protein [General function prediction only]. 39429 KOG4228: Protein tyrosine phosphatase [Signal transduction mechanisms]. 39430 KOG4229: Myosin VII, myosin IXB and related myosins [Cell motility]. 39431 KOG4230: C1-tetrahydrofolate synthase [Coenzyme transport and metabolism]. 39432 KOG4231: Intracellular membrane-bound Ca2+-independent phospholipase A2 [Lipid transport and metabolism]. 39433 KOG4232: Delta 6-fatty acid desaturase/delta-8 sphingolipid desaturase [Lipid transport and metabolism]. 39434 KOG4233: DNA-bridging protein BAF [Chromatin structure and dynamics, Replication, recombination and repair]. 39435 KOG4234: TPR repeat-containing protein [General function prediction only]. 39436 KOG4235: Mitochondrial thymidine kinase 2/deoxyguanosine kinase [Nucleotide transport and metabolism]. 39437 KOG4236: Serine/threonine protein kinase PKC mu/PKD and related proteins [Signal transduction mechanisms]. 39438 KOG4237: Extracellular matrix protein slit, contains leucine-rich and EGF-like repeats [Extracellular structures, Signal transduction mechanisms]. 39439 KOG4238: Bifunctional ATP sulfurylase/adenosine 5'-phosphosulfate kinase [Nucleotide transport and metabolism]. 39440 KOG4239: Ras GTPase effector RASSF2 [Signal transduction mechanisms]. 39441 KOG4240: Multidomain protein, contains SPEC repeats, PH, SH3, and separate rac-specific and rho-specific guanine nucleotide exchange factor domains [Signal transduction mechanisms]. 39442 KOG4241: Mitochondrial ribosomal protein L10 [Translation, ribosomal structure and biogenesis]. 39443 KOG4242: Predicted myosin-I-binding protein [Cell motility]. 39444 KOG4243: Macrophage maturation-associated protein [Defense mechanisms]. 39445 KOG4244: Failed axon connections (fax) protein/glutathione S-transferase-like protein [Signal transduction mechanisms]. 39446 KOG4245: Predicted metal-dependent hydrolase of the TIM-barrel fold [General function prediction only]. 39447 KOG4246: Predicted DNA-binding protein, contains SAP domain [General function prediction only]. 39448 KOG4247: Mitochondrial DNA polymerase accessory subunit [Replication, recombination and repair]. 39449 KOG4248: Ubiquitin-like protein, regulator of apoptosis [Posttranslational modification, protein turnover, chaperones]. 39450 KOG4249: Uncharacterized conserved protein [Function unknown]. 39451 KOG4250: TANK binding protein kinase TBK1 [Signal transduction mechanisms]. 39452 KOG4251: Calcium binding protein [General function prediction only]. 39453 KOG4252: GTP-binding protein [Signal transduction mechanisms]. 39454 KOG4253: Tryptophan-rich basic nuclear protein [General function prediction only]. 39455 KOG4254: Phytoene desaturase [Coenzyme transport and metabolism]. 39456 KOG4255: Uncharacterized conserved protein [Function unknown]. 39457 KOG4256: Kinetochore component [Cell cycle control, cell division, chromosome partitioning]. 39458 KOG4257: Focal adhesion tyrosine kinase FAK, contains FERM domain [Signal transduction mechanisms]. 39459 KOG4258: Insulin/growth factor receptor (contains protein kinase domain) [Signal transduction mechanisms]. 39460 KOG4259: Putative nucleic acid-binding protein Hcc-1/proliferation associated cytokine-inducible protein, contains SAP domain [Cell cycle control, cell division, chromosome partitioning]. 39461 KOG4260: Uncharacterized conserved protein [Function unknown]. 39462 KOG4261: Talin [Cytoskeleton]. 39463 KOG4262: Uncharacterized conserved protein [Function unknown]. 39464 KOG4263: Putative receptor CCR1 [Signal transduction mechanisms]. 39465 KOG4264: Nucleo-cytoplasmic protein MLN51 [General function prediction only]. 39466 KOG4265: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 39467 KOG4266: Subtilisin kexin isozyme-1/site 1 protease, subtilase superfamily [Posttranslational modification, protein turnover, chaperones]. 39468 KOG4267: Predicted membrane protein [Function unknown]. 39469 KOG4268: Uncharacterized conserved protein containing PAP2 domain [Function unknown]. 39470 KOG4269: Rac GTPase-activating protein BCR/ABR [Signal transduction mechanisms]. 39471 KOG4270: GTPase-activator protein [Signal transduction mechanisms]. 39472 KOG4271: Rho-GTPase activating protein [Signal transduction mechanisms]. 39473 KOG4272: Predicted GTP-binding protein [General function prediction only]. 39474 KOG4273: Uncharacterized conserved protein [Function unknown]. 39475 KOG4274: Positive cofactor 2 (PC2), subunit of a multiprotein coactivator of RNA polymerase II [Transcription]. 39476 KOG4275: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 39477 KOG4276: Predicted hormone receptor interactor [General function prediction only]. 39478 KOG4277: Uncharacterized conserved protein, contains thioredoxin domain [General function prediction only]. 39479 KOG4278: Protein tyrosine kinase [Signal transduction mechanisms]. 39480 KOG4279: Serine/threonine protein kinase [Signal transduction mechanisms]. 39481 KOG4280: Kinesin-like protein [Cytoskeleton]. 39482 KOG4281: Uncharacterized conserved protein [Function unknown]. 39483 KOG4282: Transcription factor GT-2 and related proteins, contains trihelix DNA-binding/SANT domain [Transcription]. 39484 KOG4283: Transcription-coupled repair protein CSA, contains WD40 domain [Transcription, Replication, recombination and repair]. 39485 KOG4284: DEAD box protein [Transcription]. 39486 KOG4285: Mitotic phosphoprotein [Cell cycle control, cell division, chromosome partitioning]. 39487 KOG4286: Dystrophin-like protein [Cell motility, Signal transduction mechanisms, Cytoskeleton]. 39488 KOG4287: Pectin acetylesterase and similar proteins [Cell wall/membrane/envelope biogenesis]. 39489 KOG4288: Predicted oxidoreductase [General function prediction only]. 39490 KOG4289: Cadherin EGF LAG seven-pass G-type receptor [Signal transduction mechanisms]. 39491 KOG4290: Predicted membrane protein [Function unknown]. 39492 KOG4291: Mucin/alpha-tectorin [Extracellular structures]. 39493 KOG4292: Cubilin, multiligand receptor mediating cobalamin absorption [Coenzyme transport and metabolism]. 39494 KOG4293: Predicted membrane protein, contains DoH and Cytochrome b-561/ferric reductase transmembrane domains [Signal transduction mechanisms]. 39495 KOG4294: Non voltage-gated ion channels (DEG/ENaC family) [Inorganic ion transport and metabolism, Signal transduction mechanisms]. 39496 KOG4295: Serine proteinase inhibitor (KU family) [Posttranslational modification, protein turnover, chaperones]. 39497 KOG4296: Epithelin/granulin [Signal transduction mechanisms]. 39498 KOG4297: C-type lectin [Signal transduction mechanisms, Defense mechanisms]. 39499 KOG4298: CAP-binding protein complex interacting protein 2 [RNA processing and modification]. 39500 KOG4299: PHD Zn-finger protein [General function prediction only]. 39501 KOG4300: Predicted methyltransferase [General function prediction only]. 39502 KOG4301: Beta-dystrobrevin [Cytoskeleton]. 39503 KOG4302: Microtubule-associated protein essential for anaphase spindle elongation [Cell cycle control, cell division, chromosome partitioning, Cytoskeleton]. 39504 KOG4303: Vesicular inhibitory amino acid transporter [Amino acid transport and metabolism, Signal transduction mechanisms]. 39505 KOG4304: Transcriptional repressors of the hairy/E(spl) family (contains HLH) [Transcription]. 39506 KOG4305: RhoGEF GTPase [Signal transduction mechanisms]. 39507 KOG4306: Glycosylphosphatidylinositol-specific phospholipase C [Signal transduction mechanisms]. 39508 KOG4307: RNA binding protein RBM12/SWAN [General function prediction only]. 39509 KOG4308: LRR-containing protein [Function unknown]. 39510 KOG4309: Transcription mediator-related factor [Transcription]. 39511 KOG4310: Synapse-associated protein [Intracellular trafficking, secretion, and vesicular transport]. 39512 KOG4311: Histidinol dehydrogenase [Amino acid transport and metabolism]. 39513 KOG4312: Predicted acyltransferase [General function prediction only]. 39514 KOG4313: Thiamine pyrophosphokinase [Nucleotide transport and metabolism]. 39515 KOG4314: Predicted carbohydrate/phosphate translocator [General function prediction only]. 39516 KOG4315: G-patch nucleic acid binding protein [General function prediction only]. 39517 KOG4316: Uncharacterized conserved protein [Function unknown]. 39518 KOG4317: Predicted Zn-finger protein [Function unknown]. 39519 KOG4318: Bicoid mRNA stability factor [RNA processing and modification]. 39520 KOG4319: DNA-binding nuclear protein p8 [Transcription]. 39521 KOG4320: Uncharacterized conserved protein [Function unknown]. 39522 KOG4321: Predicted phosphate acyltransferases [Lipid transport and metabolism]. 39523 KOG4322: Anaphase-promoting complex (APC), subunit 5 [Cell cycle control, cell division, chromosome partitioning, Posttranslational modification, protein turnover, chaperones]. 39524 KOG4323: Polycomb-like PHD Zn-finger protein [General function prediction only]. 39525 KOG4324: Guanine nucleotide exchange factor [Intracellular trafficking, secretion, and vesicular transport]. 39526 KOG4325: Uncharacterized conserved protein [Function unknown]. 39527 KOG4326: Mitochondrial F1F0-ATP synthase, subunit e [Energy production and conversion]. 39528 KOG4327: mRNA splicing protein SMN (survival motor neuron) [RNA processing and modification]. 39529 KOG4328: WD40 protein [Function unknown]. 39530 KOG4329: DNA-binding protein [General function prediction only]. 39531 KOG4330: Uncharacterized conserved protein [Function unknown]. 39532 KOG4331: Polytopic membrane protein Prominin [General function prediction only]. 39533 KOG4332: Predicted sugar transporter [Carbohydrate transport and metabolism]. 39534 KOG4333: Nuclear DEAF-1 related transcriptional regulator (suppressin) and related SAND domain proteins [Cell cycle control, cell division, chromosome partitioning, Transcription]. 39535 KOG4334: Uncharacterized conserved protein, contains double-stranded RNA-binding motif and WW domain [General function prediction only]. 39536 KOG4335: FERM domain-containing protein KRIT1 [Signal transduction mechanisms]. 39537 KOG4336: TBP-associated transcription factor Prodos [Transcription]. 39538 KOG4337: Microsomal triglyceride transfer protein [Lipid transport and metabolism, Intracellular trafficking, secretion, and vesicular transport]. 39539 KOG4338: Predicted lipoprotein [Lipid transport and metabolism]. 39540 KOG4339: RPEL repeat-containing protein [General function prediction only]. 39541 KOG4340: Uncharacterized conserved protein [Function unknown]. 39542 KOG4341: F-box protein containing LRR [General function prediction only]. 39543 KOG4342: Alpha-mannosidase [Carbohydrate transport and metabolism]. 39544 KOG4343: bZIP transcription factor ATF6 [Transcription]. 39545 KOG4344: Uncharacterized conserved protein [Function unknown]. 39546 KOG4345: NF-kappa B regulator AP20/Cezanne [Signal transduction mechanisms]. 39547 KOG4346: Uncharacterized conserved protein [Function unknown]. 39548 KOG4347: GTPase-activating protein VRP [General function prediction only]. 39549 KOG4348: Adaptor protein CMS/SETA [Signal transduction mechanisms]. 39550 KOG4349: Uncharacterized conserved protein [Function unknown]. 39551 KOG4350: Uncharacterized conserved protein, contains BTB/POZ domain [General function prediction only]. 39552 KOG4351: Uncharacterized conserved protein [Function unknown]. 39553 KOG4352: Fas-mediated apoptosis inhibitor FAIM [Signal transduction mechanisms]. 39554 KOG4353: RNA export factor NXT1 [RNA processing and modification]. 39555 KOG4354: N-acetyl-gamma-glutamyl-phosphate reductase [Amino acid transport and metabolism]. 39556 KOG4355: Predicted Fe-S oxidoreductase [General function prediction only]. 39557 KOG4356: Uncharacterized conserved protein [Function unknown]. 39558 KOG4357: Uncharacterized conserved protein (involved in mesoderm differentiation in humans) [General function prediction only]. 39559 KOG4358: Uncharacterized conserved protein [Function unknown]. 39560 KOG4359: Protein kinase C inhibitor-like protein [General function prediction only]. 39561 KOG4360: Uncharacterized coiled coil protein [Function unknown]. 39562 KOG4361: BCL2-associated athanogene-like proteins and related BAG family chaperone regulators [Signal transduction mechanisms]. 39563 KOG4362: Transcriptional regulator BRCA1 [Replication, recombination and repair, Transcription]. 39564 KOG4363: Putative growth response protein [Signal transduction mechanisms]. 39565 KOG4364: Chromatin assembly factor-I [Chromatin structure and dynamics]. 39566 KOG4365: Uncharacterized conserved protein [Function unknown]. 39567 KOG4366: Predicted thioesterase [General function prediction only]. 39568 KOG4367: Predicted Zn-finger protein [Function unknown]. 39569 KOG4368: Predicted RNA binding protein, contains SWAP, RPR and G-patch domains [General function prediction only]. 39570 KOG4369: RTK signaling protein MASK/UNC-44 [Signal transduction mechanisms]. 39571 KOG4370: Ral-GTPase effector RLIP76 [Signal transduction mechanisms]. 39572 KOG4371: Membrane-associated protein tyrosine phosphatase PTP-BAS and related proteins, contain FERM domain [Signal transduction mechanisms]. 39573 KOG4372: Predicted alpha/beta hydrolase [General function prediction only]. 39574 KOG4373: Predicted 3'-5' exonuclease [General function prediction only]. 39575 KOG4374: RNA-binding protein Bicaudal-C [RNA processing and modification]. 39576 KOG4375: Scaffold protein Shank and related SAM domain proteins [Signal transduction mechanisms]. 39577 KOG4376: Uncharacterized conserved protein [Function unknown]. 39578 KOG4377: Zn-finger protein [General function prediction only]. 39579 KOG4378: Nuclear protein COP1 [Signal transduction mechanisms]. 39580 KOG4379: Uncharacterized conserved protein (tumor antigen CML66 in humans) [Function unknown]. 39581 KOG4380: Carnitine deficiency associated protein [General function prediction only]. 39582 KOG4381: RUN domain-containing protein [Signal transduction mechanisms]. 39583 KOG4382: Uncharacterized conserved protein, contains DTW domain [Function unknown]. 39584 KOG4383: Uncharacterized conserved protein [Function unknown]. 39585 KOG4384: Uncharacterized SAM domain protein [General function prediction only]. 39586 KOG4385: Predicted forkhead transcription factor [Transcription]. 39587 KOG4386: Uncharacterized conserved protein [Function unknown]. 39588 KOG4387: Ornithine decarboxylase antizyme [Amino acid transport and metabolism]. 39589 KOG4388: Hormone-sensitive lipase HSL [Lipid transport and metabolism]. 39590 KOG4389: Acetylcholinesterase/Butyrylcholinesterase [Signal transduction mechanisms]. 39591 KOG4390: Voltage-gated A-type K+ channel KCND [Inorganic ion transport and metabolism]. 39592 KOG4391: Predicted alpha/beta hydrolase BEM46 [General function prediction only]. 39593 KOG4392: RNA polymerase, subunit L [Transcription]. 39594 KOG4393: Predicted pseudouridylate synthase [RNA processing and modification, Translation, ribosomal structure and biogenesis]. 39595 KOG4394: RNase P subunit that is not also a subunit of RNase MRP, involved in pre-tRNA processing [RNA processing and modification]. 39596 KOG4395: Transcription factor Atonal, contains HTH domain [Transcription]. 39597 KOG4396: Predicted membrane-anchored protein [Function unknown]. 39598 KOG4397: Uncharacterized conserved protein [Function unknown]. 39599 KOG4398: Predicted coiled-coil protein [General function prediction only]. 39600 KOG4399: C2HC-type Zn-finger protein [General function prediction only]. 39601 KOG4400: E3 ubiquitin ligase interacting with arginine methyltransferase [Posttranslational modification, protein turnover, chaperones]. 39602 KOG4401: Uncharacterized conserved protein [Function unknown]. 39603 KOG4402: Uncharacterized conserved protein [Function unknown]. 39604 KOG4403: Cell surface glycoprotein STIM, contains SAM domain [General function prediction only]. 39605 KOG4404: Tandem pore domain K+ channel TASK3/THIK-1 [Inorganic ion transport and metabolism]. 39606 KOG4405: GDP dissociation inhibitor [Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport]. 39607 KOG4406: CDC42 Rho GTPase-activating protein [Signal transduction mechanisms, Cytoskeleton]. 39608 KOG4407: Predicted Rho GTPase-activating protein [General function prediction only]. 39609 KOG4408: Putative Mg2+ and Co2+ transporter CorD [Inorganic ion transport and metabolism]. 39610 KOG4409: Predicted hydrolase/acyltransferase (alpha/beta hydrolase superfamily) [General function prediction only]. 39611 KOG4410: 5-formyltetrahydrofolate cyclo-ligase [Coenzyme transport and metabolism]. 39612 KOG4411: Phytoene/squalene synthetase [Lipid transport and metabolism]. 39613 KOG4412: 26S proteasome regulatory complex, subunit PSMD10 [Posttranslational modification, protein turnover, chaperones]. 39614 KOG4413: 26S proteasome regulatory complex, subunit PSMD5 [Posttranslational modification, protein turnover, chaperones]. 39615 KOG4414: COP9 signalosome, subunit CSN8 [Posttranslational modification, protein turnover, chaperones, Signal transduction mechanisms]. 39616 KOG4415: Uncharacterized conserved protein [Function unknown]. 39617 KOG4416: Uncharacterized conserved protein [Function unknown]. 39618 KOG4417: Predicted endonuclease [General function prediction only]. 39619 KOG4418: Predicted membrane protein [Function unknown]. 39620 KOG4419: 5' nucleotidase [Nucleotide transport and metabolism]. 39621 KOG4420: Uncharacterized conserved protein (Ganglioside-induced differentiation associated protein 1, GDAP1) [Function unknown]. 39622 KOG4421: Uncharacterized conserved protein [Function unknown]. 39623 KOG4422: Uncharacterized conserved protein [Function unknown]. 39624 KOG4423: GTP-binding protein-like, RAS superfamily [Signal transduction mechanisms]. 39625 KOG4424: Predicted Rho/Rac guanine nucleotide exchange factor/faciogenital dysplasia protein 3 [Signal transduction mechanisms]. 39626 KOG4425: Uncharacterized conserved protein [Function unknown]. 39627 KOG4426: Arginyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 39628 KOG4427: E3 ubiquitin protein ligase [Posttranslational modification, protein turnover, chaperones]. 39629 KOG4428: Inositol-polyphosphate 4-phosphatase [Signal transduction mechanisms]. 39630 KOG4429: Uncharacterized conserved protein, contains SH3 and FCH domains [General function prediction only]. 39631 KOG4430: Topoisomerase I-binding arginine-serine-rich protein [Transcription]. 39632 KOG4431: Uncharacterized protein, induced by hypoxia [General function prediction only]. 39633 KOG4432: Uncharacterized NUDIX family hydrolase [General function prediction only]. 39634 KOG4433: Tweety transmembrane/cell surface protein [General function prediction only]. 39635 KOG4434: Molecular chaperone SEC63, endoplasmic reticulum translocon component [Intracellular trafficking, secretion, and vesicular transport, Posttranslational modification, protein turnover, chaperones]. 39636 KOG4435: Predicted lipid kinase [Lipid transport and metabolism, Signal transduction mechanisms]. 39637 KOG4436: Predicted GTPase activator NB4S/EVI5 (contains TBC domain)/Calmodulin-binding protein Pollux (contains PTB and TBC domains) [General function prediction only]. 39638 KOG4437: ATP-dependent DNA ligase III [Replication, recombination and repair]. 39639 KOG4438: Centromere-associated protein NUF2 [Cell cycle control, cell division, chromosome partitioning]. 39640 KOG4439: RNA polymerase II transcription termination factor TTF2/lodestar, DEAD-box superfamily [Transcription, Replication, recombination and repair]. 39641 KOG4440: NMDA selective glutamate-gated ion channel receptor subunit GRIN1 [Inorganic ion transport and metabolism, Amino acid transport and metabolism, Signal transduction mechanisms]. 39642 KOG4441: Proteins containing BTB/POZ and Kelch domains, involved in regulatory/signal transduction processes [Signal transduction mechanisms, General function prediction only]. 39643 KOG4442: Clathrin coat binding protein/Huntingtin interacting protein HIP1, involved in regulation of endocytosis [Intracellular trafficking, secretion, and vesicular transport]. 39644 KOG4443: Putative transcription factor HALR/MLL3, involved in embryonic development [General function prediction only]. 39645 KOG4444: Peroxisomal assembly protein PEX3 [Cell wall/membrane/envelope biogenesis, Intracellular trafficking, secretion, and vesicular transport]. 39646 KOG4445: Uncharacterized conserved protein, contains RWD domain [Function unknown]. 39647 KOG4446: Uncharacterized conserved protein [Function unknown]. 39648 KOG4447: Transcription factor TWIST [Transcription]. 39649 KOG4448: Uncharacterized conserved protein, contains phosphotyrosine interaction (PI) domain [Function unknown]. 39650 KOG4449: Translocase of outer mitochondrial membrane complex, subunit TOM7 [Intracellular trafficking, secretion, and vesicular transport]. 39651 KOG4450: Uncharacterized conserved protein [Function unknown]. 39652 KOG4451: Uncharacterized conserved protein (tumor-associated antigen HCA127 in humans) [Function unknown]. 39653 KOG4452: Predicted membrane protein [Function unknown]. 39654 KOG4453: Predicted ER membrane protein [Function unknown]. 39655 KOG4454: RNA binding protein (RRM superfamily) [General function prediction only]. 39656 KOG4455: Uncharacterized conserved protein [Function unknown]. 39657 KOG4456: Inner centromere protein (INCENP), C-terminal domain [Cell cycle control, cell division, chromosome partitioning]. 39658 KOG4457: Uncharacterized conserved protein [Function unknown]. 39659 KOG4458: Nitric oxide synthase-binding protein, contains PTB domain [Signal transduction mechanisms]. 39660 KOG4459: Membrane-associated proteoglycan Leprecan [Function unknown]. 39661 KOG4460: Nuclear pore complex, Nup88/rNup84 component [Nuclear structure, Intracellular trafficking, secretion, and vesicular transport]. 39662 KOG4461: Mitochondrial 28S ribosomal protein S30 [Translation, ribosomal structure and biogenesis]. 39663 KOG4462: WASP-interacting protein VRP1/WIP, contains WH2 domain [Cytoskeleton]. 39664 KOG4463: Uncharacterized conserved protein [Function unknown]. 39665 KOG4464: Signaling protein RIC-8/synembryn (regulates neurotransmitter secretion) [Signal transduction mechanisms]. 39666 KOG4465: Uncharacterized conserved protein [Function unknown]. 39667 KOG4466: Component of histone deacetylase complex (breast carcinoma metastasis suppressor 1 protein in human) [Cell cycle control, cell division, chromosome partitioning, Transcription]. 39668 KOG4467: Uncharacterized conserved protein [Function unknown]. 39669 KOG4468: Polycomb-group transcriptional regulator [Transcription]. 39670 KOG4469: Uncharacterized conserved protein [Function unknown]. 39671 KOG4470: Proteasome activator subunit [Posttranslational modification, protein turnover, chaperones]. 39672 KOG4471: Phosphatidylinositol 3-phosphate 3-phosphatase myotubularin MTM1 [Lipid transport and metabolism, Intracellular trafficking, secretion, and vesicular transport]. 39673 KOG4472: Glycolipid 2-alpha-mannosyltransferase (alpha-1,2-mannosyltransferase) [Carbohydrate transport and metabolism]. 39674 KOG4473: Uncharacterized membrane protein [Function unknown]. 39675 KOG4474: Uncharacterized conserved protein [Function unknown]. 39676 KOG4476: Gluconate transport-inducing protein [Signal transduction mechanisms, Carbohydrate transport and metabolism]. 39677 KOG4477: RING1 interactor RYBP and related Zn-finger-containing proteins [Transcription]. 39678 KOG4478: Uncharacterized membrane protein [Function unknown]. 39679 KOG4479: Transcription factor e(y)2 [Transcription]. 39680 KOG4480: Heme oxygenase [Inorganic ion transport and metabolism]. 39681 KOG4481: Uncharacterized conserved protein [Function unknown]. 39682 KOG4482: Sarcoglycan complex, alpha/epsilon subunits [Function unknown]. 39683 KOG4483: Uncharacterized conserved protein [Function unknown]. 39684 KOG4484: Uncharacterized conserved protein [Function unknown]. 39685 KOG4485: Uncharacterized conserved protein, contains ankyrin and FN3 repeats [General function prediction only]. 39686 KOG4486: 3-methyladenine DNA glycosylase [Replication, recombination and repair]. 39687 KOG4487: Uncharacterized conserved protein [Function unknown]. 39688 KOG4488: Small EDRK-rich protein H4F5 [General function prediction only]. 39689 KOG4489: Uncharacterized conserved protein BC10 (implicated in bladder cancer in humans) [Function unknown]. 39690 KOG4490: Translocon-associated complex TRAP, gamma subunit [Intracellular trafficking, secretion, and vesicular transport]. 39691 KOG4491: Predicted membrane protein [Function unknown]. 39692 KOG4492: Chorismate synthase [Amino acid transport and metabolism]. 39693 KOG4493: Uncharacterized conserved protein [Function unknown]. 39694 KOG4494: Cell surface ATP diphosphohydrolase Apyrase [Nucleotide transport and metabolism]. 39695 KOG4495: RNA polymerase II transcription elongation factor Elongin/SIII, subunit elongin B [Transcription]. 39696 KOG4496: Predicted coiled-coil protein [Function unknown]. 39697 KOG4497: Uncharacterized conserved protein WDR8, contains WD repeats [General function prediction only]. 39698 KOG4498: Uncharacterized conserved protein [Function unknown]. 39699 KOG4499: Ca2+-binding protein Regucalcin/SMP30 [Inorganic ion transport and metabolism, Signal transduction mechanisms]. 39700 KOG4500: Rho/Rac GTPase guanine nucleotide exchange factor smgGDS/Vimar [Signal transduction mechanisms]. 39701 KOG4501: Transcription coactivator complex, P100 component [Transcription]. 39702 KOG4502: Predicted membrane protein [Function unknown]. 39703 KOG4503: Uncharacterized conserved membrane protein [Function unknown]. 39704 KOG4504: Cation-independent mannose-6-phosphate receptor CI-MPR [Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport]. 39705 KOG4505: Na+/H+ antiporter [Inorganic ion transport and metabolism]. 39706 KOG4506: Uncharacterized conserved protein [Function unknown]. 39707 KOG4507: Uncharacterized conserved protein, contains TPR repeats [Function unknown]. 39708 KOG4508: Uncharacterized conserved protein [Function unknown]. 39709 KOG4509: Uncharacterized conserved protein [Function unknown]. 39710 KOG4510: Permease of the drug/metabolite transporter (DMT) superfamily [General function prediction only]. 39711 KOG4511: Uncharacterized conserved protein [Function unknown]. 39712 KOG4512: Vitamin D3 receptor interacting protein [Transcription, Signal transduction mechanisms]. 39713 KOG4513: Phosphoglycerate mutase [Carbohydrate transport and metabolism]. 39714 KOG4514: Uncharacterized conserved protein [Function unknown]. 39715 KOG4515: Uncharacterized conserved protein [Function unknown]. 39716 KOG4516: NADH:ubiquinone oxidoreductase, NDUFC2/B14.5B subunit [Energy production and conversion]. 39717 KOG4517: Uncharacterized conserved protein [Function unknown]. 39718 KOG4518: Hydroxypyruvate isomerase [Carbohydrate transport and metabolism]. 39719 KOG4519: Phosphomevalonate kinase [Lipid transport and metabolism]. 39720 KOG4520: Predicted coiled-coil protein [General function prediction only]. 39721 KOG4521: Nuclear pore complex, Nup160 component [Nuclear structure, Intracellular trafficking, secretion, and vesicular transport]. 39722 KOG4522: RNA polymerase II transcription mediator [Transcription]. 39723 KOG4523: Uncharacterized conserved protein [Function unknown]. 39724 KOG4524: Uncharacterized conserved protein [Function unknown]. 39725 KOG4525: Jacalin-like lectin domain-containing protein [General function prediction only]. 39726 KOG4526: Predicted membrane protein [Function unknown]. 39727 KOG4527: Cytochrome c oxidase, subunit VIIc/COX8 [Energy production and conversion]. 39728 KOG4528: Uncharacterized conserved protein [Function unknown]. 39729 KOG4529: Uncharacterized conserved protein [Function unknown]. 39730 KOG4530: Predicted flavoprotein [General function prediction only]. 39731 KOG4531: M phase phosphoprotein 6 [Cell cycle control, cell division, chromosome partitioning]. 39732 KOG4532: WD40-like repeat containing protein [General function prediction only]. 39733 KOG4533: Uncharacterized conserved protein [Function unknown]. 39734 KOG4534: Uncharacterized conserved protein [Function unknown]. 39735 KOG4535: HEAT and armadillo repeat-containing protein [General function prediction only]. 39736 KOG4536: Predicted membrane protein [Function unknown]. 39737 KOG4537: Zn-ribbon-containing protein implicated in mitosis [Cell cycle control, cell division, chromosome partitioning, Defense mechanisms]. 39738 KOG4538: Predicted coiled-coil protein [General function prediction only]. 39739 KOG4539: Uncharacterized conserved protein [Function unknown]. 39740 KOG4540: Putative lipase essential for disintegration of autophagic bodies inside the vacuole [Intracellular trafficking, secretion, and vesicular transport, Lipid transport and metabolism]. 39741 KOG4541: Nuclear transport receptor exportin 4 (importin beta superfamily) [Nuclear structure, Intracellular trafficking, secretion, and vesicular transport]. 39742 KOG4542: Predicted membrane protein [Function unknown]. 39743 KOG4543: Uncharacterized conserved protein [Function unknown]. 39744 KOG4544: Uncharacterized conserved protein [Function unknown]. 39745 KOG4545: Uncharacterized conserved protein [Function unknown]. 39746 KOG4546: Peroxisomal biogenesis protein (peroxin 16) [Intracellular trafficking, secretion, and vesicular transport]. 39747 KOG4547: WD40 repeat-containing protein [General function prediction only]. 39748 KOG4548: Mitochondrial ribosomal protein L17 [Translation, ribosomal structure and biogenesis]. 39749 KOG4549: Magnesium-dependent phosphatase [General function prediction only]. 39750 KOG4550: Predicted membrane protein [Function unknown]. 39751 KOG4551: GPI-GlcNAc transferase complex, PIG-H component, involved in glycosylphosphatidylinositol anchor biosynthesis [Cell wall/membrane/envelope biogenesis, Posttranslational modification, protein turnover, chaperones]. 39752 KOG4552: Vitamin-D-receptor interacting protein complex component [Transcription]. 39753 KOG4553: Uncharacterized conserved protein [Function unknown]. 39754 KOG4554: Protein involved in inorganic phosphate transport [Inorganic ion transport and metabolism]. 39755 KOG4555: TPR repeat-containing protein [Function unknown]. 39756 KOG4556: Predicted membrane protein [Function unknown]. 39757 KOG4557: Origin recognition complex, subunit 6 [Replication, recombination and repair]. 39758 KOG4558: Uncharacterized conserved protein [Function unknown]. 39759 KOG4559: Uncharacterized conserved protein [Function unknown]. 39760 KOG4560: Transcription factor IIIC box B binding (alpha) subunit [Transcription]. 39761 KOG4561: Uncharacterized conserved protein, contains TBC domain [Signal transduction mechanisms, General function prediction only]. 39762 KOG4562: Uncharacterized conserved protein (tumor-rejection antigen MAGE in humans) [Function unknown]. 39763 KOG4563: Cell cycle-regulated histone H1-binding protein [Chromatin structure and dynamics, Cell cycle control, cell division, chromosome partitioning]. 39764 KOG4564: Adenylate cyclase-coupled calcitonin receptor [Signal transduction mechanisms]. 39765 KOG4565: E93 protein involved in programmed cell death, putative transcription regulator [Transcription]. 39766 KOG4566: Cytokine-inducible SH2 protein [Signal transduction mechanisms]. 39767 KOG4567: GTPase-activating protein [General function prediction only]. 39768 KOG4568: Cytoskeleton-associated protein and related proteins [Cytoskeleton, General function prediction only]. 39769 KOG4569: Predicted lipase [Lipid transport and metabolism]. 39770 KOG4570: Uncharacterized conserved protein [Function unknown]. 39771 KOG4571: Activating transcription factor 4 [Transcription]. 39772 KOG4572: Predicted DNA-binding transcription factor, interacts with stathmin [Transcription, General function prediction only, Signal transduction mechanisms]. 39773 KOG4573: Phosphoprotein involved in cytoplasm to vacuole targeting and autophagy [Intracellular trafficking, secretion, and vesicular transport]. 39774 KOG4574: RNA-binding protein (contains RRM and Pumilio-like repeats) [General function prediction only]. 39775 KOG4575: TGc (transglutaminase/protease-like) domain-containing protein involved in cytokinesis [Cell cycle control, cell division, chromosome partitioning]. 39776 KOG4576: Sulfite oxidase, heme-binding component [Energy production and conversion]. 39777 KOG4577: Transcription factor LIM3, contains LIM and HOX domains [Transcription]. 39778 KOG4578: Uncharacterized conserved protein, contains KAZAL and TY domains [General function prediction only]. 39779 KOG4579: Leucine-rich repeat (LRR) protein associated with apoptosis in muscle tissue [General function prediction only]. 39780 KOG4580: Component of vacuolar transporter chaperone (Vtc) involved in vacuole fusion [Intracellular trafficking, secretion, and vesicular transport, Posttranslational modification, protein turnover, chaperones]. 39781 KOG4581: Predicted membrane protein [Function unknown]. 39782 KOG4582: Uncharacterized conserved protein, contains ZZ-type Zn-finger [General function prediction only]. 39783 KOG4583: Membrane-associated ER protein involved in stress response (contains ubiquitin-like domain) [Posttranslational modification, protein turnover, chaperones]. 39784 KOG4584: Uncharacterized conserved protein [General function prediction only]. 39785 KOG4585: Predicted transposase [Replication, recombination and repair]. 39786 KOG4586: CUB domain-containing protein [General function prediction only]. 39787 KOG4587: Predicted membrane protein [Function unknown]. 39788 KOG4588: Predicted ubiquitin-conjugating enzyme [Posttranslational modification, protein turnover, chaperones]. 39789 KOG4589: Cell division protein FtsJ [Cell cycle control, cell division, chromosome partitioning]. 39790 KOG4590: Signal transduction protein Enabled, contains WH1 domain [Signal transduction mechanisms]. 39791 KOG4591: Uncharacterized conserved protein, contains BTB/POZ domain [General function prediction only]. 39792 KOG4592: Uncharacterized conserved protein [Function unknown]. 39793 KOG4593: Mitotic checkpoint protein MAD1 [Cell cycle control, cell division, chromosome partitioning]. 39794 KOG4594: Sequence-specific single-stranded-DNA-binding protein [Replication, recombination and repair, Transcription, General function prediction only]. 39795 KOG4595: Uncharacterized conserved protein [Function unknown]. 39796 KOG4596: Uncharacterized conserved protein [Function unknown]. 39797 KOG4597: Serine proteinase inhibitor (KU family) with thrombospondin repeats [Posttranslational modification, protein turnover, chaperones]. 39798 KOG4598: Putative ubiquitin-specific protease [Posttranslational modification, protein turnover, chaperones]. 39799 KOG4599: Putative mitochondrial/chloroplast ribosomal protein L45 [Translation, ribosomal structure and biogenesis]. 39800 KOG4600: Mitochondrial ribosomal protein MRP7 (L2) [Translation, ribosomal structure and biogenesis]. 39801 KOG4601: Uncharacterized conserved protein [Function unknown]. 39802 KOG4602: Nanos and related proteins [General function prediction only]. 39803 KOG4603: TBP-1 interacting protein [Signal transduction mechanisms]. 39804 KOG4604: Uncharacterized conserved protein [Function unknown]. 39805 KOG4605: Uncharacterized conserved protein containing CDGSH-type Zn-finger [Function unknown]. 39806 KOG4606: Uncharacterized conserved protein [Function unknown]. 39807 KOG4607: Mitochondrial ribosomal protein L9 [Translation, ribosomal structure and biogenesis]. 39808 KOG4608: Uncharacterized conserved protein [Function unknown]. 39809 KOG4609: Predicted phosphoglycerate mutase [General function prediction only]. 39810 KOG4610: Uncharacterized conserved protein [Function unknown]. 39811 KOG4611: Uncharacterized conserved protein [Function unknown]. 39812 KOG4612: Mitochondrial ribosomal protein L34 [Translation, ribosomal structure and biogenesis]. 39813 KOG4613: Predicted component of DNA replication checkpoint response mechanism (S-M checkpoint) [General function prediction only, Cell cycle control, cell division, chromosome partitioning]. 39814 KOG4614: Inner membrane protein required for assembly of the F0 sector of ATP synthase [Posttranslational modification, protein turnover, chaperones]. 39815 KOG4615: Uncharacterized conserved protein [Function unknown]. 39816 KOG4616: Mitochondrial ribosomal protein L55 [Translation, ribosomal structure and biogenesis]. 39817 KOG4617: Uncharacterized conserved protein [Function unknown]. 39818 KOG4618: Uncharacterized conserved protein [Function unknown]. 39819 KOG4619: Uncharacterized conserved protein [Function unknown]. 39820 KOG4620: Uncharacterized conserved protein [Function unknown]. 39821 KOG4621: Uncharacterized conserved protein [Function unknown]. 39822 KOG4622: Predicted nucleotide kinase [General function prediction only]. 39823 KOG4623: Uncharacterized conserved protein [Function unknown]. 39824 KOG4624: Uncharacterized conserved protein [Function unknown]. 39825 KOG4625: Notch signaling protein Neuralized, Nuez domain [Signal transduction mechanisms]. 39826 KOG4626: O-linked N-acetylglucosamine transferase OGT [Carbohydrate transport and metabolism, Posttranslational modification, protein turnover, chaperones, Signal transduction mechanisms]. 39827 KOG4627: Kynurenine formamidase [Amino acid transport and metabolism]. 39828 KOG4628: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 39829 KOG4629: Predicted mechanosensitive ion channel [Cell wall/membrane/envelope biogenesis]. 39830 KOG4630: NADH:ubiquinone oxidoreductase, NDUFA7/B14.5A subunit [Energy production and conversion]. 39831 KOG4631: NADH:ubiquinone oxidoreductase, NDUFB3/B12 subunit [Energy production and conversion]. 39832 KOG4632: NADH:ubiquinone oxidoreductase, NDUFB5/SGDH subunit [Energy production and conversion]. 39833 KOG4633: NADH:ubiquinone oxidoreductase, NDUFB6/B17 subunit [Energy production and conversion]. 39834 KOG4634: Mitochondrial F1F0-ATP synthase, subunit Cf6 (coupling factor 6) [Energy production and conversion]. 39835 KOG4635: Vacuolar import and degradation protein [Intracellular trafficking, secretion, and vesicular transport]. 39836 KOG4636: Uncharacterized conserved protein with TLDc domain [Function unknown]. 39837 KOG4637: Adaptor for phosphoinositide 3-kinase [Signal transduction mechanisms]. 39838 KOG4638: Uncharacterized conserved protein [Function unknown]. 39839 KOG4639: RNase P/RNase MRP subunit POP5 [Translation, ribosomal structure and biogenesis]. 39840 KOG4640: Anaphase-promoting complex (APC), subunit 4 [Cell cycle control, cell division, chromosome partitioning, Posttranslational modification, protein turnover, chaperones]. 39841 KOG4642: Chaperone-dependent E3 ubiquitin protein ligase (contains TPR repeats) [Posttranslational modification, protein turnover, chaperones]. 39842 KOG4643: Uncharacterized coiled-coil protein [Function unknown]. 39843 KOG4644: L-fucose kinase [Carbohydrate transport and metabolism]. 39844 KOG4645: MAPKKK (MAP kinase kinase kinase) SSK2 and related serine/threonine protein kinases [Signal transduction mechanisms]. 39845 KOG4646: Uncharacterized conserved protein, contains ARM repeats [Function unknown]. 39846 KOG4647: Uncharacterized membrane protein [Function unknown]. 39847 KOG4648: Uncharacterized conserved protein, contains LRR repeats [Function unknown]. 39848 KOG4649: PQQ (pyrrolo-quinoline quinone) repeat protein [Secondary metabolites biosynthesis, transport and catabolism]. 39849 KOG4650: Predicted steroid reductase [General function prediction only]. 39850 KOG4651: Chondroitin 6-sulfotransferase and related sulfotransferases [Cell wall/membrane/envelope biogenesis, Extracellular structures]. 39851 KOG4652: HORMA domain [Chromatin structure and dynamics]. 39852 KOG4653: Uncharacterized conserved protein [Function unknown]. 39853 KOG4654: Uncharacterized conserved protein [Function unknown]. 39854 KOG4655: U3 small nucleolar ribonucleoprotein (snoRNP) component [RNA processing and modification]. 39855 KOG4656: Copper chaperone for superoxide dismutase [Inorganic ion transport and metabolism]. 39856 KOG4657: Uncharacterized conserved protein [Function unknown]. 39857 KOG4658: Apoptotic ATPase [Signal transduction mechanisms]. 39858 KOG4659: Uncharacterized conserved protein (Rhs family) [Function unknown]. 39859 KOG4660: Protein Mei2, essential for commitment to meiosis, and related proteins [Cell cycle control, cell division, chromosome partitioning]. 39860 KOG4661: Hsp27-ERE-TATA-binding protein/Scaffold attachment factor (SAF-B) [Transcription]. 39861 KOG4662: NADH dehydrogenase subunit 3 and related proteins [Energy production and conversion]. 39862 KOG4663: Cytochrome b [Energy production and conversion]. 39863 KOG4664: Cytochrome oxidase subunit III and related proteins [Energy production and conversion]. 39864 KOG4665: ATP synthase F0 subunit 6 and related proteins [Energy production and conversion]. 39865 KOG4666: Predicted phosphate acyltransferase, contains PlsC domain [Lipid transport and metabolism]. 39866 KOG4667: Predicted esterase [Lipid transport and metabolism]. 39867 KOG4668: NADH dehydrogenase subunits 2, 5, and related proteins [Energy production and conversion]. 39868 KOG4669: NADH dehydrogenase subunit 4L and related proteins [Energy production and conversion]. 39869 KOG4670: Uncharacterized conserved membrane protein [Function unknown]. 39870 KOG4671: Brain cell membrane protein 1 (BCMP1) [General function prediction only]. 39871 KOG4672: Uncharacterized conserved low complexity protein [Function unknown]. 39872 KOG4673: Transcription factor TMF, TATA element modulatory factor [Transcription]. 39873 KOG4674: Uncharacterized conserved coiled-coil protein [Function unknown]. 39874 KOG4675: Uncharacterized conserved protein, contains ENT domain [General function prediction only]. 39875 KOG4676: Splicing factor, arginine/serine-rich [RNA processing and modification]. 39876 KOG4677: Golgi integral membrane protein [Intracellular trafficking, secretion, and vesicular transport, General function prediction only]. 39877 KOG4679: Uncharacterized protein PSP1 (suppressor of DNA polymerase alpha mutations in yeast) [General function prediction only]. 39878 KOG4680: Uncharacterized conserved protein, contains ML domain [General function prediction only]. 39879 KOG4681: Uncharacterized conserved protein [Function unknown]. 39880 KOG4682: Uncharacterized conserved protein, contains BTB/POZ domain [General function prediction only]. 39881 KOG4683: Uncharacterized conserved protein [Function unknown]. 39882 KOG4684: Uncharacterized conserved protein, contains C4-type Zn-finger [General function prediction only]. 39883 KOG4685: tRNA splicing endonuclease SEN2 [Translation, ribosomal structure and biogenesis]. 39884 KOG4686: Predicted sugar transporter [Carbohydrate transport and metabolism]. 39885 KOG4687: Uncharacterized coiled-coil protein [Function unknown]. 39886 KOG4688: Putative beta-catenin-Tcf/Lef signaling pathway component DRCTNNB1A [Signal transduction mechanisms]. 39887 KOG4689: Predicted RNase [RNA processing and modification]. 39888 KOG4690: Uncharacterized conserved protein [Function unknown]. 39889 KOG4691: Uncharacterized conserved protein [Function unknown]. 39890 KOG4692: Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 39891 KOG4693: Uncharacterized conserved protein, contains kelch repeat [General function prediction only]. 39892 KOG4694: Predicted membrane protein [Function unknown]. 39893 KOG4695: Uncharacterized conserved protein [Function unknown]. 39894 KOG4696: Uncharacterized conserved protein [Function unknown]. 39895 KOG4697: Integral membrane protein involved in transport between the late Golgi and endosome [Intracellular trafficking, secretion, and vesicular transport]. 39896 KOG4698: Uncharacterized conserved protein [Function unknown]. 39897 KOG4699: Preprotein translocase subunit Sec66 [Intracellular trafficking, secretion, and vesicular transport]. 39898 KOG4700: Uncharacterized homolog of ribosome-binding factor A [General function prediction only]. 39899 KOG4701: Chitinase [Cell wall/membrane/envelope biogenesis]. 39900 KOG4702: Uncharacterized conserved protein [Function unknown]. 39901 KOG4703: Uncharacterized conserved protein [Function unknown]. 39902 KOG4704: Uncharacterized conserved protein [Function unknown]. 39903 KOG4705: Uncharacterized conserved protein [Function unknown]. 39904 KOG4706: Uncharacterized conserved protein [Function unknown]. 39905 KOG4707: Mitochondrial/chloroplast ribosomal protein L20 [Translation, ribosomal structure and biogenesis]. 39906 KOG4708: Mitochondrial ribosomal protein MRP17 [Translation, ribosomal structure and biogenesis]. 39907 KOG4709: Uncharacterized conserved protein [Function unknown]. 39908 KOG4710: E3 ubiquitin ligase, VHL component (von Hippel-Lindau tumor suppressor in humans) [Posttranslational modification, protein turnover, chaperones]. 39909 KOG4711: Predicted membrane protein [General function prediction only]. 39910 KOG4712: Uncharacterized conserved protein [Function unknown]. 39911 KOG4713: Cyclin-dependent kinase 2-associated protein [Signal transduction mechanisms, Cell cycle control, cell division, chromosome partitioning]. 39912 KOG4714: Nucleoporin [Nuclear structure]. 39913 KOG4715: SWI/SNF-related matrix-associated actin-dependent regulator of chromatin [Chromatin structure and dynamics]. 39914 KOG4716: Thioredoxin reductase [Posttranslational modification, protein turnover, chaperones]. 39915 KOG4717: Serine/threonine protein kinase [Signal transduction mechanisms]. 39916 KOG4718: Non-SMC (structural maintenance of chromosomes) element 1 protein (NSE1) [Chromatin structure and dynamics]. 39917 KOG4719: Nuclear pore complex protein [Nuclear structure, Intracellular trafficking, secretion, and vesicular transport]. 39918 KOG4720: Ethanolamine kinase [Lipid transport and metabolism]. 39919 KOG4721: Serine/threonine protein kinase, contains leucine zipper domain [Signal transduction mechanisms]. 39920 KOG4722: Zn-finger protein [General function prediction only]. 39921 KOG4723: Uncharacterized conserved protein [Function unknown]. 39922 KOG4724: Predicted Rho GTPase-activating protein [Signal transduction mechanisms]. 39923 KOG4725: Uncharacterized conserved protein [Function unknown]. 39924 KOG4726: Ultrahigh sulfur keratin-associated protein [Extracellular structures]. 39925 KOG4727: U1-like Zn-finger protein [General function prediction only]. 39926 KOG4728: Anti-apoptotic Bcl-2 family proteins, prevent opening of mitochondrial porin channel [Signal transduction mechanisms]. 39927 KOG4729: Galactoside-binding lectin [General function prediction only]. 39928 KOG4730: D-arabinono-1, 4-lactone oxidase [Defense mechanisms]. 39929 KOG4731: Protein predicted to be involved in spindle matrix formation, contains DM13, DoH, and DOMON domains [Cell cycle control, cell division, chromosome partitioning]. 39930 KOG4732: Uncharacterized conserved protein [Function unknown]. 39931 KOG4734: Uncharacterized conserved protein [Function unknown]. 39932 KOG4735: Extracellular protein with conserved cysteines [Function unknown]. 39933 KOG4736: Uncharacterized conserved protein [Function unknown]. 39934 KOG4737: ATPase membrane sector associated protein [Energy production and conversion]. 39935 KOG4738: Predicted metallothionein [Inorganic ion transport and metabolism]. 39936 KOG4739: Uncharacterized protein involved in synaptonemal complex formation [Cell cycle control, cell division, chromosome partitioning, General function prediction only]. 39937 KOG4740: Uncharacterized conserved protein [Function unknown]. 39938 KOG4741: Uncharacterized conserved protein [Function unknown]. 39939 KOG4742: Predicted chitinase [General function prediction only]. 39940 KOG4743: Cyclin-dependent kinase inhibitor [Signal transduction mechanisms]. 39941 KOG4744: Uncharacterized conserved protein [Function unknown]. 39942 KOG4745: Metalloproteinase inhibitor TIMP and related proteins [General function prediction only]. 39943 KOG4746: Small nuclear RNA activating complex (SNAPc), subunit SNAP43 [RNA processing and modification]. 39944 KOG4747: Two-component phosphorelay intermediate involved in MAP kinase cascade regulation [Signal transduction mechanisms]. 39945 KOG4748: Subunit of Golgi mannosyltransferase complex [Carbohydrate transport and metabolism, Cell wall/membrane/envelope biogenesis]. 39946 KOG4749: Inositol polyphosphate kinase [Signal transduction mechanisms]. 39947 KOG4750: Serine O-acetyltransferase [Amino acid transport and metabolism]. 39948 KOG4751: DNA recombinational repair protein BRCA2 [Replication, recombination and repair]. 39949 KOG4752: Ribosomal protein L41 [Translation, ribosomal structure and biogenesis]. 39950 KOG4753: Predicted membrane protein [Function unknown]. 39951 KOG4754: Predicted phosphoglycerate mutase [Carbohydrate transport and metabolism]. 39952 KOG4755: Predicted pyroglutamyl peptidase [Posttranslational modification, protein turnover, chaperones]. 39953 KOG4756: Mitochondrial ribosomal protein L27 [Translation, ribosomal structure and biogenesis]. 39954 KOG4757: Predicted telomere binding protein [General function prediction only]. 39955 KOG4758: Predicted membrane protein [General function prediction only]. 39956 KOG4759: Ribosome recycling factor [Translation, ribosomal structure and biogenesis]. 39957 KOG4760: Uncharacterized conserved protein [Function unknown]. 39958 KOG4761: Proteasome formation inhibitor PI31 [Posttranslational modification, protein turnover, chaperones]. 39959 KOG4762: DNA replication factor [Replication, recombination and repair]. 39960 KOG4763: Ubiquinol-cytochrome c reductase hinge protein [Energy production and conversion]. 39961 KOG4764: Uncharacterized conserved protein [Function unknown]. 39962 KOG4765: Uncharacterized conserved protein [Function unknown]. 39963 KOG4766: Uncharacterized conserved protein [Function unknown]. 39964 KOG4767: Cytochrome c oxidase, subunit II, and related proteins [Energy production and conversion]. 39965 KOG4768: Mitochondrial mRNA maturase [RNA processing and modification]. 39966 KOG4769: Cytochrome c oxidase, subunit I [Energy production and conversion]. 39967 KOG4770: NADH dehydrogenase subunit 1 [Energy production and conversion]. 39968 KOG4771: Nucleolar protein (NOP16) involved in 60S ribosomal subunit biogenesis [Translation, ribosomal structure and biogenesis]. 39969 KOG4772: Predicted tRNA-splicing endonuclease subunit [Translation, ribosomal structure and biogenesis]. 39970 KOG4773: NADPH oxidase [Energy production and conversion]. 39971 KOG4774: Uncharacterized conserved protein [Function unknown]. 39972 KOG4775: Uncharacterized protein SFI1 involved in G(2)-M transition [Cell cycle control, cell division, chromosome partitioning]. 39973 KOG4776: Uncharacterized conserved protein BCNT [Function unknown]. 39974 KOG4777: Aspartate-semialdehyde dehydrogenase [Amino acid transport and metabolism]. 39975 KOG4778: Mitochondrial ribosomal protein L28 [Translation, ribosomal structure and biogenesis]. 39976 KOG4779: Predicted membrane protein [Function unknown]. 39977 KOG4780: Uncharacterized conserved protein [Function unknown]. 39978 KOG4781: Uncharacterized conserved protein [Function unknown]. 39979 KOG4782: Predicted membrane protein [Function unknown]. 39980 KOG4783: Uncharacterized conserved protein [Function unknown]. 39981 KOG4784: Uncharacterized conserved protein [Function unknown]. 39982 KOG4785: Transcription factor CBF, beta subunit [Transcription]. 39983 KOG4786: Ubinuclein, nuclear protein interacting with cellular and viral transcription factors [Transcription, Signal transduction mechanisms]. 39984 KOG4787: Uncharacterized conserved protein [Function unknown]. 39985 KOG4788: Members of chemokine-like factor super family and related proteins [Defense mechanisms]. 39986 KOG4789: Uncharacterized conserved protein [Function unknown]. 39987 KOG4790: Uncharacterized conserved protein [Function unknown]. 39988 KOG4791: Uncharacterized conserved protein [Function unknown]. 39989 KOG4792: Crk family adapters [Signal transduction mechanisms]. 39990 KOG4793: Three prime repair exonuclease [Replication, recombination and repair]. 39991 KOG4794: Thymosin beta [Cell motility]. 39992 KOG4795: Protein associated with transcriptional elongation factor ELL [Transcription]. 39993 KOG4796: RNA polymerase II elongation factor [Transcription]. 39994 KOG4797: Transcriptional regulator [Transcription]. 39995 KOG4798: Uncharacterized conserved protein [Function unknown]. 39996 KOG4799: Mitochondrial ribosomal protein L30 [Translation, ribosomal structure and biogenesis]. 39997 KOG4800: Neuronal membrane glycoprotein/Myelin proteolipid protein [Function unknown]. 39998 KOG4801: Member of the steroid/thyroid receptor superfamily [Signal transduction mechanisms]. 39999 KOG4802: Adhesion-type protein [Extracellular structures]. 40000 KOG4803: Uncharacterized conserved protein [Function unknown]. 40001 KOG4804: Predicted membrane protein [General function prediction only]. 40002 KOG4805: Uncharacterized conserved protein [Function unknown]. 40003 KOG4806: Uncharacterized conserved protein [Function unknown]. 40004 KOG4807: F-actin binding protein, regulates actin cytoskeletal organization [Cytoskeleton]. 40005 KOG4808: Uncharacterized conserved protein [Function unknown]. 40006 KOG4809: Rab6 GTPase-interacting protein involved in endosome-to-TGN transport [Intracellular trafficking, secretion, and vesicular transport]. 40007 KOG4810: Uncharacterized conserved protein [Function unknown]. 40008 KOG4811: Uncharacterized conserved protein [Function unknown]. 40009 KOG4812: Golgi-associated protein/Nedd4 WW domain-binding protein [General function prediction only]. 40010 KOG4813: Translation initiation factor eIF3, p35 subunit [Translation, ribosomal structure and biogenesis]. 40011 KOG4814: Uncharacterized conserved protein [Function unknown]. 40012 KOG4815: Muscular protein implicated in muscular dystrophy phenotype [General function prediction only]. 40013 KOG4816: Uncharacterized conserved protein [Function unknown]. 40014 KOG4817: Unnamed protein [Function unknown]. 40015 KOG4818: Lysosomal-associated membrane protein [General function prediction only]. 40016 KOG4819: Uncharacterized conserved protein [Function unknown]. 40017 KOG4820: Involved in anesthetic response in C.elegans [General function prediction only]. 40018 KOG4821: Predicted Na+-dependent cotransporter [General function prediction only]. 40019 KOG4822: Predicted nuclear membrane protein involved in mRNA transport and sex determination via splicing modulation [RNA processing and modification, Signal transduction mechanisms]. 40020 KOG4823: Uncharacterized conserved protein [Function unknown]. 40021 KOG4824: Apolipoprotein D/Lipocalin [Cell wall/membrane/envelope biogenesis]. 40022 KOG4825: Component of synaptic membrane glycine-, glutamate- and thienylcyclohexylpiperidine-binding glycoprotein (43kDa) [Signal transduction mechanisms]. 40023 KOG4826: C-8,7 sterol isomerase [Lipid transport and metabolism]. 40024 KOG4827: Uncharacterized conserved protein [Function unknown]. 40025 KOG4828: Uncharacterized conserved protein [Function unknown]. 40026 KOG4829: Uncharacterized conserved protein [Function unknown]. 40027 KOG4830: Predicted sugar transporter [Carbohydrate transport and metabolism]. 40028 KOG4831: Unnamed protein [Function unknown]. 40029 KOG4832: Uncharacterized conserved protein [Function unknown]. 40030 KOG4833: Uncharacterized conserved protein [Function unknown]. 40031 KOG4834: Predicted DNA-binding protein, contains SANT domain [General function prediction only]. 40032 KOG4835: DNA-binding protein C1D involved in regulation of double-strand break repair [Replication, recombination and repair]. 40033 KOG4836: Uncharacterized conserved protein [Function unknown]. 40034 KOG4837: Uncharacterized conserved protein [Function unknown]. 40035 KOG4838: Uncharacterized conserved protein [Function unknown]. 40036 KOG4839: Uncharacterized conserved protein [Function unknown]. 40037 KOG4840: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) [General function prediction only]. 40038 KOG4841: Dolichol-phosphate mannosyltransferase, subunit 3 [Posttranslational modification, protein turnover, chaperones, Signal transduction mechanisms]. 40039 KOG4842: Protein involved in sister chromatid separation and/or segregation [Cell cycle control, cell division, chromosome partitioning]. 40040 KOG4843: Uncharacterized conserved protein [Function unknown]. 40041 KOG4844: Mitochondrial ribosomal protein S27 [Translation, ribosomal structure and biogenesis]. 40042 KOG4845: NADH dehydrogenase, subunit 4 [Energy production and conversion]. 40043 KOG4846: Nuclear receptor [Signal transduction mechanisms]. 40044 KOG4847: Coiled coil protein involved in linking membrane proteins to cytoskeleton [General function prediction only]. 40045 KOG4848: Extracellular matrix-associated peroxidase [Extracellular structures, Defense mechanisms]. 40046 KOG4849: mRNA cleavage factor I subunit/CPSF subunit [RNA processing and modification]. 40047 KOG4850: Uncharacterized conserved protein [Function unknown]. 40048 KOG4851: Uncharacterized conserved protein [Function unknown]. 40049 KOG4852: Uncharacterized conserved protein [Function unknown]. 40050 LOAD_ACT: small ligand binding domain. 40051 LOAD_ATP_cone: ATP-cone domain, ATP binding allosteric domain seen in nucleotidereductases and other proteins. 40052 LOAD_Ccd1: Conserved cysteine containing domain 1 (a predicted disulfide bond redox regulator).. 40053 LOAD_Ccd2: Conserved cysteine containing domain 2 (a predicted disulfide bond redox regulator).. 40054 LOAD_DAK2: predicted phosphatase domain of the Diacylglycerol kinase family. 40055 LOAD_DSBH: Double stranded beta helix domain involved in carbohydrate binding and protein-protein interactions in different contexts. 40056 LOAD_Dak1: Kinase domain of the Diacylglycerol kinase family. 40057 LOAD_EF1B: Elongation factor 1B conserved domain. 40058 LOAD_ERCC4: ERCC4 domain, Predicted nuclease domain. 40059 LOAD_FCS: Zinc coordinating domain chromatinic proteins. 40060 LOAD_HD: HD Metal dependent (phospho) Hydrolase domain. 40061 LOAD_HisDeac1: RPD3 like Histone decetylase domain. 40062 LOAD_Hrd: Conserved helical domain seen associated with recQ family helicases and RNASeD like nucleases. 40063 LOAD_LSAT: Late Stage Acyl transferases. 40064 LOAD_NTF2: Domain found in RNA transport proteins. 40065 LOAD_S2Pmetalloprt: S2P metal dependent membrane associated protease domain. 40066 LOAD_SKIP: Conserved domain found in chromatinic proteins. 40067 LOAD_Toprim: The Topoisomerase- primase domain- a nucleotidyl transferase/hydrolase domain. 40068 LOAD_USPA: An ATP binding domain seen as a stand alone in USPA. 40069 LOAD_W2: conserved protein-protein interaction domain in translation factors like eIF2B. 40070 LOAD_arc_metj: A small exclusively prokaryotic DNA binding domain with beta sheet DNA contact. 40071 LOAD_bir: Zn binding domain involved in protein protein interactions in caspase inhibition and spindle assembly. 40072 LOAD_cdc45: Possible enzyme involved in DNA replication. 40073 LOAD_cold: Cold Shock RNA binding domain of the OB fold. 40074 LOAD_efts_N: Conserved 3 helical domain seen in the N-terminus of EF-TS and eukaryotic NAC proteins. 40075 LOAD_gas2groo: Conserved domain seen in Groovin and Gas2 proteins. 40076 LOAD_gyf: Protein protein interaction domain involved in binding proline rich sequences. 40077 LOAD_hismacro: Domain found in Macro histone 2 (predicted phosphoesterase).. 40078 LOAD_kelch: A beta propeller domain involved in protein-protein interactions and some enzymatic actvities like gycolate oxidase. 40079 LOAD_ku: A family of DNA binding domains including those in Ku70/80. 40080 LOAD_ligase: ATP and NAD dependent DNA ligases and capping enzymes. 40081 LOAD_little_fing: Zinc coordinating RNA binding domain. 40082 LOAD_mi: conserved alpha helical domain involved in protein-protein interactions in eukaryotic translation regulators. 40083 LOAD_minC: Bacterial proteins involved in chromosomal partioning. 40084 LOAD_nac: Domain found in the archaeal and eukaryotic BTF proteins involved in translation and transcription. 40085 LOAD_nic: Large alpha helical domain involved in protein-protein interactions in eIF4G, NMD2 and CBP80. 40086 LOAD_nirV: Conserved bacterial and eukaryotic enzyme of unknown activity. 40087 LOAD_osmc: domain with Conserved cysteine found in OsmC like proteins (predicted oxidative damage protective proteins).. 40088 LOAD_p29_RnaseP: subunit of RNA processing enzyme RNAse P. 40089 LOAD_pex11: peroxisomal development protein. 40090 LOAD_php: family of phosphoesterases. 40091 LOAD_pua: Novel RNA binding domain. 40092 LOAD_rna_Cyc: RNA 3' phosphate cyclase domain. 40093 LOAD_sir2: Catalytic domain of the SIR2 like Histonde deacetylases and ADP ribosyl transferase. 40094 LOAD_sis: phosphoSugar isomerase domain similar to glucose phosphate isomerase. 40095 LOAD_surE: Conserved bacterial and eukaryotic enzyme of unknown activity. 40096 LOAD_swib: Conserved domain involved in chromatin structure regulation. 40097 LOAD_tam: DNA binding domain found in several chromosomal proteins including the Methyl CpG binding proteins. 40098 LOAD_taz: Zinc binding domain involved in protein-protein interactions. 40099 LOAD_tf2f: Conserved domain in TF2F, predicted chromatin modifying protein. 40100 LOAD_uvrC_endov: DNASE domain found in UvrC and EndoV proteins. 40101 LOAD_yugu: Conserved bacterial and eukaryotic enzyme of unknown activity. 40102 LOAD_zz: Zinc binding domain seen in both chromatinic and cytoskeletal proteins. 40103 pfam00001: 7 transmembrane receptor (rhodopsin family).. 40104 pfam00002: 7 transmembrane receptor (Secretin family).. 40105 pfam00003: 7 transmembrane receptor (metabotropic glutamate family).. 40106 pfam00004: ATPase family associated with various cellular activities (AAA). AAA family proteins often perform chaperone-like functions that assist in the assembly, operation, or disassembly of protein complexes. 40107 pfam00005: ABC transporter. ABC transporters for a large family of proteins responsible for translocation of a variety of compounds across biological membranes. ABC transporters are the largest family of proteins in many completely sequenced bacteria. ABC transporters are composed of two copies of this domain and two copies of a transmembrane domain pfam00664. These four domains may belong to a single polypeptide, or belong in different polypeptide chains. 40108 pfam00006: ATP synthase alpha/beta family, nucleotide-binding domain. This family includes the ATP synthase alpha and beta subunits the ATP synthase associated with flagella. 40109 pfam00007: Cystine-knot domain. The family comprises glycoprotein hormones and the C-terminal domain of various extracellular proteins. It is believed to be involved in disulfide-linked dimerisation. 40110 pfam00008: EGF-like domain. There is no clear separation between noise and signal. pfam00053 is very similar, but has 8 instead of 6 conserved cysteines. Includes some cytokine receptors. 40111 pfam00009: Elongation factor Tu GTP binding domain. This domain contains a P-loop motif, also found in several other families such as pfam00071, pfam00025 and pfam00063. Elongation factor Tu consists of three structural domains, this plus two C-terminal beta barrel domains. 40112 pfam00010: Helix-loop-helix DNA-binding domain. 40113 pfam00011: Hsp20/alpha crystallin family. 40114 pfam00012: Hsp70 protein. Hsp70 chaperones help to fold many proteins. Hsp70 assisted folding involves repeated cycles of substrate binding and release. Hsp70 activity is ATP dependent. Hsp70 proteins are made up of two regions: the amino terminus is the ATPase domain and the carboxyl terminus is the substrate binding region. 40115 pfam00013: KH domain. KH motifs can bind RNA in vitro. Autoantibodies to Nova, a KH domain protein, cause paraneoplastic opsoclonus ataxia. 40116 pfam00014: Kunitz/Bovine pancreatic trypsin inhibitor domain. Indicative of a protease inhibitor, usually a serine protease inhibitor. Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. 40117 pfam00015: Methyl-accepting chemotaxis protein (MCP) signaling domain. This domain is thought to transduce the signal to CheA since it is highly conserved in very diverse MCPs. 40118 pfam00016: Ribulose bisphosphate carboxylase large chain, catalytic domain. The C-terminal domain of RuBisCO large chain is the catalytic domain adopting a TIM barrel fold. 40119 pfam00017: SH2 domain. 40120 pfam00018: SH3 domain. SH3 (Src homology 3) domains are often indicative of a protein involved in signal transduction related to cytoskeletal organisation. First described in the Src cytoplasmic tyrosine kinase. The structure is a partly opened beta barrel. 40121 pfam00019: Transforming growth factor beta like domain. 40122 pfam00020: TNFR/NGFR cysteine-rich region. 40123 pfam00021: u-PAR/Ly-6 domain. This extracellular disulphide bond rich domain is related to pfam00087. 40124 pfam00022: Actin. 40125 pfam00023: Ankyrin repeat. There's no clear separation between noise and signal on the HMM search Ankyrin repeats generally consist of a beta, alpha, alpha, beta order of secondary structures. The repeats associate to form a higher order structure. 40126 pfam00024: PAN domain. The PAN domain contains a conserved core of three disulphide bridges. In some members of the family there is an additional fourth disulphide bridge the links the N and C termini of the domain. The domain is found in diverse proteins, in some they mediate protein-protein interactions, in others they mediate protein-carbohydrate interactions. 40127 pfam00025: ADP-ribosylation factor family. 40128 pfam00026: Eukaryotic aspartyl protease. Aspartyl (acid) proteases include pepsins, cathepsins, and renins. Two-domain structure, probably arising from ancestral duplication. This family does not include the retroviral nor retrotransposon proteases (pfam00077), which are much smaller and appear to be homologous to a single domain of the eukaryotic asp proteases. 40129 pfam00027: Cyclic nucleotide-binding domain. 40130 pfam00028: Cadherin domain. 40131 pfam00029: Connexin. 40132 pfam00030: Beta/Gamma crystallin. The alignment comprises two Greek key motifs since the similarity between them is very low. 40133 pfam00031: Cystatin domain. Very diverse family. Attempts to define separate sub-families failed. Typically, either the N-terminal or C-terminal end is very divergent. But splitting into two domains would make very short families. Domains described by pfam00666 are related to this family but have not been included. 40134 pfam00032: Cytochrome b(C-terminal)/b6/petD. 40135 pfam00033: Cytochrome b(N-terminal)/b6/petB. 40136 pfam00034: Cytochrome c. The cytochrome 556 and cytochrome c' families are not included. . 40137 pfam00035: Double-stranded RNA binding motif. Sequences gathered for seed by HMM_iterative_training Putative motif shared by proteins that bind to dsRNA. At least some DSRM proteins seem to bind to specific RNA targets. Exemplified by Staufen, which is involved in localisation of at least five different mRNAs in the early Drosophila embryo. Also by interferon-induced protein kinase in humans, which is part of the cellular response to dsRNA. 40138 pfam00036: EF hand. The EF-hands can be divided into two classes: signaling proteins and buffering/transport proteins. The first group is the largest and includes the most well-known members of the family such as calmodulin, troponin C and S100B. These proteins typically undergo a calcium-dependent conformational change which opens a target binding site. The latter group is represented by calbindin D9k and do not undergo calcium dependent conformational changes. 40139 pfam00037: 4Fe-4S binding domain. Superfamily includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. Structure of the domain is an alpha-antiparallel beta sandwich. 40140 pfam00038: Intermediate filament protein. 40141 pfam00039: Fibronectin type I domain. 40142 pfam00040: Fibronectin type II domain. 40143 pfam00041: Fibronectin type III domain. 40144 pfam00042: Globin. 40145 pfam00043: Glutathione S-transferase, C-terminal domain. Function: conjugation of reduced glutathione to a variety of targets. Also included in the alignment, but are not GSTs: * S-crystallins from squid. Similarity to GST previously noted. * Eukaryotic elongation factors 1-gamma. Not known to have GST activity; similarity not previously recognised. * HSP26 family of stress-related proteins. including auxin-regulated proteins in plants and stringent starvation proteins in E. coli. Not known to have GST activity. Similarity not previously recognised. The glutathione molecule binds in a cleft between N and C-terminal domains - the catalytically important residues are proposed to reside in the N-terminal domain. 40146 pfam00044: Glyceraldehyde 3-phosphate dehydrogenase, NAD binding domain. GAPDH is a tetrameric NAD-binding enzyme involved in glycolysis and glyconeogenesis. N-terminal domain is a Rossman NAD(P) binding fold. 40147 pfam00045: Hemopexin. Hemopexin is a heme-binding protein that transports heme to the liver. Hemopexin-like repeats occur in vitronectin and some matrix metallopeptidases family (matrixins). The HX repeats of some matrixins bind tissue inhibitor of metallopeptidases (TIMPs).. 40148 pfam00046: Homeobox domain. 40149 pfam00047: Immunoglobulin domain. Members of the immunoglobulin superfamily are found in hundreds of proteins of different functions. Examples include antibodies, the giant muscle kinase titin and receptor tyrosine kinases. Immunoglobulin-like domains may be involved in protein-protein and protein-ligand interactions. The Pfam alignments do not include the first and last strand of the immunoglobulin-like domain. 40150 pfam00048: Small cytokines (intecrine/chemokine), interleukin-8 like. Includes a number of secreted growth factors and interferons involved in mitogenic, chemotactic, and inflammatory activity. Structure contains two highly conserved disulfide bonds. 40151 pfam00049: Insulin/IGF/Relaxin family. Superfamily includes insulins; relaxins; insulin-like growth factor; and bombyxin. All are secreted regulatory hormones. Disulfide rich, all-alpha fold. Alignment includes B chain, linker (which is processed out of the final product), and A chain. 40152 pfam00050: Kazal-type serine protease inhibitor domain. Usually indicative of serine protease inhibitors. However, kazal-like domains are also seen in the extracellular part of agrins, which are not known to be protease inhibitors. Kazal domains often occur in tandem arrays. Small alpha+beta fold containing three disulphides. Alignment also includes a single domain from transporters in the OATP/PGT family. 40153 pfam00051: Kringle domain. Kringle domains have been found in plasminogen, hepatocyte growth factors, prothrombin, and apolipoprotein A. Structure is disulfide-rich, nearly all-beta. 40154 pfam00052: Laminin B (Domain IV).. 40155 pfam00053: Laminin EGF-like (Domains III and V). This family is like pfam00008 but has 8 conserved cysteines instead of 6. 40156 pfam00054: Laminin G domain. 40157 pfam00055: Laminin N-terminal (Domain VI).. 40158 pfam00056: lactate/malate dehydrogenase, NAD binding domain. L-lactate dehydrogenases are metabolic enzymes which catalyse the conversion of L-lactate to pyruvate, the last step in anaerobic glycolysis. L-2-hydroxyisocaproate dehydrogenases are also members of the family. Malate dehydrogenases catalyse the interconversion of malate to oxaloacetate. The enzyme participates in the citric acid cycle. L-lactate dehydrogenase is also found as a lens crystallin in bird and crocodile eyes. N-terminus (this family) is a Rossman NAD-binding fold. C-terminus is an unusual alpha+beta fold. . 40159 pfam00057: Low-density lipoprotein receptor domain class A. 40160 pfam00058: Low-density lipoprotein receptor repeat class B. This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. 40161 pfam00059: Lectin C-type domain. This family includes both long and short form C-type. 40162 pfam00060: Ligand-gated ion channel. This family includes the four transmembrane regions of the ionotropic glutamate receptors and NMDA receptors. 40163 pfam00061: Lipocalin / cytosolic fatty-acid binding protein family. Lipocalins are transporters for small hydrophobic molecules, such as lipids, steroid hormones, bilins, and retinoids. Alignment subsumes both lipocalin and fatty acid binding protein signatures. This is supported on structural and functional grounds. Structure is an eight-stranded beta barrel. 40164 pfam00062: C-type lysozyme/alpha-lactalbumin family. Alpha-lactalbumin is the regulatory subunit of lactose synthase, changing the substrate specificity of galactosyltransferase from N-acetylglucosamine to glucose. C-type lysozymes are secreted bacteriolytic enzymes that cleave the peptidoglycan of bacterial cell walls. Structure is a multi-domain, mixed alpha and beta fold, containing four conserved disulfide bonds. 40165 pfam00063: Myosin head (motor domain).. 40166 pfam00064: Neuraminidase. Neuraminidases cleave sialic acid residues from glycoproteins. Belong to the sialidase family -- but this alignment does not generalise to the other sialidases. Structure is a 6-sheet beta propeller. 40167 pfam00066: Notch (DSL) domain. The Notch domain is also called the 'DSL' domain. The notch proteins are transmembrane proteins with extracellular domains of repeated EGF domains and the notch (or DSL) domain N-terminal to that. These proteins are generally involved in lateral inhibition in developmental processes. 40168 pfam00067: Cytochrome P450. Cytochrome P450s are involved in the oxidative degradation of various compounds. Particularly well known for their role in the degradation of environmental toxins and mutagens. Structure is mostly alpha, and binds a heme cofactor. 40169 pfam00068: Phospholipase A2. Phospholipase A2 releases fatty acids from the second carbon group of glycerol. Perhaps the best known members are secreted snake venoms, but also found in secreted pancreatic and membrane-associated forms. Structure is all-alpha, with two core disulfide-linked helices and a calcium-binding loop. This alignment represents the major family of PLA2s. A second minor family, defined by the honeybee venom PLA2 PDB:1POC and related sequences from Gila monsters (Heloderma), is not recognised. This minor family conserves the core helix pair but is substantially different elsewhere. 40170 pfam00069: Protein kinase domain. 40171 pfam00070: Pyridine nucleotide-disulphide oxidoreductase. This family includes both class I and class II oxidoreductases and also NADH oxidases and peroxidases. This domain is actually a small NADH binding domain within a larger FAD binding domain. 40172 pfam00071: Ras family. Includes sub-families Ras, Rab, Rac, Ral, Ran, Rap Ypt1 and more. Shares P-loop motif with GTP_EFTU, arf and myosin_head. See pfam00009 pfam00025, pfam00063. The high cutoff is so high to avoid overlaps with related families. 40173 pfam00072: Response regulator receiver domain. This domain receives the signal from the sensor partner in bacterial two-component systems. It is usually found N-terminal to a DNA binding effector domain. 40174 pfam00073: picornavirus capsid protein. CAUTION: This alignment is very weak. It can not be generated by clustalw. If a representative set is used for a seed, many so-called members are not recognised. The family should probably be split up into sub-families. Capsid proteins of picornaviruses. Picornaviruses are non-enveloped plus-strand ssRNA animal viruses with icosahedral capsids. They include rhinovirus (common cold) and poliovirus. Common structure is an 8-stranded beta sandwich. Variations (one or two extra strands) occur. 40175 pfam00074: Pancreatic ribonuclease. Ribonucleases. Members include pancreatic RNAase A and angiogenins. Structure is an alpha+beta fold -- long curved beta sheet and three helices. 40176 pfam00075: RNase H. RNase H digests the RNA strand of an RNA/DNA hybrid. Important enzyme in retroviral replication cycle, and often found as a domain associated with reverse transcriptases. Structure is a mixed alpha+beta fold with three a/b/a layers. 40177 pfam00076: RNA recognition motif. (a.k.a. RRM, RBD, or RNP domain). The RRM motif is probably diagnostic of an RNA binding protein. RRMs are found in a variety of RNA binding proteins, including various hnRNP proteins, proteins implicated in regulation of alternative splicing, and protein components of snRNPs. The motif also appears in a few single stranded DNA binding proteins. The RRM structure consists of four strands and two helices arranged in an alpha/beta sandwich, with a third helix present during RNA binding in some cases The C-terminal beta strand (4th strand) and final helix are hard to align and have been omitted in the SEED alignment The LA proteins have a N terminus rrm which is included in the seed. There is a second region towards the C terminus that has some features of a rrm but does not appear to have the important structural core of a rrm. The LA proteins are one of the main autoantigens in Systemic lupus erythematosus (SLE), an autoimmune disease. 40178 pfam00077: Retroviral aspartyl protease. Single domain aspartyl proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). These proteases are generally part of a larger polyprotein; usually pol, more rarely gag. Retroviral proteases appear to be homologous to a single domain of the two-domain eukaryotic aspartyl proteases such as pepsins, cathepsins, and renins (pfam00026).. 40179 pfam00078: Reverse transcriptase (RNA-dependent DNA polymerase). A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. 40180 pfam00079: Serpin (serine protease inhibitor). Structure is a multi-domain fold containing a bundle of helices and a beta sandwich. 40181 pfam00080: Copper/zinc superoxide dismutase (SODC). superoxide dismutases (SODs) catalyse the conversion of superoxide radicals to hydrogen peroxide and molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the copper/zinc-binding family is one. Defects in the human SOD1 gene cause familial amyotrophic lateral sclerosis (Lou Gehrig's disease). Structure is an eight-stranded beta sandwich, similar to the immunoglobulin fold. 40182 pfam00081: Iron/manganese superoxide dismutases, alpha-hairpin domain. superoxide dismutases (SODs) catalyse the conversion of superoxide radicals to hydrogen peroxide and molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the Mn/Fe-binding family is one. In humans, there is a cytoplasmic Cu/Zn SOD, and a mitochondrial Mn/Fe SOD. N-terminal domain is a long alpha antiparallel hairpin. A small fragment of YTRE_LEPBI matches well - sequencing error?. 40183 pfam00082: Subtilase family. Subtilases are a family of serine proteases. They appear to have independently and convergently evolved an Asp/Ser/His catalytic triad, like that found in the trypsin serine proteases (see pfam00089). Structure is an alpha/beta fold containing a 7-stranded parallel beta sheet, order 2314567. 40184 pfam00083: Sugar (and other) transporter. 40185 pfam00084: Sushi domain (SCR repeat).. 40186 pfam00085: Thioredoxin. Thioredoxins are small enzymes that participate in redox reactions, via the reversible oxidation of an active centre disulfide bond. Some members with only the active site are not separated from the noise. 40187 pfam00086: Thyroglobulin type-1 repeat. Thyroglobulin type 1 repeats are thought to be involved in the control of proteolytic degradation. The domain usually contains six conserved cysteines. These form three disulphide bridges. Cysteines 1 pairs with 2, 3 with 4 and 5 with 6. 40188 pfam00087: Snake toxin. A family of venomous neurotoxins and cytotoxins. Structure is small, disulfide-rich, nearly all beta sheet. 40189 pfam00088: Trefoil (P-type) domain. 40190 pfam00089: Trypsin. 40191 pfam00090: Thrombospondin type 1 domain. 40192 pfam00091: Tubulin/FtsZ family, GTPase domain. This family includes the tubulin alpha, beta and gamma chains, as well as the bacterial FtsZ family of proteins. Members of this family are involved in polymer formation. FtsZ is the polymer-forming protein of bacterial cell division. It is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ and tubulin are GTPases. FtsZ can polymerise into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria and archaea. Tubulin is the major component of microtubules. 40193 pfam00092: von Willebrand factor type A domain. 40194 pfam00093: von Willebrand factor type C domain. The high cutoff was used to prevent overlap with pfam00094. 40195 pfam00094: von Willebrand factor type D domain. 40196 pfam00095: WAP-type (Whey Acidic Protein) 'four-disulfide core'.. 40197 pfam00096: Zinc finger, C2H2 type. The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. 40198 pfam00097: Zinc finger, C3HC4 type (RING finger). The C3HC4 type zinc-finger (RING finger) is a cysteine-rich domain of 40 to 60 residues that coordinates two zinc ions, and has the consensus sequence: C-X2-C-X(9-39)-C-X(1-3)-H-X(2-3)-C-X2-C-X(4-48)-C-X2-C where X is any amino acid. Many proteins containing a RING finger play a key role in the ubiquitination pathway. 40199 pfam00098: Zinc knuckle. The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger. 40200 pfam00100: Zona pellucida-like domain. 40201 pfam00101: Ribulose bisphosphate carboxylase, small chain. 40202 pfam00102: Protein-tyrosine phosphatase. 40203 pfam00103: Somatotropin hormone family. 40204 pfam00104: Ligand-binding domain of nuclear hormone receptor. This all helical domain is involved in binding the hormone in these receptors. 40205 pfam00105: Zinc finger, C4 type (two domains). In nearly all cases, this is the DNA binding domain of a nuclear hormone receptor. The alignment contains two Zinc finger domains that are too dissimilar to be aligned with each other. 40206 pfam00106: short chain dehydrogenase. This family contains a wide variety of dehydrogenases. 40207 pfam00107: Zinc-binding dehydrogenase. 40208 pfam00108: Thiolase, N-terminal domain. Thiolase is reported to be structurally related to beta-ketoacyl synthase (pfam00109), and also chalcone synthase. 40209 pfam00109: Beta-ketoacyl synthase, N-terminal domain. The structure of beta-ketoacyl synthase is similar to that of the thiolase family (Pfam::PF00108) and also chalcone sythase. The active site of beta-ketoacyl synthase is located between the N and C-terminal domains. The N-terminal domain contains most of the structures involved in dimer formation and also the active site cysteine. 40210 pfam00110: wnt family. 40211 pfam00111: 2Fe-2S iron-sulfur cluster binding domain. 40212 pfam00112: Papain family cysteine protease. 40213 pfam00113: Enolase, C-terminal TIM barrel domain. 40214 pfam00114: Pilin (bacterial filament). Proteins with only the short N-terminal methylation site are not separated from the noise. 40215 pfam00115: Cytochrome C and Quinol oxidase polypeptide I. 40216 pfam00116: Cytochrome C oxidase subunit II, periplasmic domain. 40217 pfam00117: Glutamine amidotransferase class-I. 40218 pfam00118: TCP-1/cpn60 chaperonin family. This family includes members from the HSP60 chaperone family and the TCP-1 (T-complex protein) family. 40219 pfam00119: ATP synthase A chain. 40220 pfam00120: Glutamine synthetase, catalytic domain. 40221 pfam00121: Triosephosphate isomerase. 40222 pfam00122: E1-E2 ATPase. 40223 pfam00123: Peptide hormone. This family contains glucagon, GIP, secretin and VIP. 40224 pfam00124: Photosynthetic reaction centre protein. 40225 pfam00125: Core histone H2A/H2B/H3/H4. 40226 pfam00126: Bacterial regulatory helix-turn-helix protein, lysR family. 40227 pfam00127: Copper binding proteins, plastocyanin/azurin family. 40228 pfam00128: Alpha amylase, catalytic domain. Alpha amylase is classified as family 13 of the glycosyl hydrolases. The structure is an 8 stranded alpha/beta barrel containing the active site, interrupted by a ~70 a.a. calcium-binding domain protruding between beta strand 3 and alpha helix 3, and a carboxyl-terminal Greek key beta-barrel domain. 40229 pfam00129: Class I Histocompatibility antigen, domains alpha 1 and 2. 40230 pfam00130: Phorbol esters/diacylglycerol binding domain (C1 domain). This domain is also known as the Protein kinase C conserved region 1 (C1) domain. 40231 pfam00131: Metallothionein. 40232 pfam00132: Bacterial transferase hexapeptide (three repeats).. 40233 pfam00133: tRNA synthetases class I (I, L, M and V). Other tRNA synthetase sub-families are too dissimilar to be included. 40234 pfam00134: Cyclin, N-terminal domain. Cyclins regulate cyclin dependent kinases (CDKs). One member is a Uracil-DNA glycosylase, which is related to other cyclins. Cyclins contain two domains of similar all-alpha fold, this family corresponds to the N-terminal domain. 40235 pfam00135: Carboxylesterase. 40236 pfam00136: DNA polymerase family B. This region of DNA polymerase B appears to consist of more than one structural domain, possibly including elongation, DNA-binding and dNTP binding activities. 40237 pfam00137: ATP synthase subunit C. 40238 pfam00138: Legume lectins alpha domain. 40239 pfam00139: Legume lectins beta domain. 40240 pfam00140: Sigma-70 factor, region 1.2. 40241 pfam00141: Peroxidase. 40242 pfam00142: 4Fe-4S iron sulfur cluster binding proteins, NifH/frxC family. 40243 pfam00143: Interferon alpha/beta domain. 40244 pfam00144: Beta-lactamase. This family appears to be distantly related to pfam00905 and pfam00768 D-alanyl-D-alanine carboxypeptidase. 40245 pfam00145: C-5 cytosine-specific DNA methylase. 40246 pfam00146: NADH dehydrogenase. 40247 pfam00147: Fibrinogen beta and gamma chains, C-terminal globular domain. 40248 pfam00148: Nitrogenase component 1 type Oxidoreductase. 40249 pfam00149: Calcineurin-like phosphoesterase. This family includes a diverse range of phosphoesterases, including protein phosphoserine phosphatases, nucleotidases, sphingomyelin phosphodiesterases and 2'-3' cAMP phosphodiesterases as well as nucleases such as bacterial SbcD or yeast MRE11. The most conserved regions in this superfamily centre around the metal chelating residues. 40250 pfam00150: Cellulase (glycosyl hydrolase family 5).. 40251 pfam00151: Lipase. 40252 pfam00152: tRNA synthetases class II (D, K and N). Other tRNA synthetase sub-families are too dissimilar to be included. 40253 pfam00153: Mitochondrial carrier protein. 40254 pfam00154: recA bacterial DNA recombination protein. 40255 pfam00155: Aminotransferase class I and II. 40256 pfam00156: Phosphoribosyl transferase domain. This family includes a range of diverse phosphoribosyl transferase enzymes. This family includes: Adenine phosphoribosyltransferase EC:2.4.2.7. Hypoxanthine-guanine-xanthine phosphoribosyltransferase. Hypoxanthine phosphoribosyltransferase EC:2.4.2.8. Ribose-phosphate pyrophosphokinase i EC:2.7.6.1. Amidophosphoribosyltransferase EC:2.4.2.14. Orotate phosphoribosyltransferase EC:2.4.2.10. Uracil phosphoribosyltransferase EC:2.4.2.9. Xanthine-guanine phosphoribosyltransferase EC:2.4.2.22. 40257 pfam00157: Pou domain - N-terminal to homeobox domain. 40258 pfam00158: Sigma-54 interaction domain. 40259 pfam00159: Pancreatic hormone peptide. 40260 pfam00160: Cyclophilin type peptidyl-prolyl cis-trans isomerase. 40261 pfam00161: Ribosome inactivating protein. 40262 pfam00162: Phosphoglycerate kinase. 40263 pfam00163: Ribosomal protein S4/S9 N-terminal domain. This family includes small ribosomal subunit S9 from prokaryotes and S16 from metazoans. This domain is predicted to bind to ribosomal RNA. This domain is composed of four helices in the known structure. However the domain is discontinuous in sequence and the alignment for this family contains only the first three helices. 40264 pfam00164: Ribosomal protein S12. 40265 pfam00165: Bacterial regulatory helix-turn-helix proteins, araC family. Members of this family contain two structural repeats of this domain. 40266 pfam00166: Chaperonin 10 Kd subunit. 40267 pfam00167: Fibroblast growth factor. Fibroblast growth factors are a family of proteins involved in growth and differentiation in a wide range of contexts. These growth factors cause dimerisation of their tyrosine kinase receptors leading to intracellular signaling. There are currently four known tyrosine kinase receptors for fibroblast growth factors. These receptors can each bind several different members of this family. Members of this family have a beta trefoil structure. 40268 pfam00168: C2 domain. 40269 pfam00169: PH domain. PH stands for pleckstrin homology. 40270 pfam00170: bZIP transcription factor. The Pfam entry includes the basic region and the leucine zipper region. 40271 pfam00171: Aldehyde dehydrogenase family. This family of dehydrogenases act on aldehyde substrates. Members use NADP as a cofactor. The family includes the following members: The prototypical members are the aldehyde dehydrogenases EC:1.2.1.3. Succinate-semialdehyde dehydrogenase EC:1.2.1.16. Lactaldehyde dehydrogenase EC:1.2.1.22. Benzaldehyde dehydrogenase EC:1.2.1.28. Methylmalonate-semialdehyde dehydrogenase EC:1.2.1.27. Glyceraldehyde-3-phosphate dehydrogenase EC:1.2.1.9. Delta-1-pyrroline-5-carboxylate dehydrogenase EC: 1.5.1.12. Acetaldehyde dehydrogenase EC:1.2.1.10. Glutamate-5-semialdehyde dehydrogenase EC:1.2.1.41. This family also includes omega crystallin, an eye lens protein from squid and octopus that has little aldehyde dehydrogenase activity. 40272 pfam00172: Fungal Zn(2)-Cys(6) binuclear cluster domain. 40273 pfam00173: Cytochrome b5-like Heme/Steroid binding domain. This family includes heme binding domains from a diverse range of proteins. This family also includes proteins that bind to steroids. The family includes progesterone receptors. Many members of this subfamily are membrane anchored by an N-terminal transmembrane alpha helix. This family also includes a domain in some chitin synthases. There is no known ligand for this domain in the chitin synthases. 40274 pfam00174: Oxidoreductase molybdopterin binding domain. This domain is found in a variety of oxidoreductases. This domain binds to a molybdopterin cofactor. Xanthine dehydrogenases, that also bind molybdopterin, have essentially no similarity. 40275 pfam00175: Oxidoreductase NAD-binding domain. Xanthine dehydrogenases, that also bind FAD/NAD, have essentially no similarity. 40276 pfam00176: SNF2 family N-terminal domain. This domain is found in proteins involved in a variety of processes including transcription regulation (e.g., SNF2, STH1, brahma, MOT1) , DNA repair (e.g., ERCC6, RAD16, RAD5), DNA recombination (e.g., RAD54), and chromatin unwinding (e.g., ISWI) as well as a variety of other proteins with little functional information (e.g., lodestar, ETL1).. 40277 pfam00177: Ribosomal protein S7p/S5e. This family contains ribosomal protein S7 from prokaryotes and S5 from eukaryotes. 40278 pfam00178: Ets-domain. 40279 pfam00179: Ubiquitin-conjugating enzyme. Proteins destined for proteasome-mediated degradation may be ubiquitinated. Ubiquitination follows conjugation of ubiquitin to a conserved cysteine residue of UBC homologues. TSG101 is one of several UBC homologues that lacks this active site cysteine. 40280 pfam00180: Isocitrate/isopropylmalate dehydrogenase. 40281 pfam00181: Ribosomal Proteins L2, RNA binding domain. 40282 pfam00182: Chitinase class I. 40283 pfam00183: Hsp90 protein. 40284 pfam00184: Neurohypophysial hormones, C-terminal Domain. N-terminal Domain is in hormone5. 40285 pfam00185: Aspartate/ornithine carbamoyltransferase, Asp/Orn binding domain. 40286 pfam00186: Dihydrofolate reductase. 40287 pfam00187: Chitin recognition protein. 40288 pfam00188: SCP-like extracellular protein. This domain is also found in prokaryotes. 40289 pfam00189: Ribosomal protein S3, C-terminal domain. This family contains a central domain pfam00013, hence the amino and carboxyl terminal domains are stored separately. This is a minimal carboxyl-terminal domain. Some are much longer. 40290 pfam00190: Cupin. This family represents the conserved barrel domain of the 'cupin' superfamily ('cupa' is the Latin term for a small barrel). This family contains 11S and 7S plant seed storage proteins, and germins. Plant seed storage proteins provide the major nitrogen source for the developing plant. . 40291 pfam00191: Annexin. This family of annexins also includes giardin that has been shown to function as an annexin. 40292 pfam00193: Extracellular link domain. 40293 pfam00194: Eukaryotic-type carbonic anhydrase. 40294 pfam00195: Chalcone and stilbene synthases, N-terminal domain. The C-terminal domain of Chalcone synthase is reported to be structurally similar to domains in thiolase and beta-ketoacyl synthase. The differences in activity are accounted for by differences in this N-terminal domain. 40295 pfam00196: Bacterial regulatory proteins, luxR family. 40296 pfam00197: Trypsin and protease inhibitor. 40297 pfam00198: 2-oxoacid dehydrogenases acyltransferase (catalytic domain). These proteins contain one to three copies of a lipoyl binding domain followed by the catalytic domain. 40298 pfam00199: Catalase. 40299 pfam00200: Disintegrin. 40300 pfam00201: UDP-glucoronosyl and UDP-glucosyl transferase. 40301 pfam00202: Aminotransferase class-III. 40302 pfam00203: Ribosomal protein S19. 40303 pfam00204: DNA gyrase B. This family represents the second domain of DNA gyrase B which has a ribosomal S5 domain 2-like fold. This family is structurally related to PF01119. 40304 pfam00205: Thiamine pyrophosphate enzyme, central domain. The central domain of TPP enzymes contains a 2-fold Rossman fold. 40305 pfam00206: Lyase. 40306 pfam00207: Alpha-2-macroglobulin family. This family includes the C-terminal region of the alpha-2-macroglobulin family. 40307 pfam00208: Glutamate/Leucine/Phenylalanine/Valine dehydrogenase. 40308 pfam00209: Sodium:neurotransmitter symporter family. 40309 pfam00210: Ferritin-like domain. This family contains ferritins and other ferritin-like proteins such as members of the DPS family and bacterioferritins. 40310 pfam00211: Adenylate and Guanylate cyclase catalytic domain. 40311 pfam00212: Atrial natriuretic peptide. 40312 pfam00213: ATP synthase delta (OSCP) subunit. The ATP D subunit from E. coli is the same as the OSCP subunit which is this family. The ATP D subunit from metazoa are found in family pfam00401. 40313 pfam00214: Calcitonin / CGRP / IAPP family. 40314 pfam00215: Orotidine 5'-phosphate decarboxylase / HUMPS family. This family includes Orotidine 5'-phosphate decarboxylase enzymes EC:4.1.1.23 that are involved in the final step of pyrimidine biosynthesis. The family also includes enzymes such as hexulose-6-phosphate synthase. This family appears to be distantly related to pfam00834. 40315 pfam00216: Bacterial DNA-binding protein. 40316 pfam00217: ATP:guanido phosphotransferase, C-terminal catalytic domain. The substrate binding site is located in the cleft between N and C-terminal domains, but most of the catalytic residues are found in the larger C-terminal domain. 40317 pfam00218: Indole-3-glycerol phosphate synthase. 40318 pfam00219: Insulin-like growth factor binding protein. 40319 pfam00220: Neurohypophysial hormones, N-terminal Domain. C-terminal is in hormone5. 40320 pfam00221: Phenylalanine and histidine ammonia-lyase. 40321 pfam00223: Photosystem I psaA/psaB protein. 40322 pfam00224: Pyruvate kinase, barrel domain. This domain of the is actually a small beta-barrel domain nested within a larger TIM barrel. The active site is found in a cleft between the two domains. 40323 pfam00225: Kinesin motor domain. 40324 pfam00226: DnaJ domain. DnaJ domains (J-domains) are associated with hsp70 heat-shock system and it is thought that this domain mediates the interaction. DnaJ-domain is therefore part of a chaperone (protein folding) system. The T-antigens are confirmed as DnaJ containing domains from literature. 40325 pfam00227: Proteasome A-type and B-type. 40326 pfam00228: Bowman-Birk serine protease inhibitor family. 40327 pfam00229: TNF(Tumour Necrosis Factor) family. 40328 pfam00230: Major intrinsic protein. MIP (Major Intrinsic Protein) family proteins exhibit essentially two distinct types of channel properties: (1) specific water transport by the aquaporins, and (2) small neutral solutes transport, such as glycerol by the glycerol facilitators. 40329 pfam00231: ATP synthase. 40330 pfam00232: Glycosyl hydrolase family 1. 40331 pfam00233: 3'5'-cyclic nucleotide phosphodiesterase. 40332 pfam00234: Protease inhibitor/seed storage/LTP family. This family is composed of trypsin-alpha amylase inhibitors, seed storage proteins and lipid transfer proteins from plants. 40333 pfam00235: Profilin. 40334 pfam00236: Glycoprotein hormone. 40335 pfam00237: Ribosomal protein L22p/L17e. This family includes L22 from prokaryotes and chloroplasts and L17 from eukaryotes. 40336 pfam00238: Ribosomal protein L14p/L23e. 40337 pfam00239: Resolvase, N terminal domain. The N-terminal domain of the resolvase family (this family) contains the active site and the dimer interface. The extended arm at the C-terminus of this domain connects to the C-terminal helix-turn-helix domain of resolvase - see pfam02796. 40338 pfam00240: Ubiquitin family. This family contains a number of ubiquitin-like proteins: SUMO (smt3 homologue), Nedd8, Elongin B, Rub1. 40339 pfam00241: Cofilin/tropomyosin-type actin-binding protein. Severs actin filaments and binds to actin monomers. 40340 pfam00242: DNA polymerase (viral) N-terminal domain. 40341 pfam00243: Nerve growth factor family. 40342 pfam00244: 14-3-3 protein. 40343 pfam00245: Alkaline phosphatase. 40344 pfam00246: Zinc carboxypeptidase. 40345 pfam00248: Aldo/keto reductase family. This family includes a number of K+ ion channel beta chain regulatory domains - these are reported to have oxidoreductase activity. 40346 pfam00249: Myb-like DNA-binding domain. This family contains the DNA binding domains from Myb proteins, as well as the SANT domain family. 40347 pfam00250: Fork head domain. 40348 pfam00251: Glycosyl hydrolases family 32. 40349 pfam00252: Ribosomal protein L16. 40350 pfam00253: Ribosomal protein S14p/S29e. This family includes both ribosomal S14 from prokaryotes and S29 from eukaryotes. 40351 pfam00254: FKBP-type peptidyl-prolyl cis-trans isomerase. 40352 pfam00255: Glutathione peroxidase. 40353 pfam00256: Ribosomal protein L15. 40354 pfam00257: Dehydrin. 40355 pfam00258: Flavodoxin. 40356 pfam00259: Xylose isomerase. 40357 pfam00260: Protamine P1. 40358 pfam00261: Tropomyosin. 40359 pfam00262: Calreticulin family. 40360 pfam00263: Bacterial type II and III secretion system protein. 40361 pfam00264: Common central domain of tyrosinase. This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. 40362 pfam00265: Thymidine kinase. 40363 pfam00266: Aminotransferase class-V. 40364 pfam00267: Gram-negative porin. 40365 pfam00268: Ribonucleotide reductase, small chain. 40366 pfam00269: Small, acid-soluble spore proteins, alpha/beta type. 40367 pfam00270: DEAD/DEAH box helicase. Members of this family include the DEAD and DEAH box helicases. Helicases are involved in unwinding nucleic acids. The DEAD box helicases are involved in various aspects of RNA metabolism, including nuclear transcription, pre mRNA splicing, ribosome biogenesis, nucleocytoplasmic transport, translation, RNA decay and organellar gene expression. 40368 pfam00271: Helicase conserved C-terminal domain. TThis domain family is found in a wide variety of helicases and helicase related proteins. It may be that this is not an autonomously folding unit, but an integral part of the helicase. 40369 pfam00272: Cecropin family. 40370 pfam00273: Serum albumin family. 40371 pfam00274: Fructose-bisphosphate aldolase class-I. 40372 pfam00275: EPSP synthase (3-phosphoshikimate 1-carboxyvinyltransferase).. 40373 pfam00276: Ribosomal protein L23. 40374 pfam00277: Serum amyloid A protein. 40375 pfam00278: Pyridoxal-dependent decarboxylase, C-terminal sheet domain. These pyridoxal-dependent decarboxylases act on ornithine, lysine, arginine and related substrates. . 40376 pfam00280: Potato inhibitor I family. 40377 pfam00281: Ribosomal protein L5. 40378 pfam00282: Pyridoxal-dependent decarboxylase conserved domain. 40379 pfam00283: Cytochrome b559, alpha (gene psbE) and beta (gene psbF)subunits. 40380 pfam00284: Lumenal portion of Cytochrome b559, alpha (gene psbE) subunit. This family is the lumenal portion of cytochrome b559 alpha chain, matches to this family should be accompanied by a match to the pfam00283 family. 40381 pfam00285: Citrate synthase. 40382 pfam00286: Viral coat protein. Family includes coat proteins from Potexviruses and carlaviruses. 40383 pfam00287: Sodium / potassium ATPase beta chain. 40384 pfam00288: GHMP kinases putative ATP-binding protein. 40385 pfam00289: Carbamoyl-phosphate synthase L chain, N-terminal domain. Carbamoyl-phosphate synthase catalyses the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. This important enzyme initiates both the urea cycle and the biosynthesis of arginine and/or pyrimidines. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. The small chain promotes the hydrolysis of glutamine to ammonia, which is used by the large chain to synthesise carbamoyl phosphate. See pfam00988. The small chain has a GATase domain in the carboxyl terminus. See pfam00117. 40386 pfam00290: Tryptophan synthase alpha chain. 40387 pfam00291: Pyridoxal-phosphate dependent enzyme. Members of this family are all pyridoxal-phosphate dependent enzymes. This family includes: serine dehydratase EC:4.2.1.13 P20132, threonine dehydratase EC:4.2.1.16, tryptophan synthase beta chain EC:4.2.1.20, threonine synthase EC:4.2.99.2, cysteine synthase EC:4.2.99.8 P11096, cystathionine beta-synthase EC:4.2.1.22, 1-aminocyclopropane-1-carboxylate deaminase EC:4.1.99.4. 40388 pfam00292: 'Paired box' domain. 40389 pfam00293: NUDIX domain. 40390 pfam00294: pfkB family carbohydrate kinase. This family includes a variety of carbohydrate and pyrimidine kinases. The family includes phosphomethylpyrimidine kinase EC:2.7.4.7. This enzyme is part of the Thiamine pyrophosphate (TPP) synthesis pathway, TPP is an essential cofactor for many enzymes. 40391 pfam00295: Glycosyl hydrolases family 28. Glycosyl hydrolase family 28 includes polygalacturonase EC:3.2.1.15 as well as rhamnogalacturonase A(RGase A), EC:3.2.1.-. These enzymes is important in cell wall metabolism. 40392 pfam00296: Luciferase-like monooxygenase. 40393 pfam00297: Ribosomal protein L3. 40394 pfam00298: Ribosomal protein L11, RNA binding domain. 40395 pfam00299: Squash family serine protease inhibitor. 40396 pfam00300: Phosphoglycerate mutase family. 40397 pfam00301: Rubredoxin. 40398 pfam00302: Chloramphenicol acetyltransferase. 40399 pfam00303: Thymidylate synthase. 40400 pfam00304: Gamma-thionins family. 40401 pfam00305: Lipoxygenase. 40402 pfam00306: ATP synthase alpha/beta chain, C terminal domain. 40403 pfam00307: Calponin homology (CH) domain. The CH domain is found in both cytoskeletal proteins and signal transduction proteins. The CH domain is involved in actin binding in some members of the family. However in calponins there is evidence that the CH domain is not involved in its actin binding activity. Most proteins have two copies of the CH domain, however some proteins such as calponin have a single copy only. 40404 pfam00308: Bacterial dnaA protein. 40405 pfam00309: Sigma-54 factor, Activator interacting domain (AID). The sigma-54 holoenzyme is an enhancer dependent form of the RNA polymerase. The AID is necessary for activator interaction. In addition, the AID also inhibits transcription initiation in the sigma-54 holoenzyme prior to interaction with the activator. 40406 pfam00310: Glutamine amidotransferases class-II. 40407 pfam00311: Phosphoenolpyruvate carboxylase. 40408 pfam00312: Ribosomal protein S15. 40409 pfam00313: 'Cold-shock' DNA-binding domain. 40410 pfam00314: Thaumatin family. 40411 pfam00316: Fructose-1-6-bisphosphatase. 40412 pfam00317: Ribonucleotide reductase, all-alpha domain. 40413 pfam00318: Ribosomal protein S2. 40414 pfam00319: SRF-type transcription factor (DNA-binding and dimerisation domain).. 40415 pfam00320: GATA zinc finger. This domain uses four cysteine residues to coordinate a zinc ion. This domain binds to DNA. Two GATA zinc fingers are found in the GATA transcription factors. However there are several proteins which only contains a single copy of the domain. 40416 pfam00321: Plant thionin. 40417 pfam00322: Endothelin family. 40418 pfam00323: Mammalian defensin. 40419 pfam00324: Amino acid permease. 40420 pfam00325: Bacterial regulatory proteins, crp family. 40421 pfam00326: Prolyl oligopeptidase family. 40422 pfam00327: Ribosomal protein L30p/L7e. This family includes prokaryotic L30 and eukaryotic L7. 40423 pfam00328: Histidine acid phosphatase. 40424 pfam00329: Respiratory-chain NADH dehydrogenase, 30 Kd subunit. 40425 pfam00330: Aconitase family (aconitate hydratase).. 40426 pfam00331: Glycosyl hydrolase family 10. 40427 pfam00332: Glycosyl hydrolases family 17. 40428 pfam00333: Ribosomal protein S5, N-terminal domain. 40429 pfam00334: Nucleoside diphosphate kinase. 40430 pfam00335: Tetraspanin family. 40431 pfam00336: DNA polymerase (viral) C-terminal domain. 40432 pfam00337: Galactoside-binding lectin. 40433 pfam00338: Ribosomal protein S10p/S20e. This family includes small ribosomal subunit S10 from prokaryotes and S20 from eukaryotes. 40434 pfam00339: Arrestin (or S-antigen), N-terminal domain. Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain. 40435 pfam00340: Interleukin-1 / 18. This family includes interleukin-1 and interleukin-18. 40436 pfam00341: Platelet-derived growth factor (PDGF).. 40437 pfam00342: Phosphoglucose isomerase. Phosphoglucose isomerase catalyses the interconversion of glucose-6-phosphate and fructose-6-phosphate. 40438 pfam00343: Carbohydrate phosphorylase. The members of this family catalyse the formation of glucose 1-phosphate from one of the following polyglucoses; glycogen, starch, glucan or maltodextrin. 40439 pfam00344: eubacterial secY protein. 40440 pfam00345: Gram-negative pili assembly chaperone, N-terminal domain. C2 domain-like beta-sandwich fold. 40441 pfam00346: Respiratory-chain NADH dehydrogenase, 49 Kd subunit. 40442 pfam00347: Ribosomal protein L6. 40443 pfam00348: Polyprenyl synthetase. 40444 pfam00349: Hexokinase. Hexokinase (EC:2.7.1.1) contains two structurally similar domains represented by this family and pfam03727. Some members of the family have two copies of each of these domains. 40445 pfam00350: Dynamin family. 40446 pfam00351: Biopterin-dependent aromatic amino acid hydroxylase. This family includes phenylalanine-4-hydroxylase, the phenylketonuria disease protein. 40447 pfam00352: Transcription factor TFIID (or TATA-binding protein, TBP).. 40448 pfam00353: Hemolysin-type calcium-binding repeat (2 copies).. 40449 pfam00354: Pentaxin family. Pentaxins are also known as pentraxins. 40450 pfam00355: Rieske [2Fe-2S] domain. The rieske domain has a [2Fe-2S] centre. Two conserved cysteines that one Fe ion while the other Fe ion is coordinated by two conserved histidines. 40451 pfam00356: Bacterial regulatory proteins, lacI family. 40452 pfam00357: Integrin alpha cytoplasmic region. This family contains the short intracellular region of integrin alpha chains. 40453 pfam00358: phosphoenolpyruvate-dependent sugar phosphotransferase system, EIIA 1. 40454 pfam00359: Phosphoenolpyruvate-dependent sugar phosphotransferase system, EIIA 2. 40455 pfam00360: Phytochrome region. This family contains a region specific to phytochrome proteins. 40456 pfam00361: NADH-Ubiquinone/plastoquinone (complex I), various chains. This family is part of complex I which catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane. 40457 pfam00362: Integrin, beta chain. Sequences cut off at repeats due to overlap with EGF. 40458 pfam00363: Casein. 40459 pfam00364: Biotin-requiring enzyme. This family covers two subgroups, the conserved lysine residue binds biotin in one group and lipoic acid in the other. 40460 pfam00365: Phosphofructokinase. 40461 pfam00366: Ribosomal protein S17. 40462 pfam00367: phosphotransferase system, EIIB. 40463 pfam00368: Hydroxymethylglutaryl-coenzyme A reductase. 40464 pfam00370: FGGY family of carbohydrate kinases, N-terminal domain. This domain adopts a ribonuclease H-like fold and is structurally related to the C-terminal domain. 40465 pfam00372: Hemocyanin, copper containing domain. This family includes arthropod hemocyanins and insect larval storage proteins. 40466 pfam00373: FERM domain (Band 4.1 family). This domain has been renamed the FERM domain, which stands for F for 4.1, E for Ezrin, R for radixin and M for moesin. 40467 pfam00374: Nickel-dependent hydrogenase. 40468 pfam00375: Sodium:dicarboxylate symporter family. 40469 pfam00376: MerR family regulatory protein. 40470 pfam00377: Prion/Doppel alpha-helical domain. The prion protein is thought to be the infectious agent that causes transmissible spongiform encephalopathies, such as scrapie and BSE. It is thought that the prion protein can exist in two different forms: one is the normal cellular protein, and the other is the infectious form which can change the normal prion protein into the infectious form. It has been found that the prion alpha-helical domain is also found in the Doppel protein. 40471 pfam00378: Enoyl-CoA hydratase/isomerase family. This family contains a diverse set of enzymes including: Enoyl-CoA hydratase. Napthoate synthase. Carnitate racemase. 3-hydoxybutyryl-CoA dehydratase. Dodecanoyl-CoA delta-isomerase. 40472 pfam00379: Insect cuticle protein. Many insect cuticular proteins include a 35-36 amino acid motif known as the R&R consensus. The extensive conservation of this region led to the suggestion that it functions to bind chitin. Provocatively, it has no sequence similarity to the well-known cysteine-containing chitin-binding domain found in chitinases and some peritrophic membrane proteins. Chitin binding has been shown experimentally for this region. Thus arthropods have two distinct classes of chitin binding proteins, those with the chitin-binding domain found in lectins, chitinases and peritrophic membranes (cysCBD) and those with the cuticular protein chitin-binding domain (non-cysCBD).. 40473 pfam00380: Ribosomal protein S9/S16. This family includes small ribosomal subunit S9 from prokaryotes and S16 from eukaryotes. 40474 pfam00381: PTS HPr component phosphorylation site. 40475 pfam00382: Transcription factor TFIIB repeat. 40476 pfam00383: Cytidine and deoxycytidylate deaminase zinc-binding region. 40477 pfam00384: Molybdopterin oxidoreductase. 40478 pfam00385: 'chromo' (CHRromatin Organisation MOdifier) domain. 40479 pfam00386: C1q domain. C1q is a subunit of the C1 enzyme complex that activates the serum complement system. 40480 pfam00387: Phosphatidylinositol-specific phospholipase C, Y domain. This associates with pfam00388 to form a single structural unit. 40481 pfam00388: Phosphatidylinositol-specific phospholipase C, X domain. This associates with pfam00387 to form a single structural unit. 40482 pfam00389: D-isomer specific 2-hydroxyacid dehydrogenase, catalytic domain. This family represents the largest portion of the catalytic domain of 2-hydroxyacid dehydrogenases as the NAD binding domain is inserted within the structural domain. 40483 pfam00390: Malic enzyme, N-terminal domain. 40484 pfam00391: PEP-utilising enzyme, mobile domain. This domain is a ""swivelling"" beta/beta/alpha domain which is thought to be mobile in all proteins known to contain it. 40485 pfam00392: Bacterial regulatory proteins, gntR family. This domain comprises the N-terminal HTH-containing region of GntR-like bacterial transcription factors. At the C terminus there is usually an effector-binding/oligomerisation domain. The GntR-like proteins can be divided into six sub-families: MocR, YtrR, FadR, AraR, HutC and PlmA. 40486 pfam00393: 6-phosphogluconate dehydrogenase, C-terminal domain. This family represents the C-terminal all-alpha domain of 6-phosphogluconate dehydrogenase. The domain contains two structural repeats of 5 helices each. 40487 pfam00394: Multicopper oxidase. Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. 40488 pfam00395: S-layer homology domain. 40489 pfam00396: Granulin. 40490 pfam00397: WW domain. The WW domain is a protein module with two highly conserved tryptophans that binds proline-rich peptide motifs in vitro. 40491 pfam00398: Ribosomal RNA adenine dimethylase. 40492 pfam00399: Yeast PIR protein repeat. 40493 pfam00400: WD domain, G-beta repeat. 40494 pfam00401: ATP synthase, Delta/Epsilon chain, long alpha-helix domain. Part of the ATP synthase CF(1). These subunits are part of the head unit of the ATP synthase. This subunit is called epsilon in bacteria and delta in mitochondria. In bacteria the delta (D) subunit is equivalent to the mitochondrial Oligomycin sensitive subunit, OSCP (pfam00213).. 40495 pfam00402: Calponin family repeat. 40496 pfam00403: Heavy-metal-associated domain. 40497 pfam00404: Dockerin type I repeat. The dockerin repeat is the binding partner of the cohesin domain pfam00963. The cohesin-dockerin interaction is the crucial interaction for complex formation in the cellulosome. The dockerin repeats, each bearing homology to the EF-hand calcium-binding loop bind calcium. 40498 pfam00405: Transferrin. 40499 pfam00406: Adenylate kinase. 40500 pfam00407: Pathogenesis-related protein Bet v I family. 40501 pfam00408: Phosphoglucomutase/phosphomannomutase, C-terminal domain. 40502 pfam00410: Ribosomal protein S8. 40503 pfam00411: Ribosomal protein S11. 40504 pfam00412: LIM domain. This family represents two copies of the LIM structural domain. 40505 pfam00413: Matrixin. The members of this family are enzymes that cleave peptides. These proteases require zinc for catalysis. 40506 pfam00414: Neuraxin and MAP1B repeat. 40507 pfam00415: Regulator of chromosome condensation (RCC1).. 40508 pfam00416: Ribosomal protein S13/S18. This family includes ribosomal protein S13 from prokaryotes and S18 from eukaryotes. 40509 pfam00417: Ribosomal protein S3, N-terminal domain. This family contains a central domain pfam00013, hence the amino and carboxyl-terminal domains are stored separately. 40510 pfam00418: Tau and MAP protein, tubulin-binding repeat. 40511 pfam00419: Fimbrial protein. 40512 pfam00420: NADH-ubiquinone/plastoquinone oxidoreductase chain 4L. 40513 pfam00421: Photosystem II protein. 40514 pfam00423: Hemagglutinin-neuraminidase. 40515 pfam00424: REV protein (anti-repression trans-activator protein).. 40516 pfam00425: chorismate binding enzyme. This family includes the catalytic regions of the chorismate binding enzymes anthranilate synthase, isochorismate synthase, aminodeoxychorismate synthase and para-aminobenzoate synthase. 40517 pfam00426: Outer Capsid protein VP4 (Hemagglutinin).. 40518 pfam00427: Phycobilisome Linker polypeptide. 40519 pfam00428: 60s Acidic ribosomal protein. This family includes archaebacterial L12, eukaryotic P0, P1 and P2. 40520 pfam00429: ENV polyprotein (coat polyprotein).. 40521 pfam00430: ATP synthase B/B' CF(0). Part of the CF(0) (base unit) of the ATP synthase. The base unit is thought to translocate protons through membrane (inner membrane in mitochondria, thylakoid membrane in plants, cytoplasmic membrane in bacteria). The B subunits are thought to interact with the stalk of the CF(1) subunits. This domain should not be confused with the ab CF(1) proteins (in the head of the ATP synthase) which are found in pfam00006. 40522 pfam00431: CUB domain. 40523 pfam00432: Prenyltransferase and squalene oxidase repeat. 40524 pfam00433: Protein kinase C terminal domain. 40525 pfam00434: Glycoprotein VP7. 40526 pfam00435: Spectrin repeat. Spectrin repeats are found in several proteins involved in cytoskeletal structure. These include spectrin, alpha-actinin and dystrophin. The sequence repeat used in this family is taken from the structural repeat in reference. The spectrin repeat forms a three helix bundle. The second helix is interrupted by proline in some sequences. The repeats are defined by a characteristic tryptophan (W) residue at position 17 in helix A and a leucine (L) at 2 residues from the carboxyl end of helix C. 40527 pfam00436: Single-strand binding protein family. This family includes single stranded binding proteins and also the primosomal replication protein N (PriB). PriB forms a complex with PriA, PriC and ssDNA. 40528 pfam00437: Type II/IV secretion system protein. This family contains both type II and type IV pathway secretion proteins from bacteria. VirB11 ATPase is a subunit of the Agrobacterium tumefaciens transfer DNA (T-DNA) transfer system, a type IV secretion pathway required for delivery of T-DNA and effector proteins to plant cells during infection. . 40529 pfam00438: S-adenosylmethionine synthetase, N-terminal domain. The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold. 40530 pfam00439: Bromodomain. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. 40531 pfam00440: Bacterial regulatory proteins, tetR family. 40532 pfam00441: Acyl-CoA dehydrogenase, C-terminal domain. C-terminal domain of Acyl-CoA dehydrogenase is an all-alpha, four helical up-and-down bundle. 40533 pfam00443: Ubiquitin carboxyl-terminal hydrolase. 40534 pfam00444: Ribosomal protein L36. 40535 pfam00445: Ribonuclease T2 family. 40536 pfam00446: Gonadotropin-releasing hormone. 40537 pfam00447: HSF-type DNA-binding. 40538 pfam00448: SRP54-type protein, GTPase domain. This family includes relatives of the G-domain of the SRP54 family of proteins. 40539 pfam00449: Urease alpha-subunit, N-terminal domain. The N-terminal domain is a composite domain and plays a major trimer stabilising role by contacting the catalytic domain of the symmetry related alpha-subunit. 40540 pfam00450: Serine carboxypeptidase. 40541 pfam00451: Scorpion short toxin. 40542 pfam00452: Apoptosis regulator proteins, Bcl-2 family. 40543 pfam00453: Ribosomal protein L20. 40544 pfam00454: Phosphatidylinositol 3- and 4-kinase. 40545 pfam00455: Bacterial regulatory proteins, deoR family. 40546 pfam00456: Transketolase, thiamine diphosphate binding domain. This family includes transketolase enzymes EC:2.2.1.1. and also partially matches to 2-oxoisovalerate dehydrogenase beta subunit EC:1.2.4.4. Both these enzymes utilise thiamine pyrophosphate as a cofactor, suggesting there may be common aspects in their mechanism of catalysis. 40547 pfam00457: Glycosyl hydrolases family 11. 40548 pfam00458: WHEP-TRS domain. 40549 pfam00459: Inositol monophosphatase family. 40550 pfam00460: Flagella basal body rod protein. 40551 pfam00461: Signal peptidase I. 40552 pfam00462: Glutaredoxin. 40553 pfam00463: Isocitrate lyase family. 40554 pfam00464: Serine hydroxymethyltransferase. 40555 pfam00465: Iron-containing alcohol dehydrogenase. 40556 pfam00466: Ribosomal protein L10. 40557 pfam00467: KOW motif. The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and NusG. 40558 pfam00468: Ribosomal protein L34. 40559 pfam00469: Negative factor, (F-Protein) or Nef. Nef protein accelerates virulent progression of AIDS by its interaction with cellular proteins involved in signal transduction and host cell activation. Nef has been shown to bind specifically to a subset of the Src kinase family. 40560 pfam00471: Ribosomal protein L33. 40561 pfam00472: Peptidyl-tRNA hydrolase domain. This domain is found in peptide chain release factors such as RF-1 and RF-2, and a number of smaller proteins of unknown function. This domain contains the peptidyl-tRNA hydrolase activity. The domain contains a highly conserved motif GGQ, where the glutamine is thought to coordinate the water that mediates the hydrolysis. 40562 pfam00473: Corticotropin-releasing factor family. 40563 pfam00474: Sodium:solute symporter family. Membership of this family is supported by a significant blast score. 40564 pfam00475: Imidazoleglycerol-phosphate dehydratase. 40565 pfam00476: DNA polymerase family A. 40566 pfam00477: Small hydrophilic plant seed protein. 40568 pfam00479: Glucose-6-phosphate dehydrogenase, NAD binding domain. 40569 pfam00480: ROK family. 40570 pfam00481: Protein phosphatase 2C. Protein phosphatase 2C is a Mn++ or Mg++ dependent protein serine/threonine phosphatase. 40571 pfam00482: Bacterial type II secretion system protein F domain. The original family covered both the regions found by the current model. The splitting of the family has allowed the related FlaJ_arch (archaeal FlaJ family) to be merged with it. 40572 pfam00483: Nucleotidyl transferase. This family includes a wide range of enzymes which transfer nucleotides onto phosphosugars. 40573 pfam00484: Carbonic anhydrase. 40574 pfam00485: Phosphoribulokinase / Uridine kinase family. 40575 pfam00486: Transcriptional regulatory protein, C terminal. 40576 pfam00487: Fatty acid desaturase. 40577 pfam00488: MutS domain V. This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam01624, pfam05188, pfam05192 and pfam05190. The mutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. The aligned region corresponds with domain V of Thermus aquaticus MutS, which contains a Walker A motif, and is structurally similar to the ATPase domain of ABC transporters. 40578 pfam00489: Interleukin-6/G-CSF/MGF family. 40579 pfam00490: Delta-aminolevulinic acid dehydratase. 40580 pfam00491: Arginase family. 40581 pfam00493: MCM2/3/5 family. 40582 pfam00494: Squalene/phytoene synthase. 40583 pfam00496: Bacterial extracellular solute-binding proteins, family 5. 40584 pfam00497: Bacterial extracellular solute-binding proteins, family 3. 40585 pfam00498: FHA domain. The FHA (Forkhead-associated) domain is a phosphopeptide binding motif. 40586 pfam00499: NADH-ubiquinone/plastoquinone oxidoreductase chain 6. 40587 pfam00500: L1 (late) protein. 40588 pfam00501: AMP-binding enzyme. 40589 pfam00502: Phycobilisome protein. 40590 pfam00503: G-protein alpha subunit. G proteins couple receptors of extracellular signals to intracellular signaling pathways. The G protein alpha subunit binds guanyl nucleotide and is a weak GTPase. 40591 pfam00504: Chlorophyll A-B binding protein. 40592 pfam00505: HMG (high mobility group) box. 40593 pfam00506: Influenza virus nucleoprotein. 40594 pfam00507: NADH-ubiquinone/plastoquinone oxidoreductase, chain 3. 40595 pfam00508: E2 (early) protein, N terminal. 40596 pfam00509: Hemagglutinin. Hemagglutinin from influenza virus causes membrane fusion of the viral membrane with the host membrane. Fusion occurs after the host cell internalises the virus by endocytosis. The drop of pH causes release of a hydrophobic fusion peptide and a large conformational change leading to membrane fusion. 40597 pfam00510: Cytochrome c oxidase subunit III. 40598 pfam00511: E2 (early) protein, C terminal. 40599 pfam00512: His Kinase A (phosphoacceptor) domain. Dimerisation and phosphoacceptor domain of histidine kinases. 40600 pfam00513: Late Protein L2. 40601 pfam00514: Armadillo/beta-catenin-like repeat. Approx. 40 amino acid repeat. Tandem repeats form super-helix of helices that is proposed to mediate interaction of beta-catenin with its ligands. CAUTION: This family does not contain all known armadillo repeats. 40602 pfam00515: TPR Domain. 40603 pfam00516: Envelope glycoprotein GP120. The entry of HIV requires interaction of viral GP120 with CD4 and a chemokine receptor on the cell surface. 40604 pfam00517: Envelope Polyprotein GP41. The GP41 subunit of the envelope protein complex from human and simian immunodeficiency viruses (HIV and SIV) mediates membrane fusion during viral entry. 40605 pfam00518: Early Protein (E6).. 40606 pfam00519: Papillomavirus helicase. This protein is a DNA helicase that is required for initiation of viral DNA replication. This protein forms a complex with the E2 protein pfam00508. 40607 pfam00520: Ion transport protein. This family contains Sodium, Potassium, Calcium ion channels. This family is 6 transmembrane helices in which the last two helices flank a loop which determines ion selectivity. In some sub-families (e.g. Na channels) the domain is repeated four times, whereas in others (e.g. K channels) the protein forms as a tetramer in the membrane. A bacterial structure of the protein is known for the last two helices but is not the Pfam family due to it lacking the first four helices. 40608 pfam00521: DNA gyrase/topoisomerase IV, subunit A. 40609 pfam00522: VPR/VPX protein. 40610 pfam00523: Fusion glycoprotein F0. 40611 pfam00524: E1 Protein, N terminal domain. 40612 pfam00525: Alpha crystallin A chain, N terminal. 40613 pfam00526: Dictyostelium (slime mold) repeat. 40614 pfam00527: E7 protein, Early protein. 40615 pfam00528: Binding-protein-dependent transport system inner membrane component. The alignments cover the most conserved region of the proteins, which is thought to be located in a cytoplasmic loop between two transmembrane domains. The members of this family have a variable number of transmembrane helices. 40616 pfam00529: HlyD family secretion protein. 40617 pfam00530: Scavenger receptor cysteine-rich domain. These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. 40618 pfam00531: Death domain. 40619 pfam00532: Periplasmic binding proteins and sugar binding domain of the LacI family. This family includes the periplasmic binding proteins, and the LacI family transcriptional regulators. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The LacI family of proteins consist of transcriptional regulators related to the lac repressor. In this case, generally the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain (pfam00356).. 40620 pfam00533: BRCA1 C Terminus (BRCT) domain. The BRCT domain is found predominantly in proteins involved in cell cycle checkpoint functions responsive to DNA damage. It has been suggested that the Retinoblastoma protein contains a divergent BRCT domain, this has not been included in this family. The BRCT domain of XRCC1 forms a homodimer in the crystal structure. This suggests that pairs of BRCT domains associate as homo- or heterodimers. 40621 pfam00534: Glycosyl transferases group 1. Mutations in this domain may lead to disease (Paroxysmal Nocturnal haemoglobinuria). Members of this family transfer activated sugars to a variety of substrates, including glycogen, Fructose-6-phosphate and lipopolysaccharides. Members of this family transfer UDP, ADP, GDP or CMP linked sugars. The eukaryotic glycogen synthases may be distant members of this family. 40622 pfam00535: Glycosyl transferase. Diverse family, transferring sugar from UDP-glucose, UDP-N-acetyl- galactosamine, GDP-mannose or CDP-abequose, to a range of substrates including cellulose, dolichol phosphate and teichoic acids. 40623 pfam00536: SAM domain (Sterile alpha motif). It has been suggested that SAM is an evolutionarily conserved protein binding domain that is involved in the regulation of numerous developmental processes in diverse eukaryotes. The SAM domain can potentially function as a protein interaction module through its ability to homo- and heterooligomerise with other SAM domains. 40624 pfam00537: Scorpion toxin-like domain. This family contains both neurotoxins and plant defensins. The mustard trypsin inhibitor, MTI-2, is plant defensin. It is a potent inhibitor of trypsin with no activity towards chymotrypsin. MTI-2 is toxic for Lepidopteran insects, but has low activity against aphids. Brazzein is plant defensin-like protein. It is pH-stable, heat-stable and intensely sweet protein. The scorpion toxin (a neurotoxin) binds to sodium channels and inhibits the activation mechanisms of the channels, thereby blocking neuronal transmission. Scorpion toxins bind to sodium channels and inhibit the activation mechanisms of the channels, thereby blocking neuronal transmission. 40625 pfam00538: linker histone H1 and H5 family. Linker histone H1 is an essential component of chromatin structure. H1 links nucleosomes into higher order structures Histone H1 is replaced by histone H5 in some cell types. 40626 pfam00539: Transactivating regulatory protein (Tat). The retroviral Tat protein binds to the Tar RNA. This activates transcriptional initiation and elongation from the LTR promoter. Binding is mediated by an arginine rich region. 40627 pfam00540: gag gene protein p17 (matrix protein). The matrix protein forms an icosahedral shell associated with the inner membrane of the mature immunodeficiency virus. 40628 pfam00541: Adenoviral fibre protein (knob domain). Specific attachment of adenovirus is achieved through interactions between host-cell receptors and the adenovirus fibre protein and is mediated by the globular carboxy-terminal domain of the adenovirus fibre protein, termed the carboxy-terminal knob domain. 40629 pfam00542: Ribosomal protein L7/L12 C-terminal domain. 40630 pfam00543: Nitrogen regulatory protein P-II. P-II modulates the activity of glutamine synthetase. 40631 pfam00544: Pectate lyase. This enzyme forms a right handed beta helix structure. Pectate lyase is an enzyme involved in the maceration and soft rotting of plant tissue. . 40632 pfam00545: ribonuclease. This enzyme hydrolyses RNA and oligoribonucleotides. 40633 pfam00547: Urease, gamma subunit. Urease is a nickel-binding enzyme that catalyses the hydrolysis of urea to carbon dioxide and ammonia. 40634 pfam00548: 3C cysteine protease (picornain 3C). Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease. 40635 pfam00549: CoA-ligase. This family includes the CoA ligases Succinyl-CoA synthetase alpha and beta chains, malate CoA ligase and ATP-citrate lyase. Some members of the family utilise ATP others use GTP. 40636 pfam00550: Phosphopantetheine attachment site. A 4 '-phosphopantetheine prosthetic group is attached through a serine. This prosthetic group acts as a a 'swinging arm' for the attachment of activated fatty acid and amino-acid groups. This domain forms a four helix bundle. The attachment serine is replaced by an alanine in some members. 40637 pfam00551: Formyl transferase. This family includes the following members. Glycinamide ribonucleotide transformylase catalyses the third step in de novo purine biosynthesis, the transfer of a formyl group to 5'-phosphoribosylglycinamide. Formyltetrahydrofolate deformylase produces formate from formyl- tetrahydrofolate. Methionyl-tRNA formyltransferase transfers a formyl group onto the amino terminus of the acyl moiety of the methionyl aminoacyl-tRNA. Inclusion of the following members is supported by PSI-blast. HOXX_BRAJA (P31907) contains a related domain of unknown function. PRTH_PORGI (P46071) contains a related domain of unknown function. Y09P_MYCTU (Q50721) contains a related domain of unknown function. 40638 pfam00552: Integrase DNA binding domain. Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain. The central domain is the catalytic domain pfam00665. This domain is the carboxyl terminal domain that is a non-specific DNA binding domain. 40639 pfam00553: Cellulose binding domain. Two tryptophan residues are involved in cellulose binding. Cellulose binding domain found in bacteria. 40640 pfam00554: Rel homology domain (RHD). Proteins containing the Rel homology domain (RHD) are eukaryotic transcription factors. The RHD is composed of two structural domains. This is the N-terminal domain that is similar to that found in P53. The C-terminal domain has an immunoglobulin-like fold (See pfam01833) that binds to DNA. 40641 pfam00555: delta endotoxin. This family contains insecticidal toxins produced by Bacillus species of bacteria. During spore formation the bacteria produce crystals of this protein. When an insect ingests these proteins they are activated by proteolytic cleavage. The N terminus is cleaved in all of the proteins and a C terminal extension is cleaved in some members. Once activated the endotoxin binds to the gut epithelium and causes cell lysis leading to death. This activated region of the delta endotoxin is composed of three structural domains. The N-terminal helical domain is involved in membrane insertion and pore formation. The second and third domains are involved in receptor binding. 40642 pfam00556: Antenna complex alpha/beta subunit. 40643 pfam00557: metallopeptidase family M24. 40644 pfam00558: Vpu protein. The Vpu protein contains an N-terminal transmembrane spanning region and a C-terminal cytoplasmic region. The HIV-1 Vpu protein stimulates virus production by enhancing the release of viral particles from infected cells. The VPU protein binds specifically to CD4. 40645 pfam00559: Retroviral Vif (Viral infectivity) protein. Human immunodeficiency virus type 1 (HIV-1) Vif is required for productive infection of T lymphocytes and macrophages. Virions produced in the absence of Vif have abnormal core morphology and those produced in primary T cells carry immature core proteins and low levels of mature capsid. 40646 pfam00560: Leucine Rich Repeat. CAUTION: This Pfam may not find all Leucine Rich Repeats in a protein. Leucine Rich Repeats are short sequence motifs present in a number of proteins with diverse functions and cellular locations. These repeats are usually involved in protein-protein interactions. Each Leucine Rich Repeat is composed of a beta-alpha unit. These units form elongated non-globular structures. Leucine Rich Repeats are often flanked by cysteine rich domains. 40647 pfam00561: alpha/beta hydrolase fold. This catalytic domain is found in a very wide range of enzymes. 40648 pfam00562: RNA polymerase Rpb2, domain 6. RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain represents the hybrid binding domain and the wall domain. The hybrid binding domain binds the nascent RNA strand / template DNA strand in the Pol II transcription elongation complex. This domain contains the important structural motifs, switch 3 and the flap loop and binds an active site metal ion. This domain is also involved in binding to Rpb1 and Rpb3. Many of the bacterial members contain large insertions within this domain, as region known as dispensable region 2 (DRII).. 40649 pfam00563: EAL domain. This domain is found in diverse bacterial signaling proteins. It is called EAL after its conserved residues. The EAL domain is a good candidate for a diguanylate phosphodiesterase function. The domain contains many conserved acidic residues that could participate in metal binding and might form the phosphodiesterase active site. 40650 pfam00564: PB1 domain. 40651 pfam00565: Staphylococcal nuclease homologue. Present in all three domains of cellular life. Four copies in the transcriptional coactivator p100: these, however, appear to lack the active site residues of Staphylococcal nuclease. Positions 14 (Asp-21), 34 (Arg-35), 39 (Asp-40), 42 (Glu-43) and 110 (Arg-87) [SNase numbering in parentheses] are thought to be involved in substrate-binding and catalysis. 40652 pfam00566: TBC domain. Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, imply that these domains are GTPase activator proteins of Rab-like small GTPases. 40653 pfam00567: Tudor domain. Domain of unknown function present in several RNA-binding proteins. copies in the Drosophila Tudor protein. 40654 pfam00568: WH1 domain. WASp Homology domain 1 (WH1) domain. WASP is the protein that is defective in Wiskott-Aldrich syndrome (WAS). The majority of point mutations occur within the amino- terminal WH1 domain. The metabotropic glutamate receptors mGluR1alpha and mGluR5 bind a protein called homer, which is a WH1 domain homologue. A subset of WH1 domains has been termed a ""EVH1"" domain and appear to bind a polyproline motif. 40655 pfam00569: Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300. ZZ in dystrophin binds calmodulin Putative zinc finger; binding not yet shown. 40656 pfam00570: HRDC domain. The HRDC (Helicase and RNase D C-terminal) domain has a putative role in nucleic acid binding. Mutations in the HRDC domain cause human disease. 40657 pfam00571: CBS domain. CBS domains are small intracellular modules of unknown function. They are mostly found in 2 or four copies within a protein. Pairs of CBS domains dimerise to form a stable globular domain. Two CBS domains are found in inosine-monophosphate dehydrogenase from all species, however the CBS domains are not needed for activity. CBS domains are found attached to a wide range of other protein domains suggesting that CBS domains may play a regulatory role. The region containing the CBS domains in Cystathionine-beta synthase is involved in regulation by S-AdoMet. 40658 pfam00572: Ribosomal protein L13. 40659 pfam00573: Ribosomal protein L4/L1 family. This family includes Ribosomal L4/L1 from eukaryotes and archaebacteria and L4 from eubacteria. L4 from yeast has been shown to bind rRNA. 40660 pfam00574: Clp protease. The Clp protease has an active site catalytic triad. In E. coli Clp protease, ser-111, his-136 and asp-185 form the catalytic triad. One member has lost all of these active site residues and is therefore inactive. Some members contain one or two large insertions. 40661 pfam00575: S1 RNA binding domain. The S1 domain occurs in a wide range of RNA associated proteins. It is structurally similar to cold shock protein which binds nucleic acids. The S1 domain has an OB-fold structure. 40662 pfam00576: Transthyretin precursor (formerly prealbumin). Transthyretin is a thyroid hormone-binding protein that transports thyroxine from the bloodstream to the brain. Mutations in the human transthyretin are associated with several genetic disorders. 40663 pfam00577: Fimbrial Usher protein. This protein is involved in biogenesis of gram negative bacterial pili. 40664 pfam00578: AhpC/TSA family. This family contains proteins related to alkyl hydroperoxide reductase (AhpC) and thiol specific antioxidant (TSA). . 40665 pfam00579: tRNA synthetases class I (W and Y).. 40666 pfam00580: UvrD/REP helicase. The Rep family helicases are composed of four structural domains. The Rep family function as dimers. REP helicases catalyse ATP dependent unwinding of double stranded DNA to single stranded DNA. Some members have large insertions near to the carboxy-terminus relative to other members of the family. 40667 pfam00581: Rhodanese-like domain. Rhodanese has an internal duplication. This Pfam represents a single copy of this duplicated domain. The domain is found as a single copy in other proteins, including phosphatases and ubiquitin C-terminal hydrolases. 40668 pfam00582: Universal stress protein family. The universal stress protein UspA is a small cytoplasmic bacterial protein whose expression is enhanced when the cell is exposed to stress agents. UspA enhances the rate of cell survival during prolonged exposure to such conditions, and may provide a general ""stress endurance"" activity. The crystal structure of Haemophilus influenzae UspA reveals an alpha/beta fold similar to that of the Methanococcus jannaschii MJ0577 protein, which binds ATP, though UspA lacks ATP-binding activity. 40669 pfam00583: Acetyltransferase (GNAT) family. This family contains proteins with N-acetyltransferase functions. 40670 pfam00584: SecE/Sec61-gamma subunits of protein translocation complex. SecE is part of the SecYEG complex in bacteria which translocates proteins from the cytoplasm. In eukaryotes the complex, made from Sec61-gamma and Sec61-alpha translocates protein from the cytoplasm to the ER. Archaea have a similar complex. 40671 pfam00585: C-terminal regulatory domain of Threonine dehydratase. Threonine dehydratases pfam00291 all contain a carboxy terminal region. This region may have a regulatory role. Some members contain two copies of this region. This family is homologous to the pfam01842 domain. 40672 pfam00586: AIR synthase related protein, N-terminal domain. This family includes Hydrogen expression/formation protein HypE, AIR synthases EC:6.3.3.1, FGAM synthase EC:6.3.5.3 and selenide, water dikinase EC:2.7.9.3. The N-terminal domain of AIR synthase forms the dimer interface of the protein, and is suggested as a putative ATP binding domain. 40673 pfam00587: tRNA synthetase class II core domain (G, H, P, S and T). Other tRNA synthetase sub-families are too dissimilar to be included. This domain is the core catalytic domain of tRNA synthetases and includes glycyl, histidyl, prolyl, seryl and threonyl tRNA synthetases. 40674 pfam00588: SpoU rRNA Methylase family. This family of proteins probably use S-AdoMet. . 40675 pfam00589: Phage integrase family. Members of this family cleave DNA substrates by a series of staggered cuts, during which the protein becomes covalently linked to the DNA through a catalytic tyrosine residue at the carboxy end of the alignment. The catalytic site residues in CRE recombinase are Arg-173, His-289, Arg-292 and Tyr-324. 40676 pfam00590: Tetrapyrrole (Corrin/Porphyrin) Methylases. This family uses S-AdoMet in the methylation of diverse substrates. This family includes a related group of bacterial proteins of unknown function. This family includes the methylase Dipthine synthase. 40677 pfam00591: Glycosyl transferase family, a/b domain. This family includes anthranilate phosphoribosyltransferase (TrpD), thymidine phosphorylase. All these proteins can transfer a phosphorylated ribose substrate. 40679 pfam00594: Vitamin K-dependent carboxylation/gamma-carboxyglutamic (GLA) domain. This domain is responsible for the high-affinity binding of calcium ions. This domain contains post-translational modifications of many glutamate residues by Vitamin K-dependent carboxylation to form gamma-carboxyglutamate (Gla).. 40680 pfam00595: PDZ domain (Also known as DHR or GLGF). PDZ domains are found in diverse signaling proteins. 40681 pfam00596: Class II Aldolase and Adducin N-terminal domain. This family includes class II aldolases and adducins which have not been ascribed any enzymatic function. 40682 pfam00597: DedA family. This family combines the DedA related proteins and YIAN/YGIK family. Members of this family are not functionally characterised. These proteins contain multiple predicted transmembrane regions. 40683 pfam00598: Influenza Matrix protein (M1). This protein forms a continuous shell on the inner side of the lipid bilayer, but its function is unclear. 40684 pfam00599: Influenza Matrix protein (M2). This protein spans the viral membrane with an extracellular amino-terminus external and a cytoplasmic carboxy-terminus. 40685 pfam00600: Influenza non-structural protein (NS1). NS1 is a homodimeric RNA-binding protein that is required for viral replication. NS1 binds polyA tails of mRNA keeping them in the nucleus. NS1 inhibits pre-mRNA splicing by tightly binding to a specific stem-bulge of U6 snRNA. 40686 pfam00601: Influenza non-structural protein (NS2). NS2 may play a role in promoting normal replication of the genomic RNAs by preventing the replication of short-length RNA species. 40687 pfam00602: Influenza RNA-dependent RNA polymerase subunit PB1. Two GTP binding sites exist in this protein. 40688 pfam00603: Influenza RNA-dependent RNA polymerase subunit PA. 40689 pfam00604: Influenza RNA-dependent RNA polymerase subunit PB2. PB2 can bind 5' end cap structure of RNA. 40690 pfam00605: Interferon regulatory factor transcription factor. This family of transcription factors are important in the regulation of interferons in response to infection by virus and in the regulation of interferon-inducible genes. Three of the five conserved tryptophan residues bind to DNA. 40691 pfam00606: Herpesvirus Glycoprotein B. This family of proteins contains a transmembrane region. 40692 pfam00607: gag gene protein p24 (core nucleocapsid protein). p24 forms inner protein layer of the nucleocapsid. ELISA tests for p24 is the most commonly used method to demonstrate virus replication both in vivo and in vitro. 40693 pfam00608: Adenoviral fibre protein (repeat/shaft region). There is no separation between signal and noise. Specific attachment of adenovirus is achieved through interactions between host-cell receptors and the adenovirus fibre protein and is mediated by the globular carboxy-terminal domain of the adenovirus fibre protein, rather than the 'shaft' region represented by this family. The alignment of this family contains two copies of a fifteen residue repeat found in the 'shaft' region of adenoviral fibre proteins. 40694 pfam00609: Diacylglycerol kinase accessory domain (presumed). Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. This domain is assumed to be an accessory domain: its function is unknown. 40695 pfam00610: Domain found in Dishevelled, Egl-10, and Pleckstrin. Domain of unknown function present in signaling proteins that contain pfam00169, rasGEF, rhoGEF, rhoGAP, pfam00615, pfam00595 domains. 40696 pfam00611: Fes/CIP4 homology domain. Alignment extended from. Highly alpha-helical. 40697 pfam00612: IQ calmodulin-binding motif. Calmodulin-binding motif. 40698 pfam00613: Phosphoinositide 3-kinase family, accessory domain (PIK domain). PIK domain is conserved in all PI3 and PI4-kinases. Its role is unclear but it has been suggested to be involved in substrate presentation. 40699 pfam00614: Phospholipase D. Active site motif. Phosphatidylcholine-hydrolysing phospholipase D (PLD) isoforms are activated by ADP-ribosylation factors (ARFs). PLD produces phosphatidic acid from phosphatidylcholine, which may be essential for the formation of certain types of transport vesicles or may be constitutive vesicular transport to signal transduction pathways. PC-hydrolysing PLD is a homologue of cardiolipin synthase, phosphatidylserine synthase, bacterial PLDs, and viral proteins. Each of these appears to possess a domain duplication which is apparent by the presence of two motifs containing well-conserved histidine, lysine, and/or asparagine residues which may contribute to the active site. aspartic acid. An E. coli endonuclease (nuc) and similar proteins appear to be PLD homologues but possess only one of these motifs. The profile contained here represents only the putative active site regions, since an accurate multiple alignment of the repeat units has not been achieved. 40700 pfam00615: Regulator of G protein signaling domain. RGS family members are GTPase-activating proteins for heterotrimeric G-protein alpha-subunits. 40701 pfam00616: GTPase-activator protein for Ras-like GTPase. All alpha-helical domain that accelerates the GTPase activity of Ras, thereby ""switching"" it into an ""off"" position. 40702 pfam00617: RasGEF domain. Guanine nucleotide exchange factor for Ras-like small GTPases. 40703 pfam00618: Guanine nucleotide exchange factor for Ras-like GTPases; N-terminal motif. A subset of guanine nucleotide exchange factor for Ras-like small GTPases appear to possess this motif/domain N-terminal to the RasGef (Cdc25-like) domain. 40704 pfam00619: Caspase recruitment domain. Motif contained in proteins involved in apoptotic signaling. Predicted to possess a DEATH (pfam00531) domain-like fold. 40705 pfam00620: RhoGAP domain. GTPase activator proteins towards Rho/Rac/Cdc42-like small GTPases. 40706 pfam00621: RhoGEF domain. Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases Also called Dbl-homologous (DH) domain. It appears that pfam00169 domains invariably occur C-terminal to RhoGEF/DH domains. 40707 pfam00622: SPRY domain. SPRY Domain is named from SPla and the RYanodine Receptor. Domain of unknown function. Distant homologues are domains in butyrophilin/marenostrin/pyrin homologues. 40708 pfam00623: RNA polymerase Rpb1, domain 2. RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 2, contains the active site. The invariant motif -NADFDGD- binds the active site magnesium ion]. . 40709 pfam00624: Flocculin repeat. This short repeat is rich in serine and threonine residues. 40710 pfam00625: Guanylate kinase. 40711 pfam00626: Gelsolin repeat. 40712 pfam00627: UBA/TS-N domain. This small domain is composed of three alpha helices. This family includes the previously defined UBA and TS-N domains. The UBA-domain (ubiquitin associated domain) is a novel sequence motif found in several proteins having connections to ubiquitin and the ubiquitination pathway. The structure of the UBA domain consists of a compact three helix bundle. This domain is found at the N terminus of EF-TS hence the name TS-N. The structure of EF-TS is known and this domain is implicated in its interaction with EF-TU. The domain has been found in non EF-TS proteins such as alpha-NAC and MJ0280. 40713 pfam00628: PHD-finger. PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. 40714 pfam00629: MAM domain. An extracellular domain found in many receptors. 40715 pfam00630: Filamin/ABP280 repeat. 40716 pfam00631: GGL domain. G-protein gamma like domains (GGL) are found in the gamma subunit of the heterotrimeric G protein complex and in regulators of G protein signaling (RGS) proteins. 40717 pfam00632: HECT-domain (ubiquitin-transferase). The name HECT comes from Homologous to the E6-AP Carboxyl Terminus. 40718 pfam00633: Helix-hairpin-helix motif. The helix-hairpin-helix DNA-binding motif is found to be duplicated in the central domain of RuvA. 40719 pfam00634: BRCA2 repeat. The alignment covers only the most conserved region of the repeat. 40720 pfam00635: MSP (Major sperm protein) domain. Major sperm proteins are involved in sperm motility. These proteins oligomerise to form filaments. This family contains many other proteins. 40721 pfam00636: RNase3 domain. 40722 pfam00637: Region in Clathrin and VPS. Each region is about 140 amino acids long. The regions are composed of multiple alpha helical repeats. They occur in the arm region of the Clathrin heavy chain. 40723 pfam00638: RanBP1 domain. 40724 pfam00639: PPIC-type PPIASE domain. Rotamases increase the rate of protein folding by catalysing the interconversion of cis-proline and trans-proline. 40725 pfam00640: Phosphotyrosine interaction domain (PTB/PID).. 40726 pfam00641: Zn-finger in Ran binding protein and others. 40727 pfam00642: Zinc finger C-x8-C-x5-C-x3-H type (and similar).. 40728 pfam00643: B-box zinc finger. 40729 pfam00644: Poly(ADP-ribose) polymerase catalytic domain. Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active. 40730 pfam00645: Poly(ADP-ribose) polymerase and DNA-Ligase Zn-finger region. Poly(ADP-ribose) polymerase is an important regulatory component of the cellular response to DNA damage. The amino-terminal region of Poly(ADP-ribose) polymerase consists of two PARP-type zinc fingers. This region acts as a DNA nick sensor. 40731 pfam00646: F-box domain. 40732 pfam00647: Elongation factor 1 gamma, conserved domain. 40733 pfam00648: Calpain family cysteine protease. 40734 pfam00649: Copper fist DNA binding domain. 40735 pfam00650: CRAL/TRIO domain. The original profile has been extended to include the carboxyl domain from the known structure of Sec14. 40736 pfam00651: BTB/POZ domain. The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN. 40737 pfam00652: QXW lectin repeat. 40738 pfam00653: Inhibitor of Apoptosis domain. BIR stands for 'Baculovirus Inhibitor of apoptosis protein Repeat' Also known as IAP repeat. 40739 pfam00654: Voltage gated chloride channel. This family of ion channels contains 10 or 12 transmembrane helices. Each protein forms a single pore. It has been shown that some members of this family form homodimers. These proteins contain two pfam00571 domains. . 40740 pfam00656: Caspase domain. 40741 pfam00657: GDSL-like Lipase/Acylhydrolase. 40742 pfam00658: Poly-adenylate binding protein, unique domain. 40743 pfam00659: POLO box duplicated region. 40744 pfam00660: Seripauperin and TIP1 family. 40745 pfam00661: Viral matrix protein. Found in Morbillivirus and paramyxovirus, pneumovirus. 40746 pfam00662: NADH-Ubiquinone oxidoreductase (complex I), chain 5 N-terminus. This sub-family represents an amino terminal extension of pfam00361. Only NADH-Ubiquinone chain 5 and eubacterial chain L are in this family. This sub-family is part of complex I which catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane. 40747 pfam00664: ABC transporter transmembrane region. This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. 40748 pfam00665: Integrase core domain. Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain pfam02022. This domain is the central catalytic domain. The carboxyl terminal domain that is a non-specific DNA binding domain pfam00552. The catalytic domain acts as an endonuclease when two nucleotides are removed from the 3' ends of the blunt-ended viral DNA made by reverse transcription. This domain also catalyses the DNA strand transfer reaction of the 3' ends of the viral DNA to the 5' ends of the integration site. 40749 pfam00666: Cathelicidin. A novel protein family, showing a conserved proregion and a variable carboxyl-terminal antimicrobial domain. This region shows similarity to cystatins. 40750 pfam00667: FAD binding domain. This domain is found in sulfite reductase, NADPH cytochrome P450 reductase, Nitric oxide synthase and methionine synthase reductase. 40751 pfam00668: Condensation domain. This domain is found in many multi-domain enzymes which synthesise peptide antibiotics. This domain catalyses a condensation reaction to form peptide bonds in non- ribosomal peptide biosynthesis. It is usually found to the carboxy side of a phosphopantetheine binding domain (pfam00550). It has been shown that mutations in the HHXXXDG motif abolish activity suggesting this is part of the active site. 40752 pfam00669: Bacterial flagellin N-terminus. Flagellins polymerise to form bacterial flagella. This family includes flagellins and hook associated protein 3. 40753 pfam00670: S-adenosyl-L-homocysteine hydrolase, NAD binding domain. 40754 pfam00672: HAMP domain. 40755 pfam00673: ribosomal L5P family C-terminus. This region is found associated with pfam00281. 40756 pfam00674: DUP family. This family consists of several yeast proteins of unknown functions. Several members of this family contain an internal duplication of this region. 40757 pfam00675: Insulinase (Peptidase family M16).. 40758 pfam00676: Dehydrogenase E1 component. This family uses thiamine pyrophosphate as a cofactor. This family includes pyruvate dehydrogenase, 2-oxoglutarate dehydrogenase and 2-oxoisovalerate dehydrogenase. 40759 pfam00677: Lumazine binding domain. This domain binds to derivatives of lumazine in some proteins. Some proteins have lost the residues involved in binding lumazine. 40760 pfam00679: Elongation factor G C-terminus. This domain includes the carboxyl terminal regions of Elongation factor G, elongation factor 2 and some tetracycline resistance proteins and adopt a ferredoxin-like fold. 40761 pfam00680: RNA dependent RNA polymerase. 40762 pfam00681: Plectin repeat. This family includes repeats from plectin, desmoplakin, envoplakin and bullous pemphigoid antigen. 40763 pfam00682: HMGL-like. This family contains a diverse set of enzymes. These include various aldolases and a region of pyruvate carboxylase. . 40764 pfam00683: TB domain. This domain is also known as the 8 cysteine domain. This family includes the hybrid domains. This cysteine rich repeat is found in TGF binding protein and fibrillin. 40765 pfam00684: DnaJ central domain (4 repeats). The central cysteine-rich (CR) domain of DnaJ proteins contains four repeats of the motif CXXCXGXG where X is any amino acid. The isolated cysteine rich domain folds in zinc dependent fashion. Each set of two repeats binds one unit of zinc. Although this domain has been implicated in substrate binding, no evidence of specific interaction between the isolated DNAJ cysteine rich domain and various hydrophobic peptides has been found. 40766 pfam00685: Sulfotransferase domain. 40767 pfam00686: Starch binding domain. 40768 pfam00687: Ribosomal protein L1p/L10e family. This family includes prokaryotic L1 and eukaryotic L10. 40769 pfam00688: TGF-beta propeptide. This propeptide is known as latency associated peptide (LAP) in TGF-beta. LAP is a homodimer which is disulfide linked to TGF-beta binding protein. 40770 pfam00689: Cation transporting ATPase, C-terminus. Members of this families are involved in Na+/K+, H+/K+, Ca++ and Mg++ transport. 40771 pfam00690: Cation transporter/ATPase, N-terminus. Members of this families are involved in Na+/K+, H+/K+, Ca++ and Mg++ transport. 40772 pfam00691: OmpA family. The Pfam entry also includes MotB and related proteins. 40773 pfam00692: dUTPase. dUTPase hydrolyses dUTP to dUMP and pyrophosphate. 40774 pfam00693: Thymidine kinase from herpesvirus. 40775 pfam00694: Aconitase C-terminal domain. Members of this family usually also match to pfam00330. This domain undergoes conformational change in the enzyme mechanism. 40776 pfam00695: Major surface antigen from hepadnavirus. 40777 pfam00696: Amino acid kinase family. This family includes kinases that phosphorylate a variety of amino acid substrates, as well as uridylate kinase and carbamate kinase. This family includes: Aspartokinase EC:2.7.2.4. Acetylglutamate kinase EC:2.7.2.8. Glutamate 5-kinase EC:2.7.2.11. Uridylate kinase EC:2.7.4.-. Carbamate kinase EC:2.7.2.2. 40778 pfam00697: N-(5'phosphoribosyl)anthranilate (PRA) isomerase. 40779 pfam00698: Acyl transferase domain. 40780 pfam00699: Urease beta subunit. This subunit is known as alpha in Heliobacter. 40781 pfam00700: Bacterial flagellin C-terminus. Flagellins polymerise to form bacterial flagella. There is some similarity between this family and pfam00669, particularly the motif NRFXSXIXXL. It has been suggested that these two regions associate. 40782 pfam00701: Dihydrodipicolinate synthetase family. This family has a TIM barrel structure. 40783 pfam00702: haloacid dehalogenase-like hydrolase. This family are structurally different from the alpha/ beta hydrolase family (pfam00561). This family includes L-2-haloacid dehalogenase, epoxide hydrolases and phosphatases. The structure of the family consists of two domains. One is an inserted four helix bundle, which is the least well conserved region of the alignment. The rest of the fold is composed of the core alpha/beta domain. 40784 pfam00703: Glycosyl hydrolases family 2, immunoglobulin-like beta-sandwich domain. This family contains beta-galactosidase, beta-mannosidase and beta-glucuronidase activities. 40785 pfam00704: Glycosyl hydrolases family 18. 40786 pfam00705: Proliferating cell nuclear antigen, N-terminal domain. N-terminal and C-terminal domains of PCNA are topologically identical. Three PCNA molecules are tightly associated to form a closed ring encircling duplex DNA. 40787 pfam00706: Anenome neurotoxin. 40788 pfam00707: Translation initiation factor IF-3, C-terminal domain. 40789 pfam00708: Acylphosphatase. 40790 pfam00709: Adenylosuccinate synthetase. 40791 pfam00710: Asparaginase. 40792 pfam00711: Beta defensin. The beta defensins are antimicrobial peptides implicated in the resistance of epithelial surfaces to microbial colonisation. 40793 pfam00712: DNA polymerase III beta subunit, N-terminal domain. A dimer of the beta subunit of DNA polymerase beta forms a ring which encircles duplex DNA. Each monomer contains three domains of identical topology and DNA clamp fold. 40794 pfam00713: Hirudin. 40795 pfam00714: Interferon gamma. 40796 pfam00715: Interleukin 2. 40797 pfam00716: Assemblin (Peptidase family S21).. 40798 pfam00717: Peptidase family S24. 40799 pfam00718: Polyomavirus coat protein. 40800 pfam00719: Inorganic pyrophosphatase. 40801 pfam00720: Subtilisin inhibitor-like. 40802 pfam00721: Virus coat protein (TMV like). This family contains coat proteins from tobamoviruses, hordeiviruses, Tobraviruses, Furoviruses and Potyviruses. 40803 pfam00722: Glycosyl hydrolases family 16. 40804 pfam00723: Glycosyl hydrolases family 15. 40805 pfam00724: NADH:flavin oxidoreductase / NADH oxidase family. 40806 pfam00725: 3-hydroxyacyl-CoA dehydrogenase, C-terminal domain. This family also includes lambda crystallin. Some proteins include two copies of this domain. 40807 pfam00726: Interleukin 10. 40808 pfam00727: Interleukin 4. 40809 pfam00728: Glycosyl hydrolase family 20, catalytic domain. This domain has a TIM barrel fold. 40810 pfam00729: Viral coat protein (S domain).. 40812 pfam00731: AIR carboxylase. Members of this family catalyse the decarboxylation of 1-(5-phosphoribosyl)-5-amino-4-imidazole-carboxylate (AIR). This family catalyse the sixth step of de novo purine biosynthesis. Some members of this family contain two copies of this domain. 40813 pfam00732: GMC oxidoreductase. This family of proteins bind FAD as a cofactor. 40814 pfam00733: Asparagine synthase. This family is always found associated with pfam00310. Members of this family catalyse the conversion of aspartate to asparagine. 40815 pfam00734: Fungal cellulose binding domain. 40816 pfam00735: Cell division protein. Members of this family include CDC3, CDC10, CDC11 and CDC12/Septin. Members of this family bind GTP. 40817 pfam00736: EF-1 guanine nucleotide exchange domain. This family is the guanine nucleotide exchange domain of EF-1 beta and EF-1 delta chains. 40818 pfam00737: Photosystem II 10 kDa phosphoprotein. This protein is phosphorylated in a light dependent reaction. 40819 pfam00738: Polyhedrin. These proteins are found in occlusion bodies in various viruses. The polyhedrin protein protects the virus. 40820 pfam00739: Trans-activation protein X. This protein is found in hepadnaviruses where it is indispensable for replication. 40821 pfam00740: Parvovirus coat protein VP2. This protein, together with VP1 forms a capsomer. Both of these proteins are formed from the same transcript using alternative splicing. As a result, VP1 and VP2 differ only in the N-terminal region of VP1. VP2 is involved in packaging the viral DNA. 40822 pfam00741: Gas vesicle protein. 40823 pfam00742: Homoserine dehydrogenase. 40824 pfam00743: Flavin-binding monooxygenase-like. This family includes FMO proteins and cyclohexanone monooxygenase. 40825 pfam00745: Glutamyl-tRNAGlu reductase, dimerisation domain. 40826 pfam00746: Gram positive anchor. 40827 pfam00747: ssDNA binding protein. This protein is found in herpesviruses and is needed for replication. 40828 pfam00748: Calpain inhibitor. This region is found multiple times in calpain inhibitor proteins. 40829 pfam00749: tRNA synthetases class I (E and Q), catalytic domain. Other tRNA synthetase sub-families are too dissimilar to be included. This family includes only glutamyl and glutaminyl tRNA synthetases. In some organisms, a single glutamyl-tRNA synthetase aminoacylates both tRNA(Glu) and tRNA(Gln).. 40830 pfam00750: tRNA synthetases class I (R). Other tRNA synthetase sub-families are too dissimilar to be included. This family includes only arginyl tRNA synthetase. 40831 pfam00751: DM DNA binding domain. The DM domain is named after dsx and mab-3. dsx contains a single amino-terminal DM domain, whereas mab-3 contains two amino-terminal domains. The DM domain has a pattern of conserved zinc chelating residues C2H2C4. The dsx DM domain has been shown to dimerise and bind palindromic DNA. 40832 pfam00752: XPG N-terminal domain. 40833 pfam00753: Metallo-beta-lactamase superfamily. 40834 pfam00754: F5/8 type C domain. This domain is also known as the discoidin (DS) domain family. The bacterial examples are not yet included in the SEED alignment and are only found with low scores. 40835 pfam00755: Choline/Carnitine o-acyltransferase. 40836 pfam00756: Putative esterase. This family contains Esterase D. However it is not clear if all members of the family have the same function. This family is related to the pfam00135 family. 40837 pfam00757: Furin-like cysteine rich region. 40838 pfam00758: Erythropoietin/thrombopoietin. 40839 pfam00759: Glycosyl hydrolase family 9. 40840 pfam00760: Cucumovirus coat protein. 40841 pfam00761: Polyomavirus coat protein. 40842 pfam00762: Ferrochelatase. 40843 pfam00763: Tetrahydrofolate dehydrogenase/cyclohydrolase, catalytic domain. 40844 pfam00764: Arginosuccinate synthase. This family contains a PP-loop motif. 40845 pfam00765: Autoinducer synthetase. 40846 pfam00766: Electron transfer flavoprotein alpha subunit. This protein is distantly related to and forms a heterodimer with pfam01012. 40847 pfam00767: Potyvirus coat protein. 40848 pfam00768: D-alanyl-D-alanine carboxypeptidase. 40849 pfam00769: Ezrin/radixin/moesin family. This family of proteins contain a band 4.1 domain (pfam00373), at their amino terminus. This family represents the rest of these proteins. 40850 pfam00770: Adenovirus endoprotease. This family of adenovirus thiol endoproteases specifically cleave Gly-Ala peptides in viral precursor peptides. 40851 pfam00771: FHIPEP family. 40852 pfam00772: DnaB-like helicase N terminal domain. The hexameric helicase DnaB unwinds the DNA duplex at the Escherichia coli chromosome replication fork. Although the mechanism by which DnaB both couples ATP hydrolysis to translocation along DNA and denatures the duplex is unknown, a change in the quaternary structure of the protein involving dimerisation of the N-terminal domain has been observed and may occur during the enzymatic cycle. This N-terminal domain is required both for interaction with other proteins in the primosome and for DnaB helicase activity. 40853 pfam00773: RNB-like protein. The function of this region of similarity is uncertain. 40854 pfam00774: Dihydropyridine sensitive L-type calcium channel (Beta subunit).. 40855 pfam00775: Dioxygenase. 40856 pfam00777: Glycosyltransferase family 29 (sialyltransferase). Members of this family belong to glycosyltransferase family 29. 40857 pfam00778: DIX domain. The DIX domain is present in Dishevelled and axin. This domain is involved in homo- and hetero-oligomerisation. It is involved in the homo- oligomerisation of mouse axin. The axin DIX domain also interacts with the dishevelled DIX domain. The DIX domain has also been called the DAX domain. 40858 pfam00779: BTK motif. Zinc-binding motif containing conserved cysteines and a histidine. Always found C-terminal to PH domains. The crystal structure shows this motif packs against the PH domain. The PH+Btk module pair has been called the Tec homology (TH) region. 40859 pfam00780: CNH domain. Domain found in NIK1-like kinase, mouse citron and yeast ROM1, ROM2. Unpublished observations. 40860 pfam00781: Diacylglycerol kinase catalytic domain (presumed). Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The catalytic domain is assumed from the finding of bacterial homologues. 40861 pfam00782: Dual specificity phosphatase, catalytic domain. Ser/Thr and Tyr protein phosphatases. The enzyme's tertiary fold is highly similar to that of tyrosine-specific phosphatases, except for a ""reco gnition"" region. 40862 pfam00784: MyTH4 domain. Domain in myosin and kinesin tails, present twice in myosin-VIIa, and also present in 3 other myosins. 40863 pfam00785: PAC motif. PAC motif occurs C-terminal to a subset of all known PAS motifs. It is proposed to contribute to the PAS domain fold. 40864 pfam00786: P21-Rho-binding domain. Small domains that bind Cdc42p- and/or Rho-like small GTPases. Also known as the Cdc42/Rac interactive binding (CRIB).. 40865 pfam00787: PX domain. PX domains bind to phosphoinositides. 40866 pfam00788: Ras association (RalGDS/AF-6) domain. RasGTP effectors (in cases of AF6, canoe and RalGDS); putative RasGTP effectors in other cases. Recent evidence (not yet in MEDLINE) shows that some RA domains do NOT bind RasGTP. Predicted structure similar to that determined, and that of the RasGTP-binding domain of Raf kinase. 40867 pfam00789: UBX domain. Domain present in ubiquitin-regulatory proteins. Present in FAF1 and Shp1p. 40868 pfam00790: VHS domain. Domain present in VPS-27, Hrs and STAM. 40869 pfam00791: ZU5 domain. Domain present in ZO-1 and Unc5-like netrin receptors Domain of unknown function. 40870 pfam00792: Phosphoinositide 3-kinase C2. Phosphoinositide 3-kinase region postulated to contain a C2 domain. Outlier of pfam00168 family. . 40871 pfam00793: DAHP synthetase I family. Members of this family catalyse the first step in aromatic amino acid biosynthesis from chorismate. E-coli has three related synthetases, which are inhibited by different aromatic amino acids. This family also includes KDSA which has very similar catalytic activity but is involved in the first step of liposaccharide biosynthesis. 40872 pfam00794: PI3-kinase family, ras-binding domain. Certain members of the PI3K family possess Ras-binding domains in their N-termini. These regions show some similarity (although not highly significant similarity) to Ras-binding pfam00788 domains. 40873 pfam00795: Carbon-nitrogen hydrolase. This family contains hydrolases that break carbon-nitrogen bonds. The family includes: Nitrilase EC:3.5.5.1, Aliphatic amidase EC:3.5.1.4, Biotidinase EC:3.5.1.12, Beta-ureidopropionase EC:3.5.1.6. 40874 pfam00796: Photosystem I reaction centre subunit VIII. 40875 pfam00797: N-acetyltransferase. Arylamine N-acetyltransferase (NAT) is a cytosolic enzyme of approximately 30kDa. It facilitates the transfer of an acetyl group from Acetyl Coenzyme A on to a wide range of arylamine, N-hydroxyarylamines and hydrazines. Acetylation of these compounds generally results in inactivation. NAT is found in many species from Mycobacteria (M. tuberculosis, M. smegmatis etc) to man. It was the first enzyme to be observed to have polymorphic activity amongst human individuals. NAT is responsible for the inactivation of Isoniazid (a drug used to treat Tuberculosis) in humans. The NAT protein has also been shown to be involved in the breakdown of folic acid. 40876 pfam00798: Arenavirus glycoprotein. 40877 pfam00799: Geminivirus AL1 protein. 40878 pfam00800: Prephenate dehydratase. This protein is involved in Phenylalanine biosynthesis. This protein catalyses the decarboxylation of prephenate to phenylpyruvate. 40879 pfam00801: PKD domain. This domain was first identified in the Polycystic kidney disease protein PKD1. This domain has been predicted to contain an Ig-like fold. 40880 pfam00802: Pneumovirus attachment glycoprotein G. This family includes attachment proteins from respiratory synctial virus. Glycoprotein G has not been shown to have any neuraminidase or hemagglutinin activity. The amino terminus is thought to be cytoplasmic, and the carboxyl terminus extracellular. The extracellular region contains four completely conserved cysteine residues. 40881 pfam00803: 3A/RNA2 movement protein family. This family includes movement proteins from various viruses. The 3A protein is found in bromoviruses and Cucumoviruses. The genome of these viruses contain 3 RNA segments. The third segment (RNA 3) contains two proteins, the coat protein and the 3A protein. The function of the 3A protein is uncertain but has been shown to be involved in cell-to- cell movement of the virus. The family also includes movement proteins from Dianthoviruses. 40882 pfam00804: Syntaxin. Syntaxins are t-SNARES. 40883 pfam00805: Pentapeptide repeats (8 copies). These repeats are found in many cyanobacterial proteins. The repeats were first identified in hglK. The function of these repeats is unknown. The structure of this repeat has been predicted to be a beta-helix. The repeat can be approximately described as A(D/N)LXX, where X can be any amino acid. 40884 pfam00806: Pumilio-family RNA binding repeat. Puf repeats (aka PUM-HD, Pumilio homology domain) are necessary and sufficient for sequence specific RNA binding in fly Pumilio and worm FBF-1 and FBF-2. Both proteins function as translational repressors in early embryonic development by binding sequences in the 3' UTR of target mRNAs (e.g. the nanos response element (NRE) in fly Hunchback mRNA, or the point mutation element (PME) in worm fem-3 mRNA). Other proteins that contain Puf domains are also plausible RNA binding proteins. Puf domains usually occur as a tandem repeat of 8 domains. The Pfam model does not necessarily recognise all 8 repeats in all sequences; some sequences appear to have 5 or 6 repeats on initial analysis, but further analysis suggests the presence of additional divergent repeats. Structures of PUF repeat proteins show they consist of a two helix structure. 40885 pfam00807: Apidaecin. These antibacterial peptides are found in bees. These heat-stable, non-helical peptides are active against a wide range of plant-associated bacteria and some human pathogens. The Pfam alignment includes the propeptide and apidaecin sequence. 40886 pfam00808: Histone-like transcription factor (CBF/NF-Y) and archaeal histone. This family includes archaebacterial histones and histone like transcription factors from eukaryotes. 40887 pfam00809: Pterin binding enzyme. This family includes a variety of pterin binding enzymes that all adopt a TIM barrel fold. The family includes dihydropteroate synthase EC:2.5.1.15 as well as a group methyltransferase enzymes including methyltetrahydrofolate, corrinoid iron-sulfur protein methyltransferase (MeTr) that catalyses a key step in the Wood-Ljungdahl pathway of carbon dioxide fixation. It transfers the N5-methyl group from methyltetrahydrofolate (CH3-H4folate) to a cob(I)amide centre in another protein, the corrinoid iron-sulfur protein. MeTr is a member of a family of proteins that includes methionine synthase and methanogenic enzymes that activate the methyl group of methyltetra-hydromethano(or -sarcino)pterin. 40888 pfam00810: ER lumen protein retaining receptor. 40889 pfam00811: Ependymin. 40890 pfam00812: Ephrin. 40891 pfam00813: FliP family. 40892 pfam00814: Glycoprotease family. 40893 pfam00815: Histidinol dehydrogenase. 40894 pfam00816: H-NS histone family. 40895 pfam00817: impB/mucB/samB family. These proteins are involved in UV protection. 40896 pfam00818: Ice nucleation protein repeat. 40897 pfam00819: Myotoxin. 40898 pfam00820: Borrelia lipoprotein. This family of lipoproteins is found in Borrelia spirochetes. The function of these proteins is uncertain. 40899 pfam00821: Phosphoenolpyruvate carboxykinase. Catalyses the formation of phosphoenolpyruvate by decarboxylation of oxaloacetate. 40900 pfam00822: PMP-22/EMP/MP20/Claudin family. 40901 pfam00823: PPE family. This family named after a PPE motif near to the amino terminus of the domain. The PPE family of proteins all contain an amino-terminal region of about 180 amino acids. The carboxyl terminus of this family are variable, and on the basis of this region fall into at least three groups. The MPTR subgroup has tandem copies of a motif NXGXGNXG. The second subgroup contains a conserved motif at about position 350. The third group are only related in the amino terminal region. The function of these proteins is uncertain but it has been suggested that they may be related to antigenic variation of Mycobacterium tuberculosis. 40902 pfam00825: Ribonuclease P. 40903 pfam00826: Ribosomal L10. 40904 pfam00827: Ribosomal L15. 40905 pfam00828: Eukaryotic ribosomal protein L18. 40906 pfam00829: Ribosomal prokaryotic L21 protein. 40907 pfam00830: Ribosomal L28 family. The ribosomal 28 family includes L28 proteins from bacteria and chloroplasts. The L24 protein from yeast also contains a region of similarity to prokaryotic L28 proteins. L24 from yeast is also found in the large ribosomal subunit. 40908 pfam00831: Ribosomal L29 protein. 40909 pfam00832: Ribosomal L39 protein. 40910 pfam00833: Ribosomal S17. 40911 pfam00834: Ribulose-phosphate 3 epimerase family. This enzyme catalyses the conversion of D-ribulose 5-phosphate into D-xylulose 5-phosphate. 40912 pfam00835: SNAP-25 family. SNAP-25 (synaptosome-associated protein 25 kDa) proteins are components of SNARE complexes. Members of this family contain a cluster of cysteine residues that can be palmitoylated for membrane attachment. 40913 pfam00836: Stathmin family. 40914 pfam00837: Iodothyronine deiodinase. 40915 pfam00838: Translationally controlled tumour protein. 40916 pfam00839: Cysteine rich repeat. This cysteine rich repeat contains four cysteines. It is found in multiple copies in a protein that binds to fibroblast growth factors. The repeat is also found in MG160 and E-selectin ligand (ESL-1).. 40917 pfam00840: Glycosyl hydrolase family 7. 40918 pfam00841: Sperm histone P2. This protein also known as protamine P2 can substitute for histones in the chromatin of sperm. The alignment contains both the sequence of the mature P2 protein and its propeptide. 40919 pfam00842: Alanine racemase, C-terminal domain. 40920 pfam00843: Arenavirus nucleocapsid protein. 40921 pfam00844: Geminivirus coat protein. It has been shown that the 104 N-terminal amino acids of the maize streak virus coat protein bind DNA non- specifically. 40922 pfam00845: Geminivirus BL1 movement protein. Geminiviruses encode two movement proteins that are essential for systemic infection of their host but dispensable for replication and encapsidation. 40923 pfam00846: Hantavirus nucleocapsid protein. 40924 pfam00847: AP2 domain. This 60 amino acid residue domain can bind to DNA. This domain is plant specific. Members of this family are suggested to be related to pyridoxal phosphate-binding domains such as found in pfam00222. 40925 pfam00848: Ring hydroxylating alpha subunit (catalytic domain). This family is the catalytic domain of aromatic-ring- hydroxylating dioxygenase systems. The active site contains a non-heme ferrous ion coordinated by three ligands. 40926 pfam00849: RNA pseudouridylate synthase. Members of this family are involved in modifying bases in RNA molecules. They carry out the conversion of uracil bases to pseudouridine. This family includes RluD, a pseudouridylate synthase that converts specific uracils to pseudouridine in 23S rRNA. RluA from E. coli converts bases in both rRNA and tRNA. 40927 pfam00850: Histone deacetylase family. Histones can be reversibly acetylated on several lysine residues. Regulation of transcription is caused in part by this mechanism. Histone deacetylases catalyse the removal of the acetyl group. Histone deacetylases are related to other proteins. 40928 pfam00851: Helper component proteinase. This protein is found in genome polyproteins of potyviruses. 40929 pfam00852: Glycosyltransferase family 10 (fucosyltransferase). This family of Fucosyltransferases are the enzymes transferring fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is know as glycosyltransferase family 10. 40930 pfam00853: Runt domain. 40931 pfam00854: POT family. The POT (proton-dependent oligopeptide transport) family all appear to be proton dependent transporters. 40932 pfam00855: PWWP domain. The PWWP domain is named after a conserved Pro-Trp-Trp-Pro motif. The function of the domain is currently unknown. 40933 pfam00856: SET domain. SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. 40934 pfam00857: Isochorismatase family. This family are hydrolase enzymes. 40935 pfam00858: Amiloride-sensitive sodium channel. 40936 pfam00859: CTF/NF-I family transcription modulation region. 40937 pfam00860: Permease family. This family includes permeases for diverse substrates such as xanthine, uracil and vitamin C. However many members of this family are functionally uncharacterised and may transport other substrates. Members of this family have ten predicted transmembrane helices. 40938 pfam00861: Ribosomal L18p/L5e family. This family includes ribosomal proteins from the large subunit. This family includes L18 from bacteria and L5 from eukaryotes. It has been shown for one member that the amino terminal 93 amino acids are necessary and sufficient to bind 5S rRNA in vitro. Suggesting that the entire family has a function in rRNA binding. 40939 pfam00862: Sucrose synthase. Sucrose synthases catalyse the synthesis of sucrose from UDP-glucose and fructose. This family includes the bulk of the sucrose synthase protein. However the carboxyl terminal region of the sucrose synthases belongs to the glycosyl transferase family pfam00534. 40940 pfam00863: Peptidase family C4. This peptidase is present in the nuclear inclusion protein of potyviruses. 40941 pfam00864: ATP P2X receptor. 40942 pfam00865: Osteopontin. 40943 pfam00866: Ring hydroxylating beta subunit. This subunit has a similar structure to NTF-2 and scytalone dehydratase. 40944 pfam00867: XPG I-region. 40945 pfam00868: Transglutaminase family. 40946 pfam00869: Flavivirus glycoprotein, central and dimerisation domains. 40947 pfam00870: P53. 40948 pfam00871: Acetokinase family. This family includes acetate kinase, butyrate kinase and 2-methylpropanoate kinase. 40949 pfam00872: Transposase, Mutator family. 40950 pfam00873: AcrB/AcrD/AcrF family. Members of this family are integral membrane proteins. Some are involved in drug resistance. AcrB cooperates with a membrane fusion protein, AcrA, and an outer membrane channel TolC. The structure shows the AcrB forms a homotrimer. . 40951 pfam00874: PRD domain. The PRD domain (for PTS Regulation Domain), is the phosphorylatable regulatory domain found in bacterial transcriptional antiterminator of the BglG family as well as in activators such as MtlR and LevR. The PRD domain is phosphorylated on a conserved histidine residue. PRD-containing proteins are involved in the regulation of catabolic operons in Gram+ and Gram- bacteria and are often characterised by a short N-terminal effector domain that binds to either RNA (CAT-RBD for antiterminators (pfam03123, see also comments for this family)) or DNA (for activators), and a duplicated PRD module which is phosphorylated on conserved histidines by the sugar phosphotransferase system (PTS) in response to the availability of carbon source. The phosphorylations are thought to modify the stability of the dimeric proteins and thereby the RNA- or DNA-binding activity of the effector domain. 40952 pfam00875: DNA photolyase. This domain binds a light harvesting cofactor. 40953 pfam00876: Innexin. This family includes the drosophila proteins Ogre and shaking-B, and the C. elegans proteins Unc-7 and Unc-9. Members of this family are integral membrane proteins which are involved in the formation of gap junctions. This family has been named the Innexins. 40954 pfam00877: NlpC/P60 family. The function of this domain is unknown. It is found in several lipoproteins. 40955 pfam00878: Cation-independent mannose-6-phosphate receptor repeat. The cation-independent mannose-6-phosphate receptor contains 15 copies of a repeat. 40956 pfam00879: Defensin propeptide. 40957 pfam00880: Nebulin repeat. 40958 pfam00881: Nitroreductase family. Members of this family utilise FMN as a cofactor. 40959 pfam00882: Zinc dependent phospholipase C. 40960 pfam00883: Cytosol aminopeptidase family, catalytic domain. The two associated zinc ions and the active site are entirely enclosed within the C-terminal catalytic domain in leucine aminopeptidase. 40961 pfam00884: Sulfatase. 40962 pfam00885: 6,7-dimethyl-8-ribityllumazine synthase. This family includes the beta chain of 6,7-dimethyl-8- ribityllumazine synthase EC:2.5.1.9, an enzyme involved in riboflavin biosynthesis. The family also includes a subfamily of distant archaebacterial proteins that may also have the same function. 40963 pfam00886: Ribosomal protein S16. 40964 pfam00887: Acyl CoA binding protein. 40965 pfam00888: Cullin family. 40966 pfam00889: Elongation factor TS. 40967 pfam00890: FAD binding domain. This family includes members that bind FAD. This family includes the flavoprotein subunits from succinate and fumarate dehydrogenase, aspartate oxidase and the alpha subunit of adenylylsulphate reductase. . 40968 pfam00891: O-methyltransferase. This family includes a range of O-methyltransferases. These enzymes utilise S-adenosyl methionine. 40969 pfam00892: Integral membrane protein DUF6. This family includes many hypothetical membrane proteins of unknown function. Many of the proteins contain two copies of the aligned region. 40970 pfam00893: Small Multidrug Resistance protein. This family is the Small Multidrug Resistance (SMR) family. Several members have been shown to export a range of toxins, including ethidium bromide and quaternary ammonium compounds, through coupling with proton influx. 40971 pfam00894: Luteovirus coat protein. 40972 pfam00895: ATP synthase protein 8. 40973 pfam00896: Phosphorylase family 2. 40974 pfam00897: Orbivirus inner capsid protein VP7. In BTV, 260 trimers of VP7 are found in the core. The major proteins of the core are VP7 and VP3. VP7 forms an outer layer around VP3. 40975 pfam00898: Orbivirus outer capsid protein VP2. VP2 acts as an anchor for VP1 and VP3. VP2 contains a non-specific DNA and RNA binding domain in the N-terminus. 40976 pfam00899: ThiF family. This family contains a repeated domain in ubiquitin activating enzyme E1 and members of the bacterial ThiF/MoeB/HesA family. 40977 pfam00900: Ribosomal family S4e. 40978 pfam00901: Orbivirus outer capsid protein VP5. cryoelectron microscopy indicates that VP5 is a trimer implying that there are 360 copies of VP5 per virion. 40979 pfam00902: Sec-independent protein translocase protein (TatC). The bacterial Tat system has a remarkable ability to transport folded proteins even enzyme complexes across the cytoplasmic membrane. It is structurally and mechanistically similar to the Delta pH-driven thylakoidal protein import pathway. A functional Tat system or Delta pH-dependent pathway requires three integral membrane proteins: TatA/Tha4, TatB/Hcf106 and TatC/cpTatC. The TatC protein is essential for the function of both pathways. It might be involved in twin-arginine signal peptide recognition, protein translocation and proton translocation. Sequence analysis predicts that TatC contains six transmembrane helices (TMHs), and experimental data confirmed that N- and C-termini of TatC or cpTatC are exposed to the cytoplasmic or stromal face of the membrane. The cytoplasmic N-terminus and the first cytoplasmic loop region of the Escherichia coli TatC protein are essential for protein export. At least two TatC molecules co-exist within each Tat translocon. 40980 pfam00903: Glyoxalase/Bleomycin resistance protein/Dioxygenase superfamily. 40981 pfam00904: Involucrin repeat. 40982 pfam00905: Penicillin binding protein transpeptidase domain. The active site serine is conserved in all members of this family. 40983 pfam00906: Hepatitis core antigen. The core antigen of hepatitis viruses possesses a carboxyl terminus rich in arginine. On this basis it was predicted that the core antigen would bind DNA. There is some experimental evidence to support this. 40984 pfam00907: T-box. The T-box encodes a 180 amino acid domain that binds to DNA. 40985 pfam00908: dTDP-4-dehydrorhamnose 3,5-epimerase. This family catalyse the isomerisation of dTDP-4-dehydro-6-deoxy -D-glucose with dTDP-4-dehydro-6-deoxy-L-mannose. The EC number of this enzyme is 5.1.3.13. 40986 pfam00909: Ammonium Transporter Family. 40987 pfam00910: RNA helicase. This family includes RNA helicases thought to be involved in duplex unwinding during viral RNA replication. Members of this family are found in a variety of single stranded RNA viruses. 40988 pfam00912: Transglycosylase. The penicillin-binding proteins are bifunctional proteins consisting of transglycosylase and transpeptidase in the N- and C-terminus respectively. The transglycosylase domain catalyses the polymerisation of murein glycan chains. 40989 pfam00913: Trypanosome variant surface glycoprotein. The trypanosome parasite expresses these proteins to evade the immune response. 40990 pfam00915: Calicivirus coat protein. 40991 pfam00916: Sulfate transporter family. Mutations in the human diastrophic dysplasia protein lead to several diseases. 40992 pfam00917: MATH domain. This motif has been called the Meprin And TRAF-Homology (MATH) domain. This domain is hugely expanded in the nematode C. elegans. 40993 pfam00918: Gastrin/cholecystokinin family. 40994 pfam00919: Uncharacterized protein family UPF0004. This family is the N terminal half of proteins with a C-terminal half, which has been shown to be related to MiaB proteins. This domain is a nearly always found in conjunction with pfam04055 and pfam01938, although its function is uncertain. 40995 pfam00920: Dehydratase family. 40996 pfam00921: Borrelia lipoprotein. This family of lipoproteins is found in Borrelia spirochetes. The function of these proteins is uncertain. 40997 pfam00922: Vesiculovirus phosphoprotein. 40998 pfam00923: Transaldolase. 40999 pfam00924: Mechanosensitive ion channel. Two members of this protein family in M. jannaschii have been functionally characterised. Both proteins form mechanosensitive (MS) ion channels upon reconstitution into liposomes and functional examination by the patch-clamp technique. Therefore this family are likely to also be MS channel proteins. 41000 pfam00925: GTP cyclohydrolase II. GTP cyclohydrolase II catalyses the first committed step in the biosynthesis of riboflavin. 41001 pfam00926: 3,4-dihydroxy-2-butanone 4-phosphate synthase. 3,4-Dihydroxy-2-butanone 4-phosphate is biosynthesised from ribulose 5-phosphate and serves as the biosynthetic precursor for the xylene ring of riboflavin. Sometimes found as a bifunctional enzyme with pfam00925. 41002 pfam00927: Transglutaminase family, C-terminal ig like domain. 41003 pfam00928: Adaptor complexes medium subunit family. This family also contains members which are coatomer subunits. 41004 pfam00929: Exonuclease. This family includes a variety of exonuclease proteins, such as ribonuclease T and the epsilon subunit of DNA polymerase III. 41005 pfam00930: Dipeptidyl peptidase IV (DPP IV) N-terminal region. This family is an alignment of the region to the N-terminal side of the active site. 41006 pfam00931: NB-ARC domain. 41007 pfam00932: Intermediate filament tail domain. 41008 pfam00933: Glycosyl hydrolase family 3 N terminal domain. 41009 pfam00934: PE family. This family named after a PE motif near to the amino terminus of the domain. The PE family of proteins all contain an amino-terminal region of about 110 amino acids. The carboxyl terminus of this family are variable and fall into several classes. The largest class of PE proteins is the highly repetitive PGRS class which have a high glycine content. The function of these proteins is uncertain but it has been suggested that they may be related to antigenic variation of Mycobacterium tuberculosis. 41010 pfam00935: Ribosomal protein L44. 41011 pfam00936: Bacterial microcompartments protein family. 41012 pfam00937: Coronavirus nucleocapsid protein. 41013 pfam00938: Lipoprotein. This family of lipoproteins is Mycoplasma specific. 41014 pfam00939: Sodium:sulfate symporter transmembrane region. There are also some members in this family, which belong to the subfamily SODIT1. 41015 pfam00940: DNA-dependent RNA polymerase. This is a family of single chain RNA polymerases. 41016 pfam00941: FAD binding domain in molybdopterin dehydrogenase. 41017 pfam00942: Cellulose binding domain. 41018 pfam00943: Alphavirus E2 glycoprotein. E2 forms a heterodimer with E1. The virus spikes are made up of 80 trimers of these heterodimers (sindbis virus).. 41019 pfam00944: Alphavirus core protein. Also known as coat protein C and capsid protein C. This makes the literature very confusing. Alphaviruses consist of a nucleoprotein core, a lipid membrane which envelopes the core, and glycoprotein spikes protruding from the lipid membrane. 41020 pfam00945: Rhabdovirus nucleocapsid protein. The Nucleocapsid (N) Protein is said to have a ""tight"" structure. The carboxyl end of the N-terminal domain possesses an RNA binding domain. Sequence alignments show 2 regions of reasonable conservation, approx. 64-103 and 201-329. A whole functional protein is required for encapsidation to take place. 41021 pfam00946: Paramyxovirus RNA dependent RNA polymerase. Paramyxoviridae, like other non-segmented negative strand RNA viruses, have an RNA-dependent RNA polymerase composed of two subunits, a large protein L and a phosphoprotein P. This is a protein family of the L protein. The L protein confers the RNA polymerase activity on the complex. The P protein acts as a transcription factor. 41022 pfam00947: Picornavirus core protein 2A. This protein is a protease, involved in cleavage of the polyprotein. 41023 pfam00948: Flavivirus non-structural Protein NS1. The NS1 protein is well conserved amongst the flaviviruses. It contains 12 cysteines, and undergoes glycosylation in a similar manner to other NS proteins. Mutational analysis has strongly implied a role for NS1 in the early stages of RNA replication. 41024 pfam00949: Flavivirus helicase (NS3). Flaviviruses produce a polyprotein from the ssRNA genome. The N-terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase. 41025 pfam00950: ABC 3 transport family. 41026 pfam00951: Arterivirus GL envelope glycoprotein. Arteriviruses encode 4 envelope proteins, Gl, Gs, M and N. Gl envelope protein, is encoded in ORF5, and is 30- 45 kDa in size. Gl is heterogenously glycosylated with N-acetyllactosamine in a cell-type-specific manner. The Gl glycoprotein expresses the neutralisation determinants. 41027 pfam00952: Bunyavirus nucleocapsid (N) protein. The bunyaviruses are enveloped viruses with a genome consisting of 3 ssRNA segments (called L, M and S). The nucleocapsid protein is encode on the small (S) genomic RNA. The N protein is the major component of the nucleocapsids. This protein is thought to interact with the L protein, virus RNA and/or other N proteins. 41028 pfam00953: Glycosyl transferase. 41029 pfam00954: S-locus glycoprotein family. In Brassicaceae, self-incompatible plants have a self/non-self recognition system. This is sporophytically controlled by multiple alleles at a single locus (S). S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles. 41030 pfam00955: HCO3- transporter family. This family contains Band 3 anion exchange proteins that exchange CL-/HCO3-. This family also includes cotransporters of Na+/HCO3-.. 41031 pfam00956: Nucleosome assembly protein (NAP). It is thought that NAPs may be involve in regulating gene expression as a result of histone accessibility. 41032 pfam00957: Synaptobrevin. 41033 pfam00958: GMP synthase C terminal domain. GMP synthetase is a glutamine amidotransferase from the de novo purine biosynthetic pathway. This family is the C-terminal domain specific to the GMP synthases EC:6.3.5.2. In prokaryotes this domain mediates dimerisation. Eukaryotic GMP synthases are monomers. This domain in eukaryotes includes several large insertions that may form globular domains. 41034 pfam00959: Phage lysozyme. This family includes lambda phage lysozyme and E. coli endolysin. 41035 pfam00960: Neocarzinostatin family. 41036 pfam00961: LAGLIDADG endonuclease. 41037 pfam00962: Adenosine/AMP deaminase. 41038 pfam00963: Cohesin domain. Cohesin domains interact with a complementary domain, termed the dockerin domain. The cohesin-dockerin interaction is the crucial interaction for complex formation in the cellulosome. 41039 pfam00964: Elicitin. Elicitins form a novel class of plant necrotic proteins which are secreted by Phytophthora and Pythium fungi, parasites of many economically important crops. These proteins induce leaf necrosis in infected plants and elicit an incompatible hypersensitive-like reaction, leading to the development of a systemic acquired resistance against a range of fungal and bacterial plant pathogens. 41040 pfam00965: Tissue inhibitor of metalloproteinase. Members of this family are common in extracellular regions of vertebrate species. 41041 pfam00967: Barwin family. 41042 pfam00969: Class II histocompatibility antigen, beta domain. 41043 pfam00970: Oxidoreductase FAD-binding domain. 41044 pfam00971: EIAV coat protein, gp90. Equine infectious anaemia (EIAV). EIAV belongs to the family Retroviridae. EIAV gp90 is hypervariable in the carboxyl-end region and more stable in the amino-end region. This variability is a pathogenicity factor that allows the evasion of the host's immune response. 41045 pfam00972: Flavivirus RNA-directed RNA polymerase. Flaviviruses produce a polyprotein from the ssRNA genome. This protein is also known as NS5. This RNA-directed RNA polymerase possesses a number of short regions and motifs homologous to other RNA-directed RNA polymerases. 41046 pfam00973: Paramyxovirus nucleocapsid protein. The nucleocapsid protein is referred to as NP. NP is is the major structural component of the nucleocapsid. The protein is approx. 58 kDa. 2600 NP molecules go to tightly encapsidate the RNA. NP interacts with several other viral encoded proteins, all of which are involved in controlling replication. {NP-NP, NP-P, NP-(PL), and NP-V}.. 41047 pfam00974: Rhabdovirus spike glycoprotein. Frequently abbreviated to G protein. The glycoprotein spike is made up of a trimer of G proteins. Channel formed by glycoprotein spike is thought to function in a similar manner to Influenza virus M2 protein channel, thus allowing a signal to pass across the viral membrane to signal for viral uncoating. 41048 pfam00975: Thioesterase domain. Peptide synthetases are involved in the non-ribosomal synthesis of peptide antibiotics. Next to the operons encoding these enzymes, in almost all cases, are genes that encode proteins that have similarity to the type II fatty acid thioesterases of vertebrates. There are also modules within the peptide synthetases that also share this similarity. With respect to antibiotic production, thioesterases are required for the addition of the last amino acid to the peptide antibiotic, thereby forming a cyclic antibiotic. Thioesterases (non-integrated) have molecular masses of 25-29 kDa. 41049 pfam00976: Corticotropin ACTH domain. 41050 pfam00977: Histidine biosynthesis protein. Proteins involved in steps 4 and 6 of the histidine biosynthesis pathway are contained in this family. Histidine is formed by several complex and distinct biochemical reactions catalysed by eight enzymes. The enzymes in this Pfam entry are called His6 and His7 in eukaryotes and HisA and HisF in prokaryotes. The structure of HisA is known to be a TIM barrel fold. In some archaeal HisA proteins the TIM barrel is composed of two tandem repeats of a half barrel. This family belong to the common phosphate binding site TIM barrel family. 41051 pfam00978: RNA dependent RNA polymerase. This family may represent an RNA dependent RNA polymerase. The family also contains the following proteins: 2A protein from bromoviruses putative RNA dependent RNA polymerase from tobamoviruses Non structural polyprotein from togaviruses. 41052 pfam00979: Reovirus outer capsid protein, Sigma 3. Sigma 3 is the major outer capsid protein of reovirus. Sigma 3 is encoded by genome segment 4. Sigma 3 binds to double stranded RNA and associates with polypeptide u1 and its cleavage product u1C to form the outer shell of the virion. The Sigma 3 protein possesses a zinc-finger motif and an RNA-binding domain in the N and C termini respectively. This protein is also thought to play a role in pathogenesis. . 41053 pfam00980: Rotavirus major capsid protein VP6. Rotaviruses consist of three concentric protein shells. The intermediate (middle) protein layer consists 260 trimers of VP6. VP6 in the most abundant protein in the virion. VP6 is also involved in virion assembly, and possesses the ability to interact with VP2, VP4 and VP7. 41054 pfam00981: Rotavirus RNA-binding Protein 53 (NS53). This protein is also known as NSP1. NS53 is encoded by gene 5. It is made in low levels in the infected cells and is a component of early replication. The protein is known to accumulate on the cytoskeleton of the infected cell. NS53 is an RNA binding protein that contains a characteristic cysteine rich region. 41055 pfam00982: Glycosyltransferase family 20. Members of this family belong to glycosyl transferase family 20. OtsA (Trehalose-6-phosphate synthase) is homologous to regions in the subunits of yeast trehalose-6-phosphate synthase/phosphate complex. 41056 pfam00983: Tymovirus coat protein. 41057 pfam00984: UDP-glucose/GDP-mannose dehydrogenase family, central domain. The UDP-glucose/GDP-mannose dehydrogenaseses are a small group of enzymes which possesses the ability to catalyse the NAD-dependent 2-fold oxidation of an alcohol to an acid without the release of an aldehyde intermediate. 41058 pfam00985: Merozoite Surface Antigen 2 (MSA-2) family. 41059 pfam00986: DNA gyrase B subunit, carboxyl terminus. The amino terminus of eukaryotic and prokaryotic DNA topoisomerase II are similar, but they have a different carboxyl terminus. The amino-terminal portion of the DNA gyrase B protein is thought to catalyse the ATP-dependent super-coiling of DNA. See pfam00204. The carboxyl-terminal end supports the complexation with the DNA gyrase A protein and the ATP-independent relaxation. This family also contains Topoisomerase IV. This is a bacterial enzyme that is closely related to DNA gyrase. 41060 pfam00988: Carbamoyl-phosphate synthase small chain, CPSase domain. The carbamoyl-phosphate synthase domain is in the amino terminus of protein. Carbamoyl-phosphate synthase catalyses the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. This important enzyme initiates both the urea cycle and the biosynthesis of arginine and/or pyrimidines. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. The small chain promotes the hydrolysis of glutamine to ammonia, which is used by the large chain to synthesise carbamoyl phosphate. See pfam00289. The small chain has a GATase domain in the carboxyl terminus. See pfam00117. 41061 pfam00989: PAS domain. CAUTION. This family does not currently match all known examples of PAS domains. PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. 41062 pfam00990: GGDEF domain. This domain is found linked to a wide range of non-homologous domains in a variety of bacteria. The function of this domain is unknown, however it has been shown to be homologous to the adenylyl cyclase catalytic domain. This prediction correlates with the functional information available on two GGDEF-containing proteins, namely diguanylate cyclase and phosphodiesterase A of Acetobacter xylinum, both of which regulate the turnover of cyclic diguanosine monophosphate. 41063 pfam00991: ParA family ATPase. 41064 pfam00992: Troponin. Troponin (Tn) contains three subunits, Ca2+ binding (TnC), inhibitory (TnI), and tropomyosin binding (TnT). this Pfam contains members of the TnT subunit. Troponin is a complex of three proteins, Ca2+ binding (TnC), inhibitory (TnI), and tropomyosin binding (TnT). The troponin complex regulates Ca++ induced muscle contraction. This family includes troponin T and troponin I. Troponin I binds to actin and troponin T binds to tropomyosin. 41065 pfam00993: Class II histocompatibility antigen, alpha domain. 41066 pfam00994: Probable molybdopterin binding domain. This domain is found a variety of proteins involved in biosynthesis of molybdopterin cofactor. The domain is presumed to bind molybdopterin. The structure of this domain is known, and it forms an alpha/beta structure. In the known structure of Gephyrin this domain mediates trimerisation. 41067 pfam00995: Sec1 family. 41068 pfam00996: GDP dissociation inhibitor. 41069 pfam00997: Kappa casein. Kappa-casein is a mammalian milk protein involved in a number of important physiological processes. In the gut, the ingested protein is split into an insoluble peptide (para kappa-casein) and a soluble hydrophilic glycopeptide (caseinomacropeptide). Caseinomacropeptide is responsible for increased efficiency of digestion, prevention of neonate hypersensitivity to ingested proteins, and inhibition of gastric pathogens. 41070 pfam00998: Viral RNA dependent RNA polymerase. This family includes viral RNA dependent RNA polymerase enzymes from hepatitis C virus and various plant viruses. 41071 pfam00999: Sodium/hydrogen exchanger family. Na/H antiporters are key transporters in maintaining the pH of actively metabolising cells. The molecular mechanisms of antiport are unclear. These antiporters contain 10-12 transmembrane regions (M) at the amino-terminus and a large cytoplasmic region at the carboxyl terminus. The transmembrane regions M3-M12 share identity with other members of the family. The M6 and M7 regions are highly conserved. Thus, this is thought to be the region that is involved in the transport of sodium and hydrogen ions. The cytoplasmic region has little similarity throughout the family. 41072 pfam01000: RNA polymerase Rpb3/RpoA insert domain. Members of this family include: alpha subunit from eubacteria alpha subunits from chloroplasts Rpb3 subunits from eukaryotes RpoD subunits from archaeal. 41073 pfam01001: Hepatitis C virus non-structural protein NS4b. No precise function has been assigned to NS4b. However, it is known that NS4b interacts with NS4a and NS3 to form a large replicase complex to direct the viral RNA replication. 41074 pfam01002: Flavivirus non-structural protein NS2B. Flaviviruses encode a single polyprotein. This is cleaved into three structural and seven non-structural proteins. All, but two, are cleaved by the NS2B-NS3 protease complex. 41075 pfam01003: Flavivirus capsid protein C. Flaviviruses are small enveloped viruses with virions comprised of 3 proteins called C, M and E. Multiple copies of the C protein form the nucleocapsid, which contains the ssRNA molecule. 41076 pfam01004: Flavivirus envelope glycoprotein M. Flaviviruses are small enveloped viruses with virions comprised of 3 proteins called C, M and E. The envelope glycoprotein M is made as a precursor, called prM. The precursor portion of the protein is the signal peptide for the proteins entry into the membrane. prM is cleaved to form M in a late-stage cleavage event. Associated with this cleavage is a change in the infectivity and fusion activity of the virus. . 41077 pfam01005: Flavivirus non-structural protein NS2A. NS2A is a hydrophobic protein about 25 kDa is size. NS2A is cleaved from NS1 by a membrane bound host protease. NS2A has been found to associate with the dsRNA within the vesicle packages. It has also been found that NS2A associates with the known replicase components and so NS2A has been postulated to be part of this replicase complex. 41078 pfam01006: Hepatitis C virus non-structural protein NS4a. NS4a forms an integral part of the NS3 serine protease, as it is required in a number of cases as a cofactor of cleavage. It has also been reported that NS4a interacts with NS4b and NS3 to form a multi-subunit replicase complex. 41079 pfam01007: Inward rectifier potassium channel. 41080 pfam01008: Initiation factor 2 subunit family. This family includes initiation factor 2B alpha, beta and delta subunits from eukaryotes, initiation factor 2B subunits 1 and 2 from archaebacteria and some proteins of unknown function from prokaryotes. Initiation factor 2 binds to Met-tRNA, GTP and the small ribosomal subunit. 41081 pfam01010: NADH-Ubiquinone oxidoreductase (complex I), chain 5 C-terminus. This sub-family represents a carboxyl terminal extension of pfam00361. Only NADH-Ubiquinone chain 5 from chloroplasts are in this family. This sub-family is part of complex I which catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane. 41082 pfam01011: PQQ enzyme repeat. The family represent a single repeat of a beta propeller. This propeller has been found in several enzymes which utilise pyrrolo-quinoline quinone as a prosthetic group. 41083 pfam01012: Electron transfer flavoprotein beta subunit. This protein is distantly related to and forms a heterodimer with pfam00766. 41084 pfam01014: Uricase. 41085 pfam01015: Ribosomal S3Ae family. 41086 pfam01016: Ribosomal L27 protein. 41087 pfam01017: STAT protein, all-alpha domain. STAT proteins (Signal Transducers and Activators of Transcription) are a family of transcription factors that are specifically activated to regulate gene transcription when cells encounter cytokines and growth factors. STAT proteins also include an SH2 domain pfam00017. 41088 pfam01018: GTP1/OBG family. 41089 pfam01019: Gamma-glutamyltranspeptidase. 41090 pfam01020: Ribosomal L40e family. Bovine L40 has been identified as a secondary RNA binding protein. L40 is fused to a ubiquitin protein. 41091 pfam01021: TYA transposon protein. Ty are yeast transposons. A 5.7kb transcript codes for p3 a fusion protein of TYA and TYB. The TYA protein is analogous to the gag protein of retroviruses. TYA a is cleaved to form 46kd protein which can form mature virion like particles. 41092 pfam01022: Bacterial regulatory protein, arsR family. Members of this family contains a DNA binding 'helix-turn-helix' motif. 41093 pfam01023: S-100/ICaBP type calcium binding domain. The S-100 domain is a subfamily of the EF-hand calcium binding proteins. 41094 pfam01024: Colicin pore forming domain. 41095 pfam01025: GrpE. 41096 pfam01026: TatD related DNase. This family of proteins are related to a large superfamily of metalloenzymes. TatD, a member of this family has been shown experimentally to be a DNase enzyme. 41097 pfam01027: Uncharacterised protein family UPF0005. 41098 pfam01028: Eukaryotic DNA topoisomerase I, catalytic core. Topoisomerase I promotes the relaxation of DNA superhelical tension by introducing a transient single-stranded break in duplex DNA and are vital for the processes of replication, transcription, and recombination. 41099 pfam01029: NusB family. The NusB protein is involved in the regulation of rRNA biosynthesis by transcriptional antitermination. 41100 pfam01030: Receptor L domain. The L domains from these receptors make up the bilobal ligand binding site. Each L domain consists of a single-stranded right hand beta-helix. This Pfam entry is missing the first 50 amino acid residues of the domain. 41101 pfam01031: Dynamin central region. This region lies between the GTPase domain, see pfam00350, and the pleckstrin homology (PH) domain, see pfam00169. 41102 pfam01032: FecCD transport family. This is a sub-family of bacterial binding protein-dependent transport systems family. This Pfam entry contains the inner components of this multicomponent transport system. 41103 pfam01033: Somatomedin B domain. 41104 pfam01034: Syndecan domain. Syndecans are transmembrane heparin sulfate proteoglycans which are implicated in the binding of extracellular matrix components and growth factors. 41105 pfam01035: 6-O-methylguanine DNA methyltransferase, DNA binding domain. This domain is a 3 helical bundle. 41106 pfam01036: Bacteriorhodopsin. 41107 pfam01037: AsnC family. The AsnC family is a family of similar bacterial transcription regulatory proteins. 41108 pfam01039: Carboxyl transferase domain. All of the members in this family are biotin dependent carboxylases. The carboxyl transferase domain carries out the following reaction; transcarboxylation from biotin to an acceptor molecule. There are two recognised types of carboxyl transferase. One of them uses acyl-CoA and the other uses 2-oxoacid as the acceptor molecule of carbon dioxide. All of the members in this family utilise acyl-CoA as the acceptor molecule. 41109 pfam01040: UbiA prenyltransferase family. 41110 pfam01041: DegT/DnrJ/EryC1/StrS aminotransferase family. The members of this family are probably all pyridoxal-phosphate-dependent aminotransferase enzymes with a variety of molecular functions. The family includes StsA, StsC and StsS. The aminotransferase activity was demonstrated for purified StsC protein as the L-glutamine:scyllo-inosose aminotransferase EC:2.6.1.50, which catalyses the first amino transfer in the biosynthesis of the streptidine subunit of streptomycin. 41111 pfam01042: Endoribonuclease L-PSP. Endoribonuclease active on single-stranded mRNA. Inhibits protein synthesis by cleavage of mRNA. Previously thought to inhibit protein synthesis initiation. This protein may also be involved in the regulation of purine biosynthesis. 41112 pfam01043: SecA protein, amino terminal region. SecA protein binds to the plasma membrane where it interacts with proOmpA to support translocation of proOmpA through the membrane. SecA protein achieves this translocation, in association with SecY protein, in an ATP dependent manner. SecA possesses the ATPase activity. The carboxyl terminus has similarity with the helicase carboxyl terminus. See pfam00281. 41113 pfam01044: Vinculin family. 41114 pfam01045: EIAV glycoprotein, gp45. Equine infectious anaemia (EIAV). gp45 is a transmembrane envelope glycoprotein from EIAV. 41115 pfam01047: MarR family. The Mar proteins are involved in the multiple antibiotic resistance, a non-specific resistance system. The expression of the mar operon is controlled by a repressor, MarR. A large number of compounds induce transcription of the mar operon. This is thought to be due to the compound binding to MarR, and the resulting complex stops MarR binding to the DNA. With the MarR repression lost, transcription of the operon proceeds. The structure of MarR is known and shows MarR as a dimer with each subunit containing a winged-helix DNA binding motif. 41116 pfam01048: Phosphorylase family. Members of this family include: purine nucleoside phosphorylase (PNP) Uridine phosphorylase (UdRPase) 5'-methylthioadenosine phosphorylase (MTA phosphorylase). 41117 pfam01049: Cadherin cytoplasmic region. Cadherins are vital in cell-cell adhesion during tissue differentiation. Cadherins are linked to the cytoskeleton by catenins. Catenins bind to the cytoplasmic tail of the cadherin. Cadherins cluster to form foci of homophilic binding units. A key determinant to the strength of the binding that it is mediated by cadherins is the juxtamembrane region of the cadherin. This region induces clustering and also binds to the protein p120ctn. 41118 pfam01050: Mannose-6-phosphate isomerase. All of the members of this Pfam entry belong to family 2 of the mannose-6-phosphate isomerases. The type II phosphomannose isomerases are bifunctional enzymes. This Pfam entry covers the isomerase domain. The guanosine diphospho-D-mannose pyrophosphorylase domain is in another Pfam entry, see pfam00483. 41119 pfam01051: Initiator Replication protein. This protein is an initiator of plasmid replication. RepB possesses nicking-closing (topoisomerase I) like activity. It is also able to perform a strand transfer reaction on ssDNA that contains its target. This family also includes RepA which is an E.coli protein involved in plasmid replication. The RepA protein binds to DNA repeats that flank the repA gene. 41120 pfam01052: Surface presentation of antigens (SPOA) protein. 41121 pfam01053: Cys/Met metabolism PLP-dependent enzyme. This family includes enzymes involved in cysteine and methionine metabolism. The following are members: Cystathionine gamma-lyase, Cystathionine gamma-synthase, Cystathionine beta-lyase, Methionine gamma-lyase, OAH/OAS sulfhydrylase, O-succinylhomoserine sulfhydrylase All of these members participate is slightly different reactions. All these enzymes use PLP (pyridoxal-5'-phosphate) as a cofactor. 41122 pfam01054: Mouse mammary tumour virus superantigen. The mouse mammary tumour virus (MMTV) is a milk-transmitted type B retrovirus. The superantigen (SAg) is encoded by the long terminal repeat. The SAgs are also called PR73. 41123 pfam01055: Glycosyl hydrolases family 31. Glycosyl hydrolases are key enzymes of carbohydrate metabolism. Family 31 comprises of enzymes that are, or similar to, alpha- galactosidases. 41124 pfam01056: Myc amino-terminal region. The myc family belongs to the basic helix-loop-helix leucine zipper class of transcription factors, see pfam00010. Myc forms a heterodimer with Max, and this complex regulates cell growth through direct activation of genes involved in cell replication. 41125 pfam01057: Parvovirus non-structural protein NS1. This family also contains the NS2 protein. Parvoviruses encode two non-structural proteins, NS1 and NS2. The mRNA for NS2 contains the coding sequence for the first 87 amino acids of NS1, then by an alternative splicing mechanism mRNA from a different reading frame, encoding the last 78 amino acids, makes up the full length of the NS2 mRNA. NS1, is the major non-structural protein. It is essential for DNA replication. It is an 83-kDa nuclear phosphoprotein. It has DNA helicase and ATPase activity. 41126 pfam01058: NADH ubiquinone oxidoreductase, 20 Kd subunit. 41127 pfam01059: NADH-ubiquinone oxidoreductase chain 4, amino terminus. 41128 pfam01060: Transthyretin-like family. This family has weak similarity to transthyretin (formerly called prealbumin) which transports thyroid hormones. The specific function of this protein is unknown. 41129 pfam01061: ABC-2 type transporter. 41130 pfam01062: Bestrophin. Bestrophin is a 68-kDa basolateral plasma membrane protein expressed in retinal pigment epithelial cells (RPE). It is encoded by the VMD2 gene, which is mutated in Best macular dystrophy, a disease characterised by a depressed light peak in the electrooculogram. VMD2 encodes a 585-amino acid protein with an approximate mass of 68 kDa which has been designated bestrophin. Bestrophin shares homology with the Caenorhabditis elegans RFP gene family, named for the presence of a conserved arginine (R), phenylalanine (F), proline (P), amino acid sequence motif. Bestrophin is a plasma membrane protein, localised to the basolateral surface of RPE cells consistent with a role for bestrophin in the generation or regulation of the EOG light peak. Bestrophin and other RFP family members represent a new class of chloride channels, indicating a direct role for bestrophin in generating the light peak. The VMD2 gene underlying Best disease was shown to represent the first human member of the RFP-TM protein family. More than 97% of the disease-causing mutations are located in the N-terminal RFP-TM domain implying important functional properties. 41131 pfam01063: Aminotransferase class IV. The D-amino acid transferases (D-AAT) are required by bacteria to catalyse the synthesis of D-glutamic acid and D-alanine, which are essential constituents of bacterial cell wall and are the building block for other D-amino acids. Despite the difference in the structure of the substrates, D-AATs and L-ATTs have strong similarity. 41132 pfam01064: Activin types I and II receptor domain. This Pfam entry consists of both TGF-beta receptor types. This is an alignment of the hydrophilic cysteine-rich ligand-binding domains, Both receptor types, (type I and II) posses a 9 amino acid cysteine box, with the the consensus CCX{4-5}CN. The type I receptors also possess 7 extracellular residues preceding the cysteine box. 41133 pfam01065: Hexon, adenovirus major coat protein, N-terminal domain. Hexon is the major coat protein from adenovirus type 2. Hexon forms a homo-trimer. The 240 copies of the hexon trimer are organised so that 12 lie on each of the 20 facets. The central 9 hexons in a facet are cemented together by 12 copies of polypeptide IX. The penton complex, formed by the peripentonal hexons and base hexon (holding in place a fibre), lie at each of the 12 vertices. The N and C-terminal domains adopt the same PNGase F-like fold although they are significantly different in length. 41134 pfam01066: CDP-alcohol phosphatidyltransferase. All of these members have the ability to catalyse the displacement of CMP from a CDP-alcohol by a second alcohol with formation of a phosphodiester bond and concomitant breaking of a phosphoride anhydride bond. 41135 pfam01067: Calpain large subunit, domain III. The function of the domain III and I are currently unknown. Domain II is a cysteine protease and domain IV is a calcium binding domain. Calpains are believed to participate in intracellular signaling pathways mediated by calcium ions. 41136 pfam01068: ATP dependent DNA ligase domain. This domain belongs to a more diverse superfamily, including pfam01331 and pfam01653. 41137 pfam01070: FMN-dependent dehydrogenase. 41138 pfam01071: Phosphoribosylglycinamide synthetase, ATP-grasp (A) domain. Phosphoribosylglycinamide synthetase catalyses the second step in the de novo biosynthesis of purine. The reaction catalysed by Phosphoribosylglycinamide synthetase is the ATP- dependent addition of 5-phosphoribosylamine to glycine to form 5'phosphoribosylglycinamide. This domain is related to the ATP-grasp domain of biotin carboxylase/carbamoyl phosphate synthetase (see pfam02786).. 41139 pfam01073: 3-beta hydroxysteroid dehydrogenase/isomerase family. The enzyme 3 beta-hydroxysteroid dehydrogenase/5-ene-4-ene isomerase (3 beta-HSD) catalyses the oxidation and isomerisation of 5-ene-3 beta-hydroxypregnene and 5-ene-hydroxyandrostene steroid precursors into the corresponding 4-ene-ketosteroids necessary for the formation of all classes of steroid hormones. 41140 pfam01074: Glycosyl hydrolases family 38. Glycosyl hydrolases are key enzymes of carbohydrate metabolism. 41141 pfam01075: Glycosyltransferase family 9 (heptosyltransferase). Members of this family belong to glycosyltransferase family 9. Lipopolysaccharide is a major component of the outer leaflet of the outer membrane in Gram-negative bacteria. It is composed of three domains; lipid A, Core oligosaccharide and the O-antigen. All of these enzymes transfer heptose to the lipopolysaccharide core. 41142 pfam01076: Plasmid recombination enzyme. With some plasmids, recombination can occur in a site specific manner that is independent of RecA. In such cases, the recombination event requires another protein called Pre. Pre is a plasmid recombination enzyme. This protein is also known as Mob (conjugative mobilisation).. 41143 pfam01077: Nitrite and sulphite reductase 4Fe-4S domain. Sulphite and nitrite reductases are vital in the biosynthetic assimilation of sulphur and nitrogen, respectfully. They are also both important for the dissimilation of oxidised anions for energy transduction. 41144 pfam01078: Magnesium chelatase, subunit ChlI. Magnesium-chelatase is a three-component enzyme that catalyses the insertion of Mg2+ into protoporphyrin IX. This is the first unique step in the synthesis of (bacterio)chlorophyll. Due to this, it is thought that Mg-chelatase has an important role in channelling inter- mediates into the (bacterio)chlorophyll branch in response to conditions suitable for photosynthetic growth. ChlI and BchD have molecular weight between 38-42 kDa. 41145 pfam01079: Hint module. This is an alignment of the Hint module in the Hedgehog proteins. It does not include any Inteins which also possess the Hint module. 41146 pfam01080: Presenilin. Mutations in presenilin-1 are a major cause of early onset Alzheimer's disease. It has been found that presenilin-1 binds to beta-catenin in-vivo. This family also contains SPE proteins from C.elegans. 41147 pfam01081: KDPG and KHG aldolase. This family includes the following members: 4-hydroxy-2-oxoglutarate aldolase (KHG-aldolase) Phospho-2-dehydro-3-deoxygluconate aldolase (KDPG-aldolase). 41148 pfam01082: Copper type II ascorbate-dependent monooxygenase, N-terminal domain. The N and C-terminal domains of members of this family adopt the same PNGase F-like fold. 41149 pfam01083: Cutinase. 41150 pfam01084: Ribosomal protein S18. 41151 pfam01085: Hedgehog amino-terminal signaling domain. For the carboxyl Hint module, see pfam01079. Hedgehog is a family of secreted signal molecules required for embryonic cell differentiation. 41152 pfam01086: Clathrin light chain. 41153 pfam01087: Galactose-1-phosphate uridyl transferase, N-terminal domain. SCOP reports fold duplication with C-terminal domain. Both involved in Zn and Fe binding. 41154 pfam01088: Ubiquitin carboxyl-terminal hydrolase, family 1. 41155 pfam01089: Delta 1-pyrroline-5-carboxylate reductase. 41156 pfam01090: Ribosomal protein S19e. 41157 pfam01091: PTN/MK heparin-binding protein family, C-terminal domain. 41158 pfam01092: Ribosomal protein S6e. 41159 pfam01093: Clusterin. 41160 pfam01094: Receptor family ligand binding region. This family includes extracellular ligand binding domains of a wide range of receptors. This family also includes the bacterial amino acid binding proteins of known structure. 41161 pfam01095: Pectinesterase. 41162 pfam01096: Transcription factor S-II (TFIIS).. 41163 pfam01097: Arthropod defensin. 41164 pfam01098: Cell cycle protein. This entry includes the following members; FtsW, RodA, SpoVE. 41165 pfam01099: Uteroglobin family. Uteroglobin is a homodimer of two identical 70 amino acid polypeptides linked by two disulphide bridges. The precise role of uteroglobin has still to be elucidated. 41166 pfam01101: HMG14 and HMG17. 41167 pfam01102: Glycophorin A. 41168 pfam01103: Surface antigen. This entry includes the following surface antigens; D15 antigen from H.influenzae, OMA87 from P.multocida, OMP85 from N.meningitidis and N.gonorrhoeae. The family also includes a number of eukaryotic proteins as well that are members of the UPF0140 family. There also appears to be a relationship to pfam03865 (personal obs: C Yeats).. 41169 pfam01104: Bunyavirus non-structural protein NS-s. The NS-s protein is encoded by the S RNA. This segment also encodes for the N protein. These two proteins are encoded by overlapping reading frames. 41170 pfam01105: emp24/gp25L/p24 family. Members of this family are implicated in bringing cargo forward from the ER and binding to coat proteins by their cytoplasmic domains. . 41171 pfam01106: NifU-like domain. This is an alignment of the carboxy-terminal domain. This is the only common region between the NifU protein from nitrogen-fixing bacteria and rhodobacterial species. The biochemical function of NifU is unknown. 41172 pfam01107: Tobamovirus movement protein (MP). The movement protein (MP) of Tobamoviruses is a 30kDa protein. It is translated from the I2-subgenomic RNA. The MP is necessary for the initial cell-to-cell movement during the early stages of a viral infection. This movement is active, and it is known that the MP interacts with the plasmodesmata and possesses the ability to bind to RNA to achieve its role. 41173 pfam01108: Tissue factor. 41174 pfam01109: Granulocyte-macrophage colony-stimulating factor. 41175 pfam01110: Ciliary neurotrophic factor. 41176 pfam01111: Cyclin-dependent kinase regulatory subunit. 41177 pfam01112: Asparaginase. 41178 pfam01113: Dihydrodipicolinate reductase, N-terminus. Dihydrodipicolinate reductase (DapB) reduces the alpha,beta-unsaturated cyclic imine, dihydro-dipicolinate. This reaction is the second committed step in the biosynthesis of L-lysine and its precursor meso-diaminopimelate, which are critical for both protein and cell wall biosynthesis. The N-terminal domain of DapB binds the dinucleotide NADPH. 41179 pfam01114: Colipase, N-terminal domain. SCOP reports duplication of common fold with Colipase C-terminal domain. 41180 pfam01115: F-actin capping protein, beta subunit. 41181 pfam01116: Fructose-bisphosphate aldolase class-II. 41182 pfam01117: Aerolysin/Hemolysin/Leukocidin toxin. This family represents the pore forming lobe of aerolysin, and the related toxins hemolysin and the leukocidin S subunit. 41183 pfam01118: Semialdehyde dehydrogenase, NAD binding domain. This Pfam entry contains the following members: N-acetyl-glutamine semialdehyde dehydrogenase (AgrC) Aspartate-semialdehyde dehydrogenase. 41184 pfam01119: DNA mismatch repair protein, C-terminal domain. This family represents the C-terminal domain of the mutL/hexB/PMS1 family. This domain has a ribosomal S5 domain 2-like fold. 41185 pfam01120: Alpha-L-fucosidase. 41186 pfam01121: Dephospho-CoA kinase. This family catalyses the phosphorylation of the 3'-hydroxyl group of dephosphocoenzyme A to form Coenzyme A EC:2.7.1.24. This enzyme uses ATP in its reaction. 41187 pfam01122: Eukaryotic cobalamin-binding protein. 41188 pfam01123: Staphylococcal/Streptococcal toxin, OB-fold domain. 41189 pfam01124: MAPEG family. This family is has been called MAPEG (Membrane Associated Proteins in Eicosanoid and Glutathione metabolism). It includes proteins such as Prostaglandin E synthase. This enzyme catalyses the synthesis of PGE2 from PGH2 (produced by cyclooxygenase from arachidonic acid). Because of structural similarities in the active sites of FLAP, LTC4 synthase and PGE synthase, substrates for each enzyme can compete with one another and modulate synthetic activity. 41190 pfam01125: G10 protein. 41191 pfam01126: Heme oxygenase. 41192 pfam01127: Succinate dehydrogenase cytochrome b subunit. 41193 pfam01128: Uncharacterized protein family UPF0007. 41194 pfam01129: NAD:arginine ADP-ribosyltransferase. 41195 pfam01130: CD36 family. The CD36 family is thought to be a novel class of scavenger receptors. There is also evidence suggesting a possible role in signal transduction. CD36 is involved in cell adhesion. 41196 pfam01131: DNA topoisomerase. This subfamily of topoisomerase is divided on the basis that these enzymes preferentially relax negatively supercoiled DNA, from a 5' phospho- tyrosine linkage in the enzyme-DNA covalent intermediate and has high affinity for single stranded DNA. 41197 pfam01132: Elongation factor P (EF-P).. 41198 pfam01133: Enhancer of rudimentary. Enhancer of rudimentary is a protein of unknown function that is highly conserved in plants and animals. This protein is found to be an enhancer of the rudimentary gene. 41199 pfam01134: Glucose inhibited division protein A. 41200 pfam01135: Protein-L-isoaspartate(D-aspartate) O-methyltransferase (PCMT).. 41201 pfam01136: Peptidase family U32. 41203 pfam01138: 3' exoribonuclease family, domain 1. This family includes 3'-5' exoribonucleases. Ribonuclease PH contains a single copy of this domain, and removes nucleotide residues following the -CCA terminus of tRNA. Polyribonucleotide nucleotidyltransferase (PNPase) contains two tandem copies of the domain. PNPase is involved in mRNA degradation in a 3'-5' direction. The exosome is a 3'-5' exoribonuclease complex that is required for 3' processing of the 5.8S rRNA. Three of its five protein components contain a copy of this domain. A hypothetical protein from S. pombe appears to belong to an uncharacterised subfamily. This subfamily is found in both eukaryotes and archaebacteria. 41204 pfam01139: Uncharacterized protein family UPF0027. 41205 pfam01140: Matrix protein (MA), p15. The matrix protein, p15, is encoded by the gag gene. MA is involved in pathogenicity. 41206 pfam01141: Gag polyprotein, inner coat protein p12. The retroviral p12 is a virion structural protein. p12 is proline rich. The function carried out by p12 in assembly and replication is unknown. p12 is associated with pathogenicity of the virus. 41207 pfam01142: tRNA pseudouridine synthase D (TruD). TruD is responsible for synthesis of pseudouridine from uracil-13 in transfer RNAs. 41208 pfam01144: Coenzyme A transferase. 41209 pfam01145: SPFH domain / Band 7 family. This family also includes proteins with high blast scores to known Band 7 protein: HflC from E. coli, HflK from E. coli, Prohibitin family members. 41210 pfam01146: Caveolin. 41211 pfam01147: Crustacean CHH/MIH/GIH neurohormone family. 41212 pfam01148: Cytidylyltransferase family. The members of this family are integral membrane protein cytidylyltransferases. The family includes phosphatidate cytidylyltransferase EC:2.7.7.41 as well as Sec59 from yeast. Sec59 is a dolichol kinase EC:2.7.1.108. 41213 pfam01149: Formamidopyrimidine-DNA glycosylase N-terminal domain. Formamidopyrimidine-DNA glycosylase (Fpg) is a DNA repair enzyme that excises oxidised purines from damaged DNA. This family is the N-terminal domain contains eight beta-strands, forming a beta-sandwich with two alpha-helices parallel to its edges. 41214 pfam01150: GDA1/CD39 (nucleoside phosphatase) family. 41215 pfam01151: GNS1/SUR4 family. Members of this family are involved in long chain fatty acid elongation systems that produce the 26-carbon precursors for ceramide and sphingolipid synthesis. Predicted to be integral membrane proteins, in eukaryotes they are probably located on the endoplasmic reticulum. Yeast ELO3 affects plasma membrane H+-ATPase activity, and may act on a glucose-signaling pathway that controls the expression of several genes that are transcriptionally regulated by glucose such as PMA1. 41216 pfam01152: Bacterial-like globin. This family of heme binding proteins are found mainly in bacteria. However they can also be found in some protozoa and plants as well. 41217 pfam01153: Glypican. 41218 pfam01154: Hydroxymethylglutaryl-coenzyme A synthase. 41219 pfam01155: Hydrogenase expression/synthesis hypA family. Four conserved cysteines lie either side of the least conserved region. 41220 pfam01156: Inosine-uridine preferring nucleoside hydrolase. 41221 pfam01157: Ribosomal protein L21e. 41222 pfam01158: Ribosomal protein L36e. 41223 pfam01159: Ribosomal protein L6e. 41224 pfam01160: Vertebrate endogenous opioids neuropeptide. 41225 pfam01161: Phosphatidylethanolamine-binding protein. 41226 pfam01162: PET112 family, C terminal region. 41227 pfam01163: RIO1 family. This family of proteins are related to eukaryotic type protein kinases. 41228 pfam01165: Ribosomal protein S21. 41229 pfam01166: TSC-22/dip/bun family. 41230 pfam01167: Tub family. 41231 pfam01168: Alanine racemase, N-terminal domain. 41232 pfam01169: Uncharacterized protein family UPF0016. This family contains integral membrane proteins of unknown function. Most members of the family contain two copies of a region that contains an EXGD motif. Each of these regions contains three predicted transmembrane regions. 41233 pfam01170: Putative RNA methylase family UPF0020. This domain is probably a methylase. It is associated with the THUMP domain that also occurs with RNA modification domains. 41234 pfam01171: PP-loop family. This family of proteins belongs to the PP-loop superfamily. 41235 pfam01172: Uncharacterized protein family UPF0023. This family is highly conserved in species ranging from archaea to vertebrates and plants. The family contains several Shwachman-Bodian-Diamond syndrome (SBDS) proteins from both mouse and humans. Shwachman-Diamond syndrome is an autosomal recessive disorder with clinical features that include pancreatic exocrine insufficiency, haematological dysfunction and skeletal abnormalities. It has been speculated that this family may function in RNA metabolism although there is no evidence to support this theory. 41236 pfam01174: SNO glutamine amidotransferase family. This family and its amidotransferase domain was first described in. It is predicted that members of this family are involved in the pyridoxine biosynthetic pathway, based on the proximity and co-regulation of the corresponding genes and physical interaction between the members of pfam01174 and pfam01680. 41237 pfam01175: Urocanase. 41238 pfam01176: Eukaryotic initiation factor 1A. 41239 pfam01177: Asp/Glu/Hydantoin racemase. This family contains aspartate racemase, glutamate racemase, hydantoin racemase and arylmalonate decarboxylase. . 41240 pfam01179: Copper amine oxidase, enzyme domain. Copper amine oxidases are a ubiquitous and novel group of quinoenzymes that catalyse the oxidative deamination of primary amines to the corresponding aldehydes, with concomitant reduction of molecular oxygen to hydrogen peroxide. The enzymes are dimers of identical 70-90 kDa subunits, each of which contains a single copper ion and a covalently bound cofactor formed by the post-translational modification of a tyrosine side chain to 2,4,5-trihydroxyphenylalanine quinone (TPQ). This family corresponds to the catalytic domain of the enzyme. . 41241 pfam01180: Dihydroorotate dehydrogenase. 41242 pfam01182: Glucosamine-6-phosphate isomerases/6-phosphogluconolactonase. 41243 pfam01183: Glycosyl hydrolases family 25. 41244 pfam01184: GPR1/FUN34/yaaH family. 41245 pfam01185: Fungal hydrophobin. 41246 pfam01186: Lysyl oxidase. 41247 pfam01187: Macrophage migration inhibitory factor (MIF).. 41248 pfam01188: Mandelate racemase / muconate lactonizing enzyme, C-terminal domain. C-terminal domain is TIM barrel fold, dehydratase-like domain. Manganese is associated with this domain. 41249 pfam01189: NOL1/NOP2/sun family. 41250 pfam01190: Pollen proteins Ole e I family. 41251 pfam01191: RNA polymerase Rpb5, C-terminal domain. The assembly domain of Rpb5. The archaeal equivalent to this domain is subunit H. Subunit H lacks the N-terminal domain. 41252 pfam01192: RNA polymerase Rpb6. Rpb6 is an essential subunit in the eukaryotic polymerases Pol I, II and III. The bacterial equivalent to Rpb6 is the omega subunit. Rpb6 and omega are structurally conserved and both function in polymerase assembly. 41254 pfam01194: RNA polymerases N / 8 kDa subunit. 41255 pfam01195: Peptidyl-tRNA hydrolase. 41256 pfam01196: Ribosomal protein L17. 41257 pfam01197: Ribosomal protein L31. 41258 pfam01198: Ribosomal protein L31e. 41259 pfam01199: Ribosomal protein L34e. 41260 pfam01200: Ribosomal protein S28e. 41261 pfam01201: Ribosomal protein S8e. 41262 pfam01202: Shikimate kinase. 41263 pfam01203: Bacterial type II secretion system protein N. 41264 pfam01204: Trehalase. 41265 pfam01205: Uncharacterized protein family UPF0029. 41266 pfam01206: Uncharacterized protein family UPF0033. 41267 pfam01207: Dihydrouridine synthase (Dus). Members of this family catalyse the reduction of the 5,6-double bond of a uridine residue on tRNA. Dihydrouridine modification of tRNA is widely observed in prokaryotes and eukaryotes, and also in some archae. Most dihydrouridines are found in the D loop of t-RNAs. The role of dihydrouridine in tRNA is currently unknown, but may increase conformational flexibility of the tRNA. It is likely that different family members have different substrate specificities, which may overlap. Dus 1 from Saccharomyces cerevisiae acts on pre-tRNA-Phe, while Dus 2 acts on pre-tRNA-Tyr and pre-tRNA-Leu. Dus 1 is active as a single subunit, requiring NADPH or NADH, and is stimulated by the presence of FAD. Some family members may be targeted to the mitochondria and even have a role in mitochondria. 41268 pfam01208: Uroporphyrinogen decarboxylase (URO-D).. 41269 pfam01209: ubiE/COQ5 methyltransferase family. 41270 pfam01210: NAD-dependent glycerol-3-phosphate dehydrogenase. 41271 pfam01211: BTG1 family. A novel family of anti-proliferative proteins. 41272 pfam01212: Beta-eliminating lyase. 41273 pfam01213: CAP protein. 41274 pfam01214: Casein kinase II regulatory subunit. 41275 pfam01215: Cytochrome c oxidase subunit Vb. 41276 pfam01216: Calsequestrin. 41277 pfam01217: Clathrin adaptor complex small chain. 41278 pfam01218: Coproporphyrinogen III oxidase. 41279 pfam01219: Prokaryotic diacylglycerol kinase. 41280 pfam01220: Dehydroquinase class II. 41281 pfam01221: Dynein light chain type 1. 41282 pfam01222: Ergosterol biosynthesis ERG4/ERG24 family. 41283 pfam01223: DNA/RNA non-specific endonuclease. 41284 pfam01225: Mur ligase family, catalytic domain. This family contains a number of related ligase enzymes which have EC numbers 6.3.2.*. This family includes: MurC, MurD, MurE, MurF, Mpl and FolC. MurC, MurD, Mure and MurF catalyse consecutive steps in the synthesis of peptidoglycan. Peptidoglycan consists of a sheet of two sugar derivatives, with one of these N-acetylmuramic acid attaching to a small pentapeptide. The pentapeptide is is made of L-alanine, D-glutamic acid, Meso-diaminopimelic acid and D-alanyl alanine. The peptide moiety is synthesised by successively adding these amino acids to UDP-N-acetylmuramic acid. MurC transfers the L-alanine, MurD transfers the D-glutamate, MurE transfers the diaminopimelic acid, and MurF transfers the D-alanyl alanine. This family also includes Folylpolyglutamate synthase that transfers glutamate to folylpolyglutamate. 41285 pfam01226: Formate/nitrite transporter. 41286 pfam01227: GTP cyclohydrolase I. This family includes GTP cyclohydrolase enzymes and a family of related bacterial proteins. 41287 pfam01228: Glycine radical. 41288 pfam01229: Glycosyl hydrolases family 39. 41289 pfam01230: HIT domain. 41290 pfam01231: Indoleamine 2,3-dioxygenase. 41291 pfam01232: Mannitol dehydrogenase. 41292 pfam01233: Myristoyl-CoA:protein N-myristoyltransferase, N-terminal domain. The N and C-terminal domains of NMT are structurally similar, each adopting an acyl-CoA N-acyltransferase-like fold. 41293 pfam01234: NNMT/PNMT/TEMT family. 41294 pfam01235: Sodium:alanine symporter family. 41295 pfam01237: Oxysterol-binding protein. 41296 pfam01238: Phosphomannose isomerase type I. 41297 pfam01239: Protein prenyltransferase alpha subunit repeat. 41298 pfam01241: Photosystem I psaG / psaK. 41299 pfam01242: 6-pyruvoyl tetrahydropterin synthase. 6-Pyruvoyl tetrahydrobiopterin synthase catalyses the conversion of dihydroneopterin triphosphate to 6-pyruvoyl tetrahydropterin, the second of three enzymatic steps in the synthesis of tetrahydrobiopterin from GTP. The functional enzyme is a hexamer of identical subunits. 41300 pfam01243: Pyridoxamine 5'-phosphate oxidase. 41301 pfam01244: Renal dipeptidase. 41302 pfam01245: Ribosomal protein L19. 41303 pfam01246: Ribosomal protein L24e. 41304 pfam01247: Ribosomal protein L35Ae. 41305 pfam01248: Ribosomal protein L7Ae/L30e/S12e/Gadd45 family. This family includes: Ribosomal L7A from metazoa, Ribosomal L8-A and L8-B from fungi, 30S ribosomal protein HS6 from archaebacteria, 40S ribosomal protein S12 from eukaryotes, Ribosomal protein L30 from eukaryotes and archaebacteria. Gadd45 and MyD118. 41306 pfam01249: Ribosomal protein S21e. 41307 pfam01250: Ribosomal protein S6. 41308 pfam01251: Ribosomal protein S7e. 41309 pfam01252: Signal peptidase (SPase) II. 41310 pfam01253: Translation initiation factor SUI1. 41311 pfam01254: Nuclear transition protein 2. 41312 pfam01255: Putative undecaprenyl diphosphate synthase. Previously known as uncharacterized protein family UPF0015, a single member of this family has been identified as an undecaprenyl diphosphate synthase. 41313 pfam01256: Carbohydrate kinase. This family is related to pfam02110 and pfam00294 implying that it also is a carbohydrate kinase. (personal obs Yeats C).. 41314 pfam01257: Respiratory-chain NADH dehydrogenase 24 Kd subunit. 41315 pfam01258: Prokaryotic dksA/traR C4-type zinc finger. 41316 pfam01259: SAICAR synthetase. Also known as Phosphoribosylaminoimidazole-succinocarboxamide synthase. 41317 pfam01261: AP endonuclease family 2. 41318 pfam01262: Alanine dehydrogenase/PNT, C-terminal domain. This family now also contains the lysine 2-oxoglutarate reductases. 41319 pfam01263: Aldose 1-epimerase. 41320 pfam01264: Chorismate synthase. 41321 pfam01265: Cytochrome c/c1 heme lyase. 41322 pfam01266: FAD dependent oxidoreductase. This family includes various FAD dependent oxidoreductases: Glycerol-3-phosphate dehydrogenase EC:1.1.99.5, Sarcosine oxidase beta subunit EC:1.5.3.1, D-alanine oxidase EC:1.4.99.1, D-aspartate oxidase EC:1.4.3.1. 41323 pfam01267: F-actin capping protein alpha subunit. 41324 pfam01268: Formate--tetrahydrofolate ligase. 41325 pfam01269: Fibrillarin. 41326 pfam01270: Glycosyl hydrolases family 8. 41327 pfam01271: Granin (chromogranin or secretogranin).. 41328 pfam01272: Prokaryotic transcription elongation factor, GreA/GreB, C-terminal domain. This domain has a FKBP-like fold. 41329 pfam01273: LBP / BPI / CETP family, N-terminal domain. The N and C terminal domains of the LBP/BPI/CETP family are structurally similar. 41330 pfam01274: Malate synthase. 41331 pfam01275: Myelin proteolipid protein (PLP or lipophilin).. 41332 pfam01276: Orn/Lys/Arg decarboxylase, major domain. 41333 pfam01277: Oleosin. 41334 pfam01278: Omptin family. The omptin family is a family of serine proteases. 41335 pfam01279: Parathyroid hormone family. 41336 pfam01280: Ribosomal protein L19e. 41337 pfam01281: Ribosomal protein L9, N-terminal domain. 41338 pfam01282: Ribosomal protein S24e. 41339 pfam01283: Ribosomal protein S26e. 41340 pfam01284: Membrane-associating domain. MARVEL domain-containing proteins are often found in lipid-associating proteins - such as Occludin and MAL family proteins. It may be part of the machinery of membrane apposition events, such as transport vesicle biogenesis. 41341 pfam01285: TEA/ATTS domain family. 41342 pfam01286: XPA protein N-terminal. 41343 pfam01287: Eukaryotic initiation factor 5A hypusine, DNA-binding OB fold. 41344 pfam01288: 7,8-dihydro-6-hydroxymethylpterin-pyrophosphokinase (HPPK).. 41345 pfam01289: Thiol-activated cytolysin. 41346 pfam01290: Thymosin beta-4 family. 41347 pfam01291: LIF / OSM family. 41348 pfam01292: Cytochrome b561 family. This family includes cytochrome b561 and related proteins as well as the nickel-dependent hydrogenases b-type cytochrome subunit. 41349 pfam01293: Phosphoenolpyruvate carboxykinase. 41350 pfam01294: Ribosomal protein L13e. 41351 pfam01295: Adenylate cyclase, class-I. 41352 pfam01296: Galanin. 41353 pfam01297: Periplasmic solute binding protein family. This family includes periplasmic solute binding proteins such as TroA that interacts with an ATP-binding cassette transport system in Treponema pallidum. 41354 pfam01298: Transferrin binding protein-like solute binding protein. This family of proteins are distantly related to other families of solute binding proteins. 41355 pfam01299: Lysosome-associated membrane glycoprotein (Lamp).. 41356 pfam01300: yrdC domain. This domain has been shown to preferentially bind to dsRNA. The domain is found in SUA5 as well as HypF and YrdC. 41357 pfam01301: Glycosyl hydrolases family 35. 41358 pfam01302: CAP-Gly domain. Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove. 41359 pfam01303: Egg lysin (Sperm-lysin). Egg lysin creates a hole in the envelope of the egg thereby allowing the sperm to pass through the envelope and fuse with the egg. 41360 pfam01304: Gas vesicles protein GVPc repeated domain. 41361 pfam01305: Ribosomal protein L15 amino terminal region. This family is always associated with pfam00256. This family is diagnostic of ribosomal L15 proteins. 41362 pfam01306: LacY proton/sugar symporter. This family is closely related to the sugar transporter family. 41363 pfam01307: Plant virus coat protein. This membrane/coat protein is found in a number of different virus families. Its function is not known. 41364 pfam01308: Chlamydia major outer membrane protein. The major outer membrane protein of Chlamydia contains four symmetrically spaced variable domains (VDs I to IV). This protein is believed to be an integral part to the pathogenesis, possibly adhesion. Along with the lipopolysaccharide, the major out membrane protein (MOMP) makes up the surface of the elementary body cell. The MOMP is the protein used to determine the different serotypes. 41365 pfam01309: Equine arteritis virus small envelope glycoprotein. Equine arteritis virus small envelope glycoprotein (Gs) is a class I transmembrane protein which adopts a number of different conformations. . 41366 pfam01310: Adenovirus hexon associated protein, protein VIII. See pfam01065. This family represents Hexon. 41367 pfam01311: Bacterial export proteins, family 1. This family includes the following members; FliR, MopE, SsaT, YopT, Hrp, HrcT and SpaR All of these members export proteins, that do not possess signal peptides, through the membrane. Although the proteins that these exporters move may be different, the exporters are thought to function in similar ways. 41368 pfam01312: FlhB HrpN YscU SpaS Family. This family includes the following members: FlhB, HrpN, YscU, SpaS, HrcU SsaU and YopU. All of these proteins export peptides using the type III secretion system. The peptides exported are quite diverse. 41369 pfam01313: Bacterial export proteins, family 3. This family includes the following members; FliQ, MopD, HrcS, Hrp, YopS and SpaQ All of these members export proteins, that do not possess signal peptides, through the membrane. Although the proteins that these exporters move may be different, the exporters are thought to function in similar ways. 41370 pfam01314: Aldehyde ferredoxin oxidoreductase, domains 2 & 3. Aldehyde ferredoxin oxidoreductase (AOR) catalyses the reversible oxidation of aldehydes to their corresponding carboxylic acids with their accompanying reduction of the redox protein ferredoxin. This family is composed of two structural domains that bind the tungsten cofactor via DXXGL(C/D) motifs. In addition to maintaining specific binding interactions with the cofactor, another role for domains 2 and 3 may be to regulate substrate access to AOR. 41371 pfam01315: Aldehyde oxidase and xanthine dehydrogenase, a/b hammerhead domain. 41372 pfam01316: Arginine repressor, DNA binding domain. 41373 pfam01318: Bromovirus coat protein. 41374 pfam01320: Colicin immunity protein / pyocin immunity protein. 41375 pfam01321: Creatinase. 41376 pfam01322: Cytochrome C'.. 41378 pfam01324: Diphtheria toxin, R domain. C-terminal receptor binding (R) domain - binds to cell surface receptor, permitting the toxin to enter the cell by receptor mediated endocytosis. 41379 pfam01325: Iron dependent repressor, N-terminal DNA binding domain. This family includes the Diphtheria toxin repressor. DNA binding is through a helix-turn-helix motif. 41380 pfam01326: Pyruvate phosphate dikinase, PEP/pyruvate binding domain. This enzyme catalyses the reversible conversion of ATP to AMP, pyrophosphate and phosphoenolpyruvate (PEP).. 41381 pfam01327: Polypeptide deformylase. 41382 pfam01328: Peroxidase, family 2. The peroxidases in this family do not have similarity to other peroxidases. 41383 pfam01329: Pterin 4 alpha carbinolamine dehydratase. Pterin 4 alpha carbinolamine dehydratase is also known as DCoH (dimerisation cofactor of hepatocyte nuclear factor 1-alpha).. 41384 pfam01330: RuvA N terminal domain. The N terminal domain of RuvA has an OB-fold structure. This domain forms the RuvA tetramer contacts. 41385 pfam01331: mRNA capping enzyme, catalytic domain. This family represents the ATP binding catalytic domain of the mRNA capping enzyme. 41386 pfam01333: Apocytochrome F, C-terminal. This is a sub-family of cytochrome C. See pfam00034. 41387 pfam01335: Death effector domain. 41388 pfam01336: OB-fold nucleic acid binding domain. This family contains OB-fold domains that bind to nucleic acids. The family includes the anti-codon binding domain of lysyl, aspartyl, and asparaginyl -tRNA synthetases (See pfam00152). Aminoacyl-tRNA synthetases catalyse the addition of an amino acid to the appropriate tRNA molecule EC:6.1.1.-. This family also includes part of RecG helicase involved in DNA repair. Replication factor A is a heterotrimeric complex, that contains a subunit in this family. This domain is also found at the C-terminus of bacterial DNA polymerase III alpha chain. 41389 pfam01337: Barstar (barnase inhibitor).. 41390 pfam01338: Bacillus thuringiensis toxin. 41391 pfam01339: CheB methylesterase. 41392 pfam01340: Met Apo-repressor, MetJ. 41393 pfam01341: Glycosyl hydrolases family 6. 41394 pfam01342: SAND domain. The DNA binding activity of two proteins has been mapped to the SAND domain. The conserved KDWK motif is necessary for DNA binding, and it appears to be important for dimerisation. 41395 pfam01343: Peptidase family S49. 41396 pfam01344: Kelch motif. The kelch motif was initially discovered in Kelch. In this protein there are six copies of the motif. It has been shown for one member that it is related to Galactose Oxidase for which a structure has been solved. The kelch motif forms a beta sheet. Several of these sheets associate to form a beta propeller structure as found in pfam00064, pfam00400 and pfam00415. 41397 pfam01345: Domain of unknown function DUF11. A domain of unknown function found in multiple copies in several archaebacterial proteins. 41398 pfam01346: Domain amino terminal to FKBP-type peptidyl-prolyl isomerase. This family is only found at the amino terminus of pfam00254. This domain is of unknown function. 41399 pfam01347: Lipoprotein amino terminal region. This family contains regions from: Vitellogenin, Microsomal triglyceride transfer protein and apolipoprotein B-100. These proteins are all involved in lipid transport. This family contains the LV1n chain from lipovitellin, that contains two structural domains. 41400 pfam01348: Type II intron maturase. Group II introns use intron-encoded reverse transcriptase, maturase and DNA endonuclease activities for site-specific insertion into DNA. Although this type of intron is self splicing in vitro they require a maturase protein for splicing in vivo. It has been shown that a specific region of the aI2 intron is needed for the maturase function. This region was found to be conserved in group II introns and called domain X. 41401 pfam01349: Flavivirus non-structural protein NS4B. Flaviviruses encode a single polyprotein. This is cleaved into three structural and seven non-structural proteins. The NS4B protein is small and poorly conserved among the Flaviviruses. NS4B contains multiple hydrophobic potential membrane spanning regions. NS4B may form membrane components of the viral replication complex and could be involved in membrane localisation of NS3 and pfam00972. 41402 pfam01350: Flavivirus non-structural protein NS4A. Flaviviruses encode a single polyprotein. This is cleaved into three structural and seven non-structural proteins. The NS4A protein is small and poorly conserved among the Flaviviruses. NS4A contains multiple hydrophobic potential membrane spanning regions. NS4A has only been found in cells infected by Kunjin virus. 41403 pfam01351: Ribonuclease HII. 41404 pfam01352: KRAB box. The KRAB domain (or Kruppel-associated box) is present in about a third of zinc finger proteins containing C2H2 fingers. The KRAB domain is found to be involved in protein-protein interactions. The KRAB domain is generally encoded by two exons. The regions coded by the two exons are known as KRAB-A and KRAB-B. 41405 pfam01353: Green fluorescent protein. 41406 pfam01354: Antifreeze-like domain. This family contains type III antifreeze proteins as well as a variety of enzymes. This domain is presumed to be involved in sugar binding in the enzyme proteins. 41407 pfam01355: High potential iron-sulfur protein. 41408 pfam01356: Alpha amylase inhibitor. 41409 pfam01357: Pollen allergen. This family contains allergens lol PI, PII and PIII from Lolium perenne. 41410 pfam01358: Poly A polymerase regulatory subunit. 41411 pfam01359: Transposase. This family includes the mariner transposase. 41412 pfam01360: Monooxygenase. This family includes diverse enzymes that utilise FAD. 41413 pfam01361: Tautomerase enzyme. This family includes the enzyme 4-oxalocrotonate tautomerase that catalyses the ketonisation of 2-hydroxymuconate to 2-oxo-3-hexenedioate. 41414 pfam01363: FYVE zinc finger. The FYVE zinc finger is named after four proteins that it has been found in: Fab1, YOTB/ZK632.12, Vac1, and EEA1. The FYVE finger has been shown to bind two Zn++ ions. The FYVE finger has eight potential zinc coordinating cysteine positions. Many members of this family also include two histidines in a motif R+HHC+XCG, where + represents a charged residue and X any residue. We have included members which do not conserve these histidine residues but are clearly related. 41415 pfam01364: Peptidase family C25. 41416 pfam01365: RIH domain. The RIH (RyR and IP3R Homology) domain is an extracellular domain from two types of calcium channels. This region is found in the ryanodine receptor and the inositol-1,4,5- trisphosphate receptor. This domain may form a binding site for IP3. 41417 pfam01366: Herpesvirus processing and transport protein. The members of this family are associate with capsid intermediates during packaging of the virus. 41418 pfam01367: 5'-3' exonuclease, C-terminal SAM fold. 41419 pfam01368: DHH family. It is predicted that this family of proteins all perform a phosphoesterase function. It included the single stranded DNA exonuclease RecJ. 41420 pfam01369: Sec7 domain. The Sec7 domain is a guanine-nucleotide-exchange-factor (GEF) for the pfam00025 family. 41421 pfam01370: NAD dependent epimerase/dehydratase family. This family of proteins utilise NAD as a cofactor. The proteins in this family use nucleotide-sugar substrates for a variety of chemical reactions. 41422 pfam01371: Trp repressor protein. This protein binds to tryptophan and represses transcription of the Trp operon. 41423 pfam01372: Melittin. 41424 pfam01373: Glycosyl hydrolase family 14. This family are beta amylases. 41425 pfam01374: Glycosyl hydrolase family 46. This family are chitosanase enzymes. 41426 pfam01375: Heat-labile enterotoxin alpha chain. 41427 pfam01376: Heat-labile enterotoxin beta chain. 41428 pfam01378: B domain. This domain is found as a tandem repeat in Streptococcal cell surface proteins, such as the IgG binding protein G. 41429 pfam01379: Porphobilinogen deaminase, dipyromethane cofactor binding domain. 41430 pfam01380: SIS domain. SIS (Sugar ISomerase) domains are found in many phosphosugar isomerases and phosphosugar binding proteins. SIS domains are also found in proteins that regulate the expression of genes involved in synthesis of phosphosugars. Presumably the SIS domains bind to the end-product of the pathway. 41431 pfam01381: Helix-turn-helix. This large family of DNA binding helix-turn helix proteins includes Cro and CI. 41432 pfam01382: Avidin family. 41433 pfam01383: CpcD/allophycocyanin linker domain. 41434 pfam01384: Phosphate transporter family. This family includes PHO-4 from Neurospora crassa which is a is a Na(+)-phosphate symporter. This family also contains the leukaemia virus receptor. 41435 pfam01385: Probable transposase. This family includes IS891, IS1136 and IS1341. . 41436 pfam01386: Ribosomal L25p family. Ribosomal protein L25 is an RNA binding protein, that binds 5S rRNA. This family includes Ctc from B. subtilis, which is induced by stress. 41437 pfam01387: Synuclein. There are three types of synucleins in humans, these are called alpha, beta and gamma. Alpha synuclein has been found mutated in families with autosomal dominant Parkinson's disease. A peptide of alpha synuclein has also been found in amyloid plaques in Alzheimer's patients. 41438 pfam01388: ARID/BRIGHT DNA binding domain. This domain is know as ARID for AT-Rich Interaction Domain, and also known as the BRIGHT domain. 41439 pfam01389: OmpA-like transmembrane domain. The structure of OmpA transmembrane domain shows that it consists of an eight stranded beta barrel. This family includes some other distantly related outer membrane proteins with low scores. 41440 pfam01390: SEA domain. Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. 41441 pfam01391: Collagen triple helix repeat (20 copies). Members of this family belong to the collagen superfamily. Collagens are generally extracellular structural proteins involved in formation of connective tissue structure. The alignment contains 20 copies of the G-X-Y repeat that forms a triple helix. The first position of the repeat is glycine, the second and third positions can be any residue but are frequently proline and hydroxyproline. Collagens are post translationally modified by proline hydroxylase to form the hydroxyproline residues. Defective hydroxylation is the cause of scurvy. Some members of the collagen superfamily are not involved in connective tissue structure but share the same triple helical structure. 41442 pfam01392: Fz domain. Also known as the CRD (cysteine rich domain), the C6 box in MuSK receptor. This domain of unknown function has been independently identified by several groups. The domain contains 10 conserved cysteines. 41443 pfam01393: Chromo shadow domain. This domain is distantly related to pfam00385. This domain is always found in association with a chromo domain. 41444 pfam01394: Clathrin propeller repeat. Clathrin is the scaffold protein of the basket-like coat that surrounds coated vesicles. The soluble assembly unit, a triskelion, contains three heavy chains and three light chains in an extended three-legged structure. Each leg contains one heavy and one light chain. The N-terminus of the heavy chain is known as the globular domain, and is composed of seven repeats which form a beta propeller. 41445 pfam01395: PBP/GOBP family. The olfactory receptors of terrestrial animals exist in an aqueous environment, yet detect odorants that are primarily hydrophobic. The aqueous solubility of hydrophobic odorants is thought to be greatly enhanced via odorant binding proteins which exist in the extracellular fluid surrounding the odorant receptors. This family is composed of pheromone binding proteins (PBP), which are male-specific and associate with pheromone-sensitive neurons and general-odorant binding proteins (GOBP).. 41446 pfam01396: Topoisomerase DNA binding C4 zinc finger. 41447 pfam01397: Terpene synthase, N-terminal domain. It has been suggested that this gene family be designated tps (for terpene synthase). It has been split into six subgroups on the basis of phylogeny, called tpsa-tpsf. tpsa includes vetispiridiene synthase, 5-epi- aristolochene synthase, and (+)-delta-cadinene synthase. tpsb includes (-)-limonene synthase. tpsc includes kaurene synthase A. tpsd includes taxadiene synthase, pinene synthase, and myrcene synthase. tpse includes kaurene synthase B. tpsf includes linalool synthase. 41448 pfam01398: Mov34/MPN/PAD-1 family. Members of this family are found in proteasome regulatory subunits, eukaryotic initiation factor 3 (eIF3) subunits and regulators of transcription factors. This family is also known as the MPN domain and PAD-1-like domain. It has been shown that this domain occurs in prokaryotes. 41449 pfam01399: PCI domain. This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15).. 41450 pfam01400: Astacin (Peptidase family M12A). The members of this family are enzymes that cleave peptides. These proteases require zinc for catalysis. Members of this family contain two conserved disulphide bridges, these are joined 1-4 and 2-3. Members of this family have an amino terminal propeptide which is cleaved to give the active protease domain. All other linked domains are found to the carboxyl terminus of this domain. This family includes: Astacin, a digestive enzyme from Crayfish. Meprin, a multiple domain membrane component that is constructed from a homologous alpha and beta chain. Proteins involved in morphogenesis, and Tolloid from drosophila. 41451 pfam01401: Angiotensin-converting enzyme. Members of this family are dipeptidyl carboxydipeptidases (cleave carboxyl dipeptides) and most notably convert angiotensin I to angiotensin II. Many members of this family contain a tandem duplication of the 600 amino acid peptidase domain, both of these are catalytically active. Most members are secreted membrane bound ectoenzymes. 41452 pfam01402: Ribbon-helix-helix protein, copG family. The structure of this protein repressor, which is the shortest reported to date and the first isolated from a plasmid, has a homodimeric ribbon-helix-helix arrangement. The helix-turn-helix-like structure is involved in dimerisation and not DNA binding as might have been expected. 41453 pfam01403: Sema domain. The Sema domain occurs in semaphorins, which are a large family of secreted and transmembrane proteins, some of which function as repellent signals during axon guidance. Sema domains also occur in the hepatocyte growth factor receptor. 41454 pfam01404: Ephrin receptor ligand binding domain. The Eph receptors, which bind to ephrins pfam00812 are a large family of receptor tyrosine kinases. This family represents the amino terminal domain which binds the ephrin ligand. 41455 pfam01405: Photosystem II reaction centre T protein. The exact function of this protein is unknown. It probably consists of a single transmembrane spanning helix. The protein from Chlamydomonas reinhardtii, appears to be (i) a novel photosystem II subunit and (ii) required for maintaining optimal photosystem II activity under adverse growth conditions. 41456 pfam01406: tRNA synthetases class I (C) catalytic domain. This family includes only cysteinyl tRNA synthetases. 41457 pfam01407: Geminivirus AL3 protein. Geminiviruses are small, ssDNA-containing plant viruses. Geminiviruses contain three ORFs (designated AL1, AL2, and AL3) that overlap and are specified by multiple polycistronic mRNAs. The AL3 protein comprises approximately 0.05% of the cellular proteins and is present in the soluble and organelle fractions. AL3 may form oligomers. Immunoprecipitation of AL3 in a baculovirus expression system extracts expressing both AL1 pfam00799 and AL3 showed that the two proteins also complex with each other. The AL3 protein is involved in viral replication. 41458 pfam01408: Oxidoreductase family, NAD-binding Rossmann fold. This family of enzymes utilise NADP or NAD. This family is called the GFO/IDH/MOCA family. 41459 pfam01409: tRNA synthetases class II core domain (F). Other tRNA synthetase sub-families are too dissimilar to be included. This family includes only phenylalanyl-tRNA synthetases. This is the core catalytic domain. 41460 pfam01410: Fibrillar collagen C-terminal domain. Found at C-termini of fibrillar collagens: Ephydatia muelleri procollagen EMF1 alpha, vertebrate collagens alpha(1)III, alpha(1)II, alpha(2)V etc. 41461 pfam01411: tRNA synthetases class II (A). Other tRNA synthetase sub-families are too dissimilar to be included. This family includes only alanyl-tRNA synthetases. 41462 pfam01412: Putative GTPase activating protein for Arf. Putative zinc fingers with GTPase activating proteins (GAPs) towards the small GTPase, Arf. The GAP of ARD1 stimulates GTPase hydrolysis for ARD1 but not ARFs. 41463 pfam01413: C-terminal tandem repeated domain in type 4 procollagen. Duplicated domain in C-terminus of type 4 collagens. Mutations in alpha-5 collagen IV are associated with X-linked Alport syndrome. 41464 pfam01414: Delta serrate ligand. 41465 pfam01415: Interleukin 7/9 family. IL-7 is a cytokine that acts as a growth factor for early lymphoid cells of both B- and T-cell lineages. IL-9 is a multi-functional cytokine that, although originally described as a T-cell growth factor, its function in T-cell response remains unclear. 41466 pfam01416: tRNA pseudouridine synthase. Involved in the formation of pseudouridine at the anticodon stem and loop of transfer-RNAs Pseudouridine is an isomer of uridine (5-(beta-D-ribofuranosyl) uracil, and id the most abundant modified nucleoside found in all cellular RNAs. The TruA-like proteins also exhibit a conserved sequence with a strictly conserved aspartic acid, likely involved in catalysis. 41467 pfam01417: ENTH domain. The ENTH (Epsin N-terminal homology) domain is found in proteins involved in endocytosis and cytoskeletal machinery. The function of the ENTH domain is unknown. 41468 pfam01418: Helix-turn-helix domain, rpiR family. This domain contains a helix-turn-helix motif. The best characterised member of this family is RpiR, a regulator of the expression of rpiB gene. 41469 pfam01419: Jacalin-like lectin domain. Proteins containing this domain are lectins. It is found in 1 to 6 copies in these proteins. The domain is also found in the animal prostatic spermine-binding protein. 41470 pfam01420: Type I restriction modification DNA specificity domain. This domain is also known as the target recognition domain (TRD). Restriction-modification (R-M) systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity subunit (this family), two modification (M) subunits and two restriction (R) subunits. 41471 pfam01421: Reprolysin (M12B) family zinc metalloprotease. The members of this family are enzymes that cleave peptides. These proteases require zinc for catalysis. Members of this family are also known as adamalysins. Most members of this family are snake venom endopeptidases, but there are also some mammalian proteins such as fertilin. Fertilin and closely related proteins appear to not have some active site residues and may not be active enzymes. 41472 pfam01422: NF-X1 type zinc finger. This domain is presumed to be a zinc binding domain. The following pattern describes the zinc finger. C-X(1-6)-H-X-C-X3-C(H/C)-X(3-4)-(H/C)-X(1-10)-C Where X can be any amino acid, and numbers in brackets indicate the number of residues. Two position can be either his or cys. In one member the zinc fingers have been shown to bind to DNA. 41473 pfam01423: LSM domain. The LSM domain contains Sm proteins as well as other related LSM (Like Sm) proteins. The U1, U2, U4/U6, and U5 small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing contain seven Sm proteins (B/B', D1, D2, D3, E, F and G) in common, which assemble around the Sm site present in four of the major spliceosomal small nuclear RNAs. The U6 snRNP binds to the LSM (Like Sm) proteins. Sm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Sm proteins. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. This family also includes the bacterial Hfq (host factor Q) proteins. Hfq are also RNA-binding proteins, that form hexameric rings. 41474 pfam01424: R3H domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to be binding ssDNA. 41475 pfam01425: Amidase. 41476 pfam01426: BAH domain. This domain has been called BAH (Bromo adjacent homology) domain and has also been called ELM1 and BAM (Bromo adjacent motif) domain. The function of this domain is unknown but may be involved in protein-protein interaction. 41477 pfam01427: D-ala-D-ala dipeptidase. 41478 pfam01428: AN1-like Zinc finger. Zinc finger at the C-terminus of An1, a ubiquitin-like protein in Xenopus laevis. The following pattern describes the zinc finger. C-X2-C-X(9-12)-C-X(1-2)-C-X4-C-X2-H-X5-H-X-C Where X can be any amino acid, and numbers in brackets indicate the number of residues. 41479 pfam01429: Methyl-CpG binding domain. The Methyl-CpG binding domain (MBD) binds to DNA that contains one or more symmetrically methylated CpGs. DNA methylation in animals is associated with alterations in chromatin structure and silencing of gene expression. MBD has negligible non-specific affinity for DNA. In vitro foot-printing with MeCP2 showed the MBD can protect a 12 nucleotide region surrounding a methyl CpG pair. MBDs are found in several Methyl-CpG binding proteins and also DNA demethylase. 41480 pfam01430: Hsp33 protein. Hsp33 is a molecular chaperone, distinguished from all other known chaperones by its mode of functional regulation. Its activity is redox regulated. Hsp33 is a cytoplasmically localised protein with highly reactive cysteines that respond quickly to changes in the redox environment. Oxidising conditions like H2O2 cause disulfide bonds to form in Hsp33, a process that leads to the activation of its chaperone function. 41481 pfam01431: Peptidase family M13. Mammalian enzymes are typically type-II membrane anchored enzymes which are known, or believed to activate or inactivate oligopeptide (pro)-hormones such as opioid peptides. The family also contains a bacterial member believed to be involved with milk protein cleavage. 41482 pfam01432: Peptidase family M3. This is the Thimet oligopeptidase family, large family of mammalian and bacterial oligopeptidases that cleave medium sized peptides. The group also contains mitochondrial intermediate peptidase which is encoded by nuclear DNA but functions within the mitochondria to remove the leader sequence. 41483 pfam01433: Peptidase family M1. Members of this family are aminopeptidases. The members differ widely in specificity, hydrolysing acidic, basic or neutral N-terminal residues. This family includes leukotriene-A4 hydrolase, this enzyme also has an aminopeptidase activity. . 41484 pfam01434: Peptidase family M41. 41485 pfam01435: Peptidase family M48. 41486 pfam01436: NHL repeat. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats may have a catalytic activity, proteolysis of one menber has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. One member interacts with the activation domain of Tat. This interaction is mediated by the NHL repeats. 41487 pfam01437: Plexin repeat. A cysteine rich repeat found in several different extracellular receptors. The function of the repeat is unknown. Three copies of the repeat are found Plexin. Two copies of the repeat are found in mahogany protein. A related C. elegans protein contains four copies of the repeat. The Met receptor contains a single copy of the repeat. The Pfam alignment shows 6 conserved cysteine residues that may form three conserved disulphide bridges, whereas another reference shows 8 conserved cysteines. The pattern of conservation suggests that cysteines 5 and 7 (that are not absolutely conserved) form a disulphide bridge. 41488 pfam01439: Metallothionein. Members of this family are metallothioneins. These proteins are cysteine rich proteins that bind to heavy metals. Members of this family appear to be closest to Class II metallothioneins, seed pfam00131. 41489 pfam01440: Geminivirus AL2 protein. Geminiviruses are small, ssDNA-containing plant viruses. Geminiviruses contain three ORFs (designated AL1, AL2, and AL3) that overlap and are specified by multiple polycistronic mRNAs. The AL2 gene product transactivates expression of TGMV coat protein gene, and BR1 movement protein. 41490 pfam01441: Lipoprotein. Members of this family are lipoproteins that are probably involved in evasion of the host immune system by pathogens. 41491 pfam01442: Apolipoprotein A1/A4/E family. These proteins contain several 22 residue repeats which form a pair of alpha helices. This family includes: Apolipoprotein A-I. Apolipoprotein A-IV. Apolipoprotein E. 41492 pfam01443: Viral (Superfamily 1) RNA helicase. Helicase activity for this family has been demonstrated and NTPase activity. This helicase has multiple roles at different stages of viral RNA replication, as dissected by mutational analysis. 41493 pfam01445: Viral small hydrophobic protein. The SH (small hydrophobic) protein is a membrane protein of uncertain function. 41494 pfam01446: Replication protein. Replication proteins (rep) are involved in plasmid replication. The Rep protein binds to the plasmid DNA and nicks it at the double strand origin (dso) of replication. The 3'-hydroxyl end created is extended by the host DNA replicase, and the 5' end is displaced during synthesis. At the end of one replication round, Rep introduces a second single stranded break at the dso and ligates the ssDNA extremities generating one double-stranded plasmid and one circular ssDNA form. Complementary strand synthesis of the circular ssDNA is usually initiated at the single-stranded origin by the host RNA polymerase. 41495 pfam01447: Thermolysin metallopeptidase, catalytic domain. 41496 pfam01448: ELM2 domain. The ELM2 (Egl-27 and MTA1 homology 2) domain is a small domain of unknown function. It is found in the MTA1 protein that is part of the NuRD complex. The domain is usually found to the N terminus of a myb-like DNA binding domain pfam00249. ELM2 is also found associated with an ARID DNA binding domain pfam01388. This suggests that ELM2 may also be involved in DNA binding, or perhaps is a protein-protein interaction domain. 41497 pfam01450: Acetohydroxy acid isomeroreductase, catalytic domain. Acetohydroxy acid isomeroreductase catalyses the conversion of acetohydroxy acids into dihydroxy valerates. This reaction is the second in the synthetic pathway of the essential branched side chain amino acids valine and isoleucine. 41498 pfam01451: Low molecular weight phosphotyrosine protein phosphatase. 41499 pfam01452: Rotavirus non structural protein. This protein has been called NSP4, NSP5, NS28, and NCVP5. The final steps in the assembly of rotavirus occur in the lumen of the endoplasmic reticulum (ER). Targeting of the immature inner capsid particle (ICP) to this compartment is mediated by the cytoplasmic tail of NSP4, located in the ER membrane. 41500 pfam01453: D-mannose binding lectin. These proteins include mannose-specific lectins from plants as well as bacteriocins from bacteria. 41501 pfam01454: MAGE family. The MAGE (melanoma antigen-encoding gene) family are expressed in a wide variety of tumours but not in normal cells, with the exception of the male germ cells, placenta, and, possibly, cells of the developing embryo. The cellular function of this family is unknown. 41502 pfam01455: HupF/HypC family. 41503 pfam01456: Mucin-like glycoprotein. This family of trypanosomal proteins resemble vertebrate mucins. The protein consists of three regions. The N and C terminii are conserved between all members of the family, whereas the central region is not well conserved and contains a large number of threonine residues which can be glycosylated. Indirect evidence suggested that these genes might encode the core protein of parasite mucins, glycoproteins that were proposed to be involved in the interaction with, and invasion of, mammalian host cells. 41504 pfam01457: Leishmanolysin. 41505 pfam01458: Uncharacterized protein family (UPF0051).. 41506 pfam01459: Eukaryotic porin. 41507 pfam01461: 7TM chemoreceptor. This large family of proteins are related to pfam00001. They are 7 transmembrane receptors. This family does not include all known members, as there are problems with overlapping specificity with pfam00001. This family is greatly expanded in the nematode worm C. elegans. 41508 pfam01462: Leucine rich repeat N-terminal domain. Leucine Rich Repeats pfam00560 are short sequence motifs present in a number of proteins with diverse functions and cellular locations. Leucine Rich Repeats are often flanked by cysteine rich domains. This domain is often found at the N-terminus of tandem leucine rich repeats. 41509 pfam01463: Leucine rich repeat C-terminal domain. Leucine Rich Repeats pfam00560 are short sequence motifs present in a number of proteins with diverse functions and cellular locations. Leucine Rich Repeats are often flanked by cysteine rich domains. This domain is often found at the C-terminus of tandem leucine rich repeats. 41510 pfam01464: Transglycosylase SLT domain. This family is distantly related to pfam00062. 41511 pfam01465: GRIP domain. The GRIP (golgin-97, RanBP2alpha,Imh1p and p230/golgin-245) domain is found in many large coiled-coil proteins. It has been shown to be sufficient for targeting to the Golgi. The GRIP domain contains a completely conserved tyrosine residue. 41512 pfam01466: Skp1 family, dimerisation domain. 41513 pfam01467: Cytidylyltransferase. This family includes: Cholinephosphate cytidylyltransferase, Glycerol-3-phosphate cytidylyltransferase. 41514 pfam01468: GA module. The GA (protein G-related Albumin-binding) module is composed of three alpha helices. This module is found in a range of bacterial cell surface proteins. The GA module from one member shows a strong affinity for albumin. 41515 pfam01469: Pentapeptide repeats (8 copies). These repeats are found in many mycobacterial proteins. These repeats are most common in the pfam00823 family of proteins, where they are found in the MPTR subfamily of PPE proteins. The function of these repeats is unknown. The repeat can be approximately described as XNXGX, where X can be any amino acid. These repeats are similar to pfam00805, however it is not clear if these two families are structurally related. 41516 pfam01470: Pyroglutamyl peptidase. 41517 pfam01471: Putative peptidoglycan binding domain. This domain is composed of three alpha helices. This domain is found at the N or C terminus of a variety of enzymes involved in bacterial cell wall degradation. This domain may have a general peptidoglycan binding function. 41518 pfam01472: PUA domain. The PUA domain named after Pseudouridine synthase and Archaeosine transglycosylase, was detected in archaeal and eukaryotic pseudouridine synthases, archaeal archaeosine synthases, a family of predicted ATPases that may be involved in RNA modification, a family of predicted archaeal and bacterial rRNA methylases. Additionally, the PUA domain was detected in a family of eukaryotic proteins that also contain a domain homologous to the translation initiation factor eIF1/SUI1; these proteins may comprise a novel type of translation factors. Unexpectedly, the PUA domain was detected also in bacterial and yeast glutamate kinases; this is compatible with the demonstrated role of these enzymes in the regulation of the expression of other genes. It is predicted that the PUA domain is an RNA binding domain. 41519 pfam01473: Putative cell wall binding repeat. These repeats are characterised by conserved aromatic residues and glycines are found in multiple tandem copies in a number of proteins. The CW repeat is 20 amino acid residues long. The exact domain boundaries may not be correct. It has been suggested that in some members these repeats might be responsible for the specific recognition of choline-containing cell walls. Similar but longer repeats are found in the glucosyltransferases and glucan-binding proteins of oral streptococci and shown to be involved in glucan binding as well as in the related dextransucrases of Leuconostoc mesenteroides. Repeats also occur in toxins of Clostridium difficile and other clostridia, though the ligands are not always known. 41520 pfam01474: Class-II DAHP synthetase family. Members of this family are aldolase enzymes that catalyse the first step of the shikimate pathway. 41521 pfam01475: Ferric uptake regulator family. This family includes metal ion uptake regulator proteins, that bind to the operator DNA and controls transcription of metal ion-responsive genes. This family is also known as the FUR family. 41522 pfam01476: LysM domain. The LysM (lysin motif) domain is about 40 residues long. It is found in a variety of enzymes involved in bacterial cell wall degradation. This domain may have a general peptidoglycan binding function. The structure of this domain is known. 41523 pfam01477: PLAT/LH2 domain. This domain is found in a variety of membrane or lipid associated proteins. It is called the PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology) domain. The known structure of pancreatic lipase shows this domain binds to procolipase pfam01114, which mediates membrane association. So it appears possible that this domain mediates membrane attachment via other protein binding partners. The structure of this domain is known for many members of the family and is composed of a beta sandwich. 41524 pfam01478: Type IV leader peptidase family. Peptidase A24, or the prepilin peptidase as it is also known, processes the N-terminus of the prepilins. The processing is essential for the correct formation of the pseudopili of type IV bacterial protein secretion. The enzyme is found across eubacteria and archaea. 41525 pfam01479: S4 domain. The S4 domain is a small domain consisting of 60-65 amino acid residues that was detected in the bacterial ribosomal protein S4, eukaryotic ribosomal S9, two families of pseudouridine synthases, a novel family of predicted RNA methylases, a yeast protein containing a pseudouridine synthetase and a deaminase domain, bacterial tyrosyl-tRNA synthetases, and a number of uncharacterized, small proteins that may be involved in translation regulation. The S4 domain probably mediates binding to RNA. 41526 pfam01480: PWI domain. 41527 pfam01481: Arterivirus nucleocapsid protein. 41528 pfam01482: Domain of unknown function DUF13. This domain is found in nematode proteins. It is currently of unknown function. 41529 pfam01483: Proprotein convertase P-domain. A unique feature of the eukaryotic subtilisin-like proprotein convertases is the presence of an additional highly conserved sequence of approximately 150 residues (P domain) located immediately downstream of the catalytic domain. 41530 pfam01484: Nematode cuticle collagen N-terminal domain. The function of this domain is unknown. It is found in the N-terminal region of nematode cuticle collagens, see pfam01391. Cuticle is a tough elastic structure secreted by hypodermal cells and is primarily composed of collagen proteins. 41531 pfam01485: IBR domain. The IBR (In Between Ring fingers) domain is found to occur between pairs of ring fingers (pfam00097). The function of this domain is unknown. This domain has also been called the C6HC domain and DRIL (for double RING finger linked) domain. 41532 pfam01486: K-box region. The K-box region is commonly found associated with SRF-type transcription factors see pfam00319. The K-box is a possible coiled-coil structure. Possible role in multimer formation. 41533 pfam01487: Type I 3-dehydroquinase. Type I 3-dehydroquinase, (3-dehydroquinate dehydratase or DHQase.) Catalyses the cis-dehydration of 3-dehydroquinate via a covalent imine intermediate giving dehydroshikimate. Dehydroquinase functions in the shikimate pathway which is involved in the biosynthesis of aromatic amino acids. Type II 3-dehydroquinase catalyses the trans-dehydration of 3-dehydroshikimate see pfam01220. 41534 pfam01488: Shikimate / quinate 5-dehydrogenase. This family contains both shikimate and quinate dehydrogenases. Shikimate 5-dehydrogenase catalyses the conversion of shikimate to 5-dehydroshikimate. This reaction is part of the shikimate pathway which is involved in the biosynthesis of aromatic amino acids. Quinate 5-dehydrogenase catalyses the conversion of quinate to 5-dehydroquinate. This reaction is part of the quinate pathway where quinic acid is exploited as a source of carbon in prokaryotes and microbial eukaryotes. Both the shikimate and quinate pathways share two common pathway metabolites 3-dehydroquinate and dehydroshikimate. 41535 pfam01489: Geminivirus nuclear export factor BR1. This family consists of various geminivirus movement proteins that are nuclear export factors or shuttles. One member BR1 facilitates the export of both ds and ss DNA form the nucleus. 41536 pfam01490: Transmembrane amino acid transporter protein. This transmembrane region is found in many amino acid transporters including UNC-47 and MTR. UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT). UNC-47 is predicted to have 10 transmembrane domains. MTR is a N system amino acid transporter system protein involved in methyltryptophan resistance. Other members of this family include proline transporters and amino acid permeases. 41537 pfam01491: Frataxin-like domain. This family contains proteins that have a domain related to the globular C-terminus of Frataxin the protein that is mutated in Friedreich's ataxia. This domain is found in a family of bacterial proteins. The function of this domain is currently unknown. 41538 pfam01492: Geminivirus C4 protein. This family consists of the N terminal region of geminivirus C4 or AC4 proteins. In Tomato yellow leaf curl geminivirus (TYLCV) the C4 protein is necessary for efficient spreading of the virus in tomato plants. 41539 pfam01493: GXGXG motif. This domain is found in glutamate synthase, tungsten formylmethanofuran dehydrogenase subunit c (FwdC) and molybdenum formylmethanofuran dehydrogenase subunit c (FmdC). A repeated G-XX-G-XXX-G motif is seen in the alignment. 41540 pfam01494: FAD binding domain. This domain is involved in FAD binding in a number of enzymes. 41541 pfam01495: HypB/UreG nucleotide-binding domain. This domain is found in HypB, a hydrogenase expression / formation protein, and UreG a urease accessory protein. Both these proteins contain a P-loop nucleotide binding motif. HypB has GTPase activity and is a guanine nucleotide binding protein. It is not known whether UreG binds GTP or some other nucleotide. Both enzymes are involved in nickel binding. HypB can store nickel and is required for nickel dependent hydrogenase expression. UreG is required for functional incorporation of the urease nickel metallocenter. GTP hydrolysis may required by these proteins for nickel incorporation into other nickel proteins. 41542 pfam01496: V-type ATPase 116kDa subunit family. This family consists of the 116kDa V-type ATPase (vacuolar (H+)-ATPases) subunits, as well as V-type ATP synthase subunit i. The V-type ATPases family are proton pumps that acidify intracellular compartments in eukaryotic cells for example yeast central vacuoles, clathrin-coated and synaptic vesicles. They have important roles in membrane trafficking processes. The 116kDa subunit (subunit a) in the V-type ATPase is part of the V0 functional domain responsible for proton transport. The a subunit is a transmembrane glycoprotein with multiple putative transmembrane helices it has a hydrophilic amino terminal and a hydrophobic carboxy terminal. It has roles in proton transport and assembly of the V-type ATPase complex. This subunit is encoded by two homologous gene in yeast VPH1 and STV1. 41543 pfam01497: Periplasmic binding protein. This family includes bacterial periplasmic binding proteins. Several of which are involved in iron transport. 41544 pfam01498: Transposase. Transposase proteins are necessary for efficient DNA transposition. This family includes the amino-terminal region of Tc1, Tc1A, Tc1B and Tc2B transposases of C.elegans. The region encompasses the specific DNA binding and second DNA recognition domains as well as an amino-terminal region of the catalytic domain of Tc3 as described in. Tc3 is a member of the Tc1/mariner family of transposable elements. 41545 pfam01499: Herpesvirus UL25 family. The herpesvirus UL25 gene product is a virion component involved in virus penetration and capsid assembly. The product of the UL25 gene is required for packaging but not cleavage of replicated viral DNA. This family includes a number of herpesvirus proteins: EHV-1 36, EBV BVRF1, HCMV UL77, ILTV ORF2, and VZV gene 34. 41546 pfam01500: Keratin, high sulfur B2 protein. High sulfur proteins are cysteine-rich proteins synthesised during the differentiation of hair matrix cells, and form hair fibres in association with hair keratin intermediate filaments. This family has been divided up into four regions, with the second region containing 8 copies of a short repeat. This family is also known as B2 or KAP1. 41547 pfam01501: Glycosyl transferase family 8. This family includes enzymes that transfer sugar residues to donor molecules. Members of this family are involved in lipopolysaccharide biosynthesis and glycogen synthesis. This family includes Lipopolysaccharide galactosyltransferase, lipopolysaccharide glucosyltransferase 1, and glycogenin glucosyltransferase. 41548 pfam01502: Phosphoribosyl-AMP cyclohydrolase. This enzyme catalyses the third step in the histidine biosynthetic pathway. It requires Zn ions for activity. 41549 pfam01503: Phosphoribosyl-ATP pyrophosphohydrolase. This enzyme catalyses the second step in the histidine biosynthetic pathway. 41550 pfam01504: Phosphatidylinositol-4-phosphate 5-Kinase. This family contains a region from the common kinase core found in the type I phosphatidylinositol-4-phosphate 5-kinase (PIP5K) family as described in. The family consists of various type I, II and III PIP5K enzymes. PIP5K catalyses the formation of phosphoinositol-4,5-bisphosphate via the phosphorylation of phosphatidylinositol-4-phosphate a precursor in the phosphinositide signaling pathway. 41551 pfam01505: Major Vault Protein repeat. The vault is a ubiquitous and highly conserved ribonucleoprotein particle of approximately 13 mDa of unknown function. This family corresponds to a repeat found in the amino terminal half of the major vault protein. 41552 pfam01506: Hepatitis C virus non-structural 5a protein. The molecular function of the non-structural 5a protein is uncertain. The NS5a protein is phosphorylated when expressed in mammalian cells. It is thought to interact with the ds RNA dependent (interferon inducible) kinase PKR. The N-terminal region of the NS5a protein has been used in the construction of the alignment for this family. The C-terminal region has not been included because it is too heterogeneous. 41553 pfam01507: Phosphoadenosine phosphosulfate reductase family. This domain is found in phosphoadenosine phosphosulfate (PAPS) reductase enzymes or PAPS sulfotransferase. PAPS reductase is part of the adenine nucleotide alpha hydrolases superfamily also including N type ATP PPases and ATP sulphurylases. The enzyme uses thioredoxin as an electron donor for the reduction of PAPS to phospho-adenosine-phosphate (PAP). It is also found in NodP nodulation protein P from Rizobium which has ATP sulfurylase activity (sulfate adenylate transferase). . 41554 pfam01508: Paramecium surface antigen domain. This domain is a cysteine rich extracellular repeat found in surface antigens of Paramecium. The domain contains 8 cysteine residues. 41555 pfam01509: TruB family pseudouridylate synthase (N terminal domain). Members of this family are involved in modifying bases in RNA molecules. They carry out the conversion of uracil bases to pseudouridine. This family includes TruB, a pseudouridylate synthase that specifically converts uracil 55 to pseudouridine in most tRNAs. This family also includes Cbf5p that modifies rRNA. 41556 pfam01510: N-acetylmuramoyl-L-alanine amidase. This family includes zinc amidases that have N-acetylmuramoyl-L-alanine amidase activity EC:3.5.1.28. This enzyme domain cleaves the amide bond between N-acetylmuramoyl and L-amino acids in bacterial cell walls (preferentially: D-lactyl-L-Ala). The structure is known for the bacteriophage T7 structure and shows that two of the conserved histidines are zinc binding. 41557 pfam01512: Respiratory-chain NADH dehydrogenase 51 Kd subunit. 41558 pfam01513: ATP-NAD kinase. Members of this family are ATP-NAD kinases EC:2.7.1.23. Catalyses the phosphorylation of NAD to NADP utilising ATP and other nucleoside triphosphates as well as inorganic polyphosphate as a source of phosphorus. 41559 pfam01514: Secretory protein of YscJ/FliF family. This family includes proteins that are related to the YscJ lipoprotein, and the amino terminus of FliF, the flageller M-ring protein. The members of the YscJ family are thought to be involved in secretion of several proteins. The FliF protein ring is thought to be part of the export apparatus for flageller proteins, based on the similarity to YscJ proteins. 41560 pfam01515: Phosphate acetyl/butaryl transferase. This family contains both phosphate acetyltransferase and phosphate butaryltransferase. These enzymes catalyse the transfer of an acetyl or butaryl group to orthophosphate. 41561 pfam01516: Orbivirus helicase VP6. The VP6 protein a minor protein in the core of the virion is probably the viral helicase. 41562 pfam01517: Hepatitis delta virus delta antigen. The hepatitis delta virus (HDV) encodes a single protein, the hepatitis delta antigen (HDAg). The central region of this protein has been shown to bind RNA. Several interactions are also mediated by a coiled-coil region at the N terminus of the protein. 41563 pfam01518: Sigma NS protein. This viral protein has a poly(C)-dependent poly(G) polymerase activity. 41564 pfam01519: Protein of unknown function DUF16. The function of this protein is unknown. It appears to only occur in Mycoplasma pneumoniae. 41565 pfam01520: N-acetylmuramoyl-L-alanine amidase. This enzyme domain cleaves the amide bond between N-acetylmuramoyl and L-amino acids in bacterial cell walls. 41566 pfam01521: HesB-like domain. This family includes HesB which may be involved in nitrogen fixation; the hesB gene is expressed only under nitrogen fixation conditions. Other members of this family include various hypothetical proteins of which some also contain NifU-like domains pfam01106 which is also involved in nitrogen fixation. In the gram-negative soil bacterium Rhizobium etli, the hesB-like gene iscN is required for nitrogen fixation. 41567 pfam01522: Polysaccharide deacetylase. This domain is found in polysaccharide deacetylase. This family of polysaccharide deacetylases includes NodB (nodulation protein B from Rhizobium) which is a chitooligosaccharide deacetylase. It also includes chitin deacetylase from yeast, and endoxylanases which hydrolyses glucosidic bonds in xylan. 41568 pfam01523: Putative modulator of DNA gyrase. tldD and pmbA were found to suppress mutations in letD and inhibitor of DNA gyrase. Therefore it has been hypothesised that the TldD and PmbA proteins modulate the activity of DNA gyrase. It has also been suggested that PmbA may be involved in secretion. 41569 pfam01524: Geminivirus V1 protein. Disruption of the V1 gene in Tomato yellow leaf curl virus (TYLCV) stopped its ability to systemically infect tomato plants, suggesting that the V1 gene product is required for successful infection of the host. 41570 pfam01525: Rotavirus NS26. Gene 11 product is a non-structural phosphoprotein designated as NS26. 41571 pfam01526: Transposase. This family includes transposases of Tn3, Tn21, Tn1721, Tn2501, Tn3926 transposons from E-coli. The specific binding of the Tn3 transposase to DNA has been demonstrated. Sequence analysis has suggested that the invariant triad of Asp689, Asp765, Glu895 (numbering as in Tn3) may correspond to the D-D-35-E motif previously implicated in the catalysis of numerous transposases. 41572 pfam01527: Transposase. Transposase proteins are necessary for efficient DNA transposition. This family consists of various E. coli insertion elements and other bacterial transposases some of which are members of the IS3 family. 41573 pfam01528: Herpesvirus glycoprotein M. The herpesvirus glycoprotein M (gM) is an integral membrane protein predicted to contain 8 transmembrane segments. Glycoprotein M is not essential for viral replication. 41574 pfam01529: DHHC zinc finger domain. This domain is also known as NEW1. This domain is predicted to be a zinc binding domain. The function of this domain is unknown, but it has been predicted to be involved in protein-protein or protein-DNA interactions, and palmitoyltransferase activity. 41575 pfam01530: Zinc finger, C2HC type. This is a DNA binding zinc finger domain. 41576 pfam01531: Glycosyl transferase family 11. This family contains several fucosyl transferase enzymes. 41577 pfam01532: Glycosyl hydrolase family 47. Members of this family are alpha-mannosidases that catalyse the hydrolysis of the terminal 1,2-linked alpha-D-mannose residues in the oligo-mannose oligosaccharide Man(9)(GlcNAc)(2).. 41578 pfam01533: Tospovirus nucleocapsid protein. The tospovirus genome consists of three linear ssRNA segments, denoted L, M and S complexed with the nucleocapsid protein. The S RNA encodes the nucleocapsid protein and another non-structural protein. 41579 pfam01534: Frizzled/Smoothened family membrane region. This family contains the membrane spanning region of frizzled and smoothened receptors. This membrane region is predicted to contain seven transmembrane alpha helices. Proteins related to drosophila frizzled are receptors for the Wnt signaling molecules. The smoothened receptor mediates hedgehog signaling. . 41580 pfam01535: PPR repeat. This repeat has no known function. It is about 35 amino acids long and found in up to 18 copies in some proteins. This family appears to be greatly expanded in plants. This repeat occurs in PET309 that may be involved in RNA stabilisation. This domain occurs in crp1 that is involved in RNA processing. This repeat is associated with a predicted plant protein that has a domain organisation similar to the human BRCA1 protein. The repeat has been called PPR. 41581 pfam01536: Adenosylmethionine decarboxylase. This is a family of S-adenosylmethionine decarboxylase (SAMDC) proenzymes. In the biosynthesis of polyamines SAMDC produces decarboxylated S-adenosylmethionine, which serves as the aminopropyl moiety necessary for spermidine and spermine biosynthesis from putrescine. The Pfam alignment contains both the alpha and beta chains that are cleaved to form the active enzyme. 41582 pfam01537: Herpesvirus glycoprotein D. Herpesviruses are dsDNA viruses with no RNA stage. This is a family consists of glycoprotein-D (gD or gIV) which is common to herpes simplex virus types 1 and 2, as well as equine herpes, bovine herpes and Marek's disease virus. Glycoprotein-D has been found on the viral envelope and the plasma membrane of infected cells. and gD immunisation can produce an immune response to bovine herpes virus (BHV-1). This response is stronger than that of the other major glycoproteins gB (gI) and gC (gIII) in BHV-1. 41583 pfam01538: Hepatitis C virus non-structural protein NS2. The viral genome is translated into a single polyprotein of about 3000 amino acids. Generation of the mature non-structural proteins relies on the activity of viral proteases. Cleavage at the NS2/NS3 junction is accomplished by a metal-dependent autoprotease encoded within NS2 and the N-terminus of NS3. 41584 pfam01539: Hepatitis C virus envelope glycoprotein E1. 41585 pfam01540: Adhesin lipoprotein. This family consists of the p50 and variable adherence-associated antigen (Vaa) adhesins from Mycoplasma hominis. M. hominis is a mycoplasma associated with human urogenital diseases, pneumonia, and septic arthritis. An adhesin is a cell surface molecule that mediates adhesion to other cells or to the surrounding surface or substrate. The Vaa antigen is a 50-kDa surface lipoprotein that has four tandem repetitive DNA sequences encoding a periodic peptide structure, and is highly immunogenic in the human host. p50 is also a 50-kDa lipoprotein, having three repeats A,B and C, that may be a tetramer of 191-kDa in its native environment. 41586 pfam01541: GIY-YIG catalytic domain. This domain called GIY-YIG is found in the amino terminal region of excinuclease abc subunit c (uvrC), bacteriophage T4 endonucleases segA, segB, segC, segD and segE; it is also found in putative endonucleases encoded by group I introns of fungi and phage. The structure of I-TevI a GIY-YIG endonuclease, reveals a novel alpha/beta-fold with a central three-stranded antiparallel beta-sheet flanked by three helices. The most conserved and putative catalytic residues are located on a shallow, concave surface and include a metal coordination site. . 41587 pfam01542: Hepatitis C virus core protein. The viral core protein forms the internal viral coat that encapsidates the genomic RNA and is enveloped in a host cell-derived lipid membrane. The core protein has been shown, by yeast two-hybrid assay to interact with cellular DEAD box helicases. The N terminus of the core protein is involved in transcriptional repression. 41588 pfam01543: Hepatitis C virus capsid protein. 41589 pfam01544: CorA-like Mg2+ transporter protein. The CorA transport system is the primary Mg2+ influx system of Salmonella typhimurium and Escherichia coli. CorA is virtually ubiquitous in the Bacteria and Archaea. There are also eukaryotic relatives of this protein. The family includes the MRS2 protein from yeast that is thought to be an RNA splicing protein. However its membership of this family suggests that its effect on splicing is due to altered magnesium levels in the cell. 41590 pfam01545: Cation efflux family. Members of this family are integral membrane proteins, that are found to increase tolerance to divalent metal ions such as cadmium, zinc, and cobalt. These proteins are thought to be efflux pumps that remove these ions from cells. 41591 pfam01546: Peptidase family M20/M25/M40. This family includes a range of zinc metallopeptidases belonging to several families in the peptidase classification. Family M20 are Glutamate carboxypeptidases. Peptidase family M25 contains X-His dipeptidases. 41592 pfam01547: Bacterial extracellular solute-binding protein. This family also includes the bacterial extracellular solute-binding protein family POTD/POTF. 41593 pfam01548: Transposase. Transposase proteins are necessary for efficient DNA transposition. This family includes an amino-terminal region of the pilin gene inverting protein (PIVML) and members of the IS111A/IS1328/IS1533 family of transposases. 41594 pfam01549: ShTK domain. This domain of unknown function is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that probably form three disulphide bridges. The domain is named (by SMART) after ShK toxin. 41595 pfam01550: Virion host shutoff protein. This family consists of virion host shutoff (VHS) proteins from various herpes viruses as well as varicella zoster virus and pseudorabies virus. The VHS proteins inhibit cellular gene expression in infected cells. The VHS polypeptide destabilises preexisting host mRNAs and ensures rapid turn over of viral mRNAs. . 41596 pfam01551: Peptidase family M23/M37. Members of this family are zinc metallopeptidases with a range of specificities. The peptidase family M37 is included in this family, these are Gly-Gly endopeptidases. Peptidase family M23 are also endopeptidases. This family also includes some bacterial lipoproteins for which no proteolytic activity has been demonstrated. 41597 pfam01552: Picornavirus 2B protein. Poliovirus infection leads to drastic alterations in membrane permeability late during infection. Proteins 2B and 2BC enhance membrane permeability. 41598 pfam01553: Acyltransferase. This family contains acyltransferases involved in phospholipid biosynthesis and other proteins of unknown function. This family also includes tafazzin, the Barth syndrome gene. 41599 pfam01554: MatE. The MatE domain. 41600 pfam01555: DNA methylase. Members of this family are DNA methylases. The family contains both N-4 cytosine-specific DNA methylases and N-6 Adenine-specific DNA methylases. 41601 pfam01556: DnaJ C terminal region. This family consists of the C terminal region form the DnaJ protein. Although the function of this region is unknown, it is always found associated with pfam00226 and pfam00684. DnaJ is a chaperone associated with the Hsp70 heat-shock system involved in protein folding and renaturation after stress. 41602 pfam01557: Fumarylacetoacetate (FAA) hydrolase family. This family consists of fumarylacetoacetate (FAA) hydrolase, or fumarylacetoacetate hydrolase (FAH) and it also includes HHDD isomerase/OPET decarboxylase from E. coli strain W. FAA is the last enzyme in the tyrosine catabolic pathway, it hydrolyses fumarylacetoacetate into fumarate and acetoacetate which then join the citric acid cycle. Mutations in FAA cause type I tyrosinemia in humans this is an inherited disorder mainly affecting the liver leading to liver cirrhosis, hetpatocellular carcinoma, renal tubular damages and neurologic crises amongst other symptoms. The enzymatic defect causes the toxic accumulation of phenylalanine/tyrosine catabolites. The E. coli W enzyme HHDD isomerase/OPET decarboxylase contains two copies of this domain and functions in fourth and fifth steps of the homoprotocatechuate pathway; here it decarboxylates OPET to HHDD and isomerises this to OHED. The final products of this pathway are pyruvic acid and succinic semialdehyde. 41603 pfam01558: Pyruvate ferredoxin/flavodoxin oxidoreductase. This family includes a region of the large protein pyruvate-flavodoxin oxidoreductase and the whole pyruvate ferredoxin oxidoreductase gamma subunit protein. It is not known whether the gamma subunit has a catalytic or regulatory role. Pyruvate oxidoreductase (POR) catalyses the final step in the fermentation of carbohydrates in anaerobic microorganisms. This involves the oxidative decarboxylation of pyruvate with the participation of thiamine followed by the transfer of an acetyl moiety to coenzyme A for the synthesis of acetyl-CoA. The family also includes pyruvate flavodoxin oxidoreductase as encoded by the nifJ gene in cyanobacterium which is required for growth on molecular nitrogen when iron is limited. 41604 pfam01559: Zein seed storage protein. Zeins are seed storage proteins. They are unusually rich in glutamine, proline, alanine, and leucine residues and their sequences show a series of tandem repeats. 41605 pfam01560: Hepatitis C virus non-structural protein E2/NS1. The hypervariable region of the E2/NS1 region of hepatitis C virus varies greatly between viral isolates. E2 is thought to encode a structurally unconstrained envelope protein. 41606 pfam01561: Hantavirus glycoprotein G2. The medium (M) genome segment of hantaviruses (family Bunyaviridae) encodes the two virion glycoproteins. G1 and G2, as a precursor protein in the complementary sense RNA. 41607 pfam01562: Reprolysin family propeptide. This region is the propeptide for members of peptidase family M12B. The propeptide contains a sequence motif similar to the ""cysteine switch"" of the matrixins. This motif is found at the C terminus of the alignment but is not well aligned. 41608 pfam01563: Alphavirus E3 glycoprotein. This protein is found in some alphaviruses as a virion associated spike protein. 41609 pfam01564: Spermine/spermidine synthase. Spermine and spermidine are polyamines. This family includes spermidine synthase that catalyses the fifth (last) step in the biosynthesis of spermidine from arginine, and spermine synthase. 41610 pfam01565: FAD binding domain. This family consists of various enzymes that use FAD as a co-factor, most of the enzymes are similar to oxygen oxidoreductase. One of the enzymes Vanillyl-alcohol oxidase (VAO) has a solved structure, the alignment includes the FAD binding site, called the PP-loop, between residues 99-110. The FAD molecule is covalently bound in the known structure, however the residue that links to the FAD is not in the alignment. VAO catalyses the oxidation of a wide variety of substrates, ranging form aromatic amines to 4-alkylphenols. Other members of this family include D-lactate dehydrogenase, this enzyme catalyses the conversion of D-lactate to pyruvate using FAD as a co-factor; mitomycin radical oxidase, this enzyme oxidises the reduced form of mitomycins and is involved in mitomycin resistance. This family includes MurB an UDP-N-acetylenolpyruvoylglucosamine reductase enzyme EC:1.1.1.158. This enzyme is involved in the biosynthesis of peptidoglycan. 41611 pfam01566: Natural resistance-associated macrophage protein. The natural resistance-associated macrophage protein (NRAMP) family consists of Nramp1, Nramp2, and yeast proteins Smf1 and Smf2. The NRAMP family is a novel family of functional related proteins defined by a conserved hydrophobic core of ten transmembrane domains. This family of membrane proteins are divalent cation transporters. Nramp1 is an integral membrane protein expressed exclusively in cells of the immune system and is recruited to the membrane of a phagosome upon phagocytosis. By controlling divalent cation concentrations Nramp1 may regulate the interphagosomal replication of bacteria. Mutations in Nramp1 may genetically predispose an individual to susceptibility to diseases including leprosy and tuberculosis conversely this might however provide protection form rheumatoid arthritis. Nramp2 is a multiple divalent cation transporter for Fe2+, Mn2+ and Zn2+ amongst others it is expressed at high levels in the intestine; and is major transferrin-independent iron uptake system in mammals. The yeast proteins Smf1 and Smf2 may also transport divalent cations. 41612 pfam01567: Hantavirus glycoprotein G1. The medium (M) genome segment of hantaviruses (family Bunyaviridae) encodes the two virion glycoproteins. G1 and G2, as a precursor protein in the complementary sense RNA. 41613 pfam01568: Molydopterin dinucleotide binding domain. This domain is found in various molybdopterin - containing oxidoreductases and tungsten formylmethanofuran dehydrogenase subunit d (FwdD) and molybdenum formylmethanofuran dehydrogenase subunit (FmdD); where the domain constitutes almost the entire subunit. The formylmethanofuran dehydrogenase catalyses the first step in methane formation from CO2 in methanogenic archaea and has a molybdopterin dinucleotide cofactor. This domain corresponds to the C-terminal domain IV in dimethyl sulfoxide (DMSO)reductase which interacts with the 2-amino pyrimidone ring of both molybdopterin guanine dinucleotide molecules. 41614 pfam01569: PAP2 superfamily. This family includes the enzyme type 2 phosphatidic acid phosphatase (PAP2), Glucose-6-phosphatase EC:3.1.3.9, Phosphatidylglycerophosphatase B EC:3.1.3.27 and bacterial acid phosphatase EC:3.1.3.2. 41615 pfam01570: Flavivirus polyprotein propeptide. The flaviviruses are small enveloped animal viruses containing a single positive strand genomic RNA. The genome encodes one large ORF a polyprotein which undergos proteolytic processing into mature viral peptide chains. This family consists of a propeptide region of approximately 90 amino acid length. 41616 pfam01571: Glycine cleavage T-protein (aminomethyl transferase). This is a family of glycine cleavage T-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyses the catabolism of glycine in eukaryotes. The T-protein is an aminomethyl transferase. 41617 pfam01573: Bromovirus movement protein. 41618 pfam01575: MaoC like domain. The MaoC protein is found to share similarity with a wide variety of enzymes; estradiol 17 beta-dehydrogenase 4, peroxisomal hydratase-dehydrogenase-epimerase, fatty acid synthase beta subunit. All these enzymes contain other domains. This domain is also present in the NodN nodulation protein N. No specific function has been assigned to this region of any of these proteins. The maoC gene is part of a operon with maoA which is involved in the synthesis of monoamine oxidase. 41619 pfam01576: Myosin tail. The myosin molecule is a multi-subunit complex made up of two heavy chains and four light chains it is a fundamental contractile protein found in all eukaryote cell types. This family consists of the coiled-coil myosin heavy chain tail region. The coiled-coil is composed of the tail from two molecules of myosin. These can then assemble into the macromolecular thick filament. The coiled-coil region provides the structural backbone the thick filament. . 41620 pfam01577: Potyvirus P1 protease. The potyviridae family positive stand RNA viruses with genome encoding a polyprotein. members include zucchini yellow mosaic virus, and turnip mosaic viruses which cause considerable losses of crops worldwide. This family consists of a C terminus region from various plant potyvirus P1 proteins (found at the N terminus of the polyprotein). The C terminus of P1 is a serine-type protease responsible for autocatalytic cleavage between P1 and the helper component protease pfam00851. The entire P1 protein may be involved in virus-host interactions. 41621 pfam01578: Cytochrome C assembly protein. This family consists of various proteins involved in cytochrome c assembly from mitochondria and bacteria; CycK from Rhizobium, CcmC from E. coli and Paracoccus denitrificans and orf240 from wheat mitochondria. The members of this family are probably integral membrane proteins with six predicted transmembrane helices. It has been proposed that members of this family comprise a membrane component of an ABC (ATP binding cassette) transporter complex. It is also proposed that this transporter is necessary for transport of some component needed for cytochrome c assembly. One member CycK contains a putative heme-binding motif, orf240 also contains a putative heme-binding motif and is a proposed ABC transporter with c-type heme as its proposed substrate. However it seems unlikely that all members of this family transport heme nor c-type apocytochromes because CcmC in the putative CcmABC transporter transports neither. 41622 pfam01579: Domain of unknown function DUF19. This domain has no known function. It is found in one or two copies in several Caenorhabditis elegans proteins. It is roughly 130 amino acids long. The domain contains 12 conserved cysteines which suggests that the domain is an extracellular domain and that these cysteines from six intradomain disulphide bridges. 41623 pfam01580: FtsK/SpoIIIE family. FtsK has extensive sequence similarity to wide variety of proteins from prokaryotes and plasmids, termed the FtsK/SpoIIIE family. This domain contains a putative ATP binding P-loop motif. It is found in the FtsK cell division protein from E. coli and the stage III sporulation protein E SpoIIIE, which has roles in regulation of prespore specific gene expression in B. subtilis. A mutation in FtsK causes a temperature sensitive block in cell division and it is involved in peptidoglycan synthesis or modification. The SpoIIIE protein is implicated in intercellular chromosomal DNA transfer. 41624 pfam01581: FMRFamide related peptide family. The neuroactive peptide Phe-Met-Arg-Phe-NH2 (FMRF-amide) has a variety of effects on both mammalian and invertebrate tissues. 41625 pfam01582: TIR domain. The TIR domain is an intracellular signaling domain found in MyD88, interleukin 1 receptor and the Toll receptor. Called TIR (by SMART?) for Toll - Interleukin - Resistance. 41626 pfam01583: Adenylylsulphate kinase. Enzyme that catalyses the phosphorylation of adenylylsulphate to 3 '-phosphoadenylylsulfate. This domain contains an ATP binding P-loop motif. 41627 pfam01584: CheW-like domain. CheW proteins are part of the chemotaxis signaling mechanism in bacteria. CheW interacts with the methyl accepting chemotaxis proteins (MCPs) and relays signals to CheY, which affects flageller rotation. This family includes CheW and other related proteins that are involved in chemotaxis. The CheW-like regulatory domain in CheA binds to CheW, suggesting that these domains can interact with each other. 41628 pfam01585: G-patch domain. This domain is found in a number of RNA binding proteins, and is also found in proteins that contain RNA binding domains. This suggests that this domain may have an RNA binding function. This domain has seven highly conserved glycines. 41629 pfam01586: Myogenic Basic domain. This basic domain is found in the MyoD family of muscle specific proteins that control muscle development. The bHLH region of the MyoD family includes the basic domain and the Helix-loop-helix (HLH) motif. The bHLH region mediates specific DNA binding. With 12 residues of the basic domain involved in DNA binding. The basic domain forms an extended alpha helix in the structure. 41630 pfam01588: Putative tRNA binding domain. This domain is found in prokaryotic methionyl-tRNA synthetases, prokaryotic phenylalanyl tRNA synthetases the yeast GU4 nucleic-binding protein (G4p1 or p42, ARC1), human tyrosyl-tRNA synthetase, and endothelial-monocyte activating polypeptide II. G4p1 binds specifically to tRNA form a complex with methionyl-tRNA synthetases. In human tyrosyl-tRNA synthetase this domain may direct tRNA to the active site of the enzyme. This domain may perform a common function in tRNA aminoacylation. 41631 pfam01589: Alphavirus E1 glycoprotein. E1 forms a heterodimer with E2 pfam00943. The virus spikes are made up of 80 trimers of these heterodimers (sindbis virus).. 41633 pfam01591: 6-phosphofructo-2-kinase. This enzyme occurs as a bifunctional enzyme with fructose-2,6-bisphosphatase. The bifunctional enzyme catalyses both the synthesis and degradation of fructose-2,6-bisphosphate, a potent regulator of glycolysis. This enzyme contains a P-loop motif. 41634 pfam01592: NifU-like N terminal domain. This domain is found in NifU in combination with pfam01106. This domain is found on isolated proteins in several bacterial species. The nif genes are responsible for nitrogen fixation. However this domain is found in bacteria that do not fix nitrogen, so it may have a broader significance in the cell than nitrogen fixation. 41635 pfam01593: Flavin containing amine oxidoreductase. This family consists of various amine oxidases, including maze polyamine oxidase (PAO) and various flavin containing monoamine oxidases (MAO). The aligned region includes the flavin binding site of these enzymes. The family also contains phytoene dehydrogenases and related enzymes. In vertebrates MAO plays an important role regulating the intracellular levels of amines via there oxidation; these include various neurotransmitters, neurotoxins and trace amines. In lower eukaryotes such as aspergillus and in bacteria the main role of amine oxidases is to provide a source of ammonium. PAOs in plants, bacteria and protozoa oxidase spermidine and spermine to an aminobutyral, diaminopropane and hydrogen peroxide and are involved in the catabolism of polyamines. Other members of this family include tryptophan 2-monooxygenase, putrescine oxidase, corticosteroid binding proteins and antibacterial glycoproteins. 41636 pfam01594: Domain of unknown function DUF20. This transmembrane region is found in putative permeases and predicted transmembrane proteins it has no known function. It is not clear what source suggested that these proteins may be permeases and this information should be treated with caution. 41637 pfam01595: Domain of unknown function DUF21. This transmembrane region has no known function. Many of the sequences in this family are annotated as hemolysins, however this is due to a similarity to a protein, which does not contain this domain. This domain is found in the N-terminus of the proteins adjacent to two intracellular CBS domains pfam00571. 41638 pfam01596: O-methyltransferase. Members of this family are O-methyltransferases. The family includes catechol o-methyltransferase, caffeoyl-CoA O-methyltransferase and a family of bacterial O-methyltransferases that may be involved in antibiotic production. 41639 pfam01597: Glycine cleavage H-protein. This is a family of glycine cleavage H-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyses the catabolism of glycine in eukaryotes. A lipoyl group is attached to a completely conserved lysine residue. The H protein shuttles the methylamine group of glycine from the P protein to the T protein. 41640 pfam01598: Sterol desaturase. This family includes C-5 sterol desaturase and C-4 sterol methyl oxidase. Members of this family are involved in cholesterol biosynthesis and biosynthesis a plant cuticular wax. These enzymes contain many conserved histidine residues. Members of this family are integral membrane proteins. 41641 pfam01599: Ribosomal protein S27a. This family of ribosomal proteins consists mainly of the 40S ribosomal protein S27a which is synthesised as a C-terminal extension of ubiquitin (CEP). The S27a domain compromises the C-terminal half of the protein. The synthesis of ribosomal proteins as extensions of ubiquitin promotes their incorporation into nascent ribosomes by a transient metabolic stabilisation and is required for efficient ribosome biogenesis. The ribosomal extension protein S27a contains a basic region that is proposed to form a zinc finger; its fusion gene is proposed as a mechanism to maintain a fixed ratio between ubiquitin necessary for degrading proteins and ribosomes a source of proteins. 41642 pfam01600: Coronavirus S1 glycoprotein. The coronavirus spike glycoprotein forms the characteristic 'corona' after which the group is named. The Spike glycoprotein is translated as a large polypeptide that is subsequently cleaved to S1 and S2 pfam01601. 41643 pfam01601: Coronavirus S2 glycoprotein. The coronavirus spike glycoprotein forms the characteristic 'corona' after which the group is named. The Spike glycoprotein is translated as a large polypeptide that is subsequently cleaved to S1 pfam01600 and S2. 41644 pfam01602: Adaptin N terminal region. This family consists of the N terminal region of various alpha, beta and gamma subunits of the AP-1, AP-2 and AP-3 adaptor protein complexes. The adaptor protein (AP) complexes are involved in the formation of clathrin-coated pits and vesicles. The N-terminal region of the various adaptor proteins (APs) is constant by comparison to the C-terminal which is variable within members of the AP-2 family, and it has been proposed that this constant region interacts with another uniform component of the coated vesicles. . 41645 pfam01603: Protein phosphatase 2A regulatory B subunit (B56 family). Protein phosphatase 2A (PP2A) is a major intracellular protein phosphatase that regulates multiple aspects of cell growth and metabolism. The ability of this widely distributed heterotrimeric enzyme to act on a diverse array of substrates is largely controlled by the nature of its regulatory B subunit. There are multiple families of B subunits (See also pfam01240), this family is called the B56 family. 41646 pfam01604: 7TM chemoreceptor. This large family of proteins are related to pfam00001. They are 7 transmembrane receptors. This family does not include all known members, as there are problems with overlapping specificity with pfam00001. This family is greatly expanded in the nematode worm C. elegans. 41647 pfam01606: Arterivirus envelope protein. This family consists of viral envelope proteins from the arterivirus genus; this includes porcine reproductive and respiratory virus (PRRSV) envelope protein GP3 and lactate dehydrogenase elevating virus (LDV) structural glycoprotein. Arteriviruses consists of positive ssRNA and do not have a DNA stage. 41648 pfam01607: Chitin binding Peritrophin-A domain. This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. 41649 pfam01608: I/LWEQ domain. I/LWEQ domains bind to actin. It has been shown that the I/LWEQ domains from mouse talin and yeast Sla2p interact with F-actin. I/LWEQ domains can be placed into four major groups based on sequence similarity: (1) Metazoan talin; (2) Dictyostelium TalA/TalB and SLA110; (3) metazoan Hip1p; and (4) yeast Sla2p. The domain has four conserved blocks, the name of the domain is derived from the initial conserved amino acid of each of the four blocks. 41651 pfam01610: Transposase. Transposase proteins are necessary for efficient DNA transposition. Contains transposases for IS204, IS1001, IS1096 and IS1165. 41652 pfam01611: Filovirus glycoprotein. This family includes an extracellular region from the envelope glycoprotein of Ebola and Marburg viruses. This region is also produced as a separate transcript that gives rise to a non-structural, secreted glycoprotein, which is produced in large amounts and has an unknown function. Processing of this protein may be involved in viral pathogenicity. 41653 pfam01612: 3'-5' exonuclease. This domain is responsible for the 3'-5' exonuclease proofreading activity of E. coli DNA polymerase I (polI) and other enzymes, it catalyses the hydrolysis of unpaired or mismatched nucleotides. This domain consists of the amino-terminal half of the Klenow fragment in E. coli polI it is also found in the Werner syndrome helicase (WRN), focus forming activity 1 protein (FFA-1) and ribonuclease D (RNase D). Werner syndrome is a human genetic disorder causing premature aging; the WRN protein has helicase activity in the 3'-5' direction. The FFA-1 protein is required for formation of a replication foci and also has helicase activity; it is a homologue of the WRN protein. RNase D is a 3'-5' exonuclease involved in tRNA processing. Also found in this family is the autoantigen PM/Scl thought to be involved in polymyositis-scleroderma overlap syndrome. 41654 pfam01613: Flavin reductase like domain. This is a flavin reductase family consisting of enzymes known to be flavin reductases as well as various oxidoreductase and monooxygenase components. VlmR is a flavin reductase that functions in a two-component enzyme system to provide isobutylamine N-hydroxylase with reduced flavin and may be involved in the synthesis of valanimycin. SnaC is a flavin reductase that provides reduced flavin for the oxidation of pristinamycin IIB to pristinamycin IIA as catalysed by SnaA, SnaB heterodimer. This flavin reductase region characterised by enzymes of the family is present in the C-terminus of potential FMN proteins from Synechocystis sp. suggesting it is a flavin reductase domain. 41655 pfam01614: Bacterial transcriptional regulator. This family of bacterial transcriptional regulators includes the glycerol operon regulatory protein and acetate operon repressor both of which are members of the iclR family. These proteins have a Helix-Turn-Helix motif at the N-terminus. However this family covers the C-terminal region that may bind to the regulatory substrate. 41656 pfam01616: Orbivirus NS3. The function of this Orbivirus non structural protein is uncertain. However it may play a role on release of the virus from infected cells. 41657 pfam01617: Surface antigen. This family includes a number of bacterial surface antigens expressed on the surface of pathogens. 41658 pfam01618: MotA/TolQ/ExbB proton channel family. This family groups together integral membrane proteins that appear to be involved translocation of proteins across a membrane. These proteins are probably proton channels. MotA is an essential component of the flageller motor that uses a proton gradient to generate rotational motion in the flageller. ExbB is part of the TonB-dependent transduction complex. The TonB complex uses the proton gradient across the inner bacterial membrane to transport large molecules across the outer bacterial membrane. 41659 pfam01619: Proline dehydrogenase. 41660 pfam01620: Ribonuclease (pollen allergen). This family contains grass pollen proteins of group V. One member has been shown to possess ribonuclease activity. 41661 pfam01621: Cell fusion glycoprotein K. This protein is probably an integral membrane bound glycoprotein that is involved in viral fusion with the host cell. . 41662 pfam01623: Carlavirus putative nucleic acid binding protein. This family of carlavirus nucleic acid binding proteins includes a motif for a potential C-4 type zinc finger this has four highly conserved cysteine residues and is a conserved feature of the carlaviruses 3' terminal ORF. These proteins may function as viral transcriptional regulators. The carlavirus family includes garlic latent virus and potato virus S and M, these viruses are positive strand, ssRNA with no DNA stage. 41663 pfam01624: MutS domain I. This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam00488, pfam05188, pfam05192 and pfam05190. The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. The aligned region corresponds with globular domain I, which is involved in DNA binding, in Thermus aquaticus MutS as characterised in. 41664 pfam01625: Peptide methionine sulfoxide reductase. This enzyme repairs damaged proteins. Methionine sulfoxide in proteins is reduced to methionine. 41665 pfam01627: Hpt domain. The histidine-containing phosphotransfer (HPt) domain is a novel protein module with an active histidine residue that mediates phosphotransfer reactions in the two-component signaling systems. A multistep phosphorelay involving the HPt domain has been suggested for these signaling pathways. The crystal structure of the HPt domain of the anaerobic sensor kinase ArcB has been determined. The domain consists of six alpha helices containing a four-helix bundle-folding. The pattern of sequence similarity of the HPt domains of ArcB and components in other signaling systems can be interpreted in light of the three-dimensional structure and supports the conclusion that the HPt domains have a common structural motif both in prokaryotes and eukaryotes. In S. cerevisiae ypd1p this domain has been shown to contain a binding surface for Ssk1p (response regulator receiver domain containing protein pfam00072).. 41666 pfam01628: HrcA protein C terminal domain. HrcA is found to negatively regulate the transcription of heat shock genes. HrcA contains an amino terminal helix-turn-helix domain, however this corresponds to the carboxy terminal domain. 41667 pfam01629: Domain of unknown function DUF22. This domain is found in 1 to 3 copies in archaebacterial proteins. The function of the domain is unknown. This family appears to be expanded in Archaeoglobus fulgidus. 41668 pfam01630: Hyaluronidase. 41669 pfam01632: Ribosomal protein L35. 41670 pfam01633: Choline/ethanolamine kinase. Choline kinase catalyses the committed step in the synthesis of phosphatidylcholine by the CDP-choline pathway. 41671 pfam01634: ATP phosphoribosyltransferase. 41672 pfam01635: Coronavirus M matrix/glycoprotein. This family consists of various coronavirus matrix proteins which are transmembrane glycoproteins. The M protein or E1 glycoprotein is The coronavirus M protein is implicated in virus assembly. The E1 viral membrane protein is required for formation of the viral envelope and is transported via the Golgi complex. 41673 pfam01636: Phosphotransferase enzyme family. This family consists of bacterial antibiotic resistance proteins, which confer resistance to various aminoglycosides they include:- aminoglycoside 3 '-phosphotransferase or kanamycin kinase / neomycin-kanamycin phosphotransferase and streptomycin 3''-kinase or streptomycin 3' '-phosphotransferase. The aminoglycoside phosphotransferases inactivate aminoglycoside antibiotics via phosphorylation. This family also includes homoserine kinase. This family is related to fructosamine kinase pfam03881. 41674 pfam01637: Archaeal ATPase. This family contain a conserved P-loop motif that is involved in binding ATP. This family is almost exclusively found in archaebacteria and particularly in Methanococcus jannaschii that encodes sixteen members of this family. 41675 pfam01638: Transcriptional regulator. Members of this family are predicted to be transcriptional regulators that are related to the pfam01047 family. 41676 pfam01639: Viral family 110. This family of viral proteins is known as the 110 family. The function of members of this family is unknown. The family contains a central cysteine rich region with eight conserved cysteines. Some members of the family contains two copies of the cysteine rich region. 41677 pfam01640: Peptidase C10 family. 41678 pfam01641: SelR domain. Methionine sulfoxide reduction is an important process, by which cells regulate biological processes and cope with oxidative stress. MsrA, a protein involved in the reduction of methionine sulfoxides in proteins, has been known for four decades and has been extensively characterised with respect to structure and function. However, recent studies revealed that MsrA is only specific for methionine-S-sulfoxides. Because oxidised methionines occur in a mixture of R and S isomers in vivo, it was unclear how stereo-specific MsrA could be responsible for the reduction of all protein methionine sulfoxides. It appears that a second methionine sulfoxide reductase, SelR , evolved that is specific for methionine-R-sulfoxides, the activity that is different but complementary to that of MsrA. Thus, these proteins, working together, could reduce both stereoisomers of methionine sulfoxide. This domain is found both in SelR proteins and fused with the peptide methionine sulfoxide reductase enzymatic domain pfam01625. The domain has two conserved cysteine and histidines. The domain binds both selenium and zinc. The final cysteine is found to be replaced by the rare amino acid selenocysteine in some members of the family. This family has methionine-R-sulfoxide reductase activity. 41679 pfam01642: Methylmalonyl-CoA mutase. The enzyme methylmalonyl-CoA mutase is a member of a class of enzymes that uses coenzyme B12 (adenosylcobalamin) as a cofactor. The enzyme induces the formation of an adenosyl radical from the cofactor. This radical then initiates a free-radical rearrangement of its substrate, succinyl-CoA, to methylmalonyl-CoA. 41680 pfam01643: Acyl-ACP thioesterase. This family consists of various acyl-acyl carrier protein (ACP) thioesterases (TE) these terminate fatty acyl group extension via hydrolysing an acyl group on a fatty acid. 41681 pfam01644: Chitin synthase. This region is found commonly in chitin synthases classes I, II and III. Chitin a linear homopolymer of GlcNAc residues, it is an important component of the cell wall of fungi and is synthesised on the cytoplasmic surface of the cell membrane by membrane bound chitin synthases. 41682 pfam01645: Conserved region in glutamate synthase. This family represents a region of the glutamate synthase protein. This region is expressed as a separate subunit in the glutamate synthase alpha subunit from archaebacteria, or part of a large multidomain enzyme in other organisms. The aligned region of these proteins contains a putative FMN binding site and Fe-S cluster. 41683 pfam01646: Herpes virus protein UL24. This family consists of various herpes virus proteins; the gene 20 product, U49 protein, UL24 protein and BXRF1. The UL24 gene (product of the 24th ORF) is not essential for virus replication, mutants with lesions in UL24 show a reduced ability to replicate in tissue culture and have reduced thymidine kinase activity as the UL24 gene overlaps with thymidine kinase. 41684 pfam01647: Morbillivirus RNA polymerase alpha subunit. This family consists of morbillivirus RNA polymerase alpha subunit and non structural protein V. The P gene of morbillivirus is cotranscriptionally edited leading to the N-terminal half of the P protein being appended to the C-terminal of the P protein, and a cysteine rich region in the V fusion protein which has been shown to bind zinc. Morbilliviruses are positive strand ssRNA viruses and a part of the paramyxoviridae family, members include measles virus and phocine distemper virus. 41685 pfam01648: 4'-phosphopantetheinyl transferase superfamily. Members of this family transfers the 4'-phosphopantetheine (4 '-PP) moiety from coenzyme A (CoA) to the invariant serine of pfam00550. This post-translational modification renders holo-ACP capable of acyl group activation via thioesterification of the cysteamine thiol of 4'-PP. This superfamily consists of two subtypes: The ACPS type and the Sfp type. The structure of the Sfp type is known, which shows the active site accommodates a magnesium ion. The most highly conserved regions of the alignment are involved in binding the magnesium ion. 41686 pfam01649: Ribosomal protein S20. Bacterial ribosomal protein S20 interacts with 16S rRNA. 41687 pfam01650: Peptidase C13 family. This family of peptidases is known as the hemoglobinase family because it contains a globin degrading enzyme from blood parasites. However, relatives are found in plants and other organisms that have other functions. Members of this family are asparaginyl peptidases. 41688 pfam01652: Eukaryotic initiation factor 4E. 41689 pfam01653: NAD-dependent DNA ligase adenylation domain. DNA ligases catalyse the crucial step of joining the breaks in duplex DNA during DNA replication, repair and recombination, utilising either ATP or NAD(+) as a cofactor. This domain is the catalytic adenylation domain. The NAD+ group is covalently attached to this domain at the lysine in the KXDG motif of this domain. This enzyme- adenylate intermediate is an important feature of the proposed catalytic mechanism. 41690 pfam01654: Bacterial Cytochrome Ubiquinol Oxidase. This family are the alternative oxidases found in many bacteria which oxidise ubiquinol and reduce oxygen as part of the electron transport chain. This family is the subunit I of the oxidase E. coli has two copies of the oxidase, bo and bd', both of which are represented here In some nitrogen fixing bacteria, e.g. Klebsiella pneumoniae this oxidase is responsible for removing oxygen in microaerobic conditions, making the oxidase required for nitrogen fixation. This subunit binds a single b-haem, through ligands at His186 and Met393 (using SW:P11026 numbering). In addition His19 is a ligand for the haem b found in subunit II. 41691 pfam01655: Ribosomal protein L32. This family includes ribosomal protein L32 from eukaryotes and archaebacteria. 41692 pfam01656: Cobyrinic acid a,c-diamide synthase. This family consists of various cobyrinic acid a,c-diamide synthases. These include CbiA and CbiP from S.typhimurium, and CobQ from R. capsulatus. These amidases catalyse amidations to various side chains of hydrogenobyrinic acid or cobyrinic acid a,c-diamide in the biosynthesis of cobalamin (vitamin B12) from uroporphyrinogen III. Vitamin B12 is an important cofactor and an essential nutrient for many plants and animals and is primarily produced by bacteria. The family also contains dethiobiotin synthetases. 41693 pfam01657: Domain of unknown function DUF26. This domain has no known function. It is found in serine/threonine kinases, associated with the Eukaryotic protein kinase domain pfam00069. In the 33kDa secretary protein and many other proteins this domain is duplicated. The domain is probably extracellular and contains four conserved cysteines tat probably form two disulphide bridges. 41694 pfam01658: Myo-inositol-1-phosphate synthase. This is a family of myo-inositol-1-phosphate synthases. Inositol-1-phosphate catalyses the conversion of glucose-6- phosphate to inositol-1-phosphate, which is then dephosphorylated to inositol. Inositol phosphates play an important role in signal transduction. 41695 pfam01659: Luteovirus putative VPg genome linked protein. This family consists of several putative genome linked proteins. The genomic RNA of luteoviruses are linked to virally encoded genome proteins (VPg). Open reading frame 4 is thought to encode the VPg in Soybean dwarf luteovirus. Luteoviruses have isometric capsids that contain a positive stand ssRNA genome, they have no DNA stage during their replication. 41696 pfam01660: Viral methyltransferase. This RNA methyltransferase domain is found in a wide range of ssRNA viruses, including Hordei-, Tobra-, Tobamo-, Bromo-, Clostero- and Caliciviruses. This methyltransferase is involved in mRNA capping. Capping of mRNA enhances its stability. This usually occurs in the nucleus. Therefore, many viruses that replicate in the cytoplasm encode their own. This is a specific guanine-7-methyltransferase domain involved in viral mRNA cap0 synthesis. Specificity for guanine 7 position is shown by NMR in and in vivo role in cap synthesis. Based on secondary structure prediction, the basic fold is believed to be similar to the common AdoMet-dependent methyltransferase fold. A curious feature of this methyltransferase domain is that it together with flanking sequences seems to have guanylyltransferase activity coupled to the methyltransferase activity. The domain is found throughout the so-called Alphavirus superfamily, (including alphaviruses and several other groups). It forms the defining, unique feature of this superfamily. 41697 pfam01661: Appr-1""-p processing enzyme family. This domain is found in a number of otherwise unrelated proteins. This domain is found at the C-terminus of the macro-H2A histone protein. This domain is found in the non-structural proteins of several types of ssRNA viruses such as NSP3 from alphaviruses. This domain is also found on its own in a family of proteins from bacteria, archaebacteria and eukaryotes, suggesting that it is involved in an important and ubiquitous cellular process. 41698 pfam01663: Type I phosphodiesterase / nucleotide pyrophosphatase. This family consists of phosphodiesterases, including human plasma-cell membrane glycoprotein PC-1 / alkaline phosphodiesterase i / nucleotide pyrophosphatase (nppase). These enzymes catalyse the cleavage of phosphodiester and phosphosulfate bonds in NAD, deoxynucleotides and nucleotide sugars. Also in this family is ATX an autotaxin, tumour cell motility-stimulating protein which exhibits type I phosphodiesterases activity. The alignment encompasses the active site. Also present with in this family is 60-kDa Ca2+-ATPase form F. odoratum. 41699 pfam01664: Reovirus viral attachment protein sigma 1. This family consists of the reovirus sigma 1 hemagglutinin, cell attachment protein. This glycoprotein is a minor capsid protein and also determines the serotype-specific humoral immune response. Sigma 1 consist of a fibrous tail and a globular head. The head has important roles in the cell attachment function of sigma 1 and determinant of the type-specific humoral immune response. Reovirus is part of the orthoreovirus group of retroviruses with, a dsRNA genome. Also present in this family is bacteriophage SF6 Lysozyme. 41700 pfam01665: Rotavirus non-structural protein NSP3. This family consist of rotaviral non-structural RNA binding protein 34 (NS34 or NSP3). The NSP3 protein has been shown to bind viral RNA. The NSP3 protein consists of 3 conserved functional domains; a basic region which binds ssRNA, a region containing heptapeptide repeats mediating oligomerisation and a leucine zipper motif. NSP3 may play a central role in replication and assembly of genomic RNA structures. Rotaviruses have a dsRNA genome and are a major cause cause of acute gastroenteritis in the young of many species. The rotavirus non-structural protein NSP3 is a sequence-specific RNA binding protein that binds the nonpolyadenylated 3' end of the rotavirus mRNAs. NSP3 also interacts with the translation initiation factor eIF4GI and competes with the poly(A) binding protein. 41701 pfam01666: DX module. This domain has no known function. It is found in several C. elegans proteins. The domain contains 6 conserved cysteines that probably form three disulphide bridges. 41702 pfam01667: Ribosomal protein S27. 41703 pfam01668: SmpB protein. 41704 pfam01669: Myelin basic protein. 41705 pfam01670: Glycosyl hydrolase family 12. 41706 pfam01671: African swine fever virus multigene family 360 protein. The multigene family 360 protein are found within the African swine fever virus (ASF) genome which consist of dsDNA and has similar structural features to the poxyviruses. The biological function of this family is not known. Although one member is a major structural protein. 41707 pfam01672: Putative plasmid partition protein. This family consists of conserved hypothetical proteins from Borrelia burgdorferi the lyme disease spirochaete, some of which are putative plasmid partition proteins. 41708 pfam01673: Herpesvirus putative major envelope glycoprotein. This family consists of probable major envelope glycoproteins from members of the herpesviridae including herpes simplex virus, human cytomegalovirus and varicella-zoster virus. Members of the herpesviridae have a dsDNA genome and do not have a RNA stage during there replication. 41709 pfam01674: Lipase (class 2). This family consists of hypothetical C. elegans proteins and lipases. Lipases or triacylglycerol acylhydrolases hydrolyse ester bonds in triacylglycerol giving diacylglycerol, monoacylglycerol, glycerol and free fatty acids. One member is a extracellular lipase from B. subtilis 168. 41710 pfam01676: Metalloenzyme superfamily. This family includes phosphopentomutase and 2,3-bisphosphoglycerate-independent phosphoglycerate mutase. This family is also related to pfam00245. The alignment contains the most conserved residues that are probably involved in metal binding and catalysis. 41711 pfam01677: Herpesvirus UL7 like. This family consists of various functionally undefined proteins from the herpesviridae and UL7 from bovine herpes virus. UL7 is not essential for virus replication in cell culture, and is found localised in the cytoplasm of infected cells accumulated around the nucleus but could not be detected in purified virions. Members of the herpesviridae have a dsDNA genome and do not have a RNA stage during there replication. 41712 pfam01678: Diaminopimelate epimerase. Diaminopimelate epimerase contains two domains of the same alpha/beta fold, both contained in this family. 41713 pfam01679: Uncharacterized protein family UPF0057. 41714 pfam01680: SOR/SNZ family. Members of this family are enzymes involved in a new pathway of pyridoxine/pyridoxal 5-phosphate biosynthesis. This family was formerly known as UPF0019. 41715 pfam01681: C6 domain. This domain of unknown function is found in a C. elegans protein. It is presumed to be an extracellular domain. The C6 domain contains six conserved cysteine residues in most copies of the domain. However some copies of the domain are missing cysteine residues 1 and 3 suggesting that these form a disulphide bridge. 41716 pfam01682: DB module. This domain has no known function. It is found in several C. elegans proteins. The domain contains 12 conserved cysteines that probably form six disulphide bridges. This domain is found associated with ig pfam00047 and fn3 pfam00041 domains, as well as in some lipases pfam00657. 41717 pfam01683: EB module. This domain has no known function. It is found in several C. elegans proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges. This domain is found associated with kunitz domains pfam00014. 41718 pfam01684: ET module. This domain has no known function. It is found in several C. elegans proteins. The domain contains 8-10 conserved cysteines that probably form 4-5 disulphide bridges. By inspection of the conservation of cysteines it looks like cysteines 1,2,3,4,9 and 10 are always present and that sometimes the pair 5 and 8 or the pair 6 and 7 are missing. This suggests that cysteines 5/8 and 6/7 make disulphide bridges. 41719 pfam01686: Adenovirus penton base protein. This family consists of various adenovirus penton base proteins, from both the Mastadenoviradae having mammalian hosts and the Aviadenoviradae having avian hosts. The penton base is a major structural protein forming part of the penton which consists of a base and a fibre, the pentons hold a morphologically prominent position at the vertex capsomer in the adenovirus particle. In mammalian adenovirus there is only one tail on each base where as in avian adenovirus there are two. 41720 pfam01687: Riboflavin kinase / FAD synthetase. This family consists part of the bifunctional enzyme riboflavin kinase / FAD synthetase. These enzymes have both ATP:riboflavin 5'-phospho transferase and ATP:FMN-adenylyltransferase activitys. They catalyse the 5'-phosphorylation of riboflavin to FMN and the adenylylation of FMN to FAD. CAUTION: It is not clear if this region of the enzymes catalyses either or both of the enzymatic reactions. 41721 pfam01688: Alphaherpesvirus glycoprotein I. This family consists of glycoprotein I form various members of the alphaherpesvirinae these include herpesvirus, varicella-zoster virus and pseudorabies virus. Glycoprotein I (gI) is important during natural infection, mutants lacking gI produce smaller lesions at the site of infection and show reduced neuronal spread. gI forms a heterodimeric complex with gE; this complex displays Fc receptor activity (binds to the Fc region of immunoglobulin). Glycoproteins are also important in the production of virus-neutralising antibodies and cell mediated immunity. The alphaherpesvirinae have a dsDNA gnome and have no RNA stage during viral replication. 41722 pfam01689: Hydratase/decarboxylase. This family consist of various hydratases and 4-oxalocrotonate decarboxylases which are involved in the bacterial meta-cleavage pathways for degradation of aromatic compounds. 2-hydroxypentadienoic acid hydratase encoded by mhpD in E. coli is involved in the phenylpropionic acid pathway of E. coli and catalyses the conversion of 2-hydroxy pentadienoate to 4-hydroxy-2-keto-pentanoate and uses a Mn2+ co-factor. OHED hydratase encoded by hpcG in E. coli is involved in the homoprotocatechuic acid (HPC) catabolism. XylI in P. putida is a 4-Oxalocrotonate decarboxylase. 41723 pfam01690: Potato leaf roll virus readthrough protein. This family consists mainly of the potato leaf roll virus readthrough protein. This is generated via a readthrough of open reading frame 3 a coat protein allowing transcription of open reading frame 5 to give an extended coat protein with a large c-terminal addition or read through domain. The readthrough protein is thought to play a role in the circulative aphid transmission of potato leaf roll virus. Also in the family is open reading frame 6 from beet western yellows virus and potato leaf roll virus both luteovirus and an unknown protein from cucurbit aphid-borne yellows virus a closterovirus. 41724 pfam01691: Adenovirus E1B 19K protein / small t-antigen. This family consists of adenovirus E1B 19K protein or small t-antigen. The E1B 19K protein inhibits E1A induced apoptosis and hence prolongs the viability of the host cell. It can also inhibit apoptosis mediated by tumour necrosis factor alpha and Fas antigen. E1B 19K blocks apoptosis by interacting with and inhibiting the p53-inducible and death- promoting Bax protein. The E1B region of adenovirus encodes two proteins E1B 19K the small t-antigen as found in this family and E1B 55K the large t-antigen which is not found in this family; both of these proteins inhibit E1A induced apoptosis. 41725 pfam01692: Paramyxovirus non-structural protein c. This family consist of the C proteins (C', C, Y1, Y2) found in paramyxovirus; human parainfluenza, and sendai virus. The C proteins effect viral RNA synthesis having both a positive and negative effect during the course of infection. Paramyxovirus have a negative strand ssRNA genome of 15.3kb form which six mRNAs are transcribed, five of these are monocistronic. The P/C mRNA is polycistronic and has two overlapping open reading frames P and C, C encodes the nested C proteins C', C, Y1 and Y2. 41726 pfam01693: Caulimovirus viroplasmin. This family consists of various caulimovirus viroplasmin proteins. The viroplasmin protein is encoded by gene VI and is the main component of viral inclusion bodies or viroplasms. Inclusions are the site of viral assembly, DNA synthesis and accumulation. Two domains exist within gene VI corresponding approximately to the 5' third and middle third of gene VI, these influence systemic infection in a light-dependent manner. 41727 pfam01694: Rhomboid family. This family contains integral membrane proteins that are related to Drosophila rhomboid protein. Members of this family are found in bacteria and eukaryotes. Rhomboid promotes the cleavage of the membrane-anchored TGF-alpha-like growth factor Spitz, allowing it to activate the Drosophila EGF receptor. Analysis suggests that Rhomboid-1 is a novel intramembrane serine protease that directly cleaves Spitz. These proteins contain three strongly conserved histidines in the putative transmembrane regions that may be involved in the peptidase function. 41728 pfam01695: IstB-like ATP binding protein. This protein contains an ATP/GTP binding P-loop motif. It is found associated with IS21 family insertion sequences. The function of this protein is unknown, but it may perform a transposase function. 41729 pfam01696: Adenovirus EB1 55K protein / large t-antigen. This family consists of adenovirus E1B 55K protein or large t-antigen. E1B 55K binds p53 the tumour suppressor protein converting it from a transcriptional activator which responds to damaged DNA in to an unregulated repressor of genes with a p53 binding site. This protects the virus against p53 induced host antiviral responses and prevents apoptosis as induced by the adenovirus E1A protein. The E1B region of adenovirus encodes two proteins E1B 55K the large t-antigen as found in this family and E1B 19K pfam01691 the small t-antigen which is not found in this family; both of these proteins inhibit E1A induced apoptosis. 41731 pfam01698: Floricaula / Leafy protein. This family consists of various plant development proteins which are homologues of floricaula (FLO) and Leafy (LFY) proteins which are floral meristem identity proteins. Mutations in the sequences of these proteins affect flower and leaf development. 41732 pfam01699: Sodium/calcium exchanger protein. This is a family of sodium/calcium exchanger integral membrane proteins. This family covers the integral membrane regions of the proteins. Sodium/calcium exchangers regulate intracellular Ca2+ concentrations in many cells; cardiac myocytes, epithelial cells, neurons retinal rod photoreceptors and smooth muscle cells. Ca2+ is moved into or out of the cytosol depending on Na+ concentration. In humans and rats there are 3 isoforms; NCX1 NCX2 and NCX3. 41733 pfam01700: Orbivirus VP3 (T2) protein. The orbivirus VP3 protein is part of the virus core and makes a 'subcore' shell made up of 120 copies of the 100K protein. VP3 particles can also bind RNA and are fundamental in the early stages of viral core formation. Also found in the family is structural core protein VP2 from broadhaven virus which is similar to VP3 in bluetongue virus. Orbivirus are part of the larger reoviridae which have a dsRNA genome of 10-12 linear segments; orbivirus found in this family include bluetongue virus and epizootic hemorrhagic disease virus. 41734 pfam01701: Photosystem I reaction centre subunit IX / PsaJ. This family consists of the photosystem I reaction centre subunit IX or PsaJ from various organisms including Synechocystis sp. (strain pcc 6803), Pinus thunbergii (green pine) and Zea mays (maize). PsaJ is a small 4.4kDa, chloroplastal encoded, hydrophobic subunit of the photosystem I reaction complex its function is not yet fully understood. PsaJ can be cross-linked to PsaF and has a single predicted transmembrane domain it has a proposed role in maintaining PsaF in the correct orientation to allow for fast electron transfer from soluble donor proteins to P700+.. 41735 pfam01702: Queuine tRNA-ribosyltransferase. This is a family of queuine tRNA-ribosyltransferases EC:2.4.2.29, also known as tRNA-guanine transglycosylase and guanine insertion enzyme. Queuine tRNA-ribosyltransferase modifies tRNAs for asparagine, aspartic acid, histidine and tyrosine with queuine. It catalyses the exchange of guanine-34 at the wobble position with 7-aminomethyl-7-deazaguanine, and the addition of a cyclopentenediol moiety to 7-aminomethyl-7-deazaguanine-34 tRNA; giving a hypermodified base queuine in the wobble position. The aligned region contains a zinc binding motif C-x-C-x2-C-x29-H, and important tRNA and 7-aminomethyl-7deazaguanine binding residues. 41736 pfam01704: UTP--glucose-1-phosphate uridylyltransferase. This family consists of UTP--glucose-1-phosphate uridylyltransferases, EC:2.7.7.9. Also known as UDP-glucose pyrophosphorylase (UDPGP) and Glucose-1-phosphate uridylyltransferase. UTP--glucose-1-phosphate uridylyltransferase catalyses the interconversion of MgUTP + glucose-1-phosphate and UDP-glucose + MgPPi. UDP-glucose is an important intermediate in mammalian carbohydrate interconversion involved in various metabolic roles depending on tissue type. In Dictyostelium (slime mold) mutants in this enzyme abort the development cycle. Also within the family is UDP-N-acetylglucosamine or AGX1 and two hypothetical proteins from Borrelia burgdorferi the lyme disease spirochaete. 41737 pfam01705: CX module. This domain has no known function. It is found in several C. elegans proteins. The domain contains 6 conserved cysteines that probably form three disulphide bridges. 41738 pfam01706: FliG C-terminal domain. FliG is a component of the flageller rotor, present in about 25 copies per flagellum. This domain functions specifically in motor rotation. 41739 pfam01707: Peptidase family C9. 41740 pfam01708: Geminivirus putative movement protein. This family consists of putative movement proteins from Maize streak and wheat dwarf virus. 41741 pfam01709: Domain of unknown function DUF28. This domain is found in bacterial and yeast proteins it compromises the entire length or central region of most of the proteins in the family, all of which are hypothetical with no known function. The average length of this domain is approximately 230 amino acids long. 41742 pfam01710: Transposase. Transposase proteins are necessary for efficient DNA transposition. This family includes insertion sequences from Synechocystis PCC 6803 three of which are characterised as homologous to bacterial IS5- and IS4- and to several members of the IS630-Tc1-mariner superfamily. 41743 pfam01712: Deoxynucleoside kinase. This family consists of various deoxynucleoside kinases cytidine EC:2.7.1.74, guanosine EC:2.7.1.113, adenosine EC:2.7.1.76 and thymidine kinase EC:2.7.1.21 (which also phosphorylates deoxyuridine and deoxycytosine.) These enzymes catalyse the production of deoxynucleotide 5'-monophosphate from a deoxynucleoside. Using ATP and yielding ADP in the process. 41744 pfam01713: Smr domain. This family includes the Smr (Small MutS Related) proteins, and the C-terminal region of the MutS2 protein. It has been suggested that this domain interacts with the MutS1 protein in the case of Smr proteins and with the N-terminal MutS related region of MutS2. 41745 pfam01714: Caulimovirus movement protein. This family consists of plant virus movement proteins from the caulimovirus family. These proteins are required for transmission of the virus from cell to cell. It has been suggested in cauliflower mosaic virus that these proteins mediated viral movement by modifying plasmodesmata and forming tubules in the channel that can accommodate the virus particles and references therein. The aligned region comprises almost the entire length of the caulimovirus sequences. Also in this family is ORF1 from cassava vein mosaic virus, a distinct plant pararetrovirus; here the 300 amino acids occupy only part of a longer polyprotein. 41746 pfam01715: IPP transferase. This is a family of IPP transferases EC:2.5.1.8 also known as tRNA delta(2)-isopentenylpyrophosphate transferase. These enzymes modify both cytoplasmic and mitochondrial tRNAs at A(37) to give isopentenyl A(37).. 41747 pfam01716: Manganese-stabilising protein / photosystem II polypeptide. This family consists of the 33 KDa photosystem II polypeptide from the oxygen evolving complex (OEC) of plants and cyanobacteria. The protein is also known as the manganese-stabilising protein as it is associated with the manganese complex of the OEC and may provide the ligands for the complex. 41748 pfam01717: Methionine synthase, vitamin-B12 independent. This is a family of vitamin-B12 independent methionine synthases or 5-methyltetrahydropteroyltriglutamate--homocysteine methyltransferases, EC:2.1.1.14 from bacteria and plants. Plants are the only higher eukaryotes that have the required enzymes for methionine synthesis. This enzyme catalyses the last step in the production of methionine by transferring a methyl group from 5-methyltetrahydrofolate to homocysteine. The aligned region makes up the carboxy region of the approximately 750 amino acid protein except in some hypothetical archaeal proteins present in the family, where this region corresponds to the entire length. 41749 pfam01718: Orbivirus non-structural protein NS1, or hydrophobic tubular protein. This family consists of orbivirus non-structural protein NS1, or hydrophobic tubular protein. NS1 has no specific function in virus replication, it is however thought to play a role in transport of mature virus particles from virus inclusion bodies to the cell membrane. Orbivirus are part of the larger reoviridae which have a dsRNA genome of at least 10 segments encoding at least 10 viral proteins; orbivirus found in this family include bluetongue virus, and African horsesickness virus. 41750 pfam01719: Plasmid replication protein. This family consists of various bacterial plasmid replication (Rep) proteins. These proteins are essential for replication of plasmids, the Rep proteins are topoisomerases that nick the positive stand at the plus origin of replication and also at the single-strand conversion sequence. 41751 pfam01721: Class II bacteriocin. The bacteriocins are small peptides that inhibit the growth of various bacteria. Bacteriocins of lactic acid bacteria may inhibit their target cells by permeabilising the cell membrane. 41752 pfam01722: BolA-like protein. This family consist of the morphoprotein BolA from E. coli and its various homologues. In E. coli over expression of this protein causes round morphology and may be involved in switching the cell between elongation and septation systems during cell division. The expression of BolA is growth rate regulated and is induced during the transition into the the stationary phase. BolA is also induced by stress during early stages of growth and may have a general role in stress response. It has also been suggested that BolA can induce the transcription of penicillin binding proteins 6 and 5. 41753 pfam01723: Chorion protein. This family consists of the chorion superfamily proteins classes A, B, CA, CB and high-cysteine HCB from silk, gypsy and polyphemus moths. The chorion proteins make up the moths egg shell a complex extracellular structure. 41754 pfam01724: Domain of unknown function DUF29. This family consists of various hypothetical proteins from cyanobacteria, none of which are functionally described. The aligned region is approximately 120-140 amino acids long corresponding to almost the entire length of the proteins in the family. 41755 pfam01725: Ham1 family. This family consists of the HAM1 protein, and hypothetical archaeal bacterial and C. elegans proteins. HAM1 controls 6-N-hydroxylaminopurine (HAP) sensitivity and mutagenesis in S. cerevisiae. The HAM1 protein protects the cell from HAP, either on the level of deoxynucleoside triphosphate or the DNA level by a yet unidentified set of reactions. 41756 pfam01726: LexA DNA binding domain. This is the DNA binding domain of the LexA SOS regulon repressor which prevents expression of DNA repair proteins. The aligned region contains a variant form of the helix-turn-helix DNA binding motif. This domain is found associated with pfam00717 the auto-proteolytic domain of LexA EC:3.4.21.88. 41757 pfam01727: Domain of unknown function DUF30. This family of domains are found in several putative lipoproteins from mycoplasmas. The domain is also found in isolation in some proteins. The function of this domain is unknown. 41758 pfam01728: FtsJ-like methyltransferase. This family consists of FtsJ from various bacterial and archaeal sources FtsJ is a methyltransferase, but actually has no effect on cell division. FtsJ's substrate is the 23S rRNA. The 1.5 A crystal structure of FtsJ in complex with its cofactor S-adenosylmethionine revealed that FtsJ has a methyltransferase fold. This family also includes the N terminus of flaviviral NS5 protein. It has been hypothesised that the N-terminal domain of NS5 is a methyltransferase involved in viral RNA capping. . 41759 pfam01729: Quinolinate phosphoribosyl transferase, C-terminal domain. Quinolinate phosphoribosyl transferase (QPRTase) or nicotinate-nucleotide pyrophosphorylase EC:2.4.2.19 is involved in the de novo synthesis of NAD in both prokaryotes and eukaryotes. It catalyses the reaction of quinolinic acid with 5-phosphoribosyl-1-pyrophosphate (PRPP) in the presence of Mg2+ to give rise to nicotinic acid mononucleotide (NaMN), pyrophosphate and carbon dioxide. The QA substrate is bound between the C-terminal domain of one subunit, and the N-terminal domain of the other. The C-terminal domain has a 7 beta-stranded TIM barrel-like fold. 41760 pfam01730: UreF. This family consists of the Urease accessory protein UreF. The urease enzyme (urea amidohydrolase) hydrolyses urea into ammonia and carbamic acid. UreF is proposed to modulate the activation process of urease by eliminating the binding of nickel irons to noncarbamylated protein. 41761 pfam01731: Arylesterase. This family consists of arylesterases (Also known as serum paraoxonase) EC:3.1.1.2. These enzymes hydrolyses organophosphorus esters such as paraoxon and are found in the liver and blood. They confer resistance to organophosphate toxicity. Human arylesterase (PON1) is associated with HDL and may protect against LDL oxidation. 41762 pfam01732: Domain of unknown function DUF31. This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas. 41763 pfam01733: Nucleoside transporter. This is a family of nucleoside transporters. In mammalian cells nucleoside transporters transport nucleoside across the plasma membrane and are essential for nucleotide synthesis via the salvage pathways for cells that lack their own de novo synthesis pathways. Also in this family is mouse and human nucleolar protein HNP36, a protein of unknown function; although it has been hypothesised to be a plasma membrane nucleoside transporter. 41764 pfam01734: Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein but it also has the enzymatic activity of lipid acyl hydrolase, catalysing the cleavage of fatty acids from membrane lipids. Members of this family have been found also in vertebrates. 41765 pfam01735: Lysophospholipase catalytic domain. This family consists of Lysophospholipase / phospholipase B EC:3.1.1.5 and cytosolic phospholipase A2 EC:3.1.4 which also has a C2 domain pfam00168. Phospholipase B enzymes catalyse the release of fatty acids from lysophsopholipids and are capable in vitro of hydrolysing all phospholipids extractable form yeast cells. Cytosolic phospholipase A2 associates with natural membranes in response to physiological increases in Ca2+ and selectively hydrolyses arachidonyl phospholipids, the aligned region corresponds the the carboxy-terminal Ca2+-independent catalytic domain of the protein as discussed in. 41766 pfam01736: Polyomavirus agnoprotein. This family consist of the DNA binding protein or agnoprotein from various polyomaviruses. This protein is highly basic and can bind single stranded and double stranded DNA. Mutations in the agnoprotein produce smaller viral plaques, hence its function is not essential for growth in tissue culture cells but something has slowed in the normal replication cycle. There is also evidence suggesting that the agnogene and agnoprotein act as regulators of structural protein synthesis. 41767 pfam01737: YCF9. This family consists of the hypothetical protein product of the YCF9 gene from chloroplasts and cyanobacteria. These proteins have no known function. 41768 pfam01738: Dienelactone hydrolase family. 41769 pfam01739: CheR methyltransferase, SAM binding domain. CheR proteins are part of the chemotaxis signaling mechanism in bacteria. CheR methylates the chemotaxis receptor at specific glutamate residues. CheR is an S-adenosylmethionine- dependent methyltransferase - the C-terminal domain (this one) binds SAM. 41770 pfam01740: STAS domain. The STAS (after Sulphate Transporter and AntiSigma factor antagonist) domain is found in the C terminal region of Sulphate transporters and bacterial antisigma factor antagonists. It has been suggested that this domain may have a general NTP binding function. 41771 pfam01741: Large-conductance mechanosensitive channel, MscL. 41772 pfam01742: Clostridial neurotoxin zinc protease. These toxins are zinc proteases that block neurotransmitter release by proteolytic cleavage of synaptic proteins such as synaptobrevins, syntaxin and SNAP-25. 41773 pfam01743: Poly A polymerase family. This family includes nucleic acid independent RNA polymerases, such as Poly(A) polymerase, which adds the poly (A) tail to mRNA EC:2.7.7.19. This family also includes the tRNA nucleotidyltransferase that adds the CCA to the 3' of the tRNA EC:2.7.7.25. 41774 pfam01744: GLTT repeat (6 copies). This short repeat of unknown function is found in multiple copies in several C. elegans proteins. The repeat is five residues long and consists of XGLTT where X can be any amino acid. 41775 pfam01745: Isopentenyl transferase. Isopentenyl transferase / dimethylallyl transferase synthesises isopentenyladensosine 5 '-monophosphate, a cytokinin that induces shoot formation on host plants infected with the Ti plasmid. 41776 pfam01746: tRNA (Guanine-1)-methyltransferase. This is a family of tRNA (Guanine-1)-methyltransferases EC:2.1.1.31. In E.coli K12 this enzyme catalyses the conversion of a guanosine residue to N1-methylguanine in position 37, next to the anticodon, in tRNA. 41777 pfam01747: ATP-sulfurylase. This family consists of ATP-sulfurylase or sulfate adenylyltransferase EC:2.7.7.4 some of which are part of a bifunctional polypeptide chain associated with adenosyl phosphosulphate (APS) kinase pfam01583. Both enzymes are required for PAPS (phosphoadenosine-phosphosulfate) synthesis from inorganic sulphate. ATP sulfurylase catalyses the synthesis of adenosine-phosphosulfate APS from ATP and inorganic sulphate. 41778 pfam01748: Domain of unknown function DUF32. This domain is found in hypothetical C. elegans proteins all of which are function unknown. The aligned region is approximately 160 amino acids long. 41779 pfam01749: Importin beta binding domain. This family consists of the importin alpha (karyopherin alpha), importin beta (karyopherin beta) binding domain. The domain mediates formation of the importin alpha beta complex; required for classical NLS import of proteins into the nucleus, through the nuclear pore complex and across the nuclear envelope. Also in the alignment is the NLS of importin alpha which overlaps with the IBB domain. 41780 pfam01750: Hydrogenase maturation protease. The family consists of hydrogenase maturation proteases. In E. coli HypI the hydrogenase maturation protease is involved in processing of HypE the large subunit of hydrogenases 3, by cleavage of its C-terminal. 41781 pfam01751: Toprim domain. This is a conserved region from DNA primase. This corresponds to the Toprim domain common to DnaG primases, topoisomerases, OLD family nucleases and RecR proteins. Both DnaG motifs IV and V are present in the alignment, the DxD (V) motif may be involved in Mg2+ binding and mutations to the conserved glutamate (IV) completely abolish DnaG type primase activity. DNA primase EC:2.7.7.6 is a nucleotidyltransferase it synthesises the oligoribonucleotide primers required for DNA replication on the lagging strand of the replication fork; it can also prime the leading stand and has been implicated in cell division. This family also includes the atypical archaeal A subunit from type II DNA topoisomerases. Type II DNA topoisomerases catalyse the relaxation of DNA supercoiling by causing transient double strand breaks. 41782 pfam01752: Collagenase. This family of enzymes break down collagens. 41783 pfam01753: MYND finger. 41784 pfam01754: A20-like zinc finger. A20- (an inhibitor of cell death)-like zinc fingers. The zinc finger mediates self-association in A20. These fingers also mediate IL-1-induced NF-kappa B activation. 41785 pfam01755: Glycosyltransferase family 25 (LPS biosynthesis protein). Members of this family belong to Glycosyltransferase family 25. This is a family of glycosyltransferases involved in lipopolysaccharide (LPS) biosynthesis. These enzymes catalyse the transfer of various sugars onto the growing LPS chain during its biosynthesis. 41786 pfam01756: Acyl-CoA oxidase. This is a family of Acyl-CoA oxidases EC:1.3.3.6. Acyl-coA oxidase converts acyl-CoA into trans-2- enoyl-CoA. 41788 pfam01758: Sodium Bile acid symporter family. This family consists of Na+/bile acid co-transporters. These transmembrane proteins function in the liver in the uptake of bile acids from portal blood plasma a process mediated by the co-transport of Na+. Also in the family is ARC3 from S. cerevisiae, this is a putative transmembrane protein involved in resistance to arsenic compounds. 41789 pfam01759: NTR/C345C module. We have not included the related pfam00965 family. It has been suggested that the common function of these modules is binding to metzincins. A subset of this family is known as the C345C domain because it occurs in complement C3, C4 and C5. 41790 pfam01761: 3-dehydroquinate synthase. The 3-dehydroquinate synthase EC:4.6.1.3 domain is present in isolation in various bacterial 3-dehydroquinate synthases and also present as a domain in the pentafunctional AROM polypeptide. 3-dehydroquinate (DHQ) synthase catalyses the formation of dehydroquinate (DHQ) and orthophosphate from 3-deoxy-D-arabino heptulosonic 7 phosphate. This reaction is part of the shikimate pathway which is involved in the biosynthesis of aromatic amino acids. 41791 pfam01762: Galactosyltransferase. This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2. 41792 pfam01763: Herpesvirus UL6 like. This family consists of various proteins from the herpesviridae that are similar to herpes simplex virus type I UL6 virion protein. UL6 is essential for cleavage and packaging of the viral genome. 41793 pfam01764: Lipase (class 3).. 41794 pfam01765: Ribosome recycling factor. The ribosome recycling factor (RRF / ribosome release factor) dissociates the ribosome from the mRNA after termination of translation, and is essential bacterial growth. Thus ribosomes are ""recycled"" and ready for another round of protein synthesis. 41795 pfam01766: Birnavirus VP2 protein. VP2 is the major structural protein of birnaviruses. The large RNA segment of birnaviruses codes for a polyprotein (N-VP2-VP4-VP3-C).. 41796 pfam01767: Birnavirus VP3 protein. VP3 is a minor structural component of the virus. The large RNA segment of birnaviruses codes for a polyprotein (N-VP2-VP4-VP3-C).. 41797 pfam01768: Birnavirus VP4 protein. VP4 is a viral protease. The large RNA segment of birnaviruses codes for a polyprotein (N-VP2-VP4-VP3-C).. 41798 pfam01769: Divalent cation transporter. This region is the integral membrane part of the eubacterial MgtE family of magnesium transporters. Related regions are found also in archaebacterial and eukaryotic proteins. All the archaebacterial and eukaryotic examples have two copies of the region. This suggests that the eubacterial examples may act as dimers. Members of this family probably transport Mg2+ or other divalent cations into the cell. The alignment contains two highly conserved aspartates that may be involved in cation binding (Bateman A unpubl.). 41799 pfam01770: Reduced folate carrier. The reduced folate carrier (a transmembrane glycoprotein) transports reduced folate into mammalian cells via the carrier mediated mechanism (as opposed to the receptor mediated mechanism) it also transports cytotoxic folate analogues used in chemotherapy, such as methotrexate (MTX). Mammalian cells have an absolute requirement for exogenous folates which are needed for growth, and biosynthesis of macromolecules. 41800 pfam01771: Herpesvirus alkaline exonuclease. This family includes various alkaline exonucleases from members of the herpesviridae. Alkaline exonuclease appears to have an important role in the replication of herpes simplex virus. 41801 pfam01773: Na+ dependent nucleoside transporter. This family consists of nucleoside transport proteins. One member is a purine-specific Na+-nucleoside cotransporter localised to the bile canalicular membrane. One member is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC. 41802 pfam01774: UreD urease accessory protein. UreD is a urease accessory protein. Urease pfam00449 hydrolyses urea into ammonia and carbamic acid. UreD is involved in activation of the urease enzyme via the UreD-UreF-UreG-urease complex and is required for urease nickel metallocenter assembly. See also UreF pfam01730, UreG pfam01495. . 41803 pfam01775: Ribosomal L18ae protein family. 41804 pfam01776: Ribosomal L22e protein family. 41805 pfam01777: Ribosomal L27e protein family. The N-terminal region of the eukaryotic ribosomal L27 has the KOW motif. C-terminal region is represented by this family. . 41806 pfam01778: Ribosomal L28e protein family. 41807 pfam01779: Ribosomal L29e protein family. 41808 pfam01780: Ribosomal L37ae protein family. This ribosomal protein is found in archaebacteria and eukaryotes. It contains four conserved cysteine residues that may bind to zinc. 41809 pfam01781: Ribosomal L38e protein family. 41810 pfam01782: RimM N-terminal domain. The RimM protein is essential for efficient processing of 16S rRNA. The RimM protein was shown to have affinity for free ribosomal 30S subunits but not for 30S subunits in the 70S ribosomes. This N-terminal domain is found associated with a PRC-barrel domain. . 41811 pfam01783: Ribosomal L32p protein family. 41812 pfam01784: NIF3 (NGG1p interacting factor 3). This family contains several NIF3 (NGG1p interacting factor 3) protein homologues. NIF3 interacts with the yeast transcriptional coactivator NGG1p which is part of the ADA complex, the exact function of this interaction is unknown. 41813 pfam01785: Closterovirus coat protein. This family consist of coat proteins from closteroviruses a member of the closteroviridae. The viral coat protein encapsulates and protects the viral genome. Both the large cp1 and smaller cp2 coat protein originate from the same primary transcript. Members of the closteroviridae include Sugar beet yellow virus and Grapevine leafroll-associated virus, closteroviruses have a positive strand ssRNA genome with no DNA stage during replication. 41814 pfam01786: Alternative oxidase. The alternative oxidase is used as a second terminal oxidase in the mitochondria, electrons are transfered directly from reduced ubiquinol to oxygen forming water. This is not coupled to ATP synthesis and is not inhibited by cyanide, this pathway is a single step process. In rice the transcript levels of the alternative oxidase are increased by low temperature. 41815 pfam01787: Ilarvirus coat protein. This family consists of various coat proteins from the ilarviruses part of the Bromoviridae, members include apple mosaic virus and prune dwarf virus. The ilarvirus coat protein is required to initiate replication of the viral genome in host plants. Members of the Bromoviridae have a positive stand ssRNA genome with no DNA stage in there replication. 41816 pfam01788: PsbJ. This family consists of the photosystem II reaction centre protein PsbJ from plants and Cyanobacteria. In Synechocystis sp. PCC 6803 PsbJ regulates the number of photosystem II centres in thylakoid membranes, it is a predicted 4kDa protein with one membrane spanning domain. 41817 pfam01789: PsbP. This family consists of the 23 kDa subunit of oxygen evolving system of photosystem II or PsbP from various plants (where it is encoded by the nuclear genome) and Cyanobacteria. The 23 KDa PsbP protein is required for PSII to be fully operational in vivo, it increases the affinity of the water oxidation site for Cl- and provides the conditions required for high affinity binding of Ca2+.. 41818 pfam01790: Prolipoprotein diacylglyceryl transferase. 41819 pfam01791: Deoxyribose-phosphate aldolase. This family includes the enzyme deoxyribose-phosphate aldolase EC:4.1.2.4, which is involved in nucleotide metabolism. The family also includes a group of related bacterial proteins of unknown function. 41820 pfam01793: Glycolipid 2-alpha-mannosyltransferase. This is a family of alpha-1,2 mannosyl-transferases involved in N-linked and O-linked glycosylation of proteins. Some of the enzymes in this family have been shown to be involved in O- and N-linked glycan modifications in the Golgi. 41821 pfam01794: Ferric reductase like transmembrane component. This family includes a common region in the transmembrane proteins mammalian cytochrome B-245 heavy chain (gp91-phox), ferric reductase transmembrane component in yeast and respiratory burst oxidase from mouse-ear cress. This may be a family of flavocytochromes capable of moving electrons across the plasma membrane. The Frp1 protein from S. pombe is a ferric reductase component and is required for cell surface ferric reductase activity, mutants in frp1 are deficient in ferric iron uptake. Cytochrome B-245 heavy chain is a FAD-dependent dehydrogenase it is also has electron transferase activity which reduces molecular oxygen to superoxide anion, a precursor in the production of microbicidal oxidants. Mutations in the sequence of cytochrome B-245 heavy chain (gp91-phox) lead to the X-linked chronic granulomatous disease. The bacteriocidal ability of phagocytic cells is reduced and is characterised by the absence of a functional plasma membrane associated NADPH oxidase. The chronic granulomatous disease gene codes for the beta chain of cytochrome B-245 and cytochrome B-245 is missing from patients with the disease. The aligned region includes a potential FAD binding domain. 41822 pfam01795: MraW methylase family. Members of this family are probably SAM dependent methyltransferases. This family appears to be related to pfam01596. 41823 pfam01796: Domain of unknown function DUF35. This domain has no known function and is found in conserved hypothetical archaeal and bacterial proteins. The domain is approximately 120 amino acids long. The domain is duplicated in one member. 41824 pfam01797: Transposase IS200 like. Transposases are needed for efficient transposition of the insertion sequence or transposon DNA. This family includes transposases for IS200 from E. coli. 41825 pfam01798: Putative snoRNA binding domain. This family consists of various Pre RNA processing ribonucleoproteins. The function of the aligned region is unknown however it may be a common RNA or snoRNA or Nop1p binding domain. Nop5p (Nop58p) from yeast is the protein component of a ribonucleoprotein protein required for pre-18s rRNA processing and is suggested to function with Nop1p in a snoRNA complex. Nop56p and Nop5p interact with Nop1p and are required for ribosome biogenesis. Prp31p is required for pre-mRNA splicing in S. cerevisiae. 41826 pfam01799: [2Fe-2S] binding domain. 41827 pfam01801: Cytomegalovirus glycoprotein L. Glycoprotein L from cytomegalovirus serves a chaperone for the correct folding and surface expression of glycoprotein H (gH). Glycoprotein L is a member of the heterotrimeric gCIII complex of glycoprotein which also includes gH and gO and has an essential role in viral fusion. 41828 pfam01802: Herpesvirus VP23 like capsid protein. This family consist of various capsid proteins from members of the herpesviridae. The capsid protein VP23 in herpes simplex virus forms a triplex together with VP19C these fit between and link together adjacent capsomers as formed by VP5 and VP26. VP3 along with the scaffolding proteins helps to form normal capsids by defining the curvature of the shell and size of the particle. 41829 pfam01803: LIM-domain binding protein. The LIM-domain binding protein, binds to the LIM domain pfam00412 of LIM homeodomain proteins which are transcriptional regulators of development. Nuclear LIM interactor (NLI) / LIM domain-binding protein 1 (LDB1) is located in the nuclei of neuronal cells during development, it is co-expressed with Isl1 in early motor neuron differentiation and has a suggested role in the Isl1 dependent development of motor neurons. It is suggested that these proteins act synergistically to enhance transcriptional efficiency by acting as co-factors for LIM homeodomain and Otx class transcription factors both of which have essential roles in development. The Drosophila protein Chip is required for segmentation and activity of a remote wing margin enhancer. Chip is a ubiquitous chromosomal factor required for normal expression of diverse genes at many stages of development. It is suggested that Chip cooperates with different LIM domain proteins and other factors to structurally support remote enhancer-promoter interactions. 41830 pfam01804: Penicillin amidase. Penicillin amidase or penicillin acylase EC:3.5.1.11 catalyses the hydrolysis of benzylpenicillin to phenylacetic acid and 6-aminopenicillanic acid (6-APA) a key intermediate in the the synthesis of penicillins. Also in the family is cephalosporin acylase and aculeacin A acylase which are involved in the synthesis of related peptide antibiotics. 41831 pfam01805: Surp module. This domain is also known as the SWAP domain. SWAP stands for Suppressor-of-White-APricot. It has been suggested that these domains may be RNA binding. 41832 pfam01806: Paramyxovirus P phosphoprotein. This family consists of paramyxovirus P phosphoprotein from sendai virus and human and bovine parainfluenza viruses. The P protein is an essential part of the viral RNA polymerase complex formed form the P and L proteins. The exact role of the P protein in this complex in unknown but it is involved in multiple protein-protein interactions and binding the polymerase complex to the nucleocapsid or ribonucleoprotein template. It also appears to be important for the proper folding of the L protein. The paramyxoviruses have a negative sense ssRNA genome. 41833 pfam01807: CHC2 zinc finger. This domain is principally involved in DNA binding in DNA primases. 41834 pfam01808: AICARFT/IMPCHase bienzyme. This is a family of bifunctional enzymes catalysing the last two steps in de novo purine biosynthesis. The bifunctional enzyme is found in both prokaryotes and eukaryotes. The second last step is catalysed by 5-aminoimidazole-4-carboxamide ribonucleotide formyltransferase EC:2.1.2.3 (AICARFT), this enzyme catalyses the formylation of AICAR with 10-formyl-tetrahydrofolate to yield FAICAR and tetrahydrofolate. The last step is catalysed by IMP (Inosine monophosphate) cyclohydrolase EC:3.5.4.10 (IMPCHase), cyclizing FAICAR (5-formylaminoimidazole-4-carboxamide ribonucleotide) to IMP. 41835 pfam01809: Domain of unknown function DUF37. This domain is found in short (70 amino acid) hypothetical proteins from various bacteria. The domain contains three conserved cysteine residues. A member from Aeromonas hydrophila has been found to have hemolytic activity (unpublished).. 41836 pfam01810: LysE type translocator. This family consists of various hypothetical proteins and an l-lysine exporter LysE from Corynebacterium glutamicum which is proposed to be the first of a novel family of translocators. LysE exports l-lysine from the cell into the surrounding medium and is predicted to span the membrane six times. The physiological function of the exporter is to excrete excess l-Lysine as a result of natural flux imbalances or peptide hydrolysis; and also after artificial deregulation of l-Lysine biosynthesis as used by the biotechnology. industry for the production of l-lysine. 41837 pfam01812: 5-formyltetrahydrofolate cyclo-ligase family. 5-formyltetrahydrofolate cyclo-ligase or methenyl-THF synthetase EC:6.3.3.2 catalyses the interchange of 5-formyltetrahydrofolate (5-FTHF) to 5-10-methenyltetrahydrofolate, this requires ATP and Mg2+. 5-FTHF is used in chemotherapy where it is clinically known as Leucovorin. 41838 pfam01813: ATP synthase subunit D. This is a family of subunit D form various ATP synthases including V-type H+ transporting and Na+ dependent. Subunit D is suggested to be an integral part of the catalytic sector of the V-ATPase. 41839 pfam01814: Hemerythrin. 41840 pfam01815: Rop protein. 41841 pfam01816: Leucine rich repeat variant. The function of this repeat is unknown. It has an unusual structure of two helices. One is an alpha helix, the other is the much rarer 3-10 helix. 41842 pfam01817: Chorismate mutase. Chorismate mutase EC:5.4.99.5 catalyses the conversion of chorismate to prephenate in the pathway of tyrosine and phenylalanine biosynthesis. This enzyme is negatively regulated by tyrosine, tryptophan and phenylalanine. 41843 pfam01818: Bacteriophage translational regulator. The translational regulator protein regA is encoded by the T4 bacteriophage and binds to a region of messenger RNA (mRNA) that includes the initiator codon. RegA is unusual in that it represses the translation of about 35 early T4 mRNAs but does not affect nearly 200 other mRNAs. 41844 pfam01819: Levivirus coat protein. The Levivirus coat protein forms the bacteriophage coat that encapsidates the viral RNA. 180 copies of this protein form the virion shell. The MS2 bacteriophage coat protein controls two distinct processes: sequence-specific RNA encapsidation and repression of replicase translation-by binding to an RNA stem-loop structure of 19 nucleotides containing the initiation codon of the replicase gene. The binding of a coat protein dimer to this hairpin shuts off synthesis of the viral replicase, switching the viral replication cycle to virion assembly rather than continued replication. 41845 pfam01820: D-ala D-ala ligase. This family contains D-alanine--D-alanine ligase enzymes EC:6.3.2.4. 41846 pfam01821: Anaphylotoxin-like domain. C3a, C4a and C5a anaphylatoxins are protein fragments generated enzymatically in serum during activation of complement molecules C3, C4, and C5. They induce smooth muscle contraction. These fragments are homologous to a three-fold repeat in fibulins. 41847 pfam01822: WSC domain. This domain may be involved in carbohydrate binding. 41848 pfam01823: MAC/Perforin domain. The membrane-attack complex (MAC) of the complement system forms transmembrane channels. These channels disrupt the phospholipid bilayer of target cells, leading to cell lysis and death. A number of proteins participate in the assembly of the MAC. Freshly activated C5b binds to C6 to form a C5b-6 complex, then to C7 forming the C5b-7 complex. The C5b-7 complex binds to C8, which is composed of three chains (alpha, beta, and gamma), thus forming the C5b-8 complex. C5b-8 subsequently binds to C9 and acts as a catalyst in the polymerisation of C9. Active MAC has a subunit composition of C5b-C6-C7-C8-C9{n}. Perforin is a protein found in cytolytic T-cell and killer cells. In the presence of calcium, perforin polymerises into transmembrane tubules and is capable of lysing, non-specifically, a variety of target cells. There are a number of regions of similarity in the sequences of complement components C6, C7, C8-alpha, C8-beta, C9 and perforin. 41849 pfam01824: MatK/TrnK amino terminal region. The function of this region is unknown. 41850 pfam01825: Latrophilin/CL-1-like GPS domain. Domain present in latrophilin/CL-1, sea urchin REJ and polycystin. 41851 pfam01826: Trypsin Inhibitor like cysteine rich domain. This family contains trypsin inhibitors as well as a domain found in many extracellular proteins. The domain typically contains ten cysteine residues that form five disulphide bonds. The cysteine residues that form the disulphide bonds are 1-7, 2-6, 3-5, 4-10 and 8-9. 41852 pfam01827: FTH domain. This presumed domain is likely to be a protein-protein interaction module. It is found in many proteins from C. elegans. The domain is found associated with the F-box pfam00646. This domain is named FTH after FOG-2 homology domain. 41853 pfam01828: Peptidase A4 family. 41854 pfam01829: Peptidase A6 family. 41855 pfam01830: Peptidase C7 family. 41856 pfam01831: Peptidase C16 family. 41857 pfam01832: Mannosyl-glycoprotein endo-beta-N-acetylglucosamidase. This family includes Mannosyl-glycoprotein endo-beta-N-acetylglucosamidase EC:3.2.1.96, as well as the flageller protein J, which has been shown to hydrolyse peptidoglycan. 41858 pfam01833: IPT/TIG domain. This family consists of a domain that has an immunoglobulin like fold. These domains are found in cell surface receptors such as Met and Ron as well as in intracellular transcription factors where it is involved in DNA binding. CAUTION: This family does not currently recognise a significant number of members. 41859 pfam01834: XRCC1 N terminal domain. 41860 pfam01835: Alpha-2-macroglobulin family N-terminal region. This family includes the N-terminal region of the alpha-2-macroglobulin family. 41861 pfam01837: Domain of unknown function DUF39. This presumed domain is about is about 320 residues long. It is found in proteins that have two C-terminal pfam00571 domains. The function of this domain is unknown. Members may have been misannotated as inosine monophosphate dehydrogenases based on the similarity to the CBS domains. 41862 pfam01838: Domain of unknown function DUF40. This domain is found in a large number of C. elegans proteins. This domain is about 240 amino acids long and is found to be duplicated in one member. The domain contains a cluster of four conserved cysteines. 41863 pfam01839: FG-GAP repeat. This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats. 41864 pfam01840: TCL1/MTCP1 family. Two related oncogenes, TCL-1 and MTCP-1, are overexpressed in T cell prolymphocytic leukaemias as a result of chromosomal rearrangements that involve the translocation of one T cell receptor gene to either chromosome 14q32 or Xq28. This family contains two repeated motifs that form a single globular domain. 41865 pfam01841: Transglutaminase-like superfamily. This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterised transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease. 41866 pfam01842: ACT domain. This family of domains generally have a regulatory role. ACT domains are linked to a wide range of metabolic enzymes that are regulated by amino acid concentration. Pairs of ACT domains bind specifically to a particular amino acid leading to regulation of the linked enzyme. The ACT domain is found in: D-3-phosphoglycerate dehydrogenase EC:1.1.1.95, which is inhibited by serine. Aspartokinase EC:2.7.2.4, which is regulated by lysine. Acetolactate synthase small regulatory subunit, which is inhibited by valine. Phenylalanine-4-hydroxylase EC:1.14.16.1, which is regulated by phenylalanine. Prephenate dehydrogenase EC:4.2.1.51. formyltetrahydrofolate deformylase EC:3.5.1.10, which is activated by methionine and inhibited by glycine. GTP pyrophosphokinase EC:2.7.6.5. 41867 pfam01843: DIL domain. The DIL domain has no known function. 41868 pfam01844: HNH endonuclease. 41869 pfam01845: CcdB protein. 41870 pfam01846: FF domain. This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions. 41871 pfam01847: von Hippel-Lindau disease tumour suppressor protein. VHL forms a ternary complex with the elonginB and elonginC proteins. This complex binds Cul2, which then is involved in regulation of vascular endothelial growth factor mRNA. 41872 pfam01848: Hok/gef family. 41873 pfam01849: NAC domain. 41874 pfam01850: PIN domain. This PIN (PilT N terminus) domain is a compact domain of about 100 amino acids. The domain has two nearly invariant aspartates. The function of the PIN domain is unknown but a role in signaling appears likely given the presence of this domain in StbB and DIS3. 41875 pfam01851: Proteasome/cyclosome repeat. 41876 pfam01852: START domain. 41877 pfam01853: MOZ/SAS family. This region of these proteins has been suggested to be homologous to acetyltransferases. 41878 pfam01855: Pyruvate flavodoxin/ferredoxin oxidoreductase, thiamine diP-binding domain. This family includes the N terminal structural domain of the pyruvate ferredoxin oxidoreductase. This domain binds thiamine diphosphate, and along with domains II and IV, is involved in inter subunit contacts. The family also includes pyruvate flavodoxin oxidoreductase as encoded by the nifJ gene in cyanobacterium which is required for growth on molecular nitrogen when iron is limited. 41879 pfam01856: Outer membrane protein. This family seems confined to Helicobacter pylori. It is predicted to be an outer membrane protein based on its pattern of alternating hydrophobic amino acids similar to porins. 41880 pfam01857: Retinoblastoma-associated protein B domain. The crystal structure of the Rb pocket bound to a nine-residue E7 peptide containing the LxCxE motif, shared by other Rb-binding viral and cellular proteins, shows that the LxCxE peptide binds a highly conserved groove on the B domain. The B domain has a cyclin fold. . 41881 pfam01858: Retinoblastoma-associated protein A domain. This domain has the cyclin fold as predicted. 41882 pfam01861: Protein of unknown function DUF43. This family includes archaebacterial proteins of unknown function. All the members are 350-400 amino acids long. 41883 pfam01862: Pyruvoyl-dependent arginine decarboxylase (PvlArgDC). Methanococcus jannaschii contains homologues of most genes required for spermidine polyamine biosynthesis. Yet genomes from neither this organism nor any other euryarchaeon have orthologues of the pyridoxal 5 '-phosphate- dependent ornithine or arginine decarboxylase genes, required to produce putrescine. Instead,these organisms have a new class of arginine decarboxylase (PvlArgDC) formed by the self-cleavage of a proenzyme into a 5-kDa subunit and a 12-kDa subunit that contains a reactive pyruvoyl group. Although this extremely thermostable enzyme has no significant sequence similarity to previously characterised proteins, conserved active site residues are similar to those of the pyruvoyl-dependent histidine decarboxylase enzyme, and its subunits form a similar (alpha-beta)(3) complex. Homologues of PvlArgDC are found in several bacterial genomes, including those of Chlamydia spp., which have no agmatine ureohydrolase enzyme to convert agmatine (decarboxylated arginine) into putrescine. In these intracellular pathogens, PvlArgDC may function analogously to pyruvoyl-dependent histidine decarboxylase; the cells are proposed to import arginine and export agmatine, increasing the pH and affecting the host cell's metabolism. Phylogenetic analysis of Pvl- ArgDC proteins suggests that this gene has been recruited from the euryarchaeal polyamine biosynthetic pathway to function as a degradative enzyme in bacteria. 41884 pfam01863: Protein of unknown function DUF45. This protein has no known function. Members are found in some archaebacteria, as well as Helicobacter pylori. The proteins are 190-240 amino acids long, with the C terminus being the most conserved region, containing three conserved histidines. This motif is similar to that found in Zinc proteases, suggesting that this family may also be proteases. 41885 pfam01864: Putative integral membrane protein DUF46. This archaebacterial protein has no known function. It contains several predicted transmembrane regions, suggesting it is an integral membrane protein. 41886 pfam01865: Protein of unknown function DUF47. This family includes prokaryotic proteins of unknown function, as well as a protein annotated as the pit accessory protein from Sinorhizobium meliloti. However, the function of this protein is also unknown (Pit stands for Phosphate transport). It is probably distantly related to pfam01895 (personal obs:Yeats C).. 41887 pfam01866: Putative diphthamide synthesis protein. One member is a candidate tumour suppressor gene. DPH2 from yeast, which confers resistance to diphtheria toxin has been found to be involved in diphthamide synthesis. Diphtheria toxin inhibits eukaryotic protein synthesis by ADP-ribosylating diphthamide, a posttranslationally modified histidine residue present in EF2. The exact function of the members of this family is unknown. 41888 pfam01867: Protein of unknown function DUF48. This family of proteins are found in prokaryotes, they are functionally uncharacterized. 41889 pfam01868: Domain of unknown function UPF0086. This family consists of several archaeal and eukaryotic proteins. The archaeal proteins are found to be expressed within ribosomal operons and several of the sequences are described as ribonuclease P protein subunit p29 proteins. 41890 pfam01869: BadF/BadG/BcrA/BcrD ATPase family. This family includes the BadF and BadG proteins that are two subunits of Benzoyl-CoA reductase, that may be involved in ATP hydrolysis. The family also includes an activase subunit from the enzyme 2-hydroxyglutaryl-CoA dehydratase. One member contains two copies of this region suggesting that the family may structurally dimerise. This family appears to be related to pfam00370. 41891 pfam01870: Archaeal holliday junction resolvase (hjc). This family of archaebacterial proteins are holliday junction resolvases (hjc gene). The Holliday junction is an essential intermediate of homologous recombination. This protein is the archaeal equivalent of RuvC but is not sequence similar. 41892 pfam01871: AMMECR1. This family consists of several AMMECR1 as well as several uncharacterized proteins. The contiguous gene deletion syndrome AMME is characterized by Alport syndrome, midface hypoplasia, mental retardation and elliptocytosis and is caused by a deletion in Xq22.3, comprising several genes including COL4A5, FACL4 and AMMECR1. This family contains sequences from several eukaryotic species as well as archaebacteria and it has been suggested that the AMMECR1 protein may have a basic cellular function, potentially in either the transcription, replication, repair or translation machinery. 41893 pfam01872: RibD C-terminal domain. The function of this domain is not known, but it is thought to be involved in riboflavin biosynthesis. This domain is found in the C terminus of RibD/RibG, in combination with pfam00383, as well as in isolation in some archaebacterial proteins. This family appears to be related to pfam00186. 41894 pfam01873: Domain found in IF2B/IF5. This family includes the N terminus of eIF-5, and the C terminus of eIF-2 beta. This region corresponds to the whole of the archaebacterial eIF-2 beta homologue. The region contains a putative zinc binding C4 finger. 41895 pfam01874: ATP:dephospho-CoA triphosphoribosyl transferase. The citG gene is found in a gene cluster with citrate lyase subunits. The function of the CitG protein was elucidated as ATP:dephospho-CoA triphosphoribosyl transferase. 41896 pfam01875: Protein of unknown function DUF52. This family contains members from all branches of life. The function of this protein is unknown. 41897 pfam01876: RNase P subunit p30. This protein is part of the RNase P complex that is involved in tRNA maturation. 41898 pfam01877: Protein of unknown function DUF54. This archaebacterial family has no known function. 41899 pfam01878: Protein of unknown function DUF55. This family of proteins have no known function. 41900 pfam01880: Desulfoferrodoxin. Desulfoferrodoxins contains two types of iron: an Fe-S4 site very similar to that found in desulforedoxin from Desulfovibrio gigas and an octahedral coordinated high-spin ferrous site most probably with nitrogen/oxygen-containing ligands. Due to this rather unusual combination of active centres, this novel protein is named desulfoferrodoxin. 41901 pfam01881: Protein of unknown function DUF57. This archaebacterial protein family has no known function. 41902 pfam01882: Protein of unknown function DUF58. This family of prokaryotic proteins have no known function. A protein of unknown function in the family has been misannotated as alpha-dextrin 6-glucanohydrolase. 41903 pfam01883: Domain of unknown function DUF59. This family includes prokaryotic proteins of unknown function. The family also includes PhaH from Pseudomonas putida. PhaH forms a complex with PhaF, PhaG and PhaI, which hydroxylates phenylacetic acid to 2-hydroxyphenylacetic acid. So members of this family may all be components of ring hydroxylating complexes. 41904 pfam01884: PcrB family. This family contains proteins that are related to PcrB. The function of these proteins is unknown. 41905 pfam01885: RNA 2'-phosphotransferase, Tpt1 / KptA family. Tpt1 catalyses the last step of tRNA splicing in yeast. It transfers the splice junction 2'-phosphate from ligated tRNA to NAD, to produce ADP-ribose 1""-2""-cyclic phosphate. This is presumed to be followed by a transesterification step to release the RNA. The first step of this reaction is similar to that catalysed by some bacterial toxins. E. coli KptA and mouse Tpt1 are likely to use the same reaction mechanism. 41906 pfam01886: Protein of unknown function DUF61. Protein found in Archaebacteria. These proteins have no known function. 41907 pfam01887: Protein of unknown function DUF62. Protein found in Archaebacteria and Bacteria. These proteins have no known function. 41908 pfam01888: CbiD. CbiD is essential for cobalamin biosynthesis in both S. typhimurium and B. megaterium, no functional role has been ascribed to the protein. The CbiD protein has a putative S-AdoMet binding site. It is possible that CbiD might have the same role as CobF in undertaking the C-1 methylation and deacylation reactions required during the ring contraction process. 41909 pfam01889: Membrane protein of unknown function DUF63. Proteins found in Archaebacteria of unknown function. These proteins are probably transmembrane proteins. 41910 pfam01890: CbiG. Members of this family are involved in cobalamin synthesis. One member has been designated cbiH but in fact represents a fusion between cbiH and cbiG. As other multi-functional proteins involved in cobalamin biosynthesis catalyse adjacent steps in the pathway, including CysG, CobL (CbiET), CobIJ and CobA-HemD, it is therefore possible that CbiG catalyses a reaction step adjacent to CbiH. In the anaerobic pathway such a step could be the formation of a gamma lactone, which is thought to help to mediate the anaerobic ring contraction process. 41911 pfam01891: CbiM. This integral membrane protein is involved in cobalamin synthesis. 41912 pfam01892: Protein of unknown function DUF64. Proteins found in Bacteria and Archaebacteria. The function of these proteins are unknown. 41913 pfam01893: Uncharacterized protein family UPF0058. This archaebacterial protein has no known function. 41914 pfam01894: Uncharacterised protein family UPF0047. This family has no known function. The alignment contains a conserved aspartate and histidine that may be functionally important. 41915 pfam01895: PhoU family. This family contains phosphate regulatory proteins including PhoU. The exact nature of this regulation is is not well understood. 41916 pfam01896: DNA primase small subunit. DNA primase synthesises the RNA primers for the Okazaki fragments in lagging strand DNA synthesis. DNA primase is a heterodimer of large and small subunits. 41917 pfam01899: Na+/H+ ion antiporter subunit. Subunit of a Na+/H+ Prokaryotic antiporter complex (,).. 41918 pfam01900: Rpp14 family. tRNA processing enzyme ribonuclease P (RNase P) consists of an RNA molecule associated with at least eight protein subunits, hPop1, Rpp14, Rpp20, Rpp25, Rpp29, Rpp30, Rpp38, and Rpp40. 41919 pfam01901: Protein of unknown function DUF70. Archaebacterial proteins of unknown function. Members of this family may be transmembrane proteins. 41920 pfam01902: ATP-binding region. This family of proteins probably binds ATP. This domain is about 200 amino acids long with a strongly conserved motif SGGKD at the N terminus. In some members of this family, this domain is associated with pfam01042. . 41921 pfam01903: CbiX. The function of CbiX is uncertain, however it is found in cobalamin biosynthesis operons and so may have a related function. Some CbiX proteins contain a striking histidine-rich region at their C-terminus, which suggests that it might be involved in metal chelation. 41922 pfam01904: Protein of unknown function DUF72. The function of this family is unknown. 41923 pfam01905: Protein of unknown function DUF73. Members of this archaebacterial family have no known function. 41924 pfam01906: Domain of unknown function DUF74. Members of this protein family have no known function. The domain is about 100 amino acids long and found in prokaryotes. 41925 pfam01907: Ribosomal protein L37e. This family includes ribosomal protein L37 from eukaryotes and archaebacteria. The family contains many conserved cysteines and histidines suggesting that this protein may bind to zinc. 41926 pfam01908: Protein of unknown function DUF75. Archaebacterial proteins of unknown function. Members of this family may be transmembrane proteins. Several of the family members are thought to be 3-isopropylmalate dehydratase. 41927 pfam01909: Nucleotidyltransferase domain. Members of this family belong to a large family of nucleotidyltransferases. This family includes kanamycin nucleotidyltransferase (KNTase) which is a plasmid-coded enzyme responsible for some types of bacterial resistance to aminoglycosides. KNTase in-activates antibiotics by catalysing the addition of a nucleotidyl group onto the drug. 41928 pfam01910: Domain of unknown function DUF77. Domain of unknown function. The crystal structure of two of these members shows that this domain has a ferredoxin like fold and is likely to exists as at least homodimers. Sulphate ions are are located at the dimer interfaces, which are thought to confer additional stability. Although the function of this domain remains to be identified, its structure suggests a role in protein-protein interactions possibly regulated by the binding of small-molecule ligands. 41929 pfam01911: Ribosomal LX protein. This ribosomal protein appears to be specific to archaebacteria. 41930 pfam01912: eIF-6 family. This family includes eukaryotic translation initiation factor 6 as well as presumed archaebacterial homologues. 41931 pfam01913: Formylmethanofuran--tetrahydromethanopterin formyltransferase, distal lobe. This enzyme EC:2.3.1.101 is involved in archaebacteria in the formation of methane from carbon dioxide. N-terminal distal lobe of alpha+beta ferredoxin-like fold. SCOP reports fold duplication with C-terminal proximal lobe. 41932 pfam01914: MarC family integral membrane protein. Integral membrane protein family that includes the antibiotic resistance protein MarC. These proteins may be transporters. 41933 pfam01915: Glycosyl hydrolase family 3 C terminal domain. This domain is involved in catalysis and may be involved in binding beta-glucan. This domain is found associated with pfam00933. 41934 pfam01916: Deoxyhypusine synthase. Eukaryotic initiation factor 5A (eIF-5A) contains an unusual amino acid, hypusine [N epsilon-(4-aminobutyl-2-hydroxy)lysine]. The first step in the post-translational formation of hypusine is catalysed by the enzyme deoxyhypusine synthase (DS) EC:1.1.1.249. The modified version of eIF-5A, and DS, are required for eukaryotic cell proliferation. 41935 pfam01917: Archaebacterial flagellin. Members of this family are the proteins that form the flagella in archaebacteria. 41936 pfam01918: Protein of unknown function DUF78. This small archaebacterial protein has no known function. 41937 pfam01919: Protein of unknown function DUF79. The function of this prokaryotic protein family is unknown. 41938 pfam01920: KE2 family protein. The function of members of this family is unknown, although they have been suggested to contain a DNA binding leucine zipper motif. 41939 pfam01921: tRNA synthetases class I (K). This family includes only lysyl tRNA synthetases from prokaryotes. 41940 pfam01922: SRP19 protein. The signal recognition particle (SRP) binds to the signal peptide of proteins as they are being translated. The binding of the SRP halts translation and the complex is then transported to the endoplasmic reticulum's cytoplasmic surface. The SRP then aids translocation of the protein through the ER membrane. The SRP is a ribonucleoprotein that is composed of a small RNA and several proteins. One of these proteins is the SRP19 protein (Sec65 in yeast).. 41941 pfam01923: Cobalamin adenosyltransferase. Cobalamin adenosyltransferase This family contains the gene products of PduO and EutT which are both cobalamin adenosyltransferases. PduO is a protein with ATP:cob(I)alamin adenosyltransferase activity. The main role of this protein is the conversion of inactive cobalamins to AdoCbl for 1,2-propanediol degradation.The EutT enzyme appears to be an adenosyl transferase, converting CNB12 to AdoB12. 41942 pfam01924: Hydrogenase formation hypA family. HypD is involved in hydrogenase formation. It contains many possible metal binding residues, which may bind to nickel. Transposon Tn5 insertions into hypD resulted in R. leguminosarum mutants that lacked any hydrogenase activity in symbiosis with peas. 41943 pfam01925: Domain of unknown function DUF81. This integral membrane protein family has no known function. The alignment appears to contain two duplicated modules of three transmembrane helices. 41944 pfam01926: GTPase of unknown function. 41945 pfam01927: Protein of unknown function DUF82. This prokaryotic protein family has no known function. The protein contains four conserved cysteines that may be involved in metal binding or disulphide bridges. 41946 pfam01928: CYTH domain. 41947 pfam01929: Ribosomal protein L14. This family includes the eukaryotic ribosomal protein L14. 41948 pfam01930: Domain of unknown function DUF83. This domain has no known function. The domain contains three conserved cysteines at its C terminus. 41949 pfam01931: Protein of unknown function DUF84. The function of this prokaryotic protein family is unknown. 41950 pfam01933: Uncharacterized protein family UPF0052. 41951 pfam01934: Protein of unknown function DUF86. The function of members of this family is unknown. 41952 pfam01935: Domain of unknown function DUF87. The function of this prokaryotic domain is unknown. It contains several conserved aspartates and histidines that could be metal ligands. 41953 pfam01936: Protein of unknown function DUF88. This highly conserved bacterial protein has no known function. The alignment contains many conserved aspartates, suggesting an enzymatic function such as an endonuclease or glycosyl hydrolase (Bateman A pers. obs).. 41954 pfam01937: Protein of unknown function DUF89. This family has no known function. 41955 pfam01938: TRAM domain. This small domain has no known function. However it may perform a nucleic acid binding role. 41956 pfam01939: Protein of unknown function DUF91. The function of this prokaryotic protein is unknown. 41957 pfam01940: Integral membrane protein DUF92. Members of this family have several predicted transmembrane helices. The function of these prokaryotic proteins is unknown. 41958 pfam01941: S-adenosylmethionine synthetase (AdoMet synthetase). This family consists of several archaebacterial S-adenosylmethionine synthetase C(AdoMet synthetase or MAT) (EC 2.5.1.6). S-Adenosylmethionine (AdoMet) occupies a central role in the metabolism of all cells. The biological roles of AdoMet include acting as the primary methyl group donor, as a precursor to the polyamines, and as a progenitor of a 5'-deoxyadenosyl radical. S-Adenosylmethionine synthetase catalyses the only known route of AdoMet biosynthesis. The synthetic process occurs in a unique reaction in which the complete triphosphate chain is displaced from ATP and a sulfonium ion formed. MATs from various organisms contain ~400-amino acid polypeptide chains. 41959 pfam01943: Polysaccharide biosynthesis protein. Members of this family are integral membrane proteins. Many members of the family are implicated in production of polysaccharide. The family includes RfbX part of the O antigen biosynthesis operon. The family includes SpoVB from Bacillus subtilis, which is involved in spore cortex biosynthesis. 41960 pfam01944: Integral membrane protein DUF95. Members of this family have several predicted transmembrane regions. The function of this family is unknown. 41961 pfam01946: Thi4 family. This family includes a putative thiamine biosynthetic enzyme. 41962 pfam01947: Protein of unknown function DUF98. This prokaryotic family has no known function. 41963 pfam01948: Aspartate carbamoyltransferase regulatory chain, allosteric domain. The regulatory chain is involved in allosteric regulation of aspartate carbamoyltransferase. The N-terminal domain has ferredoxin-like fold, and provides the regulatory chain dimerisation interface. 41964 pfam01949: Protein of unknown function DUF99. The function of this archaebacterial protein family is unknown. 41965 pfam01950: Protein of unknown function DUF100. This prokaryotic family has no known function. 41966 pfam01951: Protein of unknown function DUF101. The members of this family are uncharacterised. The alignment of these proteins contains several conserved polar residues that might be potential catalytic residues. 41967 pfam01953: Protein of unknown function DUF103. This archaebacterial family has no known function. 41968 pfam01954: Protein of unknown function DUF104. This family includes short archaebacterial proteins of unknown function. Archaeoglobus fulgidus has twelve copies of this protein, with several being clustered together in the genome. 41969 pfam01955: Domain of unknown function DUF105. This prokaryotic protein family has no known function. 41970 pfam01956: Integral membrane protein DUF106. This archaebacterial protein family has no known function. Members are predicted to be integral membrane proteins. 41971 pfam01957: Nodulation efficiency protein D (NfeD). This family contains several proteins which are described as nodulation efficiency protein D (NfeD). The nfe genes (nfeA, nfeB, and nfeD) are involved in the nodulation efficiency and competitiveness of the Sinorhizobium meliloti strain GR4 on alfalfa roots. The specific function of this family is unknown although it is unlikely that NfeD is specifically involved in nodulation as the family contains several different archaeal and bacterial species most of which are not symbionts. 41972 pfam01958: Domain of unknown function DUF108. This family has no known function. It is found to compose the complete protein in archaebacteria and a single domain in a large C. elegans protein. 41973 pfam01959: 3-dehydroquinate synthase (EC 4.6.1.3). 3-Dehydroquinate synthase is an enzyme in the common pathway of aromatic amino acid biosynthesis that catalyses the conversion of 3-deoxy-D-arabino-heptulosonic acid 7-phosphate (DAHP) into 3-dehydroquinic acid. This synthesis of aromatic amino acids is an essential metabolic function for most prokaryotic as well as lower eukaryotic cells, including plants. The pathway is absent in humans; therefore, DHQS represents a potential target for the development of novel and selective antimicrobial agents. Owing to the threat posed by the spread of pathogenic bacteria resistant to many currently used antimicrobial drugs, there is clearly a need to develop new anti-infective drugs acting at novel targets. A further potential use for DHQS inhibitors is as herbicides. 41974 pfam01960: ArgJ family. Members of the ArgJ family catalyse the first EC:2.3.1.35 and fifth steps EC:2.3.1.1 in arginine biosynthesis. . 41975 pfam01963: TraB family. pAD1 is a hemolysin/bacteriocin plasmid originally identified in Enterococcus faecalis DS16. It encodes a mating response to a peptide sex pheromone, cAD1, secreted by recipient bacteria. Once the plasmid pAD1 is acquired, production of the pheromone ceases--a trait related in part to a determinant designated traB. However a related protein is found in C. elegans, suggesting that members of the TraB family have some more general function. 41976 pfam01964: ThiC family. ThiC is found within the thiamine biosynthesis operon. ThiC is involved in pyrimidine biosynthesis. The precise catalytic function of ThiC is still not known. ThiC participates in the formation of 4-Amino-5-hydroxymethyl-2-methylpyrimidine from AIR, an intermediate in the de novo pyrimidine biosynthesis. 41977 pfam01965: DJ-1/PfpI family. The family includes the protease PfpI. This domain is also found in transcriptional regulators. This family appears to be distantly related to Glutamine amidotransferases pfam00117. 41978 pfam01966: HD domain. HD domains are metal dependent phosphohydrolases. 41979 pfam01967: MoaC family. Members of this family are involved in molybdenum cofactor biosynthesis. However their molecular function is not known. 41980 pfam01968: Hydantoinase/oxoprolinase. This family includes the enzymes hydantoinase and oxoprolinase EC:3.5.2.9. Both reactions involve the hydrolysis of 5-membered rings via hydrolysis of their internal imide bonds. 41981 pfam01969: Protein of unknown function DUF111. This prokaryotic family has no known function. 41982 pfam01970: Integral membrane protein DUF112. Members of this prokaryotic family have no known function. 41983 pfam01972: Protein of unknown function DUF114. The function of this archaebacterial protein family is unknown. 41984 pfam01973: Protein of unknown function DUF115. This family of archaebacterial proteins has no known function. 41985 pfam01974: tRNA intron endonuclease, catalytic C-terminal domain. Members of this family cleave pre tRNA at the 5' and 3' splice sites to release the intron EC:3.1.27.9. 41986 pfam01975: Survival protein SurE. E. coli cells with the surE gene disrupted are found to survive poorly in stationary phase. It is suggested that SurE may be involved in stress response. Yeast also contains a member of the family, the member from Yarrowia lipolytica can complement a mutation in acid phosphatase, suggesting that members of this family could be phosphatases. 41987 pfam01976: Protein of unknown function DUF116. This archaebacterial protein has no known function. The protein contains seven conserved cysteines and may also be an integral membrane protein. 41988 pfam01977: 3-octaprenyl-4-hydroxybenzoate carboxy-lyase. This family has been characterised as 3-octaprenyl-4- hydroxybenzoate carboxy-lyase enzymes. This enzyme catalyses the third reaction in ubiquinone biosynthesis. For optimal activity the carboxy-lase was shown to require Mn2+.. 41989 pfam01978: Sugar-specific transcriptional regulator TrmB. One member of this family, TrmB, has been shown to be a sugar-specific transcriptional regulator of the trehalose/maltose ABC transporter in Thermococcus litoralis. 41990 pfam01979: Amidohydrolase family. This family of enzymes are a a large metal dependent hydrolase superfamily. The family includes Adenine deaminase EC:3.5.4.2 that hydrolyses adenine to form hypoxanthine and ammonia. Adenine deaminases reaction is important for adenine utilisation as a purine and also as a nitrogen source. This family also includes dihydroorotase and N-acetylglucosamine-6-phosphate deacetylases, EC:3.5.1.25 These enzymes catalyse the reaction N-acetyl-D-glucosamine 6-phosphate + H2O <=> D-glucosamine 6-phosphate + acetate. This family includes the catalytic domain of urease alpha subunit. Dihydroorotases (EC:3.5.2.3) are also included. . 41991 pfam01980: Uncharacterized protein family UPF0066. 41992 pfam01981: Domain of unknown function UPF0099. This domain has no known function. 41993 pfam01982: Domain of unknown function DUF120. This archaebacterial domain has no known function. In some members this domain is attached to pfam00325, a DNA binding domain, suggesting that this domain might be involved in recognising some regulatory molecule. 41994 pfam01983: Protein of unknown function DUF121. The function of this prokaryotic family is unknown. 41995 pfam01984: Double-stranded DNA-binding domain. This domain is believed to bind double-stranded DNA of 20 bases length. 41996 pfam01985: CRS1 / YhbY domain. E. coli YhbY belongs to a conserved family of proteins represented in eubacteria, archaea, and plants. Three maize proteins harbouring UPF0044-like domains are required for chloroplast group II intron splicing, and bioinformatic data suggest a role for members in translation. The crystal structure of YhbY has been determined. YhbY has a fold similar to that of the C-terminal domain of translation initiation factor 3 (IF3C), which binds to 16S rRNA in the 30S ribosome. Modelling studies indicate that the same surface is highly basic in all members of this family, suggesting a conserved RNA binding surface. 41997 pfam01986: Domain of unknown function DUF123. This archaebacterial domain has no known function. It is attached to an endonuclease domain in one member. The domain contains several conserved cysteines and histidines. This suggests that the domain may be a zinc binding nucleic acid interaction domain (Bateman A unpubl.).. 41998 pfam01987: Protein of unknown function DUF124. This prokaryotic protein family has no known function. 41999 pfam01988: Integral membrane protein DUF125. This family of predicted integral membrane proteins has no known function. However it does include a protein, which may have a role in regulating calcium levels. 42000 pfam01989: Protein of unknown function DUF126. This archaebacterial protein family has no known function. 42001 pfam01990: ATP synthase (F/14-kDa) subunit. This family includes 14-kDa subunit from vATPases, which is in the peripheral catalytic part of the complex. The family also includes archaebacterial ATP synthase subunit F. 42002 pfam01991: ATP synthase (E/31 kDa) subunit. This family includes the vacuolar ATP synthase E subunit, as well as the archaebacterial ATP synthase E subunit. 42003 pfam01992: ATP synthase (C/AC39) subunit. This family includes the AC39 subunit from vacuolar ATP synthase, and the C subunit from archaebacterial ATP synthase. The family also includes subunit C from the Sodium transporting ATP synthase from Enterococcus hirae. 42004 pfam01993: methylene-5,6,7,8-tetrahydromethanopterin dehydrogenase. This enzyme family is involved in formation of methane from carbon dioxide EC:1.5.99.9. The enzyme requires coenzyme F420. 42005 pfam01994: Protein of unknown function DUF127. This archaebacterial protein family has no known function. 42006 pfam01995: Domain of unknown function DUF128. This archaebacterial protein family has no known function. The domain is found duplicated in one member. 42007 pfam01996: Protein of unknown function DUF129. This prokaryotic protein family has no known function. 42008 pfam01997: Translin family. Members of this family include Translin, which interacts with DNA and forms a ring around the DNA. This family also includes, which was found to interact with translin with yeast two-hybrid screen. 42009 pfam01998: Protein of unknown function DUF131. This archaebacterial protein family has no known function. The proteins are predicted to contain two transmembrane helices. 42010 pfam02001: Protein of unknown function DUF134. This family of archaeal proteins has no known function. 42011 pfam02002: TFIIE alpha subunit. The general transcription factor TFIIE has an essential role in eukaryotic transcription initiation together with RNA polymerase II and other general factors. Human TFIIE consists of two subunits, TFIIE-alpha and TFIIE-beta, and joins the preinitiation complex after RNA polymerase II and TFIIF. This family consists of the conserved amino terminal region of eukaryotic TFIIE-alpha and proteins from archaebacteria that are presumed to be TFIIE-alpha subunits too. 42012 pfam02003: Protein of unknown function DUF135. This family of prokaryotic proteins has no known function. 42013 pfam02005: N2,N2-dimethylguanosine tRNA methyltransferase. This enzyme EC:2.1.1.32 used S-AdoMet to methylate tRNA. The TRM1 gene of Saccharomyces cerevisiae is necessary for the N2,N2-dimethylguanosine modification of both mitochondrial and cytoplasmic tRNAs. The enzyme is found in both eukaryotes and archaebacteria. 42014 pfam02006: Protein of unknown function DUF137. This family of archaeal proteins has no known function. 42015 pfam02007: Tetrahydromethanopterin S-methyltransferase MtrH subunit. The enzyme tetrahydromethanopterin S-methyltransferase EC:2.1.1.86 is composed of eight subunits. The enzyme is a membrane- associated enzyme complex which catalyses an energy-conserving, sodium-ion-translocating step in methanogenesis from hydrogen and carbon dioxide. . 42016 pfam02008: CXXC zinc finger. This domain contains eight conserved cysteine residues that bind to zinc. The CXXC domain is found in proteins that methylate cytosine, proteins that bind to methyl cytosine and HRX related proteins. 42017 pfam02009: Rifin/stevor family. Several multicopy gene families have been described in Plasmodium falciparum, including the stevor family of subtelomeric open reading frames and the rif interspersed repetitive elements. Both families contain three predicted transmembrane segments. It has been proposed that stevor and rif are members of a larger superfamily that code for variant surface antigens. 42018 pfam02010: REJ domain. The REJ (Receptor for Egg Jelly) domain is found in PKD1, and the sperm receptor for egg jelly. The function of this domain is unknown. The domain is 600 amino acids long so is probably composed of multiple structural domains. There are six completely conserved cysteine residues that may form disulphide bridges. 42019 pfam02011: Glycosyl hydrolase family 48. Members of this family are endoglucanase EC:3.2.1.4 and exoglucanase EC:3.2.1.91 enzymes that cleave cellulose or related substrate. 42020 pfam02012: BNR/Asp-box repeat. Members of this family contain multiple BNR (bacterial neuraminidase repeat) repeats or Asp-boxes. The repeats are short, however the repeats are never found closer than 40 residues together suggesting that the repeat is structurally longer. These repeats are found in many glycosyl hydrolases as well as other extracellular proteins of unknown function. 42021 pfam02013: Cellulose or protein binding domain. This domain is found in two distinct sets of proteins with different functions. Those found in aerobic bacteria bind cellulose (or other carbohydrates); but in anaerobic fungi they are protein binding domains, referred to as dockerin domains or docking domains. They are believed to be responsible for the assembly of a multiprotein cellulase/hemicellulase complex, similar to the cellulosome found in certain anaerobic bacteria. 42022 pfam02014: Reeler domain. 42023 pfam02015: Glycosyl hydrolase family 45. 42024 pfam02016: LD-carboxypeptidase. Muramoyl-tetrapeptide carboxypeptidase hydrolyses a peptide bond between a di-basic amino acid and the C-terminal D-alanine in the tetrapeptide moiety in peptidoglycan. This cleaves the bond between an L- and a D-amino acid. The function of this activity is in murein recycling. This family also includes the microcin c7 self-immunity protein. This family corresponds to Merops family U61. 42025 pfam02017: CIDE-N domain. This domain is found in CAD nuclease, ICAD the inhibitor of CAD nuclease. The two proteins interact through this domain. 42026 pfam02018: Carbohydrate binding domain. This family includes diverse carbohydrate binding domains. 42027 pfam02019: WIF domain. The WIF domain is found in the RYK tyrosine kinase receptors and WIF the Wnt-inhibitory- factor. The domain is extracellular and and contains two conserved cysteines that may form a disulphide bridge. This domain is Wnt binding in WIF, and it has been suggested that RYK may also bind to Wnt. 42028 pfam02020: eIF4-gamma/eIF5/eIF2-epsilon. This domain of unknown function is found at the C-terminus of several translation initiation factors. 42029 pfam02021: Uncharacterised protein family UPF0102. The function of this family is unknown. 42030 pfam02022: Integrase Zinc binding domain. Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. This domain is the amino-terminal domain zinc binding domain. The central domain is the catalytic domain pfam00665. The carboxyl terminal domain is a DNA binding domain pfam00552. 42031 pfam02023: SCAN domain. The SCAN domain (named after SRE-ZBP, CTfin51, AW-1 and Number 18 cDNA) is found in several pfam00096 proteins. The domain has been shown to be able to mediate homo- and hetero-oligomerisation. 42032 pfam02024: Leptin. 42033 pfam02025: Interleukin 5. 42034 pfam02026: RyR domain. This domain is called RyR for Ryanodine receptor. The domain is found in four copies in the ryanodine receptor. The function of this domain is unknown. 42035 pfam02027: RolB/RolC glucosidase family. This family of proteins includes RolB and RolC. RolC releases cytokinins from glucoside conjugates. Whereas RolB hydrolyses indole glucosides. 42036 pfam02028: BCCT family transporter. 42037 pfam02029: Caldesmon. 42038 pfam02030: Hypothetical lipoprotein (MG045 family). This family includes hypothetical lipoproteins, the amino terminal part of this protein is related to pfam01547, a family of solute binding proteins. This suggests this family also has a solute binding function. 42039 pfam02031: Streptomyces extracellular neutral proteinase (M7) family. 42040 pfam02033: Ribosome-binding factor A. 42041 pfam02035: Coagulin. 42042 pfam02036: SCP-2 sterol transfer family. This domain is involved in binding sterols. It is found in the SCP2 protein, as well as the C terminus of the enzyme estradiol 17 beta-dehydrogenase EC:1.1.1.62. The UNC-24 protein contains an SPFH domain pfam01145. . 42043 pfam02037: SAP domain. The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA binding domain found in diverse nuclear proteins. 42044 pfam02038: ATP1G1/PLM/MAT8 family. 42045 pfam02039: Adrenomedullin. 42046 pfam02040: Arsenical pump membrane protein. 42047 pfam02041: Auxin binding protein. 42048 pfam02042: RWP-RK domain. This domain is named RWP-RK after a conserved motif at the C terminus of the presumed domain. The domain is found in algal minus dominance proteins as well as plant proteins involved in nitrogen-controlled development. 42049 pfam02043: Bacteriochlorophyll C binding protein. 42050 pfam02044: Bombesin-like peptide. 42051 pfam02045: CCAAT-binding transcription factor (CBF-B/NF-YA) subunit B. 42052 pfam02046: Cytochrome c oxidase subunit VIa. 42053 pfam02048: Heat-stable enterotoxin. 42054 pfam02049: Flagellar hook-basal body complex protein FliE. 42055 pfam02050: Flagellar FliJ protein. 42056 pfam02051: Fragilysin metallopeptidase (M10C) enterotoxin. 42057 pfam02052: Gallidermin. 42058 pfam02053: Gene 66 (IR5) protein. 42059 pfam02055: O-Glycosyl hydrolase family 30. 42060 pfam02056: Family 4 glycosyl hydrolase. 42061 pfam02057: Glycosyl hydrolase family 59. 42062 pfam02058: Guanylin precursor. 42063 pfam02059: Interleukin-3. 42064 pfam02060: Slow voltage-gated potassium channel. 42065 pfam02061: Lambda Phage CIII. 42066 pfam02063: MARCKS family. 42067 pfam02064: MAS20 protein import receptor. 42068 pfam02065: Melibiase. 42069 pfam02066: Metallothionein family 11. 42070 pfam02067: Metallothionein family 5. 42071 pfam02068: Plant PEC family metallothionein. 42072 pfam02069: Prokaryotic metallothionein. 42073 pfam02070: Neuromedin U. 42074 pfam02071: Aromatic-di-Alanine (AdAR) repeat. This repeat is found in NSF attachment proteins. Its structure is similar to that found in TPR repeats pfam00515. 42075 pfam02072: Prepro-orexin. 42076 pfam02073: Thermophilic metalloprotease (M29).. 42077 pfam02074: Carboxypeptidase Taq (M32) metallopeptidase. 42078 pfam02075: Crossover junction endodeoxyribonuclease RuvC. 42079 pfam02076: Pheromone A receptor. 42080 pfam02077: SURF4 family. 42081 pfam02078: Synapsin, N-terminal domain. 42082 pfam02079: Nuclear transition protein 1. 42083 pfam02080: TrkA-C domain. This domain is often found next to the pfam02254 domain. The exact function of this domain is unknown. It has been suggested that it may bind an unidentified ligand. The domain is predicted to adopt an all beta structure. 42084 pfam02081: Tryptophan RNA-binding attenuator protein. 42085 pfam02082: Transcriptional regulator. This family is related to pfam001022 and other transcription regulation families (personal obs: Yeats C).. 42086 pfam02083: Urotensin II. 42087 pfam02084: Bindin. 42088 pfam02085: Class III cytochrome C family. 42089 pfam02086: D12 class N6 adenine-specific DNA methyltransferase. 42090 pfam02087: Nitrophorin. 42091 pfam02088: Ornatin. 42092 pfam02089: Palmitoyl protein thioesterase. 42093 pfam02090: Salmonella surface presentation of antigen gene type M protein. 42094 pfam02091: Glycyl-tRNA synthetase alpha subunit. 42095 pfam02092: Glycyl-tRNA synthetase beta subunit. 42096 pfam02093: Gag P30 core shell protein. P30 is essential for viral assembly. 42097 pfam02095: Extensin-like protein repeat. 42098 pfam02096: 60Kd inner membrane protein. 42099 pfam02097: Filoviridae VP35. 42100 pfam02098: Tick histamine binding protein. 42101 pfam02099: Josephin. 42102 pfam02100: Ornithine decarboxylase antizyme. 42103 pfam02101: Ocular albinism type 1 protein. 42104 pfam02102: Deuterolysin metalloprotease (M35) family. 42105 pfam02103: 36KDa capillovirus serine protease (S35).. 42106 pfam02104: SURF1 family. 42107 pfam02106: Fanconi anaemia group C protein. 42108 pfam02107: Flagellar L-ring protein. 42109 pfam02108: Flagellar assembly protein FliH. 42110 pfam02109: DAD family. Members of this family are thought to be integral membrane proteins. Some members of this family have been shown to cause apoptosis if mutated, these proteins are known as DAD for defender against death. The family also includes the epsilon subunit of the oligosaccharyltransferase that is involved in N-linked glycosylation. 42111 pfam02110: Hydroxyethylthiazole kinase family. 42112 pfam02112: cAMP phosphodiesterases class-II. 42113 pfam02113: D-Ala-D-Ala carboxypeptidase 3 (S13) family. 42114 pfam02114: Phosducin. 42115 pfam02115: RHO protein GDP dissociation inhibitor. 42116 pfam02116: Fungal pheromone mating factor STE2 GPCR. 42117 pfam02117: C.elegans Sra family integral membrane protein. 42118 pfam02118: C.elegans Srg family integral membrane protein. 42119 pfam02119: Flagellar P-ring protein. 42120 pfam02120: Flagellar hook-length control protein. 42121 pfam02121: Phosphatidylinositol transfer protein. 42122 pfam02122: Putative replicase 1 (ORF2).. 42123 pfam02123: Luteovirus (ORF3) RNA-directed RNA-polymerase. 42124 pfam02124: Marek's disease glycoprotein A. 42125 pfam02126: Phosphotriesterase family. 42126 pfam02127: Aminopeptidase I zinc metalloprotease (M18).. 42127 pfam02128: Fungalysin metallopeptidase (M36).. 42128 pfam02129: X-Pro dipeptidyl-peptidase (S15 family).. 42129 pfam02130: Uncharacterized protein family UPF0054. 42130 pfam02132: RecR protein. 42131 pfam02133: Permease for cytosine/purines, uracil, thiamine, allantoin. 42132 pfam02134: Repeat in ubiquitin-activating (UBA) protein. 42133 pfam02135: TAZ zinc finger. The TAZ2 domain of CBP binds to other transcription factors such as the p53 tumour suppressor protein, E1A oncoprotein, MyoD, and GATA-1. 42134 pfam02136: Nuclear transport factor 2 (NTF2) domain. This family includes the NTF2-like Delta-5-3-ketosteroid isomerase proteins. 42135 pfam02137: Adenosine-deaminase (editase) domain. 42136 pfam02138: Beige/BEACH domain. 42137 pfam02140: Galactose binding lectin domain. 42138 pfam02141: DENN (AEX-3) domain. DENN (after differentially expressed in neoplastic vs normal cells) is a domain which occurs in several proteins involved in Rab- mediated processes or regulation of MAPK signalling pathways. 42139 pfam02142: MGS-like domain. This domain composes the whole protein of methylglyoxal synthetase and the domain is also found in Carbamoyl phosphate synthetase (CPS) where it forms a regulatory domain that binds to the allosteric effector ornithine. This family also includes inosicase. The known structures in this family show a common phosphate binding site. 42140 pfam02144: Repair protein Rad1/Rec1/Rad17. 42141 pfam02145: Rap/ran-GAP. 42142 pfam02146: Sir2 family. 42143 pfam02148: Zn-finger in ubiquitin-hydrolases and other protein. 42144 pfam02149: Kinase associated domain 1. 42145 pfam02150: RNA polymerases M/15 Kd subunit. 42146 pfam02151: UvrB/uvrC motif. 42147 pfam02152: Dihydroneopterin aldolase. This enzyme EC:4.1.2.25 catalyses the conversion of 7,8-dihydroneopterin to 6-hydroxymethyl-7,8-dihydropterin in the biosynthetic pathway of tetrahydrofolate. 42148 pfam02153: Prephenate dehydrogenase. Members of this family are prephenate dehydrogenases EC:1.3.1.12 involved in tyrosine biosynthesis. 42149 pfam02154: Flagellar motor switch protein FliM. 42150 pfam02155: Glucocorticoid receptor. 42151 pfam02156: Glycosyl hydrolase family 26. 42152 pfam02157: Cation-dependent mannose-6-phosphate receptor. 42153 pfam02158: Neuregulin family. 42154 pfam02159: Oestrogen receptor. 42155 pfam02160: Cauliflower mosaic virus peptidase (A3).. 42156 pfam02161: Progesterone receptor. 42157 pfam02162: XYPPX repeat. This repeat is found in a wide variety of proteins and generally consists of the motif XYPPX where X can be any amino acid. The family includes annexin VII, the carboxy tail of certain rhodopsins. This family also includes plaque matrix proteins, however this motif is embedded in a ten residue repeat in one member. The molecular function of this repeat is unknown. It is also not clear is all the members of this family share a common evolutionary ancestor due to its short length and biased amino acid composition. 42158 pfam02163: Peptidase family M50. 42159 pfam02165: Wilm's tumour protein. 42160 pfam02166: Androgen receptor. 42161 pfam02167: Cytochrome C1 family. 42162 pfam02169: LPP20 lipoprotein precursor. 42163 pfam02170: PAZ domain. This domain is named PAZ after the proteins Piwi Argonaut and Zwille. This domain is found in two families of proteins that are involved in post-transcriptional gene silencing. These are the Piwi family and the Dicer family, that includes the Carpel factory protein. The function of the domains is unknown but has been suggested to mediate complex formation between proteins of the Piwi and Dicer families by hetero-dimerisation. 42164 pfam02171: Piwi domain. This domain is found in the protein Piwi and its relatives. The function of this domain is unknown. 42165 pfam02172: KIX domain. CBP and P300 bind to the CREB via a domain known as KIX. The KIX domain of CBP also binds to transactivation domains of other nuclear factors including Myb and Jun. 42166 pfam02173: pKID domain. CBP and P300 bind to the pKID (phosphorylated kinase-inducible-domain) domain of CREB. 42167 pfam02174: PTB domain (IRS-1 type).. 42168 pfam02175: C.elegans integral membrane protein Srb. 42169 pfam02176: TRAF-type zinc finger. 42170 pfam02177: Amyloid A4 extracellular domain. 42171 pfam02178: AT hook motif. At hooks are DNA binding motifs with a preference for A/T rich regions. 42172 pfam02179: BAG domain. Domain present in Hsp70 regulators. 42173 pfam02180: Bcl-2 homology region 4. 42174 pfam02181: Formin Homology 2 Domain. 42175 pfam02182: YDG/SRA domain. The function of this domain is unknown, it contains a conserved motif YDG after which it has been named. 42176 pfam02183: Homeobox associated leucine zipper. 42177 pfam02184: HAT (Half-A-TPR) repeat. The HAT (Half A TPR) repeat is found in several RNA processing proteins. 42178 pfam02185: Hr1 repeat. 42179 pfam02186: TFIIE beta subunit core domain. General transcription factor TFIIE consists of two subunits, TFIIE alpha pfam02002 and TFIIE beta. TFIIE beta has been found to bind to the region where the promoter starts to open to be single-stranded upon transcription initiation by RNA polymerase II. The structure of the DNA binding core region has been solved and has a winged helix fold. 42180 pfam02187: Growth-Arrest-Specific Protein 2 Domain. 42181 pfam02188: GoLoco motif. 42182 pfam02189: Immunoreceptor tyrosine-based activation motif. 42183 pfam02190: ATP-dependent protease La (LON) domain. 42184 pfam02191: Olfactomedin-like domain. 42185 pfam02192: PI3-kinase family, p85-binding domain. 42186 pfam02194: PXA domain. This domain is associated with PX domains pfam00787. 42187 pfam02195: ParB-like nuclease domain. 42188 pfam02196: Raf-like Ras-binding domain. 42189 pfam02197: Regulatory subunit of type II PKA R-subunit. 42190 pfam02198: Sterile alpha motif (SAM)/Pointed domain. 42191 pfam02199: Saposin A-type domain. 42192 pfam02200: STE like transcription factor. 42193 pfam02201: SWIB/MDM2 domain. This family includes the SWIB domain and the MDM2 domain. The p53-associated protein (MDM2) is an inhibitor of the p53 tumour suppressor gene binding the transactivation domain and down regulating the ability of p53 to activate transcription. This family contains the p53 binding domain of MDM2. 42194 pfam02202: Tachykinin family. 42195 pfam02203: Tar ligand binding domain homologue. 42196 pfam02204: Vacuolar sorting protein 9 (VPS9) domain. 42197 pfam02205: WH2 motif. The WH2 motif (for Wiskott Aldrich syndrome homology region 2) has been shown in WASP and Scar1 (mammalian homologue) to interact via their WH2 motifs with actin. 42198 pfam02206: Domain of unknown function. 42199 pfam02207: Putative zinc finger in N-recognin. 42200 pfam02208: Sorbin homologous domain. 42201 pfam02209: Villin headpiece domain. 42202 pfam02210: Thrombospondin N-terminal -like domain. 42203 pfam02211: Nitrile hydratase beta subunit. Nitrile hydratases EC:4.2.1.84 are unusual metalloenzymes that catalyse the hydration of nitriles to their corresponding amides. They are used as biocatalysts in acrylamide production, one of the few commercial scale bioprocesses, as well as in environmental remediation for the removal of nitriles from waste streams. Nitrile hydratases are composed of two subunits, alpha and beta, and they contain one iron atom per alpha beta unit. . 42204 pfam02212: Dynamin GTPase effector domain. 42205 pfam02213: GYF domain. The GYF domain is named because of the presence of Gly-Tyr-Phe residues. The GYF domain is a proline-binding domain in CD2-binding protein. 42206 pfam02214: K+ channel tetramerisation domain. The N-terminal, cytoplasmic tetramerisation domain (T1) of voltage-gated K+ channels encodes molecular determinants for subfamily-specific assembly of alpha-subunits into functional tetrameric channels. It is distantly related to the BTB/POZ domain pfam00651. . 42207 pfam02216: B domain. This family contains the B domain of Staphylococcal protein A, which specifically binds to the Fc portion of immunoglobulin G. 42208 pfam02217: Origin of replication binding protein. This domain of large T antigen binds to the SV40 origin of DNA replication. . 42209 pfam02218: Repeat in HS1/Cortactin. The function of this repeat is unknown. Seven copies are found in cortactin and four copies are found in HS1. The repeats are always found amino terminal to an SH3 domain pfam00018. 42210 pfam02219: Methylenetetrahydrofolate reductase. This family includes the 5,10-methylenetetrahydrofolate reductase EC:1.7.99.5 from bacteria and methylenetetrahydrofolate reductase EC: 1.5.1.20 from eukaryotes. The structure for this domain is known to be a TIM barrel. 42211 pfam02221: ML domain. ML domain - MD-2-related lipid recognition domain. This family consists of proteins from plants, animals and fungi, including dust mite allergen Der P 2. It has been implicate in lipid recognition, particularly in the recognition of pathogen related products. A mutation in Npc2 causes a rare form of Niemann-Pick type C2 disease. 42212 pfam02222: ATP-grasp domain. This family does not contain all known ATP-grasp domain members. This family includes a diverse set of enzymes that possess ATP-dependent carboxylate-amine ligase activity. 42213 pfam02223: Thymidylate kinase. 42214 pfam02224: Cytidylate kinase. Cytidylate kinase EC:2.7.4.14 catalyses the phosphorylation of cytidine 5'-monophosphate (dCMP) to cytidine 5'-diphosphate (dCDP) in the presence of ATP or GTP. 42215 pfam02225: PA domain. The PA (Protease associated) domain is found as an insert domain in diverse proteases. The PA domain is also found in a plant vacuolar sorting receptor and members of the RZF family. 42216 pfam02226: Picornavirus coat protein (VP4). VP1, VP2, VP3 and VP4 for the basic unit that forms the icosahedral coat of picornaviruses. Five symmetry-related N termini of coat protein VP4 form a ten-stranded, antiparallel beta barrel around the base of the icosahedral fivefold axis. 42217 pfam02228: Major core protein p19. p19 is a component of the inner protein layer of the viral nucleocapsid. 42218 pfam02229: Transcriptional Coactivator p15 (PC4). p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain. 42219 pfam02230: Phospholipase/Carboxylesterase. This family consists of both phospholipases and carboxylesterases with broad substrate specificity, and is structurally related to alpha/beta hydrolases pfam00561. 42220 pfam02231: PHP domain N-terminal region. The PHP (Polymerase and Histidinol Phosphatase) domain is a putative phosphoesterase domain. This family is often associated with a C terminal region. 42221 pfam02232: Alpha trans-inducing protein (Alpha-TIF). Alpha-TIF, a virion protein (VP16), is involved in transcriptional activation of viral immediate early (IE) promoters (alpha genes). Specificity of one member for IE genes is conferred by the 400 residue N-terminal, the 80 residue C-terminal is responsible for transcriptional activation. 42222 pfam02233: NAD(P) transhydrogenase beta subunit. This family corresponds to the beta subunit of NADP transhydrogenase in prokaryotes, and either the protein N- or C terminal in eukaryotes. The domain is often found in conjunction with pfam01262. Pyridine nucleotide transhydrogenase catalyses the reduction of NAD+ to NADPH. A complete loss of activity occurs upon mutation of Gly314 in E. coli. 42223 pfam02234: Cyclin-dependent kinase inhibitor. Cell cycle progression is negatively controlled by cyclin-dependent kinases inhibitors (CDIs). CDIs are involved in cell cycle arrest at the G1 phase. 42224 pfam02236: Viral DNA-binding protein, all alpha domain. This family represents a domain of the viral DNA- binding protein, a multi functional protein involved in DNA replication and transcription control. 42225 pfam02237: Biotin protein ligase C terminal domain. The function of this structural domain is unknown. It is found to the C terminus of the biotin protein ligase catalytic domain pfam01317. 42226 pfam02238: Cytochrome c oxidase subunit VIIa. Cytochrome c oxidase, a 13 sub-unit complex, is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of the heart and liver isoforms of cytochrome c oxidase subunit VIIa. 42227 pfam02239: Cytochrome D1 heme domain. Cytochrome cd1 (nitrite reductase) catalyses the conversion of nitrite to nitric oxide in the nitrogen cycle. This family represents the d1 heme binding domain of cytochrome cd1, in which His/Tyr side chains ligate the d1 heme iron of the active site in the oxidised state. 42228 pfam02240: Methyl-coenzyme M reductase gamma subunit. Methyl-coenzyme M reductase (MCR) is the enzyme responsible for microbial formation of methane. It is a hexamer composed of 2 alpha (pfam02249), 2 beta (pfam02241), and 2 gamma (this family) subunits with two identical nickel porphinoid active sites. 42229 pfam02241: Methyl-coenzyme M reductase beta subunit, C-terminal domain. Methyl-coenzyme M reductase (MCR) is the enzyme responsible for microbial formation of methane. It is a hexamer composed of 2 alpha (pfam02249), 2 beta (this family), and 2 gamma (Pfam:PF2240) subunits with two identical nickel porphinoid active sites. The C-terminal domain of MCR beta has an all-alpha fold with buried central helix. 42230 pfam02244: Carboxypeptidase activation peptide. Carboxypeptidases are found in abundance in pancreatic secretions. The pro-segment moiety (activation peptide) accounts for up to a quarter of the total length of the peptidase, and is responsible for modulation of folding and activity of the pro-enzyme. 42231 pfam02245: Methylpurine-DNA glycosylase (MPG). Methylpurine-DNA glycosylase is a base excision-repair protein. It is responsible for the hydrolysis of the deoxyribose N-glycosidic bond, excising 3-methyladenine and 3-methylguanine from damaged DNA. 42232 pfam02246: Protein L b1 domain. Protein L is a bacterial protein with immunoglobulin (Ig) light chain-binding properties. It contains a number of homologous b1 repeats towards the N-terminus. These repeats have been found to be responsible for the interaction of protein L with Ig light chains. 42233 pfam02247: Large coat protein. This family contains the large coat protein (LCP) of the comoviridae viral family. 42234 pfam02248: Small coat protein. This family contains the small coat protein (SCP) of the comoviridae viral family. 42235 pfam02249: Methyl-coenzyme M reductase alpha subunit, C-terminal domain. Methyl-coenzyme M reductase (MCR) is the enzyme responsible for microbial formation of methane. It is a hexamer composed of 2 alpha (this family), 2 beta (pfam02241), and 2 gamma (pfam02240) subunits with two identical nickel porphinoid active sites. The C-terminal domain is comprised of an all-alpha multi-helical bundle. 42236 pfam02250: 35kD major secreted virus protein. This family of orthopoxvirus secreted proteins (also known as T1 and A41) interact with members of both the CC and CXC superfamilies of chemokines. It has been suggested that these secreted proteins modulate leukocyte influx into virus-infected tissues. 42237 pfam02251: Proteasome activator pa28 alpha subunit. PA28 activator complex (also known as 11s regulator of 20S proteasome) is a ring shaped hexameric structure of alternating alpha and beta subunits. This family represents the alpha subunit. The activator complex binds to the 20S proteasome ana simulates peptidase activity in and ATP-independent manner. 42238 pfam02252: Proteasome activator pa28 beta subunit. PA28 activator complex (also known as 11s regulator of 20S proteasome) is a ring shaped hexameric structure of alternating alpha and beta subunits. This family represents the beta subunit. The activator complex binds to the 20S proteasome ana simulates peptidase activity in and ATP-independent manner. 42239 pfam02253: Phospholipase A1. Phospholipase A1 is a bacterial outer membrane bound acyl hydrolase with a broad substrate specificity EC:3.1.1.32. It has been proposed that a conserved Serine is the active site. 42240 pfam02254: TrkA-N domain. This domain is found in a wide variety of proteins. These protein include potassium channels, phosphoesterases, and various other transporters. This domain binds to NAD. 42241 pfam02255: PTS system, Lactose/Cellobiose specific IIA subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. The lactose/cellobiose-specific family are one of four structurally and functionally distinct group IIA PTS system enzymes. This family of proteins normally function as a homotrimer, stabilised by a centrally located metal ion. Separation into subunits is thought to occur after phosphorylation. 42242 pfam02256: Iron hydrogenase small subunit. This family represents the small subunit of the Fe-only hydrogenases EC:1.18.99.1. The subunit is comprised of alternating random coil and alpha helical structures that encompasses the large subunit in a novel protein fold. 42243 pfam02257: RFX DNA-binding domain. RFX is a regulatory factor which binds to the X box of MHC class II genes and is essential for their expression. The DNA-binding domain of RFX is the central domain of the protein and binds ssDNA as either a monomer or homodimer. 42244 pfam02258: Shiga-like toxin beta subunit. This family represents the B subunit of shiga-like toxin (SLT or verotoxin) produced by some strains of E.coli associated with hemorrhagic colitis and hemolytic uremic syndrome. SLT's are composed of one enzymatic A subunit and five cell binding B subunits. 42245 pfam02259: FAT domain. The FAT domain is named after FRAP, ATM and TRRAP. 42246 pfam02260: FATC domain. The FATC domain is named after FRAP, ATM, TRRAP C-terminal. 42247 pfam02261: Aspartate decarboxylase. Decarboxylation of aspartate is the major route of beta-alanine production in bacteria, and is catalysed by the enzyme aspartate decarboxylase EC:4.1.1.11 which requires a pyruvoyl group for its activity. It is synthesised initially as a proenzyme which is then proteolytically cleaved to an alpha (C-terminal) and beta (N-terminal) subunit and a pyruvoyl group. This family contains both chains of aspartate decarboxylase. 42248 pfam02262: CBL proto-oncogene N-terminal domain 1. Cbl is an adaptor protein that binds EGF receptors (or other tyrosine kinases) and SH3 domains, functioning as a negative regulator of many signaling pathways. The N-terminal domain is evolutionarily conserved, and is known to bind to phosphorylated tyrosine residues. Cbl_N is comprised of 3 structural domains of which this is the first - a four helix bundle. 42249 pfam02263: Guanylate-binding protein, N-terminal domain. Transcription of the anti-viral guanylate-binding protein (GBP) is induced by interferon-gamma during macrophage induction. This family contains GBP1 and GPB2, both GTPases capable of binding GTP, GDP and GMP. 42250 pfam02264: LamB porin. Maltoporin (LamB protein) forms a trimeric structure which facilitates the diffusion of maltodextrins across the outer membrane of Gram-negative bacteria. The membrane channel is formed by an antiparallel beta-barrel. 42251 pfam02265: S1/P1 Nuclease. This family contains both S1 and P1 nucleases (EC:3.1.30.1) which cleave RNA and single stranded DNA with no base specificity. 42252 pfam02267: ADP-ribosyl cyclase. ADP-ribosyl cyclase EC:3.2.2.5 (also know as cyclic ADP-ribose hydrolase or CD38) synthesises cyclic-ADP ribose, a second messenger for glucose-induced insulin secretion. 42253 pfam02268: Transcription initiation factor IIA, gamma subunit, helical domain. Accurate transcription in vivo requires at least six general transcription initiation factors, in addition to RNA polymerase II. Transcription initiation factor IIA (TFIIA) is a multimeric protein which facilitates the binding of TFIID to the TATA box. The N-terminal domain of the gamma subunit is a 4 helix bundle. 42254 pfam02269: Transcription initiation factor IID, 18kD subunit. This family includes the Spt3 yeast transcription factors and the 18kD subunit from human transcription initiation factor IID (TFIID-18). Determination of the crystal structure reveals an atypical histone fold. 42255 pfam02270: Transcription initiation factor IIF, beta subunit. Accurate transcription in vivo requires at least six general transcription initiation factors, in addition to RNA polymerase II. Transcription initiation factor IIF (TFIIF) is a tetramer of two beta subunits associate with two alpha subunits which interacts directly with RNA polymerase II. The beta subunit of TFIIF is required for recruitment of RNA polymerase II onto the promoter. 42256 pfam02271: Ubiquinol-cytochrome C reductase complex 14kD subunit. The ubiquinol-cytochrome C reductase complex (cytochrome bc1 complex) is a respiratory multienzyme complex. This Pfam family represents the 14kD (or VI) subunit of the complex which is not directly involved in electron transfer, but has a role in assembly of the complex. 42257 pfam02272: DHHA1 domain. This domain is often found adjacent to the DHH domain pfam01368 and is called DHHA1 for DHH associated domain. This domain is diagnostic of DHH subfamily 1 members. This domains is also found in alanyl tRNA synthetase, suggesting that this domain may have an RNA binding function. The domain is about 60 residues long and contains a conserved GG motif. 42258 pfam02273: Acyl transferase. This bacterial family of Acyl transferases (or myristoyl-acp-specific thioesterases) catalyse the first step in the bioluminescent fatty acid reductase system. 42259 pfam02274: Amidinotransferase. This family contains glycine (EC:2.1.4.1) and inosamine (EC:2.1.4.2) amidinotransferases, enzymes involved in creatine and streptomycin biosynthesis respectively. This family also includes arginine deiminases, EC:3.5.3.6. These enzymes catalyse the reaction: arginine + H2O <=> citrulline + NH3. Also found in this family is the Streptococcus anti tumour glycoprotein. 42260 pfam02275: Linear amide C-N hydrolases, choloylglycine hydrolase family. This family includes several hydrolases which cleave carbon-nitrogen bonds, other than peptide bonds, in linear amides. These include choloylglycine hydrolase (conjugated bile acid hydrolase, CBAH) EC:3.5.1.24, penicillin acylase EC:3.5.1.11 and acid ceramidase EC:3.5.1.23. 42261 pfam02276: Photosynthetic reaction centre cytochrome C subunit. Photosynthesis in purple bacteria is dependent on light-induced electron transfer in the reaction centre (RC), coupled to the uptake of protons from the cytoplasm. The RC contains a cytochrome molecule which re-reduces the oxidised electron donor. 42262 pfam02277: Phosphoribosyltransferase. This family of proteins represent the nicotinate-nucleotide- dimethylbenzimidazole phosphoribosyltransferase (NN:DBI PRT) enzymes involved in dimethylbenzimidazole synthesis. This function is essential to de novo cobalamin (vitamin B12) production in bacteria. Nicotinate mononucleotide (NaMN):5,6-dimethylbenzimidazole (DMB) phosphoribosyltransferase (CobT) from Salmonella enterica plays a central role in the synthesis of alpha-ribazole-5 '-phosphate, an intermediate for the lower ligand of cobalamin. 42263 pfam02278: Polysaccharide lyase family 8, super-sandwich domain. This family consists of a group of secreted bacterial lyase enzymes EC:4.2.2.1 capable of acting on hyaluronan and chondroitin in the extracellular matrix of host tissues, contributing to the invasive capacity of the pathogen. 42264 pfam02281: Transposase Tn5 dimerisation domain. Transposons are mobile DNA sequences capable of replication and insertion into the chromosome. Typically transposons code for the transposase enzyme, which catalyses insertion, found between terminal inverted repeats. Tn5 has a unique method of self- regulation in which a truncated version of the transposase enzyme acts as an inhibitor. The catalytic domain of the Tn5 transposon is found in pfam01609. This domain mediates dimerisation in the known structure. 42265 pfam02282: DNA polymerase processivity factor (UL42). The DNA polymerase processivity factor (UL42) of herpes simplex virus forms a heterodimer with UL30 to create the viral DNA polymerase complex. UL42 functions to increase the processivity of polymerisation and makes little contribution to the catalytic activity of the polymerase. 42266 pfam02283: Cobinamide kinase / cobinamide phosphate guanyltransferase. This family is composed of a group of bifunctional cobalamin biosynthesis enzymes which display cobinamide kinase and cobinamide phosphate guanyltransferase activity. The crystal structure of the enzyme reveals the molecule to be a trimer with a propeller-like shape. 42267 pfam02284: Cytochrome c oxidase subunit Va. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of cytochrome c oxidase subunit Va. 42268 pfam02285: Cytochrome oxidase c subunit VIII. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of cytochrome c oxidase subunit VIII. 42269 pfam02286: Dehydratase large subunit. This family contains the large subunit of the trimeric diol dehydratases and glycerol dehydratases. These enzymes are produced by some enterobacteria in response to growth substances. 42270 pfam02287: Dehydratase small subunit. This family contains the small subunit of the trimeric diol dehydratases and glycerol dehydratases. These enzymes are produced by some enterobacteria in response to growth substances. 42271 pfam02288: Dehydratase medium subunit. This family contains the medium subunit of the trimeric diol dehydratases and glycerol dehydratases. These enzymes are produced by some enterobacteria in response to growth substances. 42272 pfam02289: Cyclohydrolase (MCH). Methenyl tetrahydromethanopterin cyclohydrolase EC:3.5.4.27 is involved in methanogenesis in bacteria and archaea, producing methane from carbon monoxide or carbon dioxide. 42273 pfam02290: Signal recognition particle 14kD protein. The signal recognition particle (SRP) is a multimeric protein involved in targeting secretory proteins to the rough endoplasmic reticulum membrane. SRP14 and SRP9 form a complex essential for SRP RNA binding. 42274 pfam02291: Transcription initiation factor IID, 31kD subunit. This family represents the N-terminus of the 31kD subunit (42kD in drosophila) of transcription initiation factor IID (TAFII31). TAFII31 binds to p53, and is an essential requirement for p53 mediated transcription activation. 42275 pfam02292: APSES domain. This DNA-binding domain is found in several yeast proteins involved in transcriptional regulation. Often these proteins also contain the pfam00023 domain. The resolved structure of this domain reveals DNA-binding motif characteristic of the CAP family of helix-turn-helix transcription factors. APSES domains share a common fold with the nucleic acid-binding modules of the LAGLIDADG nucleases and the amino-terminal domains of the tRNA endonuclease. 42276 pfam02293: AmiS/UreI family transporter. This family includes UreI and proton gated urea channel as well as putative amide transporters. 42277 pfam02294: 7kD DNA-binding domain. This family contains members of the hyper-thermophilic archaebacterium 7kD DNA-binding/endoribonuclease P2 family. There are five 7kD DNA-binding proteins, 7a-7e, found as monomers in the cell. Protein 7e shows the tightest DNA-binding ability. 42278 pfam02295: Adenosine deaminase z-alpha domain. This family consists of the N-terminus and thus the z-alpha domain of double-stranded RNA-specific adenosine deaminase (ADAR), an RNA- editing enzyme. The z-alpha domain is a Z-DNA binding domain, and binding of this region to B-DNA has been shown to be disfavoured by steric hindrance. . 42279 pfam02296: Alpha adaptin AP2, C-terminal domain. Alpha adaptin is a hetero tetramer which regulates clathrin-bud formation. The carboxyl-terminal appendage of the alpha subunit regulates translocation of endocytic accessory proteins to the bud site. 42280 pfam02297: Cytochrome oxidase c subunit VIb. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of the potentially heme-binding subunit IVb of the oxidase. 42281 pfam02298: Plastocyanin-like domain. This family represents a domain found in flowering plants related to the copper binding protein plastocyanin. Some members of this family may not bind copper due to the lack of key residues. 42282 pfam02300: Fumarate reductase subunit C. Fumarate reductase is a membrane-bound flavoenzyme consisting of four subunits, A-B. A and B comprise the membrane-extrinsic catalytic domain and C and D link the catalytic centres to the electron-transport chain. This family consists of the 15kD hydrophobic subunit C. 42283 pfam02301: HORMA domain. The HORMA (for Hop1p, Rev7p and MAD2) domain has been suggested to recognise chromatin states that result from DNA adducts, double stranded breaks or non-attachment to the spindle and acts as an adaptor that recruits other proteins. MAD2 is a spindle checkpoint protein which prevents progression of the cell cycle upon detection of a defect in mitotic spindle integrity. 42284 pfam02302: PTS system, Lactose/Cellobiose specific IIB subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. The lactose/cellobiose-specific family are one of four structurally and functionally distinct group IIB PTS system cytoplasmic enzymes. The fold of IIB cellobiose shows similar structure to mammalian tyrosine phosphatases. . 42285 pfam02303: Helix-destabilising protein. This family contains the bacteriophage helix-destabilising protein, or single-stranded DNA binding protein, required for DNA synthesis. 42286 pfam02304: Scaffold protein B. This is a family of proteins from single-stranded DNA bacteriophages. Scaffold proteins B and D are required for procapsid formation. Sixty copies of the internal scaffold protein B are found in the procapsid. 42288 pfam02306: Major spike protein (G protein). This is a family of proteins from single-stranded DNA bacteriophages. Five G proteins, each a tight beta barrel, from twelve surface spikes. 42289 pfam02307: Telomere-binding protein alpha subunit, N-terminal domain. The telomere-binding protein forms a heterodimer in ciliates consisting of an alpha and a beta subunit. This complex may function as a protective cap for the single-stranded telomeric overhang. Alpha subunit consists of 3 structural domains, all of the same beta-barrel OB fold. 42290 pfam02308: MgtC family. The MgtC protein is found in an operon with the Mg2+ transporter protein MgtB. The function of MgtC and its homologues is not known. 42291 pfam02309: AUX/IAA family. Transcription of the AUX/IAA family of genes is rapidly induced by the plant hormone auxin. Some members of this family are longer and contain an N terminal DNA binding domain. The function of this region is uncertain. 42292 pfam02310: B12 binding domain. This domain binds to B12 (adenosylcobamide), it is found in several enzymes, such as glutamate mutase, methionine synthase and methylmalonyl-CoA mutase. 42293 pfam02311: AraC-like ligand binding domain. This family represents the arabinose-binding and dimerisation domain of the bacterial gene regulatory protein AraC. The domain is found in conjunction with the helix-turn-helix (HTH) DNA-binding motif pfam00165. This domain is distantly related to the Cupin domain pfam00190. 42294 pfam02312: Core binding factor beta subunit. Core binding factor (CBF) is a heterodimeric transcription factor essential for genetic regulation of hematopoiesis and osteogenesis. The beta subunit enhances DNA-binding ability of the alpha subunit in vitro, and has been show to have a structure related to the OB fold. 42295 pfam02313: Fumarate reductase subunit D. Fumarate reductase is a membrane-bound flavoenzyme consisting of four subunits, A-B. A and B comprise the membrane-extrinsic catalytic domain and C and D link the catalytic centres to the electron-transport chain. This family consists of the 13kD hydrophobic subunit D. 42296 pfam02315: Methanol dehydrogenase beta subunit. Methanol dehydrogenase (MDH) is a bacterial periplasmic quinoprotein that oxidises methanol to formaldehyde. MDH is a tetramer of two alpha and two beta subunits. This family contains the small beta subunit. 42297 pfam02316: Mu DNA-binding domain. This family consists of MuA-transposase and repressor protein CI. These proteins contain homologous DNA-binding domains at their N-termini which compete for the same DNA site within the Mu bacteriophage genome. 42298 pfam02317: NAD/NADP octopine/nopaline dehydrogenase, alpha-helical domain. This group of enzymes act on the CH-NH substrate bond using NAD(+) or NADP(+) as an acceptor. The Pfam family consists mainly of octopine and nopaline dehydrogenases from Ti plasmids. 42299 pfam02318: Rabphilin-3A effector domain. This is a family of proteins involved in protein transport in synaptic vesicles. Rabphilin-3A has been shown to contact Rab3A, a small G protein important in neurotransmitter release, in two distinct areas. 42300 pfam02319: Transcription factor E2F/dimerisation partner (TDP). This family contains the transcription factor E2F and its dimerisation partners TDP1 and TDP2, which stimulate E2F-dependent transcription. E2F binds to DNA as a homodimer or as a heterodimer in association with TDP1/2, the heterodimer having increased binding efficiency. 42301 pfam02320: Ubiquinol-cytochrome C reductase hinge protein. The ubiquinol-cytochrome C reductase complex (cytochrome bc1 complex) is a respiratory multienzyme complex. This Pfam family represents the 'hinge' protein of the complex which is thought to mediate formation of the cytochrome c1 and cytochrome c complex. 42302 pfam02321: Outer membrane efflux protein. The OEP family (Outer membrane efflux protein) form trimeric channels that allow export of a variety of substrates in Gram negative bacteria. Each member of this family is composed of two repeats. The trimeric channel is composed of a 12 stranded all beta sheet barrel that spans the outer membrane, and a long all helical barrel that spans the periplasm. 42303 pfam02322: Cytochrome oxidase subunit II. This Family consists of cytochrome bd type terminal oxidases that catalyses Quinol dependent, Na+ independent oxygen uptake. Members of this family are integral membrane proteins and contain a protoheame IX centre B558. One member of the family is implicated in having an important role in microaerobic nitrogen fixation in the enteric bacterium Klebsiella pneumoniae. 42304 pfam02323: Egg-laying hormone precursor. This family consists of egg-laying hormone (ELH) precursor and atrial gland peptides form little and California sea hare. The family also includes ovulation prohormone precursor from great pond snail. This family thus represents a conserved gastropoda ovulation and egg production prohormone. Note that many of the proteins present are further cleaved to give individual peptides. Neuropeptidergic bag cells of the marine mollusk Aplysia californica synthesise an egg-laying hormone (ELH) precursor protein which is cleaved to generate several bioactive peptides including ELH, bag cell peptides (BCP) and acidic peptide (AP).. 42305 pfam02324: Glycosyl hydrolase family 70. Members of this family belong to glycosyl hydrolase family 70 Glucosyltransferases or sucrose 6-glycosyl transferases (GTF-S) catalyse the transfer of D-glucopyramnosyl units from sucrose onto acceptor molecules, EC:2.4.1.5. This family roughly corresponds to the N-terminal catalytic domain of the enzyme. Members of this family also contain the Putative cell wall binding domain pfam01473, which corresponds with the C-terminal glucan-binding domain. 42306 pfam02325: YGGT family. This family consists of a repeat found in conserved hypothetical integral membrane proteins. The function of this region and the proteins which possess it is unknown. 42307 pfam02326: YMF19 hypothetical plant mitochondrial protein. This family consists of hypothetical proteins which are similar to the mitochondrial membrane protein YMF19, one sequence is annotated as ATP Synthase F0 subunit 8 EC:3.6.1.34 although there is no experimental evidence to support this, another sequence may be involved in cytoplasmic male sterility (CMS) in Brassica. 42308 pfam02327: Bacteriochlorophyll A protein. Bacteriochlorophyll A protein is involved in the energy transfer system of green photosynthetic bacteria. The protein forms a homotrimer, with each monomer unit containing seven molecules of bacteriochlorophyll A. 42309 pfam02328: Vanadium chloroperoxidase. This family of enzymes function by oxidising halides in the presence of hydrogen peroxide to form the corresponding hypohalous acids. 42310 pfam02329: Histidine carboxylase PI chain. Histidine carboxylase catalyses the formation of histamine from histidine. Cleavage of the proenzyme PI chain yields two subunits, alpha and beta, which arrange as a hexamer (alpha beta)6. 42311 pfam02330: Mitochondrial glycoprotein. This mitochondrial matrix protein family contains members of the MAM33 family which bind to the globular 'heads' of C1Q. It is thought to be involved in mitochondrial oxidative phosphorylation and in nucleus-mitochondrion interactions. 42312 pfam02331: Apoptosis preventing protein. This viral protein functions to block the host apoptotic response caused by infection by the virus. The apoptosis preventing protein (or early 35kD protein, P35) acts by blocking caspase protease activity. 42313 pfam02332: Methane/Phenol/Toluene Hydroxylase. Bacterial phenol hydroxylase is a multicomponent enzyme that catabolises phenol and some of its methylated derivatives. This Pfam family contains both the P1 and P3 polypeptides of phenol hydroxlase and the alpha and beta chain of methane hydroxylase protein A. 42314 pfam02333: Phytase. Phytase is a secreted enzyme which hydrolyses phytate to release inorganic phosphate. This family appears to represent a novel enzyme that shows phytase activity and has been shown to have a six- bladed propeller folding architecture. 42315 pfam02334: Replication terminator protein. The bacterial replication terminator protein (RTP) plays a role in the termination of DNA replication by impeding replication fork movement. Two RTP dimers bind to the two inverted repeat regions at the termination site. 42316 pfam02335: Cytochrome c552. Cytochrome c552 (cytochrome c nitrite reductase) is a crucial enzyme in the nitrogen cycle catalysing the reduction of nitrite to ammonia. The crystal structure of cytochrome c552 reveals it to be a dimer, with with 10 close-packed type c haem groups. 42317 pfam02336: Capsid protein VP4. Four different translation initiation sites of the densovirus capsid protein mRNA give rise to four viral proteins, VP1 to VP4. This family represents VP4. 42318 pfam02337: Retroviral GAG p10 protein. This family consists of various retroviral GAG (core) polyproteins and encompasses the p10 region producing the p10 protein upon proteolytic cleavage of GAG by retroviral protease. The p10 or matrix protein (MA) is associated with the virus envelope glycoproteins in most mammalian retroviruses and may be involved in virus particle assembly, transport and budding. Some of the GAG polyproteins have alternate cleavage sites leading to the production of alternative and longer cleavage products (e.g. p19) the alignment of this family only covers the approximately N-terminal (GAG) 100 amino acid region of homology to p10. 42319 pfam02338: OTU-like cysteine protease. This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases. 42320 pfam02339: Aggrecan core protein repeat. This short repeat is found in the aggrecan core protein. The function of this repeat is unknown. The consensus of this repeat is SGXXSGXXXX where X can be any amino acid. 42321 pfam02340: PRRSV putative envelope protein. This family consists of a conserved probable envelope protein or ORF2 in porcine reproductive and respiratory syndrome virus (PRRSV) also in the family is a minor structural protein from lactate dehydrogenase-elevating virus. 42322 pfam02341: RbcX protein. The RBCX protein has been identified as having a possible chaperonin-like function. The rbcX gene is juxtaposed to and cotranscribed with rbcL and rbcS encoding RuBisCO in Anabaena sp. CA. RbcX has been shown to possess a chaperonin-like function assisting correct folding of RuBisCO in E. coli expression studies and is needed for RuBisCO to reach its maximal activity. 42323 pfam02342: Bacterial stress protein. This family consists of tellurite resistance proteins, cAMP binding protein, and chemical-damaging agent resistance proteins and general stress proteins. The biochemical function of tellurite resistance proteins is not known other than there presence as an active protein is required for growth on the otherwise highly toxic potassium tellurite medium. The family also contains a novel cyclic AMP binding protein subunit CABP1 from slime mold. 42324 pfam02343: Domain of unknown function DUF130. This family has no known function, it consists of C. elegans proteins and is present as a repeat in some members. The aligned region has 4 conserved cysteine residues and is a maximum of 175 residues long. 42325 pfam02344: Myc leucine zipper domain. This family consists of the leucine zipper dimerisation domain found in both cellular c-Myc proto-oncogenes and viral v-Myc oncogenes. Dimerisation via the leucine zipper motif with other basic helix-loop-helix-leucine zipper (b/HLH/lz) proteins such as Max is required for efficient DNA binding. The Myc-Max dimer is a transactivating complex activating expression of growth related genes promoting cell proliferation. The dimerisation is facilitated via interdigitating leucine residues every 7th position of the alpha helix. Like charge repulsion of adjacent residues in this region perturbs the formation of homodimers with heterodimers being promoted by opposing charge attractions. 42326 pfam02345: TILa domain. This cysteine rich domain occurs along side the TIL pfam01826 domain and is likely to be a distantly related relative. This domain is found five to twenty-five times in zonadhesins. The TILa domain is also found twice in an IGG FC binding protein. 42327 pfam02346: Chordopoxvirus fusion protein. This is a family of viral fusion proteins from the chordopoxviruses. A 14-kDa Vaccinia Virus protein has been demonstrated to function as a viral fusion protein mediating cell fusion at endosmomal (low) pH. 42328 pfam02347: Glycine cleavage system P-protein. This family consists of Glycine cleavage system P-proteins EC:1.4.4.2 from bacterial, mammalian and plant sources. The P protein is part of the glycine decarboxylase multienzyme complex EC:2.1.2.10 (GDC) also annotated as glycine cleavage system or glycine synthase. GDC consists of four proteins P, H, L and T. The reaction catalysed by this protein is:- Glycine + lipoylprotein <=> S-aminomethyldihydrolipoylprotein + CO2. 42329 pfam02348: Cytidylyltransferase. This family consists of two main Cytidylyltransferase activities: 1) 3-deoxy-manno-octulosonate cytidylyltransferase, EC:2.7.7.38 catalysing the reaction:- CTP + 3-deoxy-D-manno-octulosonate <=> diphosphate + CMP-3-deoxy-D-manno-octulosonate, 2) acylneuraminate cytidylyltransferase EC:2.7.7.43, catalysing the reaction:- CTP + N-acylneuraminate <=> diphosphate + CMP-N-acylneuraminate. NeuAc cytydilyltransferase of Mannheimia haemolytica has been characterised describing kinetics and regulation by substrate charge, energetic charge and amino-sugar demand. 42330 pfam02349: Major surface glycoprotein. This is a novel repeat in Pneumocystis carinii Major surface glycoprotein (MSG) some members of the alignment have up to nine repeats of this family, the repeats containing several conserved cysteines. The MSG of P. carinii is an important protein in host-pathogen interactions. Surface glycoprotein A from Pneumocystis carinii is a main target for the host immune system, this protein is implicated in the attachment of Pneumocystis carinii to the host alveolar epithelial cells, alveolar macrophages, host surfactant and possibly accounts in part for the hypoxia seen in Pneumocystis carinii pneumonia (PCP).. 42331 pfam02350: UDP-N-acetylglucosamine 2-epimerase. This family consists of UDP-N-acetylglucosamine 2-epimerases EC:5.1.3.14 this enzyme catalyses the production of UDP-ManNAc from UDP-GlcNAc. Note that some of the enzymes is this family are bifunctional, in this instance Pfam matches only the N-terminal half of the protein suggesting that the additional C-terminal part (when compared to mono-functional members of this family) is responsible for the UPD-N-acetylmannosamine kinase activity of these enzymes. This hypothesis is further supported by the assumption that the C-terminal part of respective members is the kinase domain. 42332 pfam02351: GDNF receptor family. This family consists of Glial-cell-line-derived neurotrophic factor (GDNF) and neurturin (NTN) these receptors are potent survival factors for sympathetic, sensory and central nervous system neurons. GDNF and neurturin promote neuronal survival by signaling through similar multicomponent receptors that consist of a common receptor tyrosine kinase and a member of a GPI-linked family of receptors that determines ligand specificity. 42333 pfam02352: Decorin binding protein. This family consists of decorin binding proteins from Borrelia. The decorin binding protein of Borrelia burgdorferi the lyme disease spirochetes adheres to the proteoglycan decorin found on collagen fibres. 42334 pfam02353: Cyclopropane-fatty-acyl-phospholipid synthase. This family consist of Cyclopropane-fatty-acyl-phospholipid synthase or CFA synthase EC:2.1.1.79 this enzyme catalyse the reaction: S-adenosyl-L-methionine + phospholipid olefinic fatty acid <=> S-adenosyl-L-homocysteine + phospholipid cyclopropane fatty acid. 42335 pfam02354: Latrophilin Cytoplasmic C-terminal region. This family consists of the cytoplasmic C-terminal region in latrophilin. Latrophilin is a synaptic Ca2+ independent alpha- latrotoxin (LTX) receptor and is a novel member of the secretin family of G-protein coupled receptors that are involved in secretion. Latrophilin mRNA is present only in neuronal tissue. Lactrophillin interacts with G-alpha O. 42336 pfam02355: Protein export membrane protein. This family consists of various prokaryotic SecD and SecF protein export membrane proteins. This SecD and SecF proteins are part of the multimeric protein export complex comprising SecA, D, E, F, G, Y, and YajC. SecD and SecF are required to maintain a proton motive force. 42337 pfam02357: Transcription termination factor nusG. 42338 pfam02358: Trehalose-phosphatase. This family consist of trehalose-phosphatases EC:3.1.3.12 these enzyme catalyse the de-phosphorylation of trehalose-6-phosphate to trehalose and orthophosphate. The aligned region is present in trehalose-phosphatases and comprises the entire length of the protein it is also found in the C-terminus of trehalose-6-phosphate synthase EC:2.4.1.15 adjacent to the trehalose-6-phosphate synthase domain - pfam00982. It would appear that the two equivalent genes in the E. coli otsBA operon, otsA the trehalose-6-phosphate synthase, and otsB trehalose-phosphatase (this family), have undergone gene fusion in most eukaryotes. Trehalose is a common disaccharide of bacteria, fungi and invertebrates that appears to play a major role in desiccation tolerance. 42339 pfam02359: Cell division protein 48 (CDC48), N-terminal domain. This domain has a double psi-beta barrel fold and includes VCP-like ATPase and N-ethylmaleimide sensitive fusion protein N-terminal domains. Both the VAT and NSF N-terminal functional domains consist of two structural domains of which this is at the N-terminus. The VAT-N domain found in AAA ATPases pfam00004 is a substrate 185-residue recognition domain. 42340 pfam02361: Cobalt transport protein. This family consists of various cobalt transport proteins Most of which are found in Cobalamin (Vitamin B12) biosynthesis operons. In Salmonella the cbiN cbiQ (product CbiQ in this family) and cbiO are likely to form an active cobalt transport system. 42341 pfam02362: B3 DNA binding domain. This is a family of plant transcription factors with various roles in development, the aligned region corresponds the B3 DNA binding domain; this domain is found in VP1/AB13 transcription factors. Some proteins also have a second AP2 DNA binding domain pfam00847 such as RAV1. 42342 pfam02363: Cysteine rich repeat. This Cysteine repeat C-X3-C-X3-C is repeated in sequences of this family, up to 34 times. The function of these repeats is unknown as is the function of the proteins in which they occur. Most of the sequences in this family are from C. elegans. 42343 pfam02364: 1,3-beta-glucan synthase component. This family consists of various 1,3-beta-glucan synthase components including Gls1, Gls2 and Gls3 from yeast. 1,3-beta-glucan synthase EC:2.4.1.34 also known as callose synthase catalyses the formation of a beta-1,3-glucan polymer that is a major component of the fungal cell wall. The reaction catalysed is:- UDP-glucose + {(1,3)-beta-D-glucosyl}(N) <=> UDP + {(1,3)-beta-D-glucosyl}(N+1).. 42344 pfam02365: No apical meristem (NAM) protein. This is a family of no apical meristem (NAM) proteins these are plant development proteins. Mutations in NAM result in the failure to develop a shoot apical meristem in petunia embryos. NAM is indicated as having a role in determining positions of meristems and primordia. One member of this family NAP (NAC-like, activated by AP3/PI) is encoded by the target genes of the AP3/PI transcriptional activators and functions in the transition between growth by cell division and cell expansion in stamens and petals. 42345 pfam02366: Dolichyl-phosphate-mannose-protein mannosyltransferase. This is a family of Dolichyl-phosphate-mannose-protein mannosyltransferase proteins EC:2.4.1.109. These proteins are responsible for O-linked glycosylation of proteins, they catalyse the reaction:- Dolichyl phosphate D-mannose + protein <=> dolichyl phosphate + O-D-mannosyl-protein. Also in this family is Drosophila rotated abdomen protein which is a putative mannosyltransferase. This family appears to be distantly related to pfam02516 (A Bateman pers. obs.).. 42346 pfam02367: Uncharacterised P-loop hydrolase UPF0079. This uncharacterised family contains a P-loop. 42347 pfam02368: Bacterial Ig-like domain (group 2). This family consists of bacterial domains with an Ig-like fold. Members of this family are found in bacterial and phage surface proteins such as intimins. 42348 pfam02369: Bacterial Ig-like domain (group 1). This family consists of bacterial domains with an Ig-like fold. Members of this family are found in bacterial surface proteins such as intimins and invasins involved in pathogenicity. 42349 pfam02370: M protein repeat. This short repeat is found in multiple copies in bacterial M proteins. The M proteins bind to IgA and are closely associated with virulence. The M protein has been postulated to be a major group A Streptococcal (GAS) virulence factor because of its contribution to the bacterial resistance to opsonophagocytosis. 42350 pfam02371: Transposase IS116/IS110/IS902 family. Transposases are needed for efficient transposition of the insertion sequence or transposon DNA. This family includes transposases for IS116, IS110 and IS902. This region is often found with pfam01548. The exact function of this region is uncertain. 42351 pfam02372: Interleukin 15. Interleukin-15 (IL-15) is a cytokine that possesses a variety of biological functions, including stimulation and maintenance of cellular immune responses. 42352 pfam02373: jmjC domain. The jmjC domain is thought to be involved in chromatin organisation by modulating heterochromatisation. . 42353 pfam02374: Anion-transporting ATPase. This Pfam family represents a conserved domain, which is sometimes repeated, in an anion-transporting ATPase. The ATPase is involved in the removal of arsenate, antimonite, and arsenate from the cell. 42354 pfam02375: jmjN domain. 42355 pfam02376: CUT domain. The CUT domain is a DNA-binding motif which can bind independently or in cooperation with the homeodomain, often found downstream of the CUT domain. Multiple copies of the CUT domain can exist in one protein. 42356 pfam02377: Dishevelled specific domain. This domain is specific to the signaling protein dishevelled. The domain is found adjacent to the PDZ domain pfam00595, often in conjunction with DEP (pfam00610) and DIX (pfam00778).. 42357 pfam02378: Phosphotransferase system, EIIC. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. The sugar-specific permease of the PTS consists of three domains (IIA, IIB and IIC). The IIC domain catalyses the transfer of a phosphoryl group from IIB to the sugar substrate. 42358 pfam02379: PTS system, Fructose specific IIB subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. The sugar-specific permease of the PTS consists of three domains (IIA, IIB and IIC). IIB is is phosphorylated by phospho-IIA, before the phosphoryl group is transferred to the sugar substrate. 42359 pfam02380: T-antigen specific domain. This domain represents a conserved region in papovavirus small and middle T-antigens. It is found as the N-terminal domain in the small T-antigen, and is centrally located in the middle T-antigen. 42360 pfam02381: Domain of unknown function UPF0040 family. This small 70 amino acid domain is found duplicated in a family of bacterial proteins (A Bateman pers. observation). These proteins are probably enzymes with each domain containing a conserved DXXXR motif that probably forms part of the active site. 42361 pfam02382: RTX N-terminal domain. The RTX family of bacterial toxins are a group of cytolysins and cytotoxins. This Pfam family represents the N-terminal domain which is found in association with a glycine-rich repeat domain and hemolysinCabind pfam00353. 42362 pfam02383: SacI homology domain. This Pfam family represents a protein domain which shows homology to the yeast protein SacI. The SacI homology domain is most notably found at the amino terminal of the inositol 5'-phosphatase synaptojanin. 42363 pfam02384: N-6 DNA Methylase. This family consists of N-6 adenine-specific DNA methylase EC:2.1.1.72 from Type I and Type IC restriction systems. These enzymes are responsible for the methylation of specific DNA sequences in order to prevent the host from digesting its own genome via its restriction enzymes. These methylases have the same sequence specificity as their corresponding restriction enzymes. 42364 pfam02386: Cation transport protein. This family consists of various cation transport proteins (Trk) and V-type sodium ATP synthase subunit J or translocating ATPase J EC:3.6.1.34. These proteins are involved in active sodium up-take utilising ATP in the process. TrkH a member of the family from E. coli is a hydrophobic membrane protein and determines the specificity and kinetics of cation transport by the TrK system in E. coli. 42365 pfam02387: IncFII RepA protein family. This protein is plasmid encoded and found to be essential for plasmid replication. 42366 pfam02388: FemAB family. The femAB operon codes for two nearly identical approximately 50-kDa proteins involved in the formation of the Staphylococcal pentaglycine interpeptide bridge in peptidoglycan. These proteins are also considered as a factor influencing the level of methicillin resistance. 42367 pfam02389: Cornifin (SPRR) family. SPRR genes (formerly SPR) encode a novel class of polypeptides (small proline rich proteins) that are strongly induced during differentiation of human epidermal keratinocytes in vitro and in vivo.The most characteristic feature of the SPRR gene family resides in the structure of the central segments of the encoded polypeptides that are built up from tandemly repeated units of either eight (SPRR1 and SPRR3) or nine (SPRR2) amino acids with the general consensus XKXPEPXX where X is any amino acid. 42368 pfam02390: Putative methyltransferase. This is a family of putative methyltransferases. The aligned region contains the GXGXG S-AdoMet binding site suggesting a putative methyltransferase activity. 42369 pfam02391: MoaE protein. This family contains the MoaE protein that is involved in biosynthesis of molybdopterin. Molybdopterin, the universal component of the pterin molybdenum cofactors, contains a dithiolene group serving to bind Mo. Addition of the dithiolene sulfurs to a molybdopterin precursor requires the activity of the converting factor. Converting factor contains the MoaE and MoaD proteins. 42370 pfam02392: Ycf4. This family consists of hypothetical Ycf4 proteins from various chloroplast genomes. It has been suggested that Ycf4 is involved in the assembly and/or stability of the photosystem I complex in chloroplasts. 42371 pfam02393: US22 like. This is the US22 protein family of hypothetical proteins from herpes virus. The name sake of this family US22 is an early nuclear protein that is secreted from cells. The US22 family may have a role in virus replication and pathogenesis. 42372 pfam02394: Interleukin-1 propeptide. The Interleukin-1 cytokines are translated as precursor proteins. The N terminal approx. 115 amino acids form a propeptide that is cleaved off to release the active interleukin-1. 42373 pfam02395: Immunoglobulin A1 protease. This family consists of immunoglobulin A1 protease proteins. The immunoglobulin A1 protease cleaves immunoglobulin IgA and is found in pathogenic bacteria such as Neisseria gonorrhoeae. Not all of the members of this family are IgA proteases, a sequence from E. coli O157:H7 cleaves human coagulation factor V and another one is a hemoglobin protease from E. coli EB1. 42374 pfam02396: Acetaldehyde dehydrogenase. This family of bacterial enzymes catalyse the formation of acetyl- CoA from acetaldehyde in the 3-hydroxyphenylpropinoate degradation pathway. 42375 pfam02397: Bacterial sugar transferase. This Pfam family represents a conserved region from a number of different bacterial sugar transferases, involved in diverse biosynthesis pathways. 42376 pfam02398: Coronavirus protein 7. This is a family of proteins from coronavirus which may function in viral assembly. 42377 pfam02399: Origin of replication binding protein. This Pfam family represents the herpesvirus origin of replication binding protein, probably involved in DNA replication. 42378 pfam02400: Glycoprotein GG/GX. Glycoprotein G (gG)is one of the seven external glycoproteins of HSV1 and HSV2. This family also contains the glycoprotein GX, (gX), initially identified in Pseudorabies virus. 42379 pfam02401: LytB protein. The mevalonate-independent 2-C-methyl-D-erythritol 4-phosphate (MEP) pathway for isoprenoid biosynthesis is essential in many eubacteria, plants, and the malaria parasite. The LytB gene is involved in the trunk line of the MEP pathway. 42380 pfam02402: Lysis protein. These small bacterial proteins are required for colicin release and partial cell lysis. This family contains lysis proteins for several different forms of colicin. 42381 pfam02403: Seryl-tRNA synthetase N-terminal domain. This domain is found associated with the Pfam tRNA synthetase class II domain (pfam00587) and represents the N-terminal domain of seryl-tRNA synthetase. 42382 pfam02404: Stem cell factor. Stem cell factor (SCF) is a homodimer involved in hematopoiesis. SCF binds to and activates the SCF receptor (SCFR), a receptor tyrosine kinase. The crystal structure of human SCF has been resolved and a potential receptor-binding site identified. 42383 pfam02405: Domain of unknown function DUF140. This domain has no known function nor do any of the proteins that possess it. The aligned region is approximately 150 amino acids long. 42384 pfam02406: MmoB/DmpM family. This family consists of monooxygenase components such as MmoB methane monooxygenase (EC:1.14.13.25) regulatory protein B. When MmoB is present at low concentration it converts methane monooxygenase from an oxidase to a hydroxylase and stabilises intermediates required for the activation of dioxygen. Also found in this family is DmpM or Phenol hydroxylase (EC:1.14.13.7) protein component P2, this protein lacks redox co-factors and is required for optimal turnover of Phenol hydroxylase. 42385 pfam02407: Putative viral replication protein. This is a family of viral ORFs from various plant and animal ssDNA circoviruses. Published evidence to support the annotated function ""viral replication associated protein"" has not be found. 42386 pfam02408: Domain of unknown function DUF141. This is a family of hypothetical C. elegans proteins. The aligned region has no known function nor do any of the proteins which possess it. The aligned region is approximately 130 amino acids long and contains two conserved cysteine residues. 42387 pfam02409: O-methyltransferase N-terminus. This domain is found at the N-terminus of polyketide synthesis O-methyltransferase proteins. 42388 pfam02410: Domain of unknown function DUF143. This domain has no known function nor do any of the proteins that possess it. The aligned region is approximately 100 amino acids long. 42389 pfam02411: MerT mercuric transport protein. MerT is an mercuric transport integral membrane protein and is responsible for transport of the Hg2+ iron from periplasmic MerP (also part of the transport system) to mercuric reductase (MerE).. 42390 pfam02412: Thrombospondin type 3 repeat. This repeat probably binds to calcium. 42391 pfam02413: Caudovirales tail fibre assembly protein. This family contains bacterial and phage tail fibre assembly proteins. E.coli contains several members of this family although the function of these proteins is uncertain. 42392 pfam02414: Borrelia ORF-A. This protein is encoded by an open reading frame in plasmid borne DNA repeats of Borrelia species. This protein is known as ORF-A. The function of this putative protein is unknown. 42393 pfam02415: Chlamydia polymorphic membrane protein (Chlamydia_PMP). This family contains several Chlamydia polymorphic membrane proteins. Chlamydia pneumoniae is an obligate intracellular bacterium and a common human pathogen causing infection of the upper and lower respiratory tract. Common for the Pmps are the tetrapeptide GGA(I/V/L) motif repeated several times in the N-terminal part. The C-terminal half is characterised by conserved tryptophans and a carboxy-terminal phenylalanine. A signal peptide leader sequence is predicted in 20 C. pneumoniae Pmps, which indicates an outer membrane localisation. Pmp10 and Pmp11 contain a signal peptidase II cleavage site suggesting lipid modification. The C. pneumoniae pmp genes represent 17.5% of the chlamydia-specific coding capacity and they are all transcribed during chlamydial growth but the function of Pmps remains unknown. 42394 pfam02416: mttA/Hcf106 family. Members of this protein family are involved in a sec independent translocation mechanism. This pathway has been called the DeltapH pathway in chloroplasts. Members of this family in E.coli are involved in export of redox proteins with a ""twin arginine"" leader motif. 42395 pfam02417: Chromate transporter. Members of this family probably act as chromate transporters. Members of this family are found in both bacteria and archaebacteria. The proteins are composed of one or two copies of this region. The alignment contains two conserved motifs, FGG and PGP. 42396 pfam02419: PsbL protein. This family consists of the photosystem II reaction centre protein PsbJ from plants and Cyanobacteria. The function of this small protein is unknown. Interestingly the mRNA for this protein requires a post-transcriptional modification of an ACG triplet to form an AUG initiator codon. 42397 pfam02420: Insect antifreeze protein. This family of extracellular proteins is involved in stopping the formation of ice crystals at low temperatures. The proteins are composed of a 12 residue repeat that forms a structural repeat. The structure of the repeats is a beta helix. Each repeat contains two cys residues that form a disulphide bridge. 42398 pfam02421: Ferrous iron transport protein B. Escherichia coli has an iron(II) transport system (feo) which may make an important contribution to the iron supply of the cell under anaerobic conditions. FeoB has been identified as part of this transport system. FeoB is a large 700-800 amino acid integral membrane protein. The N terminus contains a P-loop motif suggesting that iron transport may be ATP dependent. 42399 pfam02422: Keratin. This family represents avian keratin proteins, found in feathers, scale and claw. 42400 pfam02423: Ornithine cyclodeaminase/mu-crystallin family. This family contains the bacterial Ornithine cyclodeaminase enzyme EC:4.3.1.12, which catalyses the deamination of ornithine to proline. This family also contains mu-Crystallin the major component of the eye lens in several Australian marsupials, mRNA for this protein has also been found in human retina. 42401 pfam02424: ApbE family. This prokaryotic family of lipoproteins are related to ApbE from Salmonella typhimurium. ApbE is involved in thiamine synthesis. More specifically is may be involved in the conversion of aminoimidazole ribotide (AIR) to 4-amino-5-hydroxymethyl-2-methyl pyrimidine (HMP).. 42402 pfam02425: Paralytic/GBP/PSP peptide. This family includes insect peptides that are short (23 amino acids) and contain 1 disulphide bridge. The family includes growth-blocking peptide (GBP) of Pseudaletia separata and the paralytic peptides from Manduca sexta, Heliothis virescens, and Spodoptera exigua as well as plasmatocyte-spreading peptide (PSP1). These peptides function to halt metamorphosis from larvae to pupae. 42403 pfam02426: Muconolactone delta-isomerase. This small enzyme forms a homodecameric complex, that catalyses the third step in the catabolism of catechol to succinate- and acetyl-coa in the beta-ketoadipate pathway EC:5.3.3.4. The protein has a ferredoxin-like fold according to SCOP. 42404 pfam02427: Photosystem I reaction centre subunit IV / PsaE. PsaE is a 69 amino acid polypeptide from photosystem I present on the stromal side of the thylakoid membrane. The structure is comprised of a well-defined five-stranded beta-sheet similar to SH3 domains. 42405 pfam02428: Potato type II proteinase inhibitor family. Members of this family are proteinase inhibitors that contain eight cysteines that form four disulphide bridges. The structure of the proteinase-inhibitor complex is known. 42406 pfam02429: Peridinin-chlorophyll A binding protein. Peridinin-chlorophyll-protein, a water-soluble light-harvesting complex that has a blue-green absorbing carotenoid as its main pigment, is present in most photosynthetic dinoflagellates. These proteins are composed of two similar repeated domains. These domains constitute a scaffold with pseudo-twofold symmetry surrounding a hydrophobic cavity filled by two lipid, eight peridinin, and two chlorophyll a molecules. 42407 pfam02430: Apical membrane antigen 1. Apical membrane antigen 1 (AMA-1) is a Plasmodium asexual blood-stage antigen. It has been suggested that positive selection operates on the AMA-1 gene in regions coding for antigenic sites. 42408 pfam02431: Chalcone-flavanone isomerase. Chalcone-flavanone isomerase is a plant enzyme responsible for the isomerisation of chalcone to naringenin, a key step in the biosynthesis of flavonoids. 42409 pfam02432: Fimbrial, major and minor subunit. Fimbriae (also know as pili) are polar filaments found on the bacterial surface, allowing colonisation of the host. This family consists of the minor and major fimbrial subunits. 42410 pfam02433: Cytochrome C oxidase, mono-heme subunit/FixO. The bacterial oxidase complex, fixNOPQ or cytochrome cbb3, is thought to be required for respiration in endosymbiosis. FixO is a membrane bound mono-heme constituent of the fixNOPQ complex. 42411 pfam02434: Fringe-like. The drosophila protein fringe (FNG) is a glucosaminyltransferase that controls the response of the Notch receptor to specific ligands. FNG is localised to the Golgi apparatus (not secreted as previously thought). Modification of Notch occurs through glycosylation by FNG. The xenopus homologue, lunatic fringe, has been implicated in a variety of functions. 42412 pfam02435: Levansucrase/Invertase. This Pfam family consists of the glycosyl hydrolase 68 family, including several bacterial levansucrase enzymes, and invertase from zymomonas. 42413 pfam02436: Conserved carboxylase domain. This domain represents a conserved region in pyruvate carboxylase (PYC), oxaloacetate decarboxylase alpha chain (OADA), and transcarboxylase 5s subunit. The domain is found adjacent to the HMGL-like domain (pfam00682) and often close to the biotin_lipoyl domain (pfam00364) of biotin requiring enzymes. . 42414 pfam02437: SKI/SNO/DAC family. This family contains a presumed domain that is about 100 amino acids long. All members of this family contain a conserved CLPQ motif. The c-ski proto-oncogene has been shown to influence proliferation, morphological transformation and myogenic differentiation. Sno, a Ski proto-oncogene homologue, is expressed in two isoforms and plays a role in the response to proliferation stimuli. Dachshund also contains this domain. It is involved in various aspects of development. 42415 pfam02438: Late 100kD protein. The late 100kD protein is a non-structural viral protein involved in the transport of hexon from the cytoplasm to the nucleus. 42416 pfam02439: Adenovirus E3 region protein CR2. Early region 3 (E3) of human adenoviruses (Ads) codes for proteins that appear to control viral interactions with the host. This region called CR2 (conserved region 1) is found in Adenovirus type 19 (a subgroup D virus) 49 Kd protein in the E3 region. CR2 is also found in the 20.1 Kd protein of subgroup B adenoviruses. The function of this 50 amino acid region is unknown. 42417 pfam02440: Adenovirus E3 region protein CR1. Early region 3 (E3) of human adenoviruses (Ads) codes for proteins that appear to control viral interactions with the host. This region called CR1 (conserved region 1) is found three times in Adenovirus type 19 (a subgroup D virus) 49 Kd protein in the E3 region. CR1 is also found in the 20.1 Kd protein of subgroup B adenoviruses. The function of this 80 amino acid region is unknown. This region is probably a divergent immunoglobulin domain (A. Bateman pers. observation).. 42418 pfam02441: Flavoprotein. This family contains diverse flavoprotein enzymes. This family includes epidermin biosynthesis protein, EpiD, which has been shown to be a flavoprotein that binds FMN. This enzyme catalyses the removal of two reducing equivalents from the cysteine residue of the C-terminal meso-lanthionine of epidermin to form a --C==C-- double bond. This family also includes the B chain of dipicolinate synthase a small polar molecule that accumulates to high concentrations in bacterial endospores, and is thought to play a role in spore heat resistance, or the maintenance of heat resistance. dipicolinate synthase catalyses the formation of dipicolinic acid from dihydroxydipicolinic acid. This family also includes phenylacrylic acid decarboxylase (EC:4.1.1.-).. 42419 pfam02442: L1L/F9/C19 pox virus orf family. The function of this protein family is unknown. 42420 pfam02443: Circovirus ORF-2 protein. Circoviruses are small circular single stranded viruses. This family is the ORF-2 protein from viruses such as porcine circovirus and beak and feather disease virus. These proteins are about 220 amino acids long and of unknown function. 42421 pfam02444: Hepatitis E virus ORF-2 (Putative capsid protein). The Hepatitis E virus (HEV) genome is a single-stranded, positive-sense RNA molecule of approximately 7.5 kb. Three open reading frames (ORF) were identified within the HEV genome: ORF1 encodes non-structural proteins, ORF2 encodes the putative structural protein(s), and ORF3 encodes a protein of unknown function. ORF2 contains a consensus signal peptide sequence at its amino terminus and a capsid-like region with a high content of basic amino acids similar to that seen with other virus capsid proteins. 42422 pfam02445: Quinolinate synthetase A protein. Quinolinate synthetase catalyses the second step of the de novo biosynthetic pathway of pyridine nucleotide formation. In particular, quinolinate synthetase is involved in the condensation of dihydroxyacetone phosphate and iminoaspartate to form quinolinic acid. This synthesis requires two enzymes, a FAD-containing ""B protein"" and an ""A protein"".. 42423 pfam02446: 4-alpha-glucanotransferase. These enzymes EC:2.4.1.25 transfer a segment of a (1,4)-alpha-D-glucan to a new 4-position in an acceptor, which may be glucose or (1,4)-alpha-D-glucan. 42424 pfam02447: GntP family permease. This is a family of integral membrane permeases that are involved in gluconate uptake. E. coli contains several members of this family including GntU, a low affinity transporter and GntT, a high affinity transporter. 42425 pfam02448: L71 family. This family of insect proteins are each about 100 amino acids long and have 6 conserved cysteine residues. They all have a predicted signal peptide and are probably excreted. The function of the proteins is unknown. 42426 pfam02449: Beta-galactosidase. This group of beta-galactosidase enzymes belong to the glycosyl hydrolase 42 family. The enzyme catalyses the hydrolysis of terminal, non-reducing terminal beta-D-galactosidase residues. 42427 pfam02450: Lecithin:cholesterol acyltransferase. Lecithin:cholesterol acyltransferase (LACT) is involved in extracellular metabolism of plasma lipoproteins, including cholesterol. 42428 pfam02451: Nodulin. Nodulin is a plant protein of unknown function. It is induced during nodulation in legume roots after rhizobium infection. 42429 pfam02452: PemK-like protein. PemK is a growth inhibitor in E. Coli known to bind to the promoter region of the Pem operon, auto-regulating synthesis. This Pfam family consists of the PemK protein in addition to ChpA, ChpB and other PemK-like proteins. . 42430 pfam02453: Reticulon. Reticulon, also know as neuroendocrine-specific protein (NSP), is a protein of unknown function which associates with the endoplasmic reticulum. This family represents the C-terminal domain of the three reticulon isoforms and their homologues. 42431 pfam02454: Sigma 1s protein. The reoviral gene S1 encodes for haemagglutinin (sigma 1 protein), an outer capsid protein and a major factor in determining virus-host cell interactions. Sigma 1s is one of two translation products of the S1 gene. 42432 pfam02455: Hexon-associated protein (IIIa). The major capsid protein of the adenovirus strain is also known as a hexon. This is a family of hexon-associated proteins (protein IIIa).. 42433 pfam02456: Adenovirus IVa2 protein. IVa2 protein can interact with the adenoviral packaging signal and that this interaction involves DNA sequences that have previously been demonstrated to be required for packaging. During the course of lytic infection, the adenovirus major late promoter (MLP) is induced to high levels after replication of viral DNA has started. IVa2 is a transcriptional activator of the major late promoter. 42434 pfam02457: Domain of unknown function DUF147. This domain is about 120 amino acids long. The function of this domain is unknown, however the distribution of conserved histidines and aspartates suggests that this may be a metal dependent phosphoesterase. This may be a nuclear domain as some members also contain a Helix-hairpin-helix pfam00633 motif that is characteristic of DNA binding proteins. 42435 pfam02458: Transferase family. This family includes a number of transferase enzymes. These include anthranilate N-hydroxycinnamoyl/benzoyltransferase that catalyses the first committed reaction of phytoalexin biosynthesis. Deacetylvindoline 4-O-acetyltransferase EC:2.3.1.107 catalyses the last step in vindoline biosynthesis is also a member of this family. The motif HXXXD is probably part of the active site. The family also includes trichothecene 3-O-acetyltransferase. 42436 pfam02459: Adenoviral DNA terminal protein. This protein is covalently attached to the terminii of replicating DNA in vivo. 42437 pfam02460: Patched family. The transmembrane protein Patched is a receptor for the morphogene Sonic Hedgehog. This protein associates with the smoothened protein to transduce hedgehog signals. 42438 pfam02461: Ammonia monooxygenase. Ammonia monooxygenase plays a key role in the nitrogen cycle and degrades a wide range of hydrocarbons and halogenated hydrocarbons. 42439 pfam02462: Opacity family porin protein. Pathogenic Neisseria spp. possess a repertoire of phase-variable Opacity proteins that mediate various pathogen--host cell interactions. These proteins are integral membrane proteins related to other porins. 42440 pfam02463: RecF/RecN/SMC N terminal domain. This domain is found at the N terminus of SMC proteins. The SMC (structural maintenance of chromosomes) superfamily proteins have ATP-binding domains at the N- and C-termini, and two extended coiled-coil domains separated by a hinge in the middle. The eukaryotic SMC proteins form two kind of heterodimers: the SMC1/SMC3 and the SMC2/SMC4 types. These heterodimers constitute an essential part of higher order complexes, which are involved in chromatin and DNA dynamics. This family also includes the RecF and RecN proteins that are involved in DNA metabolism and recombination. 42441 pfam02464: Competence-damaged protein. CinA is the first gene in the competence-inducible (cin) operon, and is thought to be specifically required at some stage in the process of transformation. This Pfam family consists of putative competence-damaged proteins from the cin operon. 42442 pfam02465: Flagellar hook-associated protein 2 C-terminus. The flagellar hook-associated protein 2 (HAP2 or FliD) forms the distal end of the flagella, and plays a role in mucin specific adhesion of the bacteria. This alignment covers the N-terminal region of this family of proteins. 42443 pfam02466: Tim17/Tim22/Tim23 family. The pre-protein translocase of the mitochondrial outer membrane (Tom) allows the import of pre-proteins from the cytoplasm. Tom forms a complex with a number of proteins, including Tim17. Tim17 and Tim23 are thought to form the translocation channel of the inner membrane. This family includes Tim17, Tim22 and Tim23. 42444 pfam02467: Transcription factor WhiB. WhiB is a putative transcription factor in Actinobacteria, required for differentiation and sporulation. 42445 pfam02468: Photosystem II reaction centre N protein (psbN). This is a family of small proteins encoded on the chloroplast genome. psbN is involved in photosystem II during photosynthesis, but its exact role is unknown. 42446 pfam02469: Fasciclin domain. This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria. 42447 pfam02470: mce related protein. This family of proteins contains the mce (mycobacterial cell entry) proteins from Mycobacterium tuberculosis. The archetype (Rv0169), was isolated as being necessary for colonisation of, and survival within, the macrophage. This family contains proteins of unknown function from other bacteria. 42448 pfam02471: Borrelia outer surface protein E. This is a family of outer surface proteins (Osp) from the Borrelia spirochete. The family includes OspE, and OspEF-related proteins (Erp). These proteins are coded for on different circular plasmids in the Borrelia genome. 42449 pfam02472: Biopolymer transport protein ExbD/TolR. This group of proteins are membrane bound transport proteins essential for ferric ion uptake in bacteria. The Pfam family consists of ExbD, and TolR which are involved in TonB-dependent transport of various receptor bound substrates including colicins. 42450 pfam02474: Nodulation protein A (NodA). Rhizobia nodulation (nod) genes control the biosynthesis of Nod factors required for infection and nodulation of their legume hosts. Nodulation protein A (NodA) is a N-acetyltransferase involved in production of Nod factors that stimulate mitosis in various plant protoplasts. 42451 pfam02475: Met-10+ like-protein. The methionine-10 mutant allele of N. crassa codes for a protein of unknown function. However, homologous proteins have been found in yeast suggesting this protein may be involved in methionine biosynthesis, transport and/or utilisation. 42452 pfam02476: US2 family. This is a family of unique short (US) region proteins from the herpesvirus strain. The US2 family have no known function. 42453 pfam02477: Nucleocapsid N protein. The nucleoprotein of the ssRNA negative-strand Nairovirus is an internal part of the virus particle. . 42454 pfam02478: Pneumovirus phosphoprotein. This family represents the phosphoprotein of Paramyxoviridae, a putative RNA polymerase alpha subunit that may function in template binding. 42455 pfam02479: Herpesvirus immediate early protein. This regulatory protein is expressed from an immediate early gene in the cell cycle of herpesvirus. The protein is known by various names including IE-68, US1, ICP22 and IR4. 42456 pfam02480: Alphaherpesvirus glycoprotein E. Glycoprotein E (gE) of Alphaherpesvirus forms a complex with glycoprotein I (gI) (pfam01688), functioning as an immunoglobulin G (IgG) Fc binding protein. gE is involved in virus spread but is not essential for propagation. 42457 pfam02481: SMF family. The SMF family (DNA processing chain A, dprA) are a group of bacterial proteins. In H. pylori, dprA is required for natural chromosomal and plasmid transformation. . 42458 pfam02482: Sigma 54 modulation protein / S30EA ribosomal protein. This Pfam family contains the sigma-54 modulation protein family and the S30AE family of ribosomal proteins which includes the light- repressed protein (lrtA).. 42459 pfam02483: SMC family, C-terminal domain. This Pfam family represents a conserved domain towards the C-terminus of the SMC family proteins. A second conserved domain is found at the N-terminus (pfam02463). The SMC (structural maintenance of chromosomes) superfamily proteins have ATP-binding domains at the N- and C-termini, and two extended coiled-coil domains separated by a hinge in the middle. Some members of the family are know to play a role in mitotic chromosome condensation and segregation. It has been suggested that two classes of SMC complexes are involved in different aspects of mitotic chromosome organisation in human cells. 42460 pfam02484: Rhabdovirus Non-virion protein. Infectious hematopoietic necrosis virus (IHNV) is a member of the family Rhabdoviridae. The non-virion protein (NV) is coded for by one of the six genes of the IHNV genome, but is absent in vesiculovirus -like rhabdovirus. 42461 pfam02485: Core-2/I-Branching enzyme. This is a family of two different beta-1,6-N-acetylglucosaminyltransferase enzymes, I-branching enzyme and core-2 branching enzyme. I-branching enzyme is responsible for the production of the blood group I-antigen during embryonic development. Core-2 branching enzyme forms crucial side-chain branches in O-glycans. 42462 pfam02486: Replication initiation factor. Plasmid replication is initiated by the replication initiation factor (REP). This family represents a probable topoisomerase that makes a sequence-specific single-stranded nick in the plasmid DNA at the origin of replication. Human proteins also belong to this family, including myelin transcription factor 2 and cerebrin-50. 42463 pfam02487: CLN3 protein. This is a family of proteins from the CLN3 gene. A missense mutation of glutamic acid (E) to lysine (K) at position 295 in the human protein has been implicated in Juvenile neuronal ceroid lipofuscinosis (Batten disease).. 42464 pfam02488: Merozoite Antigen. This family represents the immunodominant surface antigen of Theileria parasites including equi merozoite antigen-1 (EMA-1) and equi merozoite antigen-2 (EMA-2). The protein shows variation at a putative glycosylation site, a potential mechanism for host immune response evasion. 42465 pfam02489: Herpesvirus glycoprotein H. Herpesvirus glycoprotein H (gH) is a virion associated envelope glycoprotein. Complex formation between gH and gL has been demonstrated in both virions and infected cells. 42466 pfam02490: Aminolevulinic acid synthase domain. This Pfam domain is specific to 5-aminolevulinic acid (ALA) synthase which is involved in heme biosynthesis. The Aminotransferases class-II domain (pfam00222) is found to the C-terminus of this domain. 42467 pfam02491: Cell division protein FtsA. FtsA is essential for bacterial cell division, and co-localises to the septal ring with FtsZ. It has been suggested that the interaction of FtsA-FtsZ has arisen through coevolution in different bacterial strains. The FtsA protein contains two structurally related actin-like ATPase domains which are also structurally related to the ATPase domains of HSP70 (see PF00012).. 42468 pfam02492: Cobalamin synthesis protein/P47K. This family of proteins contains P47K, a Pseudomonas chlororaphis protein needed for nitrile hydratase expression, and the cobW gene product, which may be involved in cobalamin biosynthesis in Pseudomonas denitrificans. 42469 pfam02493: MORN repeat. The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). The function of this motif is unknown. 42470 pfam02494: HYR domain. This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion (Ref Callebaut et al. Prot. Sci. 9:1382-1390).. 42471 pfam02495: 7kD viral coat protein. This family consists of a 7kD coat protein from carlavirus and potexvirus. 42472 pfam02496: ABA/WDS induced protein. This is a family of plant proteins induced by water deficit stress (WDS), or abscisic acid (ABA) stress and ripening. 42473 pfam02497: Arterivirus glycoprotein. This is a family of structural glycoproteins from arterivirus that corresponds to open reading frame 4 (ORF4) of the virus. 42474 pfam02498: BRO family, N-terminal domain. This family includes the N-terminus of baculovirus BRO and ALI motif proteins. The function of BRO proteins is unknown. It has been suggested that BRO-A and BRO-C are DNA binding proteins that influence host DNA replication and/or transcription. This Pfam domain does not include the characteristic invariant alanine, leucine, isoleucine motif of the ALI proteins. 42475 pfam02499: Probable DNA packing protein, C-terminus. This family includes proteins that are probably involved in DNA packing in herpesvirus. This domain is found at the C-terminus of the protein. 42476 pfam02500: Probable DNA packing protein, N-terminus. This family includes proteins that are probably involved in DNA packing in herpesvirus. This domain is normally found at the N-terminus of the protein. 42477 pfam02501: Bacterial type II secretion system protein I/J. The bacterial general secretion pathway (GSP) is involved in the export of proteins (also called the type II pathway). This Pfam family includes GSPI and GSPJ, which contain the pre-pilin signal sequence. 42478 pfam02502: Ribose/Galactose Isomerase. This family of proteins contains the sugar isomerase enzymes ribose 5-phosphate isomerase B (rpiB), galactose isomerase subunit A (LacA) and galactose isomerase subunit B (LacB). . 42479 pfam02503: Polyphosphate kinase. Polyphosphate kinase (Ppk) catalyses the formation of polyphosphate from ATP, with chain lengths of up to a thousand or more orthophosphate molecules. 42480 pfam02504: Fatty acid synthesis protein. The plsX gene is part of the bacterial fab gene cluster which encodes several key fatty acid biosynthetic enzymes. The exact function of the plsX protein in fatty acid synthesis is unknown. 42481 pfam02505: Methyl-coenzyme M reductase operon protein D. Methyl coenzyme M reductase (MCR) catalyses the final step in methanogenesis. MCR is composed of three subunits, alpha (pfam02249), beta (pfam02241) and gamma (pfam02240). Genes encoding the beta (mcrB) and gamma (mcrG) subunits are separated by two open reading frames coding for two proteins C and D. The function of proteins C and D (this family) is unknown. 42482 pfam02506: Type I restriction modification system, M protein. Restriction-modification (R-M) systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The R-M system is a complex containing three polypeptides: M (this family), S (pfam01420), and R. 42483 pfam02507: Photosystem I reaction centre subunit III. Photosystem I (PSI) is an integral membrane protein complex that uses light energy to mediate electron transfer from plastocyanin to ferredoxin. Subunit III (or PSI-F) is one of at least 14 different subunits that compose the PSI complex. 42484 pfam02508: Rnf-Nqr subunit, membrane protein. This is a family of integral membrane proteins including Rhodobacter-specific nitrogen fixation (rnf) proteins RnfA and RnfE and Na+-translocating NADH:ubiquinone oxidoreductase (Na+-NQR) subunits NqrD and NqrE. 42485 pfam02509: Rotavirus non-structural protein 35. Rotavirus non-structural protein 35 (NS35) is a basic protein which possesses RNA-binding activity and is essential for genome replication. 42486 pfam02510: Surface presentation of antigens protein. Surface presentation of antigens protein (SPAN), also know as invasion protein invJ, is a Salmonella secretory pathway protein involved in presentation of determinants required for mammalian host cell invasion. 42487 pfam02511: Thymidylate synthase complementing protein. Thymidylate synthase complementing protein (Thy1) complements the thymidine growth requirement of the organisms in which it is found, but shows no homology to thymidylate synthase. 42488 pfam02512: Virulence determinant. The UK protein is an African swine fever virus (ASFV) protein that is highly conserved amongst strains, and is an important viral virulence determinant for domestic pigs. 42489 pfam02513: Spin/Ssty Family. Spindlin (Spin) is a novel maternal transcript present in the unfertilised egg and early embryo. The Y-linked spermiogenesis -specific transcript (Ssty) is also expressed during gametogenesis and forms part of this Pfam family. Members of this family contain three copies of this 50 residue repeat. The repeat is predicted to contain four beta strands. 42490 pfam02514: CobN/Magnesium Chelatase. This family contains a domain common to the cobN protein and to magnesium protoporphyrin chelatase. CobN is implicated in the conversion of precorrin-2 to cobyrinic acid. Magnesium protoporphyrin chelatase is involved in chlorophyll biosynthesis. 42491 pfam02515: CoA-transferase family III. CoA-transferases are found in organisms from all lines of descent. Most of these enzymes belong to two well-known enzyme families, but recent work on unusual biochemical pathways of anaerobic bacteria has revealed the existence of a third family of CoA-transferases. The members of this enzyme family differ in sequence and reaction mechanism from CoA-transferases of the other families. Currently known enzymes of the new family are a formyl-CoA: oxalate CoA-transferase, a succinyl-CoA: (R)-benzylsuccinate CoA-transferase, an (E)-cinnamoyl-CoA: (R)-phenyllactate CoA-transferase, and a butyrobetainyl-CoA: (R)-carnitine CoA-transferase. In addition, a large number of proteins of unknown or differently annotated function from Bacteria, Archaea and Eukarya apparently belong to this enzyme family. Properties and reaction mechanisms of the CoA-transferases of family III are described and compared to those of the previously known CoA-transferases. 42492 pfam02516: Oligosaccharyl transferase STT3 subunit. This family consists of the oligosaccharyl transferase STT3 subunit and related proteins. The STT3 subunit is part of the oligosaccharyl transferase (OTase) complex of proteins and is required for its activity. OTase transfers a lipid-linked core-oligosaccharide to selected asparagine residues in the ER. 42493 pfam02517: CAAX amino terminal protease family. Members of this family are probably proteases; the family contains CAAX prenyl protease. The proteins contain a highly conserved Glu-Glu motif at the amino end of the alignment. The alignment also contains two histidine residues that may be involved in zinc binding. 42494 pfam02518: Histidine kinase-, DNA gyrase B-, and HSP90-like ATPase. This family represents the structurally related ATPase domains of histidine kinase, DNA gyrase B and HSP90. 42495 pfam02519: Auxin responsive protein. This family consists of the protein products of the ARG7 auxin responsive genes family none of which have any identified functional role. 42496 pfam02520: Domain of unknown function DUF148. This domain has no known function nor do any of the proteins that possess it. In one member of this family the aligned region is repeated twice. 42497 pfam02521: Putative outer membrane protein. This family consists of putative outer membrane proteins from Helicobacter pylori (campylobacter pylori).. 42498 pfam02522: Aminoglycoside 3-N-acetyltransferase. This family consists of bacterial aminoglycoside 3-N-acetyltransferases EC:2.3.1.81, these catalyse the reaction: Acetyl-Co + a 2-deoxystreptamine antibiotic <=> CoA + N3'-acetyl-2-deoxystreptamine antibiotic. The enzyme can use a range of antibiotics with 2-deoxystreptamine rings as acceptor for its acetyltransferase activity, this inactivates and confers resistance to gentamicin, kanamycin, tobramycin, neomycin and apramycin amongst others. 42499 pfam02523: InvE invasion protein. This family consists of the InvE invasion protein from Salmonella. This protein is involved in host parasite interactions and mutations in the InvE gene render Salmonella typhimurium noninvasive. InvE S. typhimurium mutants fail to elicit a rapid Ca2+ increase in cultured cells, an important event in the infection procedure and internalisation of S. typhimurium into epithelial cells. Members of this family are also found in other bacterial species such as Chlamydia muridarum and also from Yersinia enterocolitica. 42500 pfam02524: KID repeat. This is family contains the KID repeat as found in Borrelia spirochete RepA / Rep+ proteins. The function of these proteins is unknown. RepA and related Borrelia proteins have been suggested to play an important genus-wide role in the biology of the Borrelia. 42501 pfam02525: Flavodoxin-like fold. This family consists of a domain with a flavodoxin-like fold. The family includes bacterial and eukaryotic NAD(P)H dehydrogenase (quinone) EC:1.6.99.2. These enzymes catalyse the NAD(P)H-dependent two-electron reductions of quinones and protect cells against damage by free radicals and reactive oxygen species. This enzyme uses a FAD co-factor. The equation for this reaction is:- NAD(P)H + acceptor <=> NAD(P)(+) + reduced acceptor. This enzyme is also involved in the bioactivation of prodrugs used in chemotherapy. The family also includes acyl carrier protein phosphodiesterase EC:3.1.4.14. This enzyme converts holo-ACP to apo-ACP by hydrolytic cleavage of the phosphopantetheine residue from ACP. This family is related to pfam03358 and pfam00258. 42502 pfam02526: Glycophorin-binding protein. This family contains glycophorin binding proteins from P. falciparum the malarial parasite. Glycophorin is a cell surface protein of erythrocytes. The Glycophorin binding protein contains a tandem 38 residue repeat. In one member the repeat occurs 11 times. 42503 pfam02527: Glucose inhibited division protein. This is a family of bacterial Glucose inhibited division proteins these are probably involved in the regulation of cell devision. 42504 pfam02529: Cytochrome B6-F complex subunit 5. This family consists of cytochrome B6-F complex subunit 5 (PetG). The cytochrome bf complex found in green plants, eukaryotic algae and cyanobacteria, connects photosystem I to photosystem II in the electron transport chain, functioning as a plastoquinol:plastocyanin/cytochrome c6 oxidoreductase. PetG or subunit 5 is associated with the bf complex and the absence of PetG affects either the assembly or stability of the cytochrome bf complex in Chlamydomonas reinhardtii. 42505 pfam02530: Porin subfamily. This family consists of porins from the alpha subdivision of Proteobacteria the members of this family are related to pfam00267. The porins form large aqueous channels in the cell membrane allowing the selective entry of hydrophilic compounds this so called 'molecular sieve' is found in the cell walls of gram negative bacteria. 42506 pfam02531: PsaD. This family consists of PsaD from plants and cyanobacteria. PsaD is an extrinsic polypeptide of photosystem I (PSI) and is required for native assembly of PSI reaction clusters and is implicated in the electrostatic binding of ferredoxin within the reaction centre. PsaD forms a dimer in solution which is bound by PsaE however PsaD is monomeric in its native complexed PSI environment. 42507 pfam02532: Photosystem II reaction centre I protein (PSII 4.8 kDa protein). This family consists of various Photosystem II (PSII) reaction centre I proteins or PSII 4.8 kDa proteins, PsbI, from the chloroplast genome of many plants and Cyanobacteria. PsbI is a small, integral membrane component of PSII the role of which is not clear. Synechocystis mutants lacking PsbI have 20-30% loss of PSII activity however the PSII complex is not destabilised. 42508 pfam02533: Photosystem II 4 kDa reaction centre component. This family consists of various photosystem II 4 kDa reaction centre components (PsbK) from plant and Cyanobacteria. The photosystem II reaction centre is responsible for catalysing the core photosynthesis reaction the light-induced splitting of water and the consequential release of dioxygen. In C. reinhardtii the psbK product is required for the stable assembly and/or stability of the photosystem II complex. 42509 pfam02534: TraG/TraD family. The TraG/TraD family are bacterial conjugation proteins. These proteins aid the transfer of DNA from the plasmid into the host bacterial chromosome although the exact mechanism of action is unknown. These proteins contain a P-loop and walker-B site for nucleotide binding. 42511 pfam02536: mTERF. This family contains one sequence of known function Human mitochondrial transcription termination factor (mTERF) the rest of the family consists of hypothetical proteins none of which have any functional information. mTERF is a multizipper protein possessing three putative leucine zippers one of which is bipartite. The protein binds DNA as a monomer. The leucine zippers are not implicated in a dimerisation role as in other leucine zippers. 42512 pfam02537: CrcB-like protein. CRCB is a putative integral membrane protein possibly involved in chromosome condensation. Over expression in E. Coli also leads to camphor resistance. 42513 pfam02538: Hydantoinase B/oxoprolinase. This family includes N-methylhydaintoinase B which converts hydantoin to N-carbamyl-amino acids, and 5-oxoprolinase EC:3.5.2.9 which catalyses the formation of L-glutamate from 5-oxo-L-proline. These enzymes are part of the oxoprolinase family and are related to pfam01968. 42514 pfam02540: NAD synthase. NAD synthase (EC:6.3.5.1) is involved in the de novo synthesis of NAD and is induced by stress factors such as heat shock and glucose limitation. 42515 pfam02541: Ppx/GppA phosphatase family. This family consists of the N-terminal region of exopolyphosphatase (Ppx) EC:3.6.1.11 and guanosine pentaphosphate phospho-hydrolase (GppA) EC:3.6.1.40. 42516 pfam02542: YgbB family. The ygbB protein is a putative enzyme of deoxy-xylulose pathway (terpenoid biosynthesis).. 42517 pfam02543: Carbamoyltransferase. This family consists of NodU from Rhizobium and CmcH from Nocardia lactamdurans. NodU a Rhizobium nodulation protein involved in the synthesis of nodulation factors has 6-O-carbamoyltransferase-like activity. CmcH is involved in cephamycin (antibiotic) biosynthesis and has 3-hydroxymethylcephem carbamoyltransferase activity, EC:2.1.3.7 catalysing the reaction: Carbamoyl phosphate + 3-hydroxymethylceph-3-EM-4-carboxylate <=> phosphate + 3-carbamoyloxymethylcephem. 42518 pfam02544: 3-oxo-5-alpha-steroid 4-dehydrogenase. This family consists of 3-oxo-5-alpha-steroid 4-dehydrogenases, EC:1.3.99.5 Also known as Steroid 5-alpha-reductase, the reaction catalysed by this enzyme is: 3-oxo-5-alpha-steroid + acceptor <=> 3-oxo-delta(4)-steroid + reduced acceptor. The Steroid 5-alpha-reductase enzyme is responsible for the formation of dihydrotestosterone, this hormone promotes the differentiation of male external genitalia and the prostate during fetal development. In humans mutations in this enzyme can cause a form of male pseudohermaphorditism in which the external genitalia and prostate fail to develop normally. A related enzyme is also found in plants: DET2 a steroid reductase from Arabidopsis. Mutations in this enzyme cause defects in light-regulated development. 42519 pfam02545: Maf-like protein. Maf is a putative inhibitor of septum formation in eukaryotes, bacteria, and archaea. 42520 pfam02547: Queuosine biosynthesis protein. Queuosine (Q) biosynthesis protein, or S-adenosylmethionine:tRNA -ribosyltransferase-isomerase, is required for the synthesis of the queuosine precursor (oQ). Q is a hypermodified nucleoside usually found at the first position of the anticodon of asparagine, aspartate, histidine, and tyrosine tRNAs. . 42521 pfam02548: Ketopantoate hydroxymethyltransferase. Ketopantoate hydroxymethyltransferase (EC:2.1.2.11) is the first enzyme in the pantothenate biosynthesis pathway. 42522 pfam02550: Acetyl-CoA hydrolase/transferase. This family contains several enzymes which take part in pathways involving acetyl-CoA. Acetyl-CoA hydrolase EC:3.1.2.1 catalyses the formation of acetate from acetyl-CoA, CoA transferase (CAT1) EC:2.8.3.- produces succinyl-CoA, and acetate-CoA transferase EC:2.8.3.8 utilises acyl-CoA and acetate to form acetyl-CoA. 42523 pfam02551: Acyl-CoA thioesterase. This family represents the thioesterase II domain. Two copies of this domain are found in a number of acyl-CoA thioesterases. 42524 pfam02552: CO dehydrogenase beta subunit/acetyl-CoA synthase epsilon subunit. This family consists of Carbon monoxide dehydrogenase I/II beta subunit EC:1.2.99.2 and acetyl-CoA synthase epsilon subunit. Carbon monoxide beta subunit catalyses the reaction: CO + H2O + acceptor <=> CO2 + reduced acceptor. 42525 pfam02553: Cobalt transport protein component CbiN. CbiN is part of the active cobalt transport system involved in uptake of cobalt in to the cell involved with cobalamin biosynthesis (vitamin B12). It has been suggested that CbiN may function as the periplasmic binding protein component of the active cobalt transport system. 42526 pfam02554: Carbon starvation protein CstA. This family consists of Carbon starvation protein CstA a predicted membrane protein. It has been suggested that CstA is involved in peptide utilisation. 42527 pfam02556: Preprotein translocase subunit SecB. This family consists of preprotein translocase subunit SecB. SecB is required for the normal export of envelope proteins out of the cell cytoplasm. 42528 pfam02557: D-alanyl-D-alanine carboxypeptidase. 42529 pfam02558: Ketopantoate reductase PanE/ApbA. This is a family of 2-dehydropantoate 2-reductases also known as ketopantoate reductases, EC:1.1.1.169. The reaction catalysed by this enzyme is: (R)-pantoate + NADP(+) <=> 2-dehydropantoate + NADPH. AbpA catalyses the NADPH reduction of ketopantoic acid to pantoic acid in the alternative pyrimidine biosynthetic (APB) pathway. ApbA and PanE are allelic. ApbA, the ketopantoate reductase enzyme is required for the synthesis of thiamine via the APB biosynthetic pathway. 42530 pfam02559: CarD-like/TRCF domain. CarD is a Myxococcus xanthus protein required for the activation of light- and starvation-inducible genes. This family includes the presumed N-terminal domain. This family also includes a domain to the N-terminal side of the DEAD helicase of TRCF proteins. TRCF displaces RNA polymerase stalled at a lesion, binds to the damage recognition protein UvrA, and increases the template strand repair rate during transcription. This domain is involved in binding to the stalled RNA polymerase. 42531 pfam02560: Cyanate lyase C-terminal domain. Cyanate lyase (also known as cyanase) EC:4.3.99.1 is responsible for the hydrolysis of cyanate, allowing organisms that possess the enzyme to overcome the toxicity of environmental cyanate. This enzyme is composed of two domains, an N-terminal helix-turn-helix and this structurally unique C-terminal domain. 42532 pfam02561: Flagellar protein FliS. FliS is coded for by the FliD operon and is transcribed in conjunction with FliD and FliT, however this protein has no known function. 42533 pfam02562: PhoH-like protein. PhoH is a cytoplasmic protein and predicted ATPase that is induced by phosphate starvation. 42534 pfam02563: Polysaccharide biosynthesis/export protein. This is a family of periplasmic proteins involved in polysaccharide biosynthesis and/or export. 42535 pfam02565: Recombination protein O. Recombination protein O (RecO) is involved in DNA repair and pfam00470 pathway recombination. 42536 pfam02566: OsmC-like protein. Osmotically inducible protein C (OsmC) is a stress -induced protein found in E. Coli. This family also contains a organic hydroperoxide detoxification protein that has a novel pattern of oxidative stress regulation. 42537 pfam02567: Phenazine biosynthesis-like protein. PhzC/PhzF is involved in dimerisation of two 2,3-dihydro-3-oxo-anthranilic acid molecules to create PCA by P. fluorescens. This family appears to be distantly related to pfam01678, including containing a weak internal duplication. However members of this family do not contain the conserved cysteines that are hypothesised to be active site residues (Bateman A pers obs).. 42538 pfam02568: Thiamine biosynthesis protein (ThiI). ThiI is required for thiazole synthesis, required for thiamine biosynthesis. 42539 pfam02569: Pantoate-beta-alanine ligase. Pantoate-beta-alanine ligase, also know as pantothenate synthase, (EC:6.3.2.1) catalyses the formation of pantothenate from pantoate and alanine. 42540 pfam02570: Precorrin-8X methylmutase. This is a family Precorrin-8X methylmutases also known as Precorrin isomerase, CbiC/CobH, EC:5.4.1.2. This enzyme catalyses the reaction: Precorrin-8X <=> hydrogenobyrinate. This enzyme is part of the Cobalamin (vitamin B12) biosynthetic pathway and catalyses a methyl rearrangement. 42541 pfam02571: Precorrin-6x reductase CbiJ/CobK. This family consists of Precorrin-6x reductase EC:1.3.1.54. This enzyme catalyses the reaction: precorrin-6Y + NADP(+) <=> precorrin-6X + NADPH. CbiJ and CobK both catalyse the reduction of macocycle in the colbalmin biosynthesis pathway. 42542 pfam02572: ATP:corrinoid adenosyltransferase BtuR/CobO/CobP. This family consists of the BtuR, CobO, CobP proteins all of which are Cob(I)alamin adenosyltransferase, EC:2.5.1.17, involved in cobalamin (vitamin B12) biosynthesis. These enzymes catalyse the adenosylation reaction: ATP + cob(I)alamin + H2O <=> phosphate + diphosphate + adenosylcobalamin. 42543 pfam02573: N-terminal HTH domain of molybdenum-binding protein. This is the N-terminal domain of molybdenum-binding proteins ModE, ModA, MopA and MopB. These proteins are involved in molybdenum transport. ModE is a molybdenum dependent transcriptional repressor preventing the transcription of the ModABCD operon in E. coli in the presence of high concentrations of molybdenum. This region contains a helix-turn-helix motif, see pfam00126. 42544 pfam02574: Homocysteine S-methyltransferase. This is a family of related homocysteine S-methyltransferases enzymes: 5-methyltetrahydrofolate--homocysteine S-methyltransferases also known EC:2.1.1.13; Betaine--homocysteine S-methyltransferase (vitamin B12 dependent), EC:2.1.1.5; and Homocysteine S-methyltransferase, EC:2.1.1.10,.. 42545 pfam02575: Uncharacterised BCR, YbaB family COG0718. 42546 pfam02576: Uncharacterised BCR, YhbC family COG0779. 42547 pfam02577: Uncharacterised ACR, COG1259. 42548 pfam02578: Uncharacterised ACR, YfiH family COG1496. 42549 pfam02579: Dinitrogenase iron-molybdenum cofactor. This family contains several NIF (B, Y and X) proteins which are iron-molybdenum cofactors (FeMo-co) in the dinitrogenase enzyme which catalyses the reduction of dinitrogen to ammonium. Dinitrogenase is a heterotetrameric (alpha(2)beta(2)) enzyme which contains the iron-molybdenum cofactor (FeMo-co) at its active site. 42550 pfam02580: D-Tyr-tRNA(Tyr) deacylase. This family comprises of several D-Tyr-tRNA(Tyr) deacylase proteins. Cell growth inhibition by several d-amino acids can be explained by an in vivo production of d-aminoacyl-tRNA molecules. Escherichia coli and yeast cells express an enzyme, d-Tyr-tRNA(Tyr) deacylase, capable of recycling such d-aminoacyl-tRNA molecules into free tRNA and d-amino acid. Accordingly, upon inactivation of the genes of the above deacylases, the toxicity of d-amino acids increases. Orthologues of the deacylase are found in many cells. 42551 pfam02581: Thiamine monophosphate synthase/TENI. Thiamine monophosphate synthase (TMP) (EC:2.5.1.3) catalyses the substitution of the pyrophosphate of 2-methyl-4-amino-5- hydroxymethylpyrimidine pyrophosphate by 4-methyl-5- (beta-hydroxyethyl)thiazole phosphate to yield thiamine phosphate. This Pfam family also includes the regulatory protein TENI. 42552 pfam02582: Uncharacterised ACR, YagE family COG1723. 42553 pfam02583: Uncharacterised BCR, COG1937. 42554 pfam02585: Uncharacterised LmbE-like protein, COG2120. 42555 pfam02586: Uncharacterised ACR, COG2135. 42556 pfam02588: Uncharacterized BCR, YitT family COG1284. This is probably a bacterial ABC transporter permease (personal obs:Yeats C).. 42557 pfam02589: Uncharacterized ACR, YkgG family COG1556. 42558 pfam02590: Uncharacterized ACR, COG1576. 42559 pfam02591: Uncharacterized ACR, COG1579. 42560 pfam02592: Uncharacterized ACR, YhhQ family COG1738. 42561 pfam02593: Uncharacterized ArCR, COG1810. 42562 pfam02594: Uncharacterized ACR, YggU family COG1872. 42563 pfam02595: Glycerate kinase family. This is family of Glycerate kinases. 42564 pfam02596: Uncharacterized ArCR, COG2043. 42565 pfam02597: ThiS family. ThiS (thiaminS) is a 66 aa protein involved in sulphur transfer. ThiS is coded in the thiCEFSGH operon in E. coli. This family of proteins have two conserved Glycines at the COOH terminus. Thiocarboxylate is formed at the last G in the activation process. Sulphur is transferred from ThiI to ThiS in a reaction catalysed by IscS. MoaD, a protein involved sulphur transfer in molybdopterin synthesis, is about the same length and shows limited sequence similarity to ThiS. Both have the conserved GG at the COOH end. 42566 pfam02598: Uncharacterized ACR, COG2106. 42567 pfam02599: Global regulator protein family. This is a family of global regulator proteins. This protein is a RNA-binding protein and a global regulator of carbohydrate metabolism genes facilitating mRNA decay. In E. coli CsrA binds the CsrB RNA molecule to form the Csr regulatory system which has a strong negative regulatory effect on glycogen biosynthesis, glyconeogenesis and glycogen catabolism and a positive regulatory effect on glycolysis. In other bacteria such as Erwinia caratovara RmsA has been shown to regulate the production of virulence determinants, such extracellular enzymes. RmsA binds to RmsB regulatory RNA. 42568 pfam02600: Disulfide bond formation protein DsbB. This family consists of disulfide bond formation protein DsbB from bacteria. The DsbB protein oxidises the periplasmic protein DsbA which in turn oxidises cysteines in other periplasmic proteins in order to make disulfide bonds. DsbB acts as a redox potential transducer across the cytoplasmic membrane and is an integral membrane protein. DsbB posses six cysteines four of which are necessary for it proper function in vivo. 42569 pfam02601: Exonuclease VII, large subunit. This family consist of exonuclease VII, large subunit EC:3.1.11.6 This enzyme catalyses exonucleolytic cleavage in either 5'->3' or 3'->5' direction to yield 5'-phosphomononucleotides. This exonuclease VII enzyme is composed of one large subunit and 4 small ones. 42570 pfam02602: Uroporphyrinogen-III synthase HemD. This family consists of uroporphyrinogen-III synthase HemD EC:4.2.1.75 also known as Hydroxymethylbilane hydrolyase (cyclizing) from eukaryotes, bacteria and archaea. This enzyme catalyses the reaction: Hydroxymethylbilane <=> uroporphyrinogen-III + H(2)O. Some members of this family are multi-functional proteins possessing other enzyme activities related to porphyrin biosynthesis, however the aligned region corresponds with the uroporphyrinogen-III synthase EC:4.2.1.75 activity only. Uroporphyrinogen-III synthase is the fourth enzyme in the heme pathway. Mutant forms of the Uroporphyrinogen-III synthase gene cause congenital erythropoietic porphyria in humans a recessive inborn error of metabolism also known as Gunther disease. 42571 pfam02603: HPr Serine kinase. This family consists of Hpr Serine/threonine kinase PtsK. This kinase is the sensor in a multicomponent phosphorelay system in control of carbon catabolic repression in bacteria. This kinase in unusual in that it recognises the tertiary structure of its target and is a member of a novel family unrelated to any previously described protein phosphorylating enzymes. X-ray analysis of the full-length crystalline enzyme from Staphylococcus xylosus at a resolution of 1.95 A shows the enzyme to consist of two clearly separated domains that are assembled in a hexameric structure resembling a three-bladed propeller. 42572 pfam02604: Uncharacterized ACR, COG2161. 42573 pfam02605: Photosystem I reaction centre subunit XI. This family consists of the photosystem I reaction centre subunit XI, PsaL, from plants and bacteria. PsaL is one of the smaller subunits in photosystem I with only two transmembrane alpha helices and interacts closely with PsaI. 42574 pfam02606: Tetraacyldisaccharide-1-P 4'-kinase. This family consists of tetraacyldisaccharide-1-P 4'-kinase also known as Lipid-A 4'-kinase or Lipid A biosynthesis protein LpxK, EC:2.7.1.130. This enzyme catalyses the reaction: ATP + 2,3-bis(3-hydroxytetradecanoyl)-D -glucosaminyl-(beta-D-1,6)-2,3-bis(3-hydroxytetradecanoyl)-D-glucosaminyl beta-phosphate <=> ADP + 2,3,2',3'-tetrakis(3-hydroxytetradecanoyl)-D- glucosaminyl-1,6-beta-D-glucosamine 1,4'-bisphosphate. This enzyme is involved in the synthesis of lipid A portion of the bacterial lipopolysaccharide layer (LPS). The family contains a P-loop motif at the N terminus. 42575 pfam02607: B12 binding domain. This B12 binding domain is found in methionine synthase EC:2.1.1.13, and other shorter proteins that bind to B12. This domain is always found to the N-terminus of pfam02310. The structure of this domain is known, it is a 4 helix bundle. Many of the conserved residues in this domain are involved in B12 binding, such as those in the MXXVG motif. 42576 pfam02608: Basic membrane protein. This is a family of basic membrane lipoproteins form Borrelia and various putative lipoproteins form other bacteria. All of these proteins are outer membrane proteins and are thus antigenic in nature when possessed by the pathogenic members of the family. One protein is a transcriptional activator. 42577 pfam02609: Exonuclease VII small subunit. This family consist of exonuclease VII, small subunit EC:3.1.11.6 This enzyme catalyses exonucleolytic cleavage in either 5'->3' or 3'->5' direction to yield 5'-phosphomononucleotides. This exonuclease VII enzyme is composed of one large subunit and 4 small ones. 42578 pfam02610: L-arabinose isomerase. This is a family of L-arabinose isomerases, AraA, EC:5.3.1.4. These enzymes catalyse the reaction: L-arabinose <=> L-ribulose. This reaction is the first step in the pathway of L-arabinose utilisation as a carbon source after entering the cell L-arabinose is converted into L-ribulose by the L-arabinose isomerases enzyme. 42579 pfam02611: CDP-diacylglycerol pyrophosphatase. This is a family of CDP-diacylglycerol pyrophosphatases, EC:3.6.1.26. This enzyme catalyses the reaction CDP-diacylglycerol + H2O <=> CMP + phosphatidate. 42580 pfam02613: Nitrate reductase delta subunit. This family is the delta subunit of the nitrate reductase enzyme, The delta subunit is not part of the nitrate reductase enzyme but is most likely needed for assembly of the multi-subunit enzyme complex. In the absence of the delta subunit the core alpha beta enzyme complex is unstable. The delta subunit is essential for enzyme activity in vivo and in vitro. The nitrate reductase enzyme, EC:1.7.99.4 catalyse the conversion of nitrite to nitrate via the reduction of an acceptor. The nitrate reductase enzyme is composed of three subunits. Nitrate is the most widely used alternative electron acceptor after oxygen. 42581 pfam02614: Glucuronate isomerase. This is a family of Glucuronate isomerases also known as D-glucuronate isomerase, uronic isomerase, uronate isomerase, or uronic acid isomerase, EC:5.3.1.12. This enzyme catalyses the reactions: D-glucuronate <=> D-fructuronate and D-galacturonate <=> D-tagaturonate. It is not however clear where the experimental evidence for this functional assignment came from and thus this family has no literature reference. 42582 pfam02615: Malate/L-lactate dehydrogenase. This family consists of bacterial and archaeal Malate/L-lactate dehydrogenase. L-lactate dehydrogenase, EC:1.1.1.27, catalyses the reaction (S)-lactate + NAD(+) <=> pyruvate + NADH. Malate dehydrogenase, EC:1.1.1.37 and EC:1.1.1.82, catalyses the reactions: (S)-malate + NAD(+) <=> oxaloacetate + NADH, and (S)-malate + NADP(+) <=> oxaloacetate + NADPH respectively. 42583 pfam02616: Uncharacterised ACR, COG1354. 42584 pfam02617: ATP-dependent Clp protease adaptor protein ClpS. In the bacterial cytosol, ATP-dependent protein degradation is performed by several different chaperone-protease pairs, including ClpAP. ClpS directly influences the ClpAP machine by binding to the N-terminal domain of the chaperone ClpA. The degradation of ClpAP substrates, both SsrA-tagged proteins and ClpA itself, is specifically inhibited by ClpS. ClpS modifies ClpA substrate specificity, potentially redirecting degradation by ClpAP toward aggregated proteins. 42585 pfam02618: Aminodeoxychorismate lyase. This family contains several aminodeoxychorismate lyase proteins. Aminodeoxychorismate lyase is a pyridoxal 5'-phosphate-dependent enzyme that converts 4-aminodeoxychorismate to pyruvate and p-aminobenzoate, a precursor of folic acid in bacteria. 42586 pfam02620: Uncharacterized ACR, COG1399. 42587 pfam02621: Uncharacterized ACR, COG1427. 42588 pfam02622: Uncharacterized ACR, COG1678. 42589 pfam02623: Uncharacterized BCR, COG1699. 42590 pfam02624: Uncharacterized ACR, COG1944. 42591 pfam02625: XdhC and CoxI family. This domain is often found in association with an NAD-binding region, related to TrkA-N (pfam02254; personal obs:C. Yeats). XdhC is believed to be involved in the attachment of molybdenum to Xanthine Dehydrogenase. 42592 pfam02626: Allophanate hydrolase subunit 2. This domain forms the second subunit of allophanate hydrolase. In yeast urea amidolyase this domain is found between pfam00289 and pfam00364. 42593 pfam02627: Carboxymuconolactone decarboxylase family. Carboxymuconolactone decarboxylase (CMD) EC:4.1.1.44 is involved in protocatechuate catabolism. In some bacteria a gene fusion event leads to expression of CMD with a hydrolase involved in the same pathway. In these bifunctional proteins CMD represents the C-terminal domain, pfam00561 represents the N-terminal domain. . 42594 pfam02628: Cytochrome oxidase assembly protein. This is a family of integral membrane proteins. CtaA is required for cytochrome aa3 oxidase assembly in Bacillus subtilis. COX15 is required for cytochrome c oxidase assembly in yeast. 42595 pfam02629: CoA binding domain. This domain has a Rossmann fold and is found in a number of proteins including succinyl CoA synthetases, malate and ATP-citrate ligases. 42596 pfam02630: SCO1/SenC. This family is involved in biogenesis of respiratory and photosynthetic systems. SCO1 is required for a post-translational step in the accumulation of subunits COXI and COXII of cytochrome c oxidase. SenC is required for optimal cytochrome c oxidase activity and maximal induction of genes encoding the light-harvesting and reaction centre complexes of R. capsulatus. . 42597 pfam02631: RecX family. RecX is a putative bacterial regulatory protein. The gene encoding RecX is found downstream of recA, and is thought to interact with the RecA protein. 42598 pfam02632: BioY family. A number of bacterial genes are involved in bioconversion of pimelate into dethiobiotin. BioY is involved in this process, however the exact function of the protein is unknown. 42599 pfam02633: Creatinine amidohydrolase. Creatinine amidohydrolase (EC:3.5.2.10), or creatininase, catalyses the hydrolysis of creatinine to creatine. 42600 pfam02634: FdhD/NarQ family. Nitrate assimilation protein, NarQ, and FdhD are required for formate dehydrogenase activity. 42601 pfam02635: DsrE/DsrF-like family. DsrE is a small soluble protein involved in intracellular sulfur reduction. This family also includes DsrF. 42602 pfam02636: Uncharacterized ACR, COG1565. This family contains several uncharacterized proteins. One member has been described as an ATP synthase beta subunit transcription termination factor rho protein. 42603 pfam02637: GatB/Yqey domain. This domain is found in GatB and proteins related to bacterial Yqey. It is about 140 amino acid residues long. This domain is found at the C terminus of GatB, which transamidates Glu-tRNA to Gln-tRNA. The function of this domain is uncertain. It does however suggest that Yqey and its relatives have a role in tRNA metabolism. 42604 pfam02638: Uncharacterized BCR, COG1649. 42605 pfam02639: Uncharacterized BCR, YaiI/YqxD family COG1671. 42606 pfam02641: Uncharacterized ACR, COG1993. 42607 pfam02642: Uncharacterized ACR, COG2107. 42608 pfam02643: Uncharacterized ACR, COG1430. 42609 pfam02645: Uncharacterized protein, DegV family COG1307. 42610 pfam02646: RmuC family. This family contains several bacterial RmuC DNA recombination proteins. The function of the RMUC protein is unknown but it is suspected that it is either a structural protein that protects DNA against nuclease action, or is itself involved in DNA cleavage at the regions of DNA secondary structures. 42611 pfam02647: Uncharacterized ACR, COG1343. 42612 pfam02649: Uncharacterized ACR, COG1469. 42613 pfam02650: Uncharacterized BCR, COG1481. 42614 pfam02652: L-lactate permease. L-lactate permease is an integral membrane protein probably involved in L-lactate transport. 42615 pfam02653: Branched-chain amino acid transport system / permease component. This is a large family mainly comprising high-affinity branched-chain amino acid transporter proteins such as E. coli LivH and LivM, both of which are form the LIV-I transport system. Also found with in this family are proteins from the galactose transport system permease, and a ribose transport system. 42616 pfam02654: Cobalamin-5-phosphate synthase. This is family of Colbalmin-5-phosphate synthases, CobS, from bacteria. The CobS enzyme catalyses the synthesis of AdoCbl-5'-p from AdoCbi-GDP and alpha-ribazole-5 '-P. This enzyme is involved in the cobalamin (vitamin B12) biosynthesis pathway in particular the nucleotide loop assembly stage in conjunction with CobC, CobU and CobT. 42617 pfam02655: Domain of unknown function DUF201. This family consists of hypothetical proteins, some of which are putative membrane proteins. No functional information or experimental verification of function is known. This domain is around 300 amino acids long. This family appears to be distantly similar to a Dala-Dala ligase enzyme (Pers. obs. A Bateman).. 42618 pfam02656: Domain of unknown function DUF. This family consists of hypothetical proteins some of which are putative membrane proteins. No functional information or experimental verification of function is known. This domain is around 100 amino acids long. 42619 pfam02657: Fe-S metabolism associated domain. This family consists of the SufE-related proteins. These have been implicated in Fe-S metabolism and export).. 42620 pfam02659: Domain of unknown function DUF. This family consists of hypothetical transmembrane proteins non of which have any known function, the aligned region is 180 amino acids long. 42621 pfam02660: Domain of unknown function DUF. This family consists of hypothetical transmembrane proteins none of which have any known function, the aligned region is around 200 amino acids long. 42622 pfam02661: Fic protein family. This family consists of the Fic (filamentation induced by cAMP) protein and its relatives. The Fic protein is involved in cell division and is suggested to be involved in the synthesis of PAB or folate, indicating that the Fic protein and cAMP are involved in a regulatory mechanism of cell division via folate metabolism. This family contains a central conserved motif HPFXXGNG in most members. The exact molecular function of these proteins is uncertain. 42623 pfam02662: Methyl-viologen-reducing hydrogenase, delta subunit. This family consist of methyl-viologen-reducing hydrogenase, delta subunit / heterodisulphide reductase. No specific functions have been assigned to this subunit. The aligned region corresponds to almost the entire delta chain sequence and contains 4 conserved cysteine residues. However, in two Archaeoglobus sequences this region corresponds to only the C-terminus of these proteins. 42624 pfam02663: Tungsten formylmethanofuran dehydrogenase, subunit E, FwdE. This is f family consists of tungsten formylmethanofuran dehydrogenase, subunit E, FwdE and FmdE. The subunit E protein is co-expressed with the enzyme but fails to co-purify, and thus its function is unknown. 42625 pfam02664: LuxS protein. This family consists of the LuxS protein involved in autoinducer AI2 synthesis and its hypothetical relatives. The LuxS protein is involved in quorum sensing and is a autoinducer-production protein. 42626 pfam02665: Nitrate reductase gamma subunit. This family is the gamma subunit of the nitrate reductase enzyme, the gamma subunit is a b-type cytochrome that receives electrons from the quinone pool. It then transfers these via the iron-sulfur clusters of the beta subunit to the molybdenum cofactor found in the alpha subunit. The nitrate reductase enzyme, EC:1.7.99.4 catalyses the conversion of nitrite to nitrate via the reduction of an acceptor. The nitrate reductase enzyme is composed of three subunits. Nitrate is the most widely used alternative electron acceptor after oxygen. 42627 pfam02666: Phosphatidylserine decarboxylase. This is a family of phosphatidylserine decarboxylases, EC:4.1.1.65. These enzymes catalyse the reaction: Phosphatidyl-L-serine <=> phosphatidylethanolamine + CO2. Phosphatidylserine decarboxylase plays a central role in the biosynthesis of aminophospholipids by converting phosphatidylserine to phosphatidylethanolamine. 42628 pfam02667: Short chain fatty acid transporter. This family consists of two sequences annotated as short chain fatty acid transporters, however, there are no references giving details of experimental characterisation of this function. 42629 pfam02668: Taurine catabolism dioxygenase TauD, TfdA family. This family consists of taurine catabolism dioxygenases of the TauD, TfdA family. TauD from E. coli is a alpha-ketoglutarate-dependent taurine dioxygenase. This enzyme catalyses the oxygenolytic release of sulfite from taurine. TfdA from Burkholderia sp. is a 2,4-dichlorophenoxyacetic acid/alpha-ketoglutarate dioxygenase. TfdA from Alcaligenes eutrophus JMP134 is a 2,4-dichlorophenoxyacetate monooxygenase. 42630 pfam02669: K+-transporting ATPase, c chain. This family consists of K+-transporting ATPase, c chain, KdpC. KdpC forms strong interactions with the KdpA subunit, serving to assemble and stabilise the Kdp complex. It has been suggested that KdpC could be one of the connecting links between the energy providing subunit KdpB and the K+-transporting subunit KdpA. The K+ transport system actively transports K+ ions via ATP hydrolysis. 42631 pfam02670: 1-deoxy-D-xylulose 5-phosphate reductoisomerase. This is a family of 1-deoxy-D-xylulose 5-phosphate reductoisomerases. This enzyme catalyses the formation of 2-C-methyl-D-erythritol 4-phosphate from 1-deoxy-D-xylulose-5-phosphate in the presence of NADPH. This reaction is part of the terpenoid biosynthesis pathway. 42632 pfam02671: Paired amphipathic helix repeat. This family contains the paired amphipathic helix repeat. The family contains the yeast SIN3 gene (also known as SDI1) that is a negative regulator of the yeast HO gene. This repeat may be distantly related to the helix-loop-helix motif, which mediate protein-protein interactions. 42633 pfam02672: CP12 domain. The function of this domain is unknown, it does contain three conserved cysteines and a histidine, that suggests this may be a zinc binding domain (Bateman A pers. observation). This domain is found associated with CBS domains in some proteins pfam00571. 42634 pfam02673: Bacitracin resistance protein BacA. Bacitracin resistance protein (BacA) is a putative undecaprenol kinase. BacA confers resistance to bacitracin, probably by phosphorylation of undecaprenol. 42635 pfam02674: Colicin V production protein. Colicin V production protein is required in E. Coli for colicin V production from plasmid pColV-K30. This protein is coded for in the purF operon. . 42636 pfam02675: S-adenosylmethionine decarboxylase. This family contains several S-adenosylmethionine decarboxylase proteins from bacterial and archaebacterial species. S-adenosylmethionine decarboxylase (AdoMetDC), a key enzyme in the biosynthesis of spermidine and spermine, is first synthesised as a proenzyme, which is cleaved post translationally to form alpha and beta subunits. The alpha subunit contains a covalently bound pyruvoyl group derived from serine that is essential for activity. 42637 pfam02676: Uncharacterized ACR, COG1590. 42638 pfam02677: Uncharacterized BCR, COG1636. 42639 pfam02678: Pirin. This family consists of Pirin proteins from both eukaryotes and prokaryotes. The function of Pirin is unknown but the gene coding for this protein is known to be expressed in all tissues in the human body although it is expressed most strongly in the liver and heart. Pirin is known to be a nuclear protein, exclusively localised within the nucleoplasma and predominantly concentrated within dot-like subnuclear structures. A tomato homologue of human Pirin has been found to be induced during programmed cell death. Human Pirin interacts with Bcl-3 and NFI and hence is probably involved in the regulation of DNA transcription and replication. It appears to be an Fe(II)-containing member of the Cupin superfamily. 42640 pfam02679: (2R)-phospho-3-sulfolactate synthase (ComA). In methanobacteria (2R)-phospho-3-sulfolactate synthase (ComA) catalyses the first step of the biosynthesis of coenzyme M from phosphoenolpyruvate (P-enolpyruvate). This novel enzyme catalyses the stereospecific Michael addition of sulfite to P-enolpyruvate, forming L-2-phospho-3-sulfolactate (PSL). It is suggested that the ComA-catalysed reaction is analogous to those reactions catalysed by beta-elimination enzymes that proceed through an enolate intermediate. 42641 pfam02680: Uncharacterized ArCR, COG1888. 42642 pfam02681: Divergent PAP2 family. This family is related to the pfam01569 family (personal obs: C Yeats).. 42643 pfam02682: Allophanate hydrolase subunit 1. This family is the first subunit of allophanate hydrolase. 42644 pfam02683: Cytochrome C biogenesis protein transmembrane region. This family consists of the transmembrane (i.e. non-catalytic) region of Cytochrome C biogenesis proteins also known as disulphide interchange proteins. These proteins posses a protein disulphide isomerase like domain that is not found within the aligned region of this family. 42645 pfam02684: Lipid-A-disaccharide synthetase. This is a family of lipid-A-disaccharide synthetases, EC:2.4.2.128. These enzymes catalyse the reaction: UDP-2,3-bis(3-hydroxytetradecanoyl) glucosamine + 2,3-bis(3-hydroxytetradecanoyl)-beta-D-glucosaminyl 1-phosphate <=> UDP + 2,3-bis(3-hydroxytetradecanoyl)-D-glucosaminyl-1,6 -beta-D-2,3-bis(3-hydroxytetradecanoyl)-beta-D-glucosaminyl 1-phosphate. These enzymes catalyse the fist disaccharide step in the synthesis of lipid-A-disaccharide. 42646 pfam02685: Glucokinase. This is a family of glucokinases or glucose kinases EC:2.7.1.2. These enzymes phosphorylate glucose using ATP as a donor to give glucose-6-phosphate and ADP. 42647 pfam02686: Glu-tRNAGln amidotransferase C subunit. This is a family of Glu-tRNAGln amidotransferase C subunits. The Glu-tRNA Gln amidotransferase enzyme itself is an important translational fidelity mechanism replacing incorrectly charged Glu-tRNAGln with the correct Gln-tRANGln via transmidation of the misacylated Glu-tRNAGln. This activity supplements the lack of glutaminyl-tRNA synthetase activity in gram-positive eubacterteria, cyanobacteria, Archaea, and organelles. 42648 pfam02687: Predicted permease. This is a family of predicted permeases and hypothetical transmembrane proteins. One member has been shown to transport lipids targeted to the outer membrane across the inner membrane. Two members have been shown to require ATP. 42649 pfam02688: Domain of unknown function DUF215. This is a large family of C. elegans proteins no of which have any known function. The aligned region is a maximum of 245 amino acids long and has two conserved cysteine residues. 42650 pfam02689: Helicase. This family consists of Helicases from the Herpes viruses. Helicases are responsible for the unwinding of DNA and are essential for replication and completion of the viral life cycle. 42651 pfam02690: Na+/Pi-cotransporter. This is a family of mainly mammalian type II renal Na+/Pi-cotransporters with other related sequences from lower eukaryotes and bacteria some of which are also Na+/Pi-cotransporters. In the kidney the type II renal Na+/Pi-cotransporters protein allows reabsorption of filtered Pi in the proximal tubule. 42652 pfam02691: Vacuolating cyotoxin. This family consists of Vacuolating cyotoxin proteins form Proteobacteria. These proteins are an important virulence determinate in H. pylori and induce cytoplasmic vacuolation in a variety of mammalian cell lines. 42653 pfam02692: Interphotoreceptor retinoid-binding protein. Interphotoreceptor retinoid-binding protein (IRBP) mediates retinoid trafficking between the photoreceptors and pigmented epithelium. This Pfam family represents a repeat domain found four times in the mammalian protein but only twice in zebrafish IRBP. 42654 pfam02694: Uncharacterized BCR, YnfA/UPF0060 family. 42655 pfam02695: Domain of unknown function DUF. This is a family of hypothetical worm (C. elegans) proteins non of which have any known function. The aligned region is repeated two or three times in many of the sequences in this family. 42656 pfam02696: Uncharacterized ACR, YdiU/UPF0061 family. 42657 pfam02697: Uncharacterized ACR, COG1753. 42658 pfam02698: Uncharacterized ACR, COG1434. 42659 pfam02699: Preprotein translocase subunit. See. 42660 pfam02700: Phosphoribosylformylglycinamidine (FGAM) synthase. This family forms a component of the de novo purine biosynthesis pathway. 42661 pfam02701: Dof domain, zinc finger. The Dof domain is a zinc finger DNA-binding domain, that shows resemblance to the Cys2 zinc finger. 42662 pfam02702: Osmosensitive K+ channel His kinase sensor domain. This is a family of KdpD sensor kinase proteins that regulate the kdpFABC operon responsible for potassium transport. The aligned region corresponds to the N-terminal cytoplasmic part of the protein which may be the sensor domain responsible for sensing turgor pressure. 42663 pfam02703: Early E1A protein. This is a family of adenovirus early E1A proteins. The E1A protein is 32 kDa it can however be cleaved to yield the 28 kDa protein. The E1A protein is responsible for the transcriptional activation of the early genes with in the viral genome at the start of the infection process as well as some cellular genes. 42664 pfam02704: Gibberellin regulated protein. This is the GASA gibberellin regulated cysteine rich protein family. The expression of these proteins is up-regulated by the plant hormone gibberellin, most of these proteins have some role in plant development. There are 12 cysteine residues conserved within the alignment giving the potential for these proteins to posses 6 disulphide bonds. 42665 pfam02705: K+ potassium transporter. This is a family of K+ potassium transporters that are conserved across phyla, having both bacterial (KUP), yeast (HAK), and plant (AtKT) sequences as members. 42666 pfam02706: Chain length determinant protein. This family includes proteins involved in lipopolysaccharide (lps) biosynthesis. This family comprises the whole length of chain length determinant protein (or wzz protein) that confers a modal distribution of chain length on the O-antigen component of lps. This region is also found as part of bacterial tyrosine kinases. 42667 pfam02707: Major Outer Sheath Protein N-terminal region. This is a family of spirochete major outer sheath protein N-terminal regions. These proteins are present on the bacterial cell surface. In T. denticola the major outer sheath protein (Msp) binds immobilised laminin and fibronectin supporting the hypothesis that Msp mediates the extracellular matrix binding activity of T. denticola. 42668 pfam02709: Galactosyltransferase. This is a family of galactosyltransferases from a wide range of Metazoa with three related galactosyltransferases activitys; all three of which are possessed by one sequence in some cases. EC:2.4.1.90, N-acetyllactosamine synthase; EC:2.4.1.38, Beta-N-acetylglucosaminyl-glycopeptide beta-1,4- galactosyltransferase; and EC:2.4.1.22 Lactose synthase. Note that N-acetyllactosamine synthase is a component of Lactose synthase along with alpha-lactalbumin, in the absence of alpha-lactalbumin EC:2.4.1.90 is the catalysed reaction. 42669 pfam02710: Hemagglutinin domain of haemagglutinin-esterase-fusion glycoprotein. 42670 pfam02711: E4 protein. This is is a family of Papillomavirus proteins, E4, coded for by ORF4. A splice variant, E1--E4, exists but neither the function of E4 or E1--E4 is known. 42671 pfam02713: Domain of unknown function DUF220. This is family consists of a region in several Arabidopsis thaliana hypothetical proteins none of which have any known function. The aligned region contains two cysteine residues. 42672 pfam02714: Domain of unknown function DUF221. This family consists of hypothetical transmembrane proteins none of which have any function, the aligned region is at 538 residues at maximum length. 42673 pfam02716: Isoflavone reductase. This is a family of isoflavone reductases from plants. Isoflavone reductase enzymes EC:1.3.1.45 catalyse the penultimate step in the synthesis of the phytoalexin medicarpin. 42674 pfam02717: B15-like protein. This is a family of poxvirus proteins including B15, C6, and T3A. Members of this family are approximately 150 residues long and have no known function. 42675 pfam02718: Herpesvirus UL31-like protein. This is a family of Herpesvirus proteins including UL31, UL53, and the product of ORF 69 in some strains. The proteins in this family have no known function. 42676 pfam02719: Polysaccharide biosynthesis protein. This is a family of diverse bacterial polysaccharide biosynthesis proteins including the CapD protein, WalL protein, mannosyl-transferase, and several putative epimerases (e.g. WbiI).. 42677 pfam02720: Domain of unknown function DUF222. This is a family of hypothetical proteins which includes a putative transposase. 42678 pfam02721: Domain of unknown function DUF223. 42679 pfam02722: Major Outer Sheath Protein C-terminal region. This is a family of spirochete major outer sheath protein C-terminal regions. These proteins are present on the bacterial cell surface. In T. denticola the major outer sheath protein (Msp) binds immobilised laminin and fibronectin supporting the hypothesis that Msp mediates the extracellular matrix binding activity of T. denticola. 42680 pfam02723: Non-structural protein NS3/Small envelope protein E. This is a family of small non-structural proteins, well conserved among Coronavirus strains. This protein is also found in murine hepatitis virus as small envelope protein E. 42681 pfam02724: CDC45-like protein. CDC45 is an essential gene required for initiation of DNA replication in S. cerevisiae, forming a complex with MCM5/CDC46. Homologues of CDC45 have been identified in human, mouse and smut fungus, amongst others. . 42682 pfam02725: Non-structural protein C. This family consists of the polymerase accessory protein C from members of the paramyxoviridae. 42683 pfam02727: Copper amine oxidase, N2 domain. This domain is the first or second structural domain in copper amine oxidases, it is known as the N2 domain. Its function is uncertain. The catalytic domain can be found in pfam01179. Copper amine oxidases are a ubiquitous and novel group of quinoenzymes that catalyse the oxidative deamination of primary amines to the corresponding aldehydes, with concomitant reduction of molecular oxygen to hydrogen peroxide. The enzymes are dimers of identical 70-90 kDa subunits, each of which contains a single copper ion and a covalently bound cofactor formed by the post-translational modification of a tyrosine side chain to 2,4,5-trihydroxyphenylalanine quinone (TPQ).. 42684 pfam02728: Copper amine oxidase, N3 domain. This domain is the second or third structural domain in copper amine oxidases, it is known as the N3 domain. Its function is uncertain. The catalytic domain can be found in pfam01179. Copper amine oxidases are a ubiquitous and novel group of quinoenzymes that catalyse the oxidative deamination of primary amines to the corresponding aldehydes, with concomitant reduction of molecular oxygen to hydrogen peroxide. The enzymes are dimers of identical 70-90 kDa subunits, each of which contains a single copper ion and a covalently bound cofactor formed by the post-translational modification of a tyrosine side chain to 2,4,5-trihydroxyphenylalanine quinone (TPQ).. 42685 pfam02729: Aspartate/ornithine carbamoyltransferase, carbamoyl-P binding domain. 42686 pfam02730: Aldehyde ferredoxin oxidoreductase, N-terminal domain. Aldehyde ferredoxin oxidoreductase (AOR) catalyses the reversible oxidation of aldehydes to their corresponding carboxylic acids with their accompanying reduction of the redox protein ferredoxin. This domain interacts with the tungsten cofactor. 42687 pfam02731: SKIP/SNW domain. This domain is found in chromatin proteins. 42688 pfam02732: ERCC4 domain. This domain is predicted to be a nuclease domain. 42689 pfam02733: Dak1 domain. This is the kinase domain of the dihydroxyacetone kinase family EC:2.7.1.29. . 42690 pfam02734: DAK2 domain. This domain is the predicted phosphatase domain of the dihydroxyacetone kinase family. 42691 pfam02735: Ku70/Ku80 beta-barrel domain. The Ku heterodimer (composed of Ku70 and Ku80) contributes to genomic integrity through its ability to bind DNA double-strand breaks and facilitate repair by the non-homologous end-joining pathway. This is the central DNA-binding beta-barrel domain. This domain is found in both the Ku70 and Ku80 proteins that form a DNA binding heterodimer. 42692 pfam02736: Myosin N-terminal SH3-like domain. This domain has an SH3-like fold. It is found at the N-terminus of many but not all myosins. The function of this domain is unknown. 42693 pfam02737: 3-hydroxyacyl-CoA dehydrogenase, NAD binding domain. This family also includes lambda crystallin. 42694 pfam02738: Aldehyde oxidase and xanthine dehydrogenase, molybdopterin binding domain. 42695 pfam02739: 5'-3' exonuclease, N-terminal resolvase-like domain. 42696 pfam02740: Colipase, C-terminal domain. SCOP reports duplication of common fold with Colipase N-terminal domain. 42697 pfam02741: FTR, proximal lobe. The FTR (Formylmethanofuran--tetrahydromethanopterin formyltransferase) enzyme EC:2.3.1.101 is involved in archaebacteria in the formation of methane from carbon dioxide. C-terminal proximal lobe of alpha+beta ferredoxin-like fold. SCOP reports fold duplication with N-terminal distal lobe. 42698 pfam02742: Iron dependent repressor, metal binding and dimerisation domain. This family includes the Diphtheria toxin repressor. 42699 pfam02743: Cache domain. 42700 pfam02744: Galactose-1-phosphate uridyl transferase, C-terminal domain. SCOP reports fold duplication with N-terminal domain. Both involved in Zn and Fe binding. 42701 pfam02745: Methyl-coenzyme M reductase alpha subunit, N-terminal domain. Methyl-coenzyme M reductase (MCR) is the enzyme responsible for microbial formation of methane. It is a hexamer composed of 2 alpha (this family), 2 beta (pfam02241), and 2 gamma (pfam02240) subunits with two identical nickel porphinoid active sites. The N-terminal domain has a ferredoxin-like fold. 42702 pfam02746: Mandelate racemase / muconate lactonizing enzyme, N-terminal domain. SCOP reports fold similarity with enolase N-terminal domain. 42703 pfam02747: Proliferating cell nuclear antigen, C-terminal domain. N-terminal and C-terminal domains of PCNA are topologically identical. Three PCNA molecules are tightly associated to form a closed ring encircling duplex DNA. 42704 pfam02748: Aspartate carbamoyltransferase regulatory chain, metal binding domain. The regulatory chain is involved in allosteric regulation of aspartate carbamoyltransferase. The C-terminal metal binding domain has a rubredoxin-like fold and provides the interface with the catalytic chain. 42705 pfam02749: Quinolinate phosphoribosyl transferase, N-terminal domain. Quinolinate phosphoribosyl transferase (QPRTase) or nicotinate-nucleotide pyrophosphorylase EC:2.4.2.19 is involved in the de novo synthesis of NAD in both prokaryotes and eukaryotes. It catalyses the reaction of quinolinic acid with 5-phosphoribosyl-1-pyrophosphate (PRPP) in the presence of Mg2+ to give rise to nicotinic acid mononucleotide (NaMN), pyrophosphate and carbon dioxide. The QA substrate is bound between the C-terminal domain of one subunit, and the N-terminal domain of the other. The N-terminal domain has an alpha/beta hammerhead fold. 42706 pfam02750: Synapsin, ATP binding domain. Ca dependent ATP binding in this ATP grasp fold. Function unknown. 42707 pfam02751: Transcription initiation factor IIA, gamma subunit, beta-barrel domain. Accurate transcription in vivo requires at least six general transcription initiation factors, in addition to RNA polymerase II. Transcription initiation factor IIA (TFIIA) is a multimeric protein which facilitates the binding of TFIID to the TATA box. The C-terminal domain of the gamma subunit is a 12 stranded beta-barrel. 42708 pfam02752: Arrestin (or S-antigen), C-terminal domain. Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain. 42709 pfam02753: Gram-negative pili assembly chaperone, C-terminal domain. Ig-like beta-sandwich fold. 42710 pfam02754: Cysteine-rich domain. This domain is usually found in two copies per protein. It contains up to four conserved cysteines. The family includes a subunit from heterodisulphide reductase, a subunit from glycolate oxidase, and glycerol-3-phosphate dehydrogenase. 42711 pfam02755: RPEL repeat. The RPEL repeat is named after four conserved amino acids it contains. The function of the RPEL repeat is unknown however it might be a DNA binding repeat based on the observation that one member contains a pfam02037 domain that is also implicated in DNA binding. 42712 pfam02756: GYR motif. The GYR motif is found in several drosophila proteins. Its function is unknown, however the presence of completely conserved tyrosine residues may suggest it could be a substrate for tyrosine kinases. 42713 pfam02757: YLP motif. The YLP motif is found in several drosophila proteins. Its function is unknown, however the presence of completely conserved tyrosine residues and its presence in human erbB-4 may suggest it could be a substrate for tyrosine kinases. 42714 pfam02758: PAAD/DAPIN/Pyrin domain. This domain is predicted to contain 6 alpha helices and to have the same fold as the pfam00531 domain. This similarity may mean that this is a protein-protein interaction domain. 42715 pfam02759: RUN domain. This domain is present in several proteins that are linked to the functions of GTPases in the Rap and Rab families. They could hence play important roles in multiple Ras-like GTPase signaling pathways. 42716 pfam02760: HIN-200/IF120x domain. This domain has no know function. It is found in one or two copies per protein, and is found associated with the PAAD/DAPIN domain pfam02758. 42717 pfam02761: CBL proto-oncogene N-terminus, EF hand-like domain. Cbl is an adaptor protein that binds EGF receptors (or other tyrosine kinases) and SH3 domains, functioning as a negative regulator of many signaling pathways. The N-terminal domain is evolutionarily conserved, and is known to bind to phosphorylated tyrosine residues. The so called N-terminal domain is actually 3 structural domains, of which this is the central EF hand domain. 42718 pfam02762: CBL proto-oncogene N-terminus, SH2-like domain. Cbl is an adaptor protein that binds EGF receptors (or other tyrosine kinases) and SH3 domains, functioning as a negative regulator of many signaling pathways. The N-terminal domain is evolutionarily conserved, and is known to bind to phosphorylated tyrosine residues. The so called N-terminal domain is actually 3 structural domains, of which this is the C-terminal SH2 domain. 42719 pfam02763: Diphtheria toxin, C domain. N-terminal catalytic (C) domain - blocks protein synthesis by transfer of ADP-ribose from NAD to a diphthamide residue of EF-2. 42720 pfam02764: Diphtheria toxin, T domain. Central domain of diphtheria toxin is the translocation (T) domain. pH induced conformational change in this domain triggers insertion into the endosomal membrane and facilitates the transfer of the catalytic domain into the cytoplasm. 42721 pfam02765: Telomere-binding protein alpha subunit, central domain. The telomere-binding protein forms a heterodimer in ciliates consisting of an alpha and a beta subunit. This complex may function as a protective cap for the single-stranded telomeric overhang. Alpha subunit consists of 3 structural domains, all with the same beta-barrel OB fold. 42722 pfam02766: Telomere-binding protein alpha subunit, C-terminal domain. The telomere-binding protein forms a heterodimer in ciliates consisting of an alpha and a beta subunit. This complex may function as a protective cap for the single-stranded telomeric overhang. Alpha subunit consists of 3 structural domains, all of the same beta-barrel OB fold. 42723 pfam02767: DNA polymerase III beta subunit, central domain. A dimer of the beta subunit of DNA polymerase beta forms a ring which encircles duplex DNA. Each monomer contains three domains of identical topology and DNA clamp fold. 42724 pfam02768: DNA polymerase III beta subunit, C-terminal domain. A dimer of the beta subunit of DNA polymerase beta forms a ring which encircles duplex DNA. Each monomer contains three domains of identical topology and DNA clamp fold. 42725 pfam02769: AIR synthase related protein, C-terminal domain. This family includes Hydrogen expression/formation protein HypE, AIR synthases EC:6.3.3.1, FGAM synthase EC:6.3.5.3 and selenide, water dikinase EC:2.7.9.3. The function of the C-terminal domain of AIR synthase is unclear, but the cleft formed between N and C domains is postulated as a sulphate binding site. 42726 pfam02770: Acyl-CoA dehydrogenase, middle domain. Central domain of Acyl-CoA dehydrogenase has a beta-barrel fold. 42727 pfam02771: Acyl-CoA dehydrogenase, N-terminal domain. The N-terminal domain of Acyl-CoA dehydrogenase is an all-alpha domain. 42728 pfam02772: S-adenosylmethionine synthetase, central domain. The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold. 42729 pfam02773: S-adenosylmethionine synthetase, C-terminal domain. The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold. 42730 pfam02774: Semialdehyde dehydrogenase, dimerisation domain. This Pfam entry contains the following members: N-acetyl-glutamine semialdehyde dehydrogenase (AgrC) Aspartate-semialdehyde dehydrogenase. 42731 pfam02775: Thiamine pyrophosphate enzyme, C-terminal TPP binding domain. 42732 pfam02776: Thiamine pyrophosphate enzyme, N-terminal TPP binding domain. 42733 pfam02777: Iron/manganese superoxide dismutases, C-terminal domain. superoxide dismutases (SODs) catalyse the conversion of superoxide radicals to hydrogen peroxide and molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the Mn/Fe-binding family is one. In humans, there is a cytoplasmic Cu/Zn SOD, and a mitochondrial Mn/Fe SOD. C-terminal domain is a mixed alpha/beta fold. 42734 pfam02778: tRNA intron endonuclease, N-terminal domain. Members of this family cleave pre tRNA at the 5' and 3' splice sites to release the intron EC:3.1.27.9. 42735 pfam02779: Transketolase, pyridine binding domain. This family includes transketolase enzymes, pyruvate dehydrogenases, and branched chain alpha-keto acid decarboxylases. 42736 pfam02780: Transketolase, C-terminal domain. The C-terminal domain of transketolase has been proposed as a regulatory molecule binding site. 42737 pfam02781: Glucose-6-phosphate dehydrogenase, C-terminal domain. 42738 pfam02782: FGGY family of carbohydrate kinases, C-terminal domain. This domain adopts a ribonuclease H-like fold and is structurally related to the N-terminal domain. 42739 pfam02783: Methyl-coenzyme M reductase beta subunit, N-terminal domain. Methyl-coenzyme M reductase (MCR) is the enzyme responsible for microbial formation of methane. It is a hexamer composed of 2 alpha (pfam02249), 2 beta (this family), and 2 gamma (pfam2240) subunits with two identical nickel porphinoid active sites. The N-terminal domain has an alpha/beta ferredoxin-like fold. 42740 pfam02784: Pyridoxal-dependent decarboxylase, pyridoxal binding domain. These pyridoxal-dependent decarboxylases acting on ornithine, lysine, arginine and related substrates This domain has a TIM barrel fold. 42741 pfam02785: Biotin carboxylase C-terminal domain. Biotin carboxylase is a component of the acetyl-CoA carboxylase multi-component enzyme which catalyses the first committed step in fatty acid synthesis in animals, plants and bacteria. Most of the active site residues reported are in this C-terminal domain. 42742 pfam02786: Carbamoyl-phosphate synthase L chain, ATP binding domain. Carbamoyl-phosphate synthase catalyses the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. This important enzyme initiates both the urea cycle and the biosynthesis of arginine and/or pyrimidines. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. The small chain promotes the hydrolysis of glutamine to ammonia, which is used by the large chain to synthesise carbamoyl phosphate. See pfam00988. The small chain has a GATase domain in the carboxyl terminus. See pfam00117. The ATP binding domain (this one) has an ATP-grasp fold. 42743 pfam02787: Carbamoyl-phosphate synthetase large chain, oligomerisation domain. Carbamoyl-phosphate synthase catalyses the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. 42744 pfam02788: Ribulose bisphosphate carboxylase large chain, N-terminal domain. The N-terminal domain of RuBisCO large chain adopts a ferredoxin-like fold. 42745 pfam02789: Cytosol aminopeptidase family, N-terminal domain. 42746 pfam02790: Cytochrome C oxidase subunit II, transmembrane domain. The N-terminal domain of cytochrome C oxidase contains two transmembrane alpha-helices. 42747 pfam02791: DDT domain. This domain is predicted to be a DNA binding domain. The DDT domain is named after (DNA binding homeobox and Different Transcription factors).. 42748 pfam02792: Mago nashi protein. This family was originally identified in Drosophila and called mago nashi, it is a strict maternal effect, grandchildless-like, gene. The human homologue has been shown to interact with an RNA binding protein. An RNAi knockout of the C. elegans homologue causes masculinization of the germ line (Mog phenotype) hermaphrodites, suggesting it is involved in hermaphrodite germ-line sex determination. Mago nashi has been found to be part of the exon-exon junction complex that binds 20 nucleotides upstream of exon-exon junctions. 42749 pfam02793: Hormone receptor domain. This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. . 42750 pfam02794: RTX toxin acyltransferase family. Members of this family are enzymes EC:2.3.1.-. involved in fatty acylation of the protoxins (HlyA) at lysine residues, thereby converting them to the active toxin. Acyl-acyl carrier protein (ACP) is the essential acyl donor. This family show a number of conserved residues that are possible candidates for participation in acyl transfer. Site-directed mutagenesis of the single conserved histidine residue in one member resulted in complete inactivation of the enzyme. 42751 pfam02795: Domain of unknown function (DUF225). This domain is found in several worm proteins. It contains 4 conserved cysteines. This domain is presumably extracellular and these cysteines form disulphide bridges. 42752 pfam02796: Helix-turn-helix domain of resolvase. 42753 pfam02797: Chalcone and stilbene synthases, C-terminal domain. This domain of chalcone synthase is reported to be structurally similar to domains in thiolase and beta-ketoacyl synthase. The differences in activity are accounted for by differences in the N-terminal domain. 42754 pfam02798: Glutathione S-transferase, N-terminal domain. Function: conjugation of reduced glutathione to a variety of targets. Also included in the alignment, but are not GSTs: * S-crystallins from squid. Similarity to GST previously noted. * Eukaryotic elongation factors 1-gamma. Not known to have GST activity; similarity not previously recognised. * HSP26 family of stress-related proteins. including auxin-regulated proteins in plants and stringent starvation proteins in E. coli. Not known to have GST activity. Similarity not previously recognised. The glutathione molecule binds in a cleft between N and C-terminal domains - the catalytically important residues are proposed to reside in the N-terminal domain. 42755 pfam02799: Myristoyl-CoA:protein N-myristoyltransferase, C-terminal domain. The N and C-terminal domains of NMT are structurally similar, each adopting an acyl-CoA N-acyltransferase-like fold. 42756 pfam02800: Glyceraldehyde 3-phosphate dehydrogenase, C-terminal domain. GAPDH is a tetrameric NAD-binding enzyme involved in glycolysis and glyconeogenesis. C-terminal domain is a mixed alpha/antiparallel beta fold. 42757 pfam02801: Beta-ketoacyl synthase, C-terminal domain. The structure of beta-ketoacyl synthase is similar to that of the thiolase family (Pfam::PF00108) and also chalcone sythase. The active site of beta-ketoacyl synthase is located between the N and C-terminal domains. 42758 pfam02803: Thiolase, C-terminal domain. Thiolase is reported to be structurally related to beta-ketoacyl synthase (pfam00109), and also chalcone synthase. 42759 pfam02804: Ribonuclease U2. This enzyme hydrolyses 28S rRNA. 42760 pfam02805: Metal binding domain of Ada. The Escherichia coli Ada protein repairs O6-methylguanine residues and methyl phosphotriesters in DNA by direct transfer of the methyl group to a cysteine residue. This domain contains four conserved cysteines that form a zinc binding site. One of these cysteines is a methyl group acceptor. The methylated domain can then specifically bind to the ada box on a DNA duplex. 42761 pfam02806: Alpha amylase, C-terminal all-beta domain. Alpha amylase is classified as family 13 of the glycosyl hydrolases. The structure is an 8 stranded alpha/beta barrel containing the active site, interrupted by a ~70 a.a. calcium-binding domain protruding between beta strand 3 and alpha helix 3, and a carboxyl-terminal Greek key beta-barrel domain. 42762 pfam02807: ATP:guanido phosphotransferase, N-terminal domain. The N-terminal domain has an all-alpha fold. 42763 pfam02809: Ubiquitin interaction motif. This motif is called the ubiquitin interaction motif. One of the proteins containing this motif is a receptor for poly-ubiquitination chains for the proteasome. This motif has a pattern of conservation characteristic of an alpha helix. 42764 pfam02810: SEC-C motif. The SEC-C motif found in the C-terminus of the SecA protein, in the middle of some SWI2 ATPases and also solo in several proteins. The motif is predicted to chelate zinc with the CXC and C[HC] pairs that constitute the most conserved feature of the motif. It is predicted to be a potential nucleic acid binding domain. 42765 pfam02811: PHP domain C-terminal region. The PHP (Polymerase and Histidinol Phosphatase) domain is a putative phosphoesterase domain. This family is often associated with an N-terminal region pfam02231. 42766 pfam02812: Glu/Leu/Phe/Val dehydrogenase, dimerisation domain. 42767 pfam02813: Retroviral M domain. Retroviruses contain a small protein, MA (matrix), which forms a protein lining immediately beneath the phospholipid membrane of the mature virus particle. MA is located in the N-terminal region of the Gag precursor polyprotein. The N-terminal segment of MA proteins directs the Gag protein to the plasma membrane where budding takes place, and has been called the M domain. This domain forms an alpha helical bundle structure. 42768 pfam02814: UreE urease accessory protein, N-terminal domain. UreE is a urease accessory protein. Urease pfam00449 hydrolyses urea into ammonia and carbamic acid. 42769 pfam02815: MIR domain. The MIR (protein mannosyltransferase, IP3R and RyR) domain is a small domain that may have a ligand transferase function. 42770 pfam02816: Alpha-kinase family. This family is a novel family of eukaryotic protein kinase catalytic domains, which have no detectable similarity to conventional kinases. The family contains myosin heavy chain kinases and Elongation Factor-2 kinase and a bifunctional ion channel. This family is known as the alpha-kinase family. The structure of the kinase domain revealed unexpected similarity to eukaryotic protein kinases in the catalytic core as well as to metabolic enzymes with ATP-grasp domains. 42771 pfam02817: e3 binding domain. This family represents a small domain of the E2 subunit of 2-oxo-acid dehydrogenases responsible for the binding of the E3 subunit. 42772 pfam02818: PPAK motif. This protein motif is found in the titin protein. These motifs are found in the PEVK region of titin. 42773 pfam02819: Spider toxin. This family of spider neurotoxins are thought to be calcium ion channel inhibitors. 42774 pfam02820: mbt repeat. The function of this repeat is unknown, but is found in a number of nuclear proteins such as drosophila sex comb on midleg protein. The repeat is found in up to four copies. The repeat contains a completely conserved glutamate at its amino terminus that may be important for function. 42775 pfam02821: Staphylokinase/Streptokinase family. 42776 pfam02822: Antistasin family. Members of this family are inhibitors of trypsin family proteases. This domain is highly disulphide bonded. The domain is also found in some large extracellular proteins in multiple copies. 42777 pfam02823: ATP synthase, Delta/Epsilon chain, beta-sandwich domain. Part of the ATP synthase CF(1). These subunits are part of the head unit of the ATP synthase. The subunit is called epsilon in bacteria and delta in mitochondria. In bacteria the delta (D) subunit is equivalent to the mitochondrial Oligomycin sensitive subunit, OSCP (pfam00213).. 42778 pfam02824: TGS domain. The TGS domain is named after ThrRS, GTPase, and SpoT. Interestingly, TGS domain was detected also at the amino terminus of the uridine kinase from the spirochaete Treponema pallidum (but not any other organism, including the related spirochaete Borrelia burgdorferi). TGS is a small domain that consists of ~50 amino acid residues and is predicted to possess a predominantly beta-sheet structure. There is no direct information on the functions of the TGS domain, but its presence in two types of regulatory proteins (the GTPases and guanosine polyphosphate phosphohydrolases/synthetases) suggests a ligand (most likely nucleotide)-binding, regulatory role. 42779 pfam02825: WWE domain. The WWE domain is named after three of its conserved residues and is predicted to mediate specific protein- protein interactions in ubiquitin and ADP ribose conjugation systems. 42780 pfam02826: D-isomer specific 2-hydroxyacid dehydrogenase, NAD binding domain. This domain is inserted into the catalytic domain, the large dehydrogenase and D-lactate dehydrogenase families in SCOP. N-terminal portion of which is represented by family pfam00389. 42781 pfam02827: cAMP-dependent protein kinase inhibitor. Members of this family are extremely potent competitive inhibitors of camp-dependent protein kinase activity. These proteins interact with the catalytic subunit of the enzyme after the cAMP-induced dissociation of its regulatory chains. 42782 pfam02828: L27 domain. The L27 domain is found in receptor targeting proteins Lin-2 and Lin-7. 42783 pfam02829: 3H domain. This domain is predicted to be a small molecule binding domain, based on its occurrence with other domains. The domain is named after its three conserved histidine residues. 42784 pfam02830: V4R domain. The V4R (vinyl 4 reductase) domain is a predicted small molecular binding domain, that may bind to hydrocarbons. 42785 pfam02831: gpW. gpW is a 68 residue protein known to be present in phage particles. Extracts of phage-infected cells lacking gpW contain DNA-filled heads, and active tails, but no infectious virions. gpW is required for the addition of gpFII to the head, which is, in turn, required for the attachment of tails. Since gpFII and tails are known to be attached at the connector, gpW is also likely to assemble at this site. The addition of gpW to filled heads increases the DNase resistance of the packaged DNA, suggesting that gpW either forms a plug at the connector to prevent ejection of the DNA, or binds directly to the DNA. The large number of positively charged residues in gpW (its calculated pI is 10.8) is consistent with a role in DNA interaction. 42786 pfam02832: Flavivirus glycoprotein, immunoglobulin-like domain. 42787 pfam02833: DHHA2 domain. This domain is often found adjacent to the DHH domain pfam01368 and is called DHHA2 for DHH associated domain. This domain is diagnostic of DHH subfamily 2 members. The domain is about 120 residues long and contains a conserved DXK motif at its amino terminus. 42788 pfam02834: 2',5' RNA ligase family. Members of this family are bacterial and archaeal RNA ligases that are able to ligate tRNA half molecules containing 2',3'-cyclic phosphate and 5' hydroxyl termini to products containing the 2',5' phosphodiester linkage. Each member of this family contains an internal duplication, each of which contains an HXTX motif that defines the family. The structure of a related protein is known. 42789 pfam02836: Glycosyl hydrolases family 2, TIM barrel domain. This family contains beta-galactosidase, beta-mannosidase and beta-glucuronidase activities. 42790 pfam02837: Glycosyl hydrolases family 2, sugar binding domain. This family contains beta-galactosidase, beta-mannosidase and beta-glucuronidase activities and has a jelly-roll fold. 42791 pfam02838: Glycosyl hydrolase family 20, domain 2. This domain has a zincin-like fold. 42792 pfam02839: Carbohydrate binding domain. This short domain is found in many different glycosyl hydrolase enzymes and is presumed to have a carbohydrate binding function. The domain has six aromatic groups that may be important for binding. 42793 pfam02840: Prp18 domain. The splicing factor Prp18 is required for the second step of pre-mRNA splicing. The structure of a large fragment of the Saccharomyces cerevisiae Prp18 is known. This fragment is fully active in yeast splicing in vitro and includes the sequences of Prp18 that have been evolutionarily conserved. The core structure consists of five alpha-helices that adopt a novel fold. The most highly conserved region of Prp18, a nearly invariant stretch of 19 aa, forms part of a loop between two alpha-helices and may interact with the U5 small nuclear ribonucleoprotein particles. 42794 pfam02841: Guanylate-binding protein, C-terminal domain. Transcription of the anti-viral guanylate-binding protein (GBP) is induced by interferon-gamma during macrophage induction. This family contains GBP1 and GPB2, both GTPases capable of binding GTP, GDP and GMP. 42795 pfam02842: Phosphoribosylglycinamide synthetase, B domain. Phosphoribosylglycinamide synthetase catalyses the second step in the de novo biosynthesis of purine. The reaction catalysed by Phosphoribosylglycinamide synthetase is the ATP- dependent addition of 5-phosphoribosylamine to glycine to form 5'phosphoribosylglycinamide. This family is related to biotin carboxylase/carbamoyl phosphate synthetase (see pfam02786).. 42796 pfam02843: Phosphoribosylglycinamide synthetase, C domain. Phosphoribosylglycinamide synthetase catalyses the second step in the de novo biosynthesis of purine. The reaction catalysed by Phosphoribosylglycinamide synthetase is the ATP- dependent addition of 5-phosphoribosylamine to glycine to form 5'phosphoribosylglycinamide. This domain is related to the C-terminal domain of biotin carboxylase/carbamoyl phosphate synthetase (see pfam02787).. 42797 pfam02844: Phosphoribosylglycinamide synthetase, N domain. Phosphoribosylglycinamide synthetase catalyses the second step in the de novo biosynthesis of purine. The reaction catalysed by Phosphoribosylglycinamide synthetase is the ATP- dependent addition of 5-phosphoribosylamine to glycine to form 5'phosphoribosylglycinamide. This domain is related to the N-terminal domain of biotin carboxylase/carbamoyl phosphate synthetase (see pfam00289).. 42798 pfam02845: CUE domain. CUE domains have been shown to bind ubiquitin. It has been suggested that CUE domains are related to pfam00627. CUE domains also occur in two protein of the IL-1 signal transduction pathway, tollip and TAB2. 42799 pfam02847: MA3 domain. Domain in DAP-5, eIF4G, MA-3 and other proteins. Highly alpha-helical. May contain repeats and/or regions similar to MIF4G domains. 42800 pfam02852: Pyridine nucleotide-disulphide oxidoreductase, dimerisation domain. This family includes both class I and class II oxidoreductases and also NADH oxidases and peroxidases. 42801 pfam02854: MIF4G domain. MIF4G is named after Middle domain of eukaryotic initiation factor 4G (eIF4G). Also occurs in NMD2p and CBP80. The domain is rich in alpha-helices and may contain multiple alpha-helical repeats. In eIF4G, this domain binds eIF4A, eIF3, RNA and DNA. 42802 pfam02861: Clp amino terminal domain. This short domain is found in one or two copies at the amino terminus of ClpA and ClpB proteins from bacteria and eukaryotes. The function of these domains is uncertain but they may form a protein binding site. 42803 pfam02862: DDHD domain. The DDHD domain is 180 residues long and contains four conserved residues that may form a metal binding site. The domain is named after these four residues. This pattern of conservation of metal binding residues is often seen in phosphoesterase domains. This domain is found in retinal degeneration B proteins, as well as a family of probable phospholipases. It has been shown that this domain is found in a longer C terminal region that binds to PYK2 tyrosine kinase. These proteins have been called N-terminal domain-interacting receptor (Nir1, Nir2 and Nir3). This suggests that this region is involved in functionally important interactions in other members of this family. 42804 pfam02863: Arginine repressor, C-terminal domain. 42805 pfam02864: STAT protein, DNA binding domain. STAT proteins (Signal Transducers and Activators of Transcription) are a family of transcription factors that are specifically activated to regulate gene transcription when cells encounter cytokines and growth factors. This family represents the DNA binding domain of STAT, which has an ig-like fold. STAT proteins also include an SH2 domain pfam00017. 42806 pfam02865: STAT protein, protein interaction domain. STAT proteins (Signal Transducers and Activators of Transcription) are a family of transcription factors that are specifically activated to regulate gene transcription when cells encounter cytokines and growth factors. STAT proteins also include an SH2 domain pfam00017. 42807 pfam02866: lactate/malate dehydrogenase, alpha/beta C-terminal domain. L-lactate dehydrogenases are metabolic enzymes which catalyse the conversion of L-lactate to pyruvate, the last step in anaerobic glycolysis. L-2-hydroxyisocaproate dehydrogenases are also members of the family. Malate dehydrogenases catalyse the interconversion of malate to oxaloacetate. The enzyme participates in the citric acid cycle. L-lactate dehydrogenase is also found as a lens crystallin in bird and crocodile eyes. 42808 pfam02867: Ribonucleotide reductase, barrel domain. 42809 pfam02868: Thermolysin metallopeptidase, alpha-helical domain. 42810 pfam02870: 6-O-methylguanine DNA methyltransferase, ribonuclease-like domain. 42811 pfam02871: NAD/NADP octopine/nopaline dehydrogenase, NAD binding domain. This group of enzymes act on the CH-NH substrate bond using NAD(+) or NADP(+) as an acceptor. The Pfam family consists mainly of octopine and nopaline dehydrogenases from Ti plasmids. This domain is an NAD(P) binding Rossman fold. 42812 pfam02872: 5'-nucleotidase, C-terminal domain. 42813 pfam02873: UDP-N-acetylenolpyruvoylglucosamine reductase, C-terminal domain. Members of this family are UDP-N-acetylenolpyruvoylglucosamine reductase enzymes EC:1.1.1.158. This enzyme is involved in the biosynthesis of peptidoglycan. 42814 pfam02874: ATP synthase alpha/beta family, beta-barrel domain. This family includes the ATP synthase alpha and beta subunits the ATP synthase associated with flagella. 42815 pfam02875: Mur ligase family, glutamate ligase domain. This family contains a number of related ligase enzymes which have EC numbers 6.3.2.*. This family includes: MurC, MurD, MurE, MurF, Mpl and FolC. MurC, MurD, Mure and MurF catalyse consecutive steps in the synthesis of peptidoglycan. Peptidoglycan consists of a sheet of two sugar derivatives, with one of these N-acetylmuramic acid attaching to a small pentapeptide. The pentapeptide is is made of L-alanine, D-glutamic acid, Meso-diaminopimelic acid and D-alanyl alanine. The peptide moiety is synthesised by successively adding these amino acids to UDP-N-acetylmuramic acid. MurC transfers the L-alanine, MurD transfers the D-glutamate, MurE transfers the diaminopimelic acid, and MurF transfers the D-alanyl alanine. This family also includes Folylpolyglutamate synthase that transfers glutamate to folylpolyglutamate. 42816 pfam02876: Staphylococcal/Streptococcal toxin, beta-grasp domain. 42817 pfam02877: Poly(ADP-ribose) polymerase, regulatory domain. Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active. 42818 pfam02878: Phosphoglucomutase/phosphomannomutase, alpha/beta/alpha domain I. 42819 pfam02879: Phosphoglucomutase/phosphomannomutase, alpha/beta/alpha domain II. 42820 pfam02880: Phosphoglucomutase/phosphomannomutase, alpha/beta/alpha domain III. 42821 pfam02881: SRP54-type protein, helical bundle domain. 42822 pfam02882: Tetrahydrofolate dehydrogenase/cyclohydrolase, NAD(P)-binding domain. 42823 pfam02883: Adaptin C-terminal domain. Alpha adaptin is a heterotetramer which regulates clathrin-bud formation. The carboxyl-terminal appendage of the alpha subunit regulates translocation of endocytic accessory proteins to the bud site. This ig-fold domain is found in alpha, beta and gamma adaptins. 42824 pfam02884: Polysaccharide lyase family 8, C-terminal beta-sandwich domain. This family consists of a group of secreted bacterial lyase enzymes EC:4.2.2.1 capable of acting on hyaluronan and chondroitin in the extracellular matrix of host tissues, contributing to the invasive capacity of the pathogen. 42825 pfam02885: Glycosyl transferase family, helical bundle domain. This family includes anthranilate phosphoribosyltransferase (TrpD), thymidine phosphorylase. All these proteins can transfer a phosphorylated ribose substrate. 42826 pfam02886: LBP / BPI / CETP family, C-terminal domain. The N and C terminal domains of the LBP/BPI/CETP family are structurally similar. 42827 pfam02887: Pyruvate kinase, alpha/beta domain. 42828 pfam02888: Calmodulin binding domain. Small-conductance Ca2+-activated K+ channels (SK channels) are independent of voltage and gated solely by intracellular Ca2+. These membrane channels are heteromeric complexes that comprise pore-forming alpha-subunits and the Ca2+-binding protein calmodulin (CaM). CaM binds to the SK channel through this the CaM-binding domain (CaMBD), which is located in an intracellular region of the alpha-subunit immediately carboxy-terminal to the pore. Channel opening is triggered when Ca2+ binds the EF hands in the N-lobe of CaM. The structure of this domain complexed with CaM is known. This domain forms an elongated dimer with a CaM molecule bound at each end; each CaM wraps around three alpha-helices, two from one CaMBD subunit and one from the other. 42829 pfam02889: Sec63 domain. 42830 pfam02890: Borrelia family of unknown function DUF226. This family of proteins are found in Borrelia. The proteins are about 190 amino acids long and have no known function. 42831 pfam02891: MIZ zinc finger. 42832 pfam02892: BED zinc finger. 42833 pfam02893: GRAM domain. The GRAM domain is found in in glucosyltransferases, myotubularins and other putative membrane-associated proteins. 42834 pfam02894: Oxidoreductase family, C-terminal alpha/beta domain. This family of enzymes utilise NADP or NAD. This family is called the GFO/IDH/MOCA family. 42835 pfam02895: Signal transducing histidine kinase, homodimeric domain. This helical bundle domain is the homodimer interface of the signal transducing histidine kinase family. 42836 pfam02896: PEP-utilising enzyme, TIM barrel domain. 42837 pfam02897: Prolyl oligopeptidase, N-terminal beta-propeller domain. This unusual 7-stranded beta-propeller domain protects the catalytic triad of prolyl oligopeptidase (see pfam00326), excluding larger peptides and proteins from proteolysis in the cytosol. 42838 pfam02898: Nitric oxide synthase, oxygenase domain. 42839 pfam02899: Phage integrase, N-terminal SAM-like domain. 42840 pfam02900: Catalytic LigB subunit of aromatic ring-opening dioxygenase. 42841 pfam02901: Pyruvate formate lyase. 42842 pfam02902: Ulp1 protease family, C-terminal catalytic domain. This domain contains the catalytic triad Cys-His-Asn. 42843 pfam02903: Alpha amylase, N-terminal ig-like domain. 42844 pfam02905: Epstein Barr virus nuclear antigen-1, DNA-binding domain. This domain has a ferredoxin-like fold. 42845 pfam02906: Iron only hydrogenase large subunit, C-terminal domain. 42846 pfam02907: Hepatitis C virus NS3 protease. Hepatitis C virus NS3 protein is a serine protease which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. NS2-3 proteinase, a zinc-dependent enzyme, performs a single proteolytic cut to release the N-terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4A. 42847 pfam02909: Tetracyclin repressor, C-terminal all-alpha domain. 42848 pfam02910: Fumarate reductase/succinate dehydrogenase flavoprotein C-terminal domain. This family contains fumarate reductases, succinate dehydrogenases and L-aspartate oxidases. 42849 pfam02911: Formyl transferase, C-terminal domain. 42850 pfam02912: Aminoacyl tRNA synthetase class II, N-terminal domain. 42851 pfam02913: FAD linked oxidases, C-terminal domain. This domain has a ferredoxin-like fold. 42852 pfam02914: Bacteriophage Mu transposase. 42853 pfam02915: Rubrerythrin. This domain has a ferritin-like fold. 42854 pfam02916: DNA polymerase processivity factor. 42855 pfam02917: Pertussis toxin, subunit 1. 42856 pfam02918: Pertussis toxin, subunit 2 and 3, C-terminal domain. 42857 pfam02919: Eukaryotic DNA topoisomerase I, DNA binding fragment. Topoisomerase I promotes the relaxation of DNA superhelical tension by introducing a transient single-stranded break in duplex DNA and are vital for the processes of replication, transcription, and recombination. This family may be more than one structural domain. 42858 pfam02920: DNA binding domain of tn916 integrase. 42859 pfam02921: Ubiquinol cytochrome reductase transmembrane region. Each subunit of the cytochrome bc1 complex provides a single helix (this family) to make up the transmembrane region of the complex. 42860 pfam02922: Isoamylase N-terminal domain. This domain is found in a range of enzymes that act on branched substrates - isoamylase, pullulanase and branching enzyme. This family also contains the beta subunit of 5' AMP activated kinase. 42861 pfam02923: Restriction endonuclease BamHI. 42862 pfam02924: Bacteriophage lambda head decoration protein D. 42863 pfam02925: Bacteriophage scaffolding protein D. 42864 pfam02926: THUMP domain. The THUMP domain is named after after thiouridine synthases, methylases and PSUSs. The THUMP domain consists of about 110 amino acid residues. It is predicted that this domain is an RNA-binding domain that adopts an alpha/beta fold similar to that found in the C-terminal domain of translation initiation factor 3 and ribosomal protein S8. The THUMP domain probably functions by delivering a variety of RNA modification enzymes to their targets. 42865 pfam02927: N-terminal ig-like domain of cellulase. 42866 pfam02928: C5HC2 zinc finger. Predicted zinc finger with eight potential zinc ligand binding residues. This domain is found in Jumonji. This domain may have a DNA binding function. 42867 pfam02929: Beta galactosidase small chain, N terminal domain. This domain is found in the amino-terminal portion of the small chain of dimeric beta-galactosidases EC:3.2.1.23. This domain is also found in single chain beta-galactosidase. 42868 pfam02930: Beta galactosidase small chain, C terminal domain. This domain is found in the carboxy-terminal portion of the small chain of dimeric beta-galactosidases EC:3.2.1.23. This domain is also found in single chain beta-galactosidase. 42869 pfam02931: Neurotransmitter-gated ion-channel ligand binding domain. This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. 42870 pfam02932: Neurotransmitter-gated ion-channel transmembrane region. This family includes the four transmembrane helices that form the ion channel. 42871 pfam02933: Cell division protein 48 (CDC48), domain 2. This domain has a double psi-beta barrel fold and includes VCP-like ATPase and N-ethylmaleimide sensitive fusion protein N-terminal domains. Both the VAT and NSF N-terminal functional domains consist of two structural domains of which this is at the C-terminus. The VAT-N domain found in AAA ATPases pfam00004 is a substrate 185-residue recognition domain. 42872 pfam02934: PET112 family, N terminal region. 42873 pfam02935: Cytochrome c oxidase subunit VIIc. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of cytochrome c oxidase subunit VIIc. The yeast member of this family is called COX VIII. 42874 pfam02936: Cytochrome c oxidase subunit IV. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of cytochrome c oxidase subunit IV. The Dictyostelium member of this family is called COX VI. 42875 pfam02937: Cytochrome c oxidase subunit VIc. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of cytochrome c oxidase subunit VIc. 42876 pfam02938: GAD domain. This domain is found in some members of the GatB and aspartyl tRNA synthetases. 42877 pfam02939: UcrQ family. The ubiquinol-cytochrome C reductase complex (cytochrome bc1 complex) is a respiratory multienzyme complex. This family represents the 9.5 kDa subunit of the complex. . 42878 pfam02940: mRNA capping enzyme, beta chain. The beta chain of mRNA capping enzyme has triphosphatase activity. The function of the capping enzyme also depends on the guanylyltransferase activity conferred by the alpha chain (see pfam01331). 42879 pfam02941: Ferredoxin thioredoxin reductase variable alpha chain. 42880 pfam02942: Influenza B non-structural protein (NS1). A specific region of the influenza B virus NS1 protein, which includes part of its effector domain, blocks the covalent linkage of ISG15 to its target proteins both in vitro and in infected cells. Of the several hundred proteins induced by interferon (IFN) alpha/beta, the ubiquitin-like ISG15 protein is one of the most predominant. Influenza A virus employs a different strategy: its NS1 protein does not bind the ISG15 protein, but little or no ISG15 protein is produced during infection. 42881 pfam02943: Ferredoxin thioredoxin reductase catalytic beta chain. 42882 pfam02944: BESS motif. The BESS motif is named after the proteins in which it is found (BEAF, Suvar(3)7 and Stonewall). The motif is 40 amino acid residues long and is composed of two predicted alpha helices. Based on the protein in which it is found and the presence of conserved positively charged residues it is predicted to be a DNA binding domain. This domain appears to be specific to drosophila. 42883 pfam02945: Recombination endonuclease VII. 42884 pfam02946: GTF2I-like repeat. This region of sequence similarity is found up to six times in a variety of proteins including GTF2I. It has been suggested that this may be a DNA binding domain. 42885 pfam02947: flt3 ligand. The flt3 ligand is a short chain cytokine with a 4 helical bundle fold. 42886 pfam02948: Amelogenin. Amelogenins play a role in biomineralisation. They seem to regulate the formation of crystallites during the secretory stage of tooth enamel development. thought to play a major role in the structural organisation and mineralisation of developing enamel. They are found in the extracellular matrix. Mutations in X-chromosomal amelogenin can cause Amelogenesis imperfecta. 42887 pfam02949: 7tm Odorant receptor. This family is composed of 7 transmembrane receptors, that are probably drosophila odorant receptors. 42888 pfam02950: Conotoxin. Conotoxins are small snail toxins that block ion channels. 42889 pfam02951: Prokaryotic glutathione synthetase, N-terminal domain. 42890 pfam02952: L-fucose isomerase, C-terminal domain. 42891 pfam02953: Tim10/DDP family zinc finger. Putative zinc binding domain with four conserved cysteine residues. This domain is found in the human disease protein. Members of this family such as Tim9 and Tim10 are involved in mitochondrial protein import. Members of this family seem to be localised to the mitochondrial intermembrane space. 42892 pfam02954: Bacterial regulatory protein, Fis family. 42893 pfam02955: Prokaryotic glutathione synthetase, ATP-binding domain. 42894 pfam02956: TT viral orf 1. TT virus (TTV), isolated initially from a Japanese patient with hepatitis of unknown aetiology, has since been found to infect both healthy and diseased individuals and numerous prevalence studies have raised questions about its role in unexplained hepatitis. ORF1 is a large 750 residue protein. 42895 pfam02957: TT viral ORF2. TT virus (TTV), isolated initially from a Japanese patient with hepatitis of unknown aetiology, has since been found to infect both healthy and diseased individuals and numerous prevalence studies have raised questions about its role in unexplained hepatitis. ORF2 is a 150 residue protein. 42896 pfam02958: Domain of unknown function (DUF227). This family includes a large number of drosophila proteins of unknown function. The family also includes several C. elegans proteins. The alignment contains many histidines and aspartates that are conserved, suggesting a metal binding and possibly a phosphoesterase function (A Bateman pers. obs.).. 42897 pfam02959: HTLV Tax. Human T-cell leukaemia virus type I (HTLV-I) is the etiological agent for adult T-cell leukaemia (ATL), as well as for tropical spastic paraparesis (TSP) and HTLV-I associate myelopathy (HAM). A biological understanding of the involvement of HTLV-I and in ATL has focused significantly on the workings of the virally-encoded 40 kDa phospho-oncoprotein, Tax. Tax is a transcriptional activator. Its ability to modulate the expression and function of many cellular genes has been reasoned to be a major contributory mechanism explaining HTLV-I-mediated transformation of cells. In activating cellular gene expression, Tax impinges upon several cellular signal-transduction pathways, including those for CREB/ATF and NF-kappaB. 42898 pfam02960: K1 glycoprotein. 42899 pfam02961: Barrier to autointegration factor. The BAF protein has a SAM-domain-like bundle of orthogonally packed alpha-hairpins - one classic and one pseudo helix-hairpin-helix motif. The protein is involved in the prevention of retroviral DNA integration. 42900 pfam02962: 5-carboxymethyl-2-hydroxymuconate isomerase. 42901 pfam02963: Restriction endonuclease EcoRI. 42902 pfam02964: Methane monooxygenase, hydrolase gamma chain. 42903 pfam02965: Vitamin B12 dependent methionine synthase, activation domain. 42904 pfam02966: Mitosis protein DIM1. 42905 pfam02967: Fumarate reductase respiratory complex, transmembrane subunit. 42906 pfam02969: TATA box binding protein associated factor (TAF). TAF proteins adopt a histone-like fold. 42907 pfam02970: Tubulin binding cofactor A. 42908 pfam02971: Formiminotransferase domain. 42909 pfam02972: Phycoerythrin, alpha/beta chain. This family represents the non-globular alpha and beta chain components of phycoerythrin. The structure is a long beta-hairpin and a single alpha-helix. 42910 pfam02973: Sialidase, N-terminal domain. 42911 pfam02974: Protease inhibitor Inh. The Inh inhibitor is secreted into the periplasm where its presumed physiological function is to protect periplasmic proteins against the action of secreted proteases. A range of proteases including A, B and C from E. chrysanthemi, alkaline protease from Pseudomonas aeruginosa and the 50 kDa protease from Serratia marcescens are inhibited. 42912 pfam02975: Methylamine dehydrogenase, L chain. 42913 pfam02976: DNA mismatch repair enzyme MutH. 42914 pfam02977: Carboxypeptidase A inhibitor. 42915 pfam02978: Signal peptide binding domain. 42916 pfam02979: Nitrile hydratase, alpha chain. 42917 pfam02980: Restriction endonuclease FokI, catalytic domain. 42918 pfam02981: Restriction endonuclease FokI, recognition domain. 42919 pfam02982: Scytalone dehydratase. Scytalone dehydratases are structurally related to the NTF2 family (see pfam02136).. 42920 pfam02983: Alpha-lytic protease prodomain. 42921 pfam02984: Cyclin, C-terminal domain. Cyclins regulate cyclin dependent kinases (CDKs). One member is a Uracil-DNA glycosylase that is related to other cyclins. Cyclins contain two domains of similar all-alpha fold, of which this family corresponds with the C-terminal domain. 42922 pfam02985: HEAT repeat. The HEAT repeat family is related to armadillo/beta-catenin-like repeats (see pfam00514). CAUTION: This family does not contain all known HEAT repeats. 42923 pfam02986: Fibronectin binding repeat. The ability of bacteria to bind fibronectin is thought to enable the colonisation of wound tissue and blood clots. The fibronectin binding repeat is found in bacterial fibronectin binding proteins and serum opacity factor. 42924 pfam02987: Late embryogenesis abundant protein. Different types of LEA proteins are expressed at different stages of late embryogenesis in higher plant seed embryos and under conditions of dehydration stress. The function of these proteins is unknown. 42925 pfam02988: Phospholipase A2 inhibitor. 42926 pfam02989: Lyme disease proteins of unknown function. 42927 pfam02990: Endomembrane protein 70. 42928 pfam02991: Microtubule associated protein 1A/1B, light chain 3. Light chain 3 is proposed to function primarily as a subunit of microtubule associated proteins 1A and 1B and that its expression may regulate microtubule binding activity. 42929 pfam02992: Transposase family tnp2. 42930 pfam02993: Minor capsid protein VI. This minor capsid protein may act as a link between the external capsid and the internal DNA-protein core. The C-terminal 11 residues may function as a protease cofactor leading to enzyme activation. 42931 pfam02994: L1 transposable element. 42932 pfam02995: Protein of unknown function (DUF229). Members of this family are uncharacterised. They are 500-1200 amino acids in length and share a long region conservation that probably corresponds to several domains. 42933 pfam02996: Prefoldin subunit. This family comprises of several prefoldin subunits. The biogenesis of the cytoskeletal proteins actin and tubulin involves interaction of nascent chains of each of the two proteins with the oligomeric protein prefoldin (PFD) and their subsequent transfer to the cytosolic chaperonin CCT (chaperonin containing TCP-1). Electron microscopy shows that eukaryotic PFD, which has a similar structure to its archaeal counterpart, interacts with unfolded actin along the tips of its projecting arms. In its PFD-bound state, actin seems to acquire a conformation similar to that adopted when it is bound to CCT. 42934 pfam02998: Lentiviral Tat protein. This family contains retroviral transactivating (Tat) proteins, from a variety of Lentiviruses. 42935 pfam02999: Borrelia orf-D family. Borrelia burgdorferi supercoiled plasmids encode multicopy tandem open reading frames called Orf-A, Orf-B, Orf-C and Orf-D. This family corresponds to Orf-D. The putative product of this gene has no known function. 42936 pfam03000: NPH3 family. Phototropism of Arabidopsis thaliana seedlings in response to a blue light source is initiated by nonphototropic hypocotyl 1 (NPH1), a light-activated serine-threonine protein kinase. Mutations in NPH3 disrupt early signaling occurring downstream of the NPH1 photoreceptor. The NPH3 gene encodes a NPH1-interacting protein. NPH3 is a member of a large protein family, apparently specific to higher plants, and may function as an adapter or scaffold protein to bring together the enzymatic components of a NPH1-activated phosphorelay. 42937 pfam03002: Somatostatin/Cortistatin family. Members of this family are hormones. Somatostatin inhibits the release of somatotropin. Cortistatin is a peptide that is related to the Somatostatins that is found to depresses neuronal electrical activity but, unlike somatostatin, induces low-frequency waves in the cerebral cortex and antagonises the effects of acetylcholine on hippocampal and cortical measures of excitability. 42938 pfam03003: Poxvirus proteins of unknown function. 42939 pfam03004: Plant transposase (Ptta/En/Spm family). Transposase proteins are necessary for efficient DNA transposition. This family includes various plant transposases from the Ptta and En/Spm families. 42940 pfam03005: Arabidopsis proteins of unknown function. 42941 pfam03006: Uncharacterised protein family (Hly-III / UPF0073). Members of this family are integral membrane proteins. This family includes a protein with hemolytic activity from Bacillus cereus. It is not clear if all the members of this family are hemolysins. It has been proposed that YOL002c encodes a Saccharomyces cerevisiae protein that plays a key role in metabolic pathways that regulate lipid and phosphate metabolism. 42942 pfam03007: Uncharacterised protein family (UPF0089). This family of uncharacterised proteins is greatly expanded in Mycobacterium tuberculosis. The most conserved region of the proteins contains conserved histidine and aspartate residues suggesting a possible metal binding site suggestive of a protease activity (Bateman A. pers. obs.).. 42943 pfam03008: Archaea bacterial proteins of unknown function. 42944 pfam03009: Glycerophosphoryl diester phosphodiesterase family. E. coli has two sequence related isozymes of glycerophosphoryl diester phosphodiesterase (GDPD) - periplasmic and cytosolic. This family also includes agrocinopine synthase, the similarity to GDPD has been noted. This family appears to have weak but not significant matches to mammalian phospholipase C pfam00388, which suggests that this family may adopt a TIM barrel fold. 42945 pfam03010: GP4. GP4 is a minor membrane-associated glycoproteins. This family contains envelope protein GP4 from equine arteritis virus. 42946 pfam03011: Plasmodium falciparum erythrocyte membrane protein (PFEMP). PfEMP1 has been identified as the rosetting ligand of the malaria parasite P. falciparum. Rosetting is the adhesion of infected erythrocytes with uninfected erythrocytes in the vasculature of the infected organ, and is associated with severe malaria. PfEMP1 interacts with Complement Receptor One on uninfected erythrocytes to form rosettes. 42947 pfam03012: Phosphoprotein. This family includes the M1 phosphoprotein non-structural RNA polymerase alpha subunit, which is thought to be a component of the active polymerase, and may be involved in template binding. 42948 pfam03013: Pyrimidine dimer DNA glycosylase. Pyrimidine dimer DNA glycosylases excise pyrimidine dimers by hydrolysis of the glycosylic bond of the 5' pyrimidine, followed by the intra-pyrimidine phosphodiester bond. Pyrimidine dimers are the major UV-lesions of DNA. 42949 pfam03014: Structural protein 2. This family represents structural protein 2 of the hepatitis E virus. The high basic amino acid content of this protein has lead to the suggestion of a role in viral genomic RNA encapsidation. 42950 pfam03015: Male sterility protein. This family represents the C-terminal region of the male sterility protein in a number of arabidopsis and drosophila. A sequence-related jojoba acyl CoA reductase is also included. 42951 pfam03016: Exostosin family. The EXT family is a family of tumour suppressor genes. Mutations of EXT1 on 8q24.1, EXT2 on 11p11-13, and EXT3 on 19p have been associated with the autosomal dominant disorder known as hereditary multiple exostoses (HME). This is the most common known skeletal dysplasia. The chromosomal locations of other EXT genes suggest association with other forms of neoplasia. EXT1 and EXT2 have both been shown to encode a heparan sulphate polymerase with both D-glucuronyl (GlcA) and N-acetyl-D-glucosaminoglycan (GlcNAC) transferase activities. The nature of the defect in heparan sulphate biosynthesis in HME is unclear. 42952 pfam03017: TNP1/EN/SPM transposase. 42953 pfam03018: Dirigent-like protein. This family contains a number of proteins which are induced during disease response in plants. Members of this family are involved in lignification. 42954 pfam03019: Furovirus P26. 42955 pfam03020: LEM domain. The LEM domain is 50 residues long and is composed of two parallel alpha helices. This domain is found in inner nuclear membrane proteins. It is called the LEM domain after LAP2, Emerin and Man1. 42956 pfam03021: Influenza C virus M2 protein. Influenza C virus M1 protein is encoded by a spliced mRNA. The unspliced mRNA is also found in small quantities and can encode the protein represented by this family. 42957 pfam03022: Major royal jelly protein. Royal jelly is the food of queen bee larvae, and is responsible for the high reproductive ability of the queen. Major royal jelly proteins make up around 90% of larval jelly proteins. This family also the sequence-related yellow protein of drosophila which controls pigmentation of the adult cuticle and larval mouth parts. 42958 pfam03023: MviN-like protein. Deletion of the mviN virulence gene in Salmonella enterica serovar. Typhimurium greatly reduces virulence in a mouse model of typhoid-like disease. Open reading frames encoding homologues of MviN have since been identified in a variety of bacteria, including pathogens and non-pathogens and plant-symbionts. In the nitrogen-fixing symbiont Rhizobium tropici, mviN is required for motility. The MviM protein is predicted to be membrane-associated. 42959 pfam03024: Folate receptor family. This family includes the folate receptor which binds to folate and reduced folic acid derivatives and mediates delivery of 5-methyltetrahydrofolate to the interior of cells. These proteins are attached to the membrane by a GPI-anchor. The proteins contain 16 conserved cysteines that form eight disulphide bridges. 42960 pfam03025: Papillomavirus E5. The E5 protein from papillomaviruses is about 80 amino acids long. The proteins are contain three regions that are predicted to be transmembrane alpha helices. The function of this protein is unknown. 42961 pfam03026: Influenza C virus M1 protein. This family represents the matrix 1 protein of influenza C virus. The protein is the product of a spliced mRNA. Small quantities of the unspliced mRNA are found in the cell additionally encoding the M2 protein (see pfam03021).. 42962 pfam03027: Odorant binding protein. This family contains the juvenile hormone binding protein of the tobacco hawkmoth as well as number of drosophila proteins of unknown function. Based on the similarity to the hormone binding protein we suggest that the members of this family are odorant binding proteins. 42963 pfam03028: Dynein heavy chain. This family represents the C-terminal region of dynein heavy chain. The chain also contains ATPase activity and microtubule binding ability and acts as a motor for the movement of organelles and vesicles along microtubules. Dynein is also involved in cilia and flagella movement. The dynein subunit consists of at least two heavy chains and a number of intermediate and light chains (see pfam01221).. 42964 pfam03029: Conserved hypothetical ATP binding protein. Members of this family are found in a range of archaea and eukaryotes and have hypothesised ATP binding activity. 42965 pfam03030: Inorganic H+ pyrophosphatase. The H+ pyrophosphatase is an transmembrane proton pump involved in establishing the H+ electrochemical potential difference between the vacuole lumen and the cell cytosol. Vacuolar-type H(+)-translocating inorganic pyrophosphatases have long been considered to be restricted to plants and to a few species of phototrophic bacteria. However, in recent investigations, these pyrophosphatases have been found in organisms as disparate as thermophilic Archaea and parasitic protists. 42966 pfam03031: NLI interacting factor-like phosphatase. This family contains a number of NLI interacting factor isoforms and also an N-terminal regions of RNA polymerase II CTC phosphatase and FCP1 serine phosphatase. This region has been identified as the minimal phosphatase domain. 42967 pfam03032: Brevenin/esculentin/gaegurin/rugosin family. This family contains a number of defence peptides secreted from the skin of amphibians, including the opiate-like dermorphins and deltorphins, and the antimicrobial dermoseptins and temporins. . 42968 pfam03033: Glycosyltransferase family 28 N-terminal domain. The glycosyltransferase family 28 includes monogalactosyldiacylglycerol synthase (EC 2.4.1.46) and UDP-N-acetylglucosamine transferase (EC 2.4.1.-). This N-terminal domain contains the acceptor binding site and likely membrane association site. This family also contains a large number of proteins that probably have quite distinct activities. 42969 pfam03034: Phosphatidyl serine synthase. Phosphatidyl serine synthase is also known as serine exchange enzyme. This family represents eukaryotic PSS I and II which are membrane bound proteins which catalyses the replacement of the head group of a phospholipid (phosphotidylcholine or phosphotidylethanolamine) by L-serine. 42970 pfam03035: Calicivirus putative RNA polymerase/capsid protein. 42971 pfam03036: Perilipin family. The perilipin family includes lipid droplet-associated protein (perilipin) and adipose differentiation-related protein (adipophilin).. 42972 pfam03037: Kinetoplastid membrane protein 11. Kinetoplastid membrane protein 11 is a major cell surface glycoprotein of the parasite Leishmania donovani. 42973 pfam03038: UL95 family. Members of this family, found in several herpesviruses, include EBV BGLF3 and other UL95 proteins (e.g. HCMV UL95, HVS-1 34, HSV6 U67). Their function is unknown. 42974 pfam03039: Interleukin-12 alpha subunit. Interleukin 12 (IL-12) is a disulphide-bonded heterodimer consisting of a 35kDa alpha subunit and a 40kDa beta subunit. It is involved in the stimulation and maintenance of Th1 cellular immune responses, including the normal host defence against various intracellular pathogens, such as Leishmania, Toxoplasma, measles virus and HIV. IL-12 also has an important role in pathological Th1 responses, such as in inflammatory bowel disease and multiple sclerosis. Suppression of IL-12 activity in such diseases may have therapeutic benefit. On the other hand, administration of recombinant IL-12 may have therapeutic benefit in conditions associated with pathological Th2 responses. 42975 pfam03040: CemA family. Members of this family are probable integral membrane proteins. Their molecular function is unknown. CemA proteins are found in the inner envelope membrane of chloroplasts but not in the thylakoid membrane. A cyanobacterial member of this family has been implicated in CO2 transport, but is probably not a CO2 transporter itself. They are predicted to be haem-binding however this has not been proven experimentally. 42976 pfam03041: lef-2. The lef-2 gene (for late expression factor 2) from baculovirus is required for expression of late genes. This gene has been shown to be specifically required for expression from the vp39 and polh promoters. 42977 pfam03042: Birnavirus VP5 protein. Birnaviruses are ds RNA viruses. Non structural protein VP5 is found in RNA segment A. The function of this small viral protein is unknown. The proteins are about 150 amino acids long and contain several conserved histidines and cysteines that might form a zinc binding site (Bateman A pers. obs.).. 42978 pfam03043: Herpesvirus UL87 family. Members of this family are functionally uncharacterised. This family groups together EBV BcRF1, HSV-6 U58, HVS-1 24 and HCMV UL87. The proteins range from 575 to 950 amino acids in length. 42979 pfam03044: Herpesvirus UL16/UL94 family. This family groups together HSV-1 UL16, HSV-6 ORF11R, EHV-1 46, HCMV UL94, EBV BDLF2 and VZV 44. UL16 protein may play a role in capsid maturation including DNA packaging/cleavage. In immunofluorescence studies, UL16 was localised to the nucleus of infected cells in areas containing high concentrations of HSV capsid proteins. These nuclear compartments have been described previously as viral assemblons and are distinct from compartments containing replicating DNA. Localisation within assemblons argues for a role of UL16 encoded protein in capsid assembly or maturation. 42980 pfam03045: DAN domain. This domain contains 9 conserved cysteines and is extracellular. Therefore the cysteines may form disulphide bridges. This family of proteins has been termed the DAN family after the first member to be reported. This family includes DAN, Cerberus and Gremlin. The gremlin protein is an antagonist of bone morphogenetic protein signaling. It is postulated that all members of this family antagonise different TGF beta pfam00019 ligands. Recent work shows that the DAN protein is not an efficient antagonist of BMP-2/4 class signals, we found that DAN was able to interact with GDF-5 in a frog embryo assay, suggesting that DAN may regulate signaling by the GDF-5/6/7 class of BMPs in vivo. 42981 pfam03047: COMC family. This family consists exclusively of streptococcal competence stimulating peptide precursors, which are generally up to 50 amino acid residues long. In all the members of this family, the leader sequence is cleaved after two conserved glycine residues; thus the leader sequence is of the double- glycine type. Competence stimulating peptides (CSP) are small (less than 25 amino acid residues) cationic peptides. The N-terminal amino acid residue is negatively charged, either glutamate or aspartate. The C-terminal end is positively charged. The third residue is also positively charged: a highly conserved arginine. A few COMC proteins and their precursors (not included in this family) do not fully follow the above description. In particular: the leader sequence in the CSP precursor from Streptococcus sanguis NCTC 7863 is not of the double-glycine type; the CSP from Streptococcus gordonii NCTC 3165 does not have a negatively charged N-terminus residue and has a lysine instead of arginine at the third position. Functionally, CSP act as pheromones, stimulating competence for genetic transformation in streptococci. In streptococci, the (CSP mediated) competence response requires exponential cell growth at a critical density, a relatively simple requirement when compared to the stationary-phase requirement of Haemophilus, or the late-logarithmic- phase of Bacillus. All bacteria induced to competence by a particular CSP are said to belong to the same pherotype, because each CSP is recognised by a specific receptor (the signalling domain of a histidine kinase ComD). Pherotypes are not necessarily species-specific. In addition, an organism may change pherotype. There are two possible mechanisms for pherotype switching: horizontal gene transfer, and accumulation of point mutations. The biological significance of pherotypes and pherotype switching is not definitively determined. Pherotype switching occurs frequently enough in naturally competent streptococci to suggest that it may be an important contributor to genetic exchange between different bacterial species. 42982 pfam03048: UL92 family. Members of this family, found in several herpesviruses, include EBV BDLF4, HCMV UL92, HHV8 31, HSV6 U63. Their function is unknown. The N terminus of this protein contains 6 conserved cysteines and histidines that might form a zinc binding domain. 42983 pfam03049: UL79 family. Members of this family are functionally uncharacterised proteins from herpesviruses. This family groups together HSV-6 U52, HVS-1 18 and HCMV UL79. 42984 pfam03050: Transposase IS66 family. Transposase proteins are necessary for efficient DNA transposition. This family includes IS66 from Agrobacterium tumefaciens. 42985 pfam03051: Peptidase C1-like family. This family is closely related to the Peptidase_C1 family pfam00112, containing several prokaryotic and eukaryotic aminopeptidases and bleomycin hydrolases. 42986 pfam03052: Adenoviral protein 52K. The adenoviral protein 52K (named after the earliest known 52kDa members) is a DNA-binding protein. 42987 pfam03053: ORF3b coronavirus protein. Members of this family are non-structural proteins, approximately 250 amino acid residues long. They are found in transmissible gastroenteritis coronavirus (TGEV) and porcine respiratory coronavirus (PRCV) isolates. These proteins are found on the same mRNA as another product, designated ORF3a. While ORF3a/b has been implicated in TGEV and PRCV pathogenesis, its precise role remains unclear. 42988 pfam03054: tRNA methyl transferase. This family represents tRNA(5-methylaminomethyl-2-thiouridine)-methyltransferase which is involved in the biosynthesis of the modified nucleoside 5-methylaminomethyl-2-thiouridine present in the wobble position of some tRNAs. 42989 pfam03055: Retinal pigment epithelial membrane protein. This family represents a retinal pigment epithelial membrane receptor which is abundantly expressed in retinal pigment epithelium, and binds plasma retinal binding protein. The family also includes the sequence related neoxanthin cleavage enzyme in plants and lignostilbene-alpha,beta-dioxygenase in bacteria. 42990 pfam03056: Env gp36 protein (HERV/MMTV type). This family includes the GP36 protein from retroviruses such as mouse mammary tumour virus (MMTV) and human endogenous retroviruses (HERVs). The GP36 protein is an envelope protein that has a predicted transmembrane helix at its amino terminus. 42991 pfam03057: Protein of unknown function. This family represents the C-terminal region of a number of C. elegans proteins of unknown function. 42992 pfam03058: Sar8.2 family. Members of this family are found in Solanaceae plants, a taxonomic group (family) that includes pepper and tobacco plant species. Synthesis of these proteins is induced by tobacco mosaic virus (TMV) and salicylic acid; indeed they are thought to be involved in the development of systemic acquired resistance (SAR) after an initial hypersensitive response to microbial infection. SAR is characterized by long-lasting resistance to infection by a wide range of pathogens, extending to plant tissues distant from the initial infection site. 42993 pfam03059: Nicotianamine synthase protein. Nicotianamine synthase EC:2.5.1.43 catalyses the trimerisation of S-adenosylmethionine to yield one molecule of nicotianamine. Nicotianamine has an important role in plant iron uptake mechanisms. Plants adopt two strategies (termed I and II) of iron acquisition. Strategy I is adopted by all higher plants except graminaceous plants, which adopt strategy II. In strategy I plants, the role of nicotianamine is not fully determined: possible roles include the formation of more stable complexes with ferrous than with ferric ion, which might serve as a sensor of the physiological status of iron within a plant, or which might be involved in the transport of iron. In strategy II (graminaceous) plants, nicotianamine is the key intermediate (and nicotianamine synthase the key enzyme) in the synthesis of the mugineic family (the only known family in plants) of phytosiderophores. Phytosiderophores are iron chelators whose secretion by the roots is greatly increased in instances of iron deficiency. 42994 pfam03060: 2-nitropropane dioxygenase. Members of this family catalyse the denitrification of a number of nitroalkanes using either FAD or FMN as a cofactor. 42995 pfam03061: Thioesterase superfamily. This family contains a wide variety of enzymes, principally thioesterases. This family includes 4HBT (EC 3.1.2.23) which catalyses the final step in the biosynthesis of 4-hydroxybenzoate from 4-chlorobenzoate in the soil dwelling microbe Pseudomonas CBS-3. This family includes various cytosolic long-chain acyl-CoA thioester hydrolases. Long-chain acyl-CoA hydrolases hydrolyse palmitoyl-CoA to CoA and palmitate, they also catalyse the hydrolysis of other long chain fatty acyl-CoA thioesters. 42996 pfam03062: MBOAT family. The MBOAT (membrane bound O-acyl transferase) family of membrane proteins contains a variety of acyltransferase enzymes. A conserved histidine has been suggested to be the active site residue. 42997 pfam03063: Prismane/CO dehydrogenase family. This family includes both hybrid-cluster proteins and the beta chain of carbon monoxide dehydrogenase. The hybrid-cluster proteins contain two Fe/S centres - a [4Fe-4S] cubane cluster, and a hybrid [4Fe-2S-2O] cluster. The physiological role of this protein is as yet unknown, although a role in nitrate/nitrite respiration has been suggested. The prismane protein from Escherichia coli was shown to contain hydroxylamine reductase activity (NH2OH + 2e + 2 H+ -> NH3 + H2O). This activity is rather low. Hydroxylamine reductase activity was also found in CO-dehydrogenase in which the active site Ni was replaced by Fe. The CO dehydrogenase contains a Ni-3Fe-2S-3O centre. 42998 pfam03064: HSV U79 / HCMV P34. This family represents herpes virus protein U79 and cytomegalovirus early phosphoprotein P34 (UL112).. 42999 pfam03065: Glycosyl hydrolase family 57. This family includes alpha-amylase (EC:3.2.1.1), 4--glucanotransferase (EC:2.4.1.-) and amylopullulanase enzymes. 43000 pfam03066: Nucleoplasmin. Nucleoplasmins are also known as chromatin decondensation proteins. They bind to core histones and transfer DNA to them in a reaction that requires ATP. This is thought to play a role in the assembly of regular nucleosomal arrays. 43001 pfam03067: Chitin binding domain. This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function. 43002 pfam03068: Protein-arginine deiminase (PAD). Members of this family are found in mammals. In the presence of calcium ions, PAD enzymes EC:3.5.3.15 catalyse the post-translational modification reaction responsible for the formation of citrulline residues: Protein L-arginine + H2O <=> Protein L-citrulline + NH3. Several types are recognised (and included in the family) on the basis of molecular mass, substrate specificity, and tissue localisation. The expression of type I PAD is known to be under the control of oestrogen. 43003 pfam03069: Acetamidase/Formamidase family. This family includes amidohydrolases of formamide EC:3.5.1.49 and acetamide. One member forms a homotrimer suggesting all the members of this family also do. 43004 pfam03070: TENA/THI-4 family. Members of this family are found in all the three major phyla of life: archaebacteria, eubacteria, and eukaryotes. In Bacillus subtilis, TENA is one of a number of proteins that enhance the expression of extracellular enzymes, such as alkaline protease, neutral protease and levansucrase. The THI-4 protein, which is involved in thiamine biosynthesis, is also a member of this family. The C-terminal part of these proteins consistently show significant sequence similarity to TENA proteins. This similarity was first noted with the Neurospora crassa THI-4. The exact molecular function of members of this family is uncertain. 43005 pfam03071: GNT-I family. Alpha-1,3-mannosyl-glycoprotein beta-1,2-N-acetylglucosaminyltransferase (GNT-I, GLCNAC-T I) EC:2.4.1.101 transfers N-acetyl-D-glucosamine from UDP to high-mannose glycoprotein N-oligosaccharide. This is an essential step in the synthesis of complex or hybrid-type N-linked oligosaccharides. The enzyme is an integral membrane protein localised to the Golgi apparatus, and is probably distributed in all tissues. The catalytic domain is located at the C-terminus. 43006 pfam03072: MG032/MG096/MG288 family 1. This family consists entirely of mycoplasmal proteins. Their function is unknown. Another related family, pfam03086, also consists entirely of mycoplasmal proteins of the MG032/MG096/MG288 family. Some proteins,, are included in both families, but of course differ in the aligned residues. 43007 pfam03073: TspO/MBR family. Tryptophan-rich sensory protein (TspO) is an integral membrane protein that acts as a negative regulator of the expression of specific photosynthesis genes in response to oxygen/light. It is involved in the efflux of porphyrin intermediates from the cell. This reduces the activity of coproporphyrinogen III oxidase, which is thought to lead to the accumulation of a putative repressor molecule that inhibits the expression of specific photosynthesis genes. Several conserved aromatic residues are necessary for TspO function: they are thought to be involved in binding porphyrin intermediates. The rat mitochondrial peripheral benzodiazepine receptor (MBR) was shown to not only retain its structure within a bacterial outer membrane, but also to be able to functionally substitute for TspO in TspO- mutants, and to act in a similar manner to TspO in its in situ location: the outer mitochondrial membrane. The biological significance of MBR remains unclear, however. It is thought to be involved in a variety of cellular functions, including cholesterol transport in steroidogenic tissues. 43008 pfam03074: Glutamate-cysteine ligase. This family represents the catalytic subunit of glutamate-cysteine ligase (E.C. 6.3.2.2), also known as gamma-glutamylcysteine synthetase (GCS). This enzyme catalyses the rate limiting step in the biosynthesis of glutathione. The eukaryotic enzyme is a dimer of a heavy chain and a light chain with all the catalytic activity exhibited by the heavy chain (this family).. 43009 pfam03076: Equine arteritis virus GP3. This protein is encoded by ORF3 of equine arteritis virus. The function is unknown. 43010 pfam03077: Putative vacuolating cytotoxin. This family contains a number of Helicobacter outer membrane proteins with multiple copies of this small conserved region. 43011 pfam03078: ATHILA ORF-1 family. ATHILA is a group of Arabidopsis thaliana retrotransposons belonging to the Ty3/gypsy family of the long terminal repeat (LTR) class of eukaryotic retrotransposons. The central region of ATHILA retrotransposons contains two or three open reading frames (ORFs). This family represents the ORF1 product. The function of ORF1 is unknown. 43012 pfam03079: ARD/ARD' family. The two acireductone dioxygenase enzymes (ARD and ARD', previously known as E-2 and E-2') from Klebsiella pneumoniae share the same amino acid sequence, but bind different metal ions: ARD binds Ni2+, ARD' binds Fe2+. ARD and ARD' can be experimentally interconverted by removal of the bound metal ion and reconstitution with the appropriate metal ion. The two enzymes share the same substrate, 1,2-dihydroxy-3-keto-5-(methylthio)pentene, but yield different products. ARD' yields the alpha-keto precursor of methionine (and formate), thus forming part of the ubiquitous methionine salvage pathway that converts 5'-methylthioadenosine (MTA) to methionine. This pathway is responsible for the tight control of the concentration of MTA, which is a powerful inhibitor of polyamine biosynthesis and transmethylation reactions. ARD yields methylthiopropanoate, carbon monoxide and formate, and thus prevents the conversion of MTA to methionine. The role of the ARD catalysed reaction is unclear: methylthiopropanoate is cytotoxic, and carbon monoxide can activate guanylyl cyclase, leading to increased intracellular cGMP levels. This family also contains other members, whose functions are not well characterised. 43013 pfam03080: Arabidopsis proteins of unknown function. This family contains a number of Arabidopsis proteins, a small number of which are putative peptidases. 43014 pfam03081: Exo70 exocyst complex subunit. The Exo70 protein forms one subunit of the exocyst complex. First discovered in S. cerevisiae, Exo70 and other exocyst proteins have been observed in several other eukaryotes, including humans. In S. cerevisiae, the exocyst complex is involved in the late stages of exocytosis, and is localised at the tip of the bud, the major site of exocytosis in yeast. Exo70 interacts with the Rho3 GTPase. This interaction mediates one of the three known functions of Rho3 in cell polarity: vesicle docking and fusion with the plasma membrane (the other two functions are regulation of actin polarity and transport of exocytic vesicles from the mother cell to the bud). In humans, the functions of Exo70 and the exocyst complex are less well characterised: Exo70 is expressed in several tissues and is thought to also be involved in exocytosis. 43015 pfam03082: Male accessory gland secretory protein. The accessory gland of male insects is a genital tissue that secretes many components of the ejaculatory fluid, some of which affect the female's receptivity to courtship and her rate of oviposition. This protein is expressed exclusively in the male accessory glands of adult Drosophila melanogaster. The proteins are transferred to the female fly during copulation and are rapidly altered in the female genital tract. 43016 pfam03083: MtN3/saliva family. This family includes proteins such as drosophila saliva, MtN3 involved in root nodule development and a protein involved in activation and expression of recombination activation genes (RAGs). Although the molecular function of these proteins is unknown, they are almost certainly transmembrane proteins. This family contains a region of two transmembrane helices that is found in two copies in most members of the family. 43017 pfam03084: Reoviral Sigma1/Sigma2 family. Reoviruses are double-stranded RNA viruses. They lack a membrane envelope and their capsid is organised in two concentric icosahedral layers: an inner core and an outer capsid layer. The sigma1 protein is found in the outer capsid, and the sigma2 protein is found in the core. There are four other kinds of protein (besides sigma2) in the core, termed lambda 1-3, mu2. Interactions between sigma2 and lambda 1 and lambda 3 are thought to initiate core formation, followed by mu2 and lambda2. Sigma1 is a trimeric protein, and is positioned at the 12 vertices of the icosahedral outer capsid layer. Its N-terminal fibrous tail, arranged as a triple coiled coil, anchors it in the virion, and a C-terminal globular head interacts with the cellular receptor. These two parts form by separate trimerisation events. The N-terminal fibrous tail forms on the polysome, without the involvement of ATP or chaperones. The post- translational assembly of the C-terminal globular head involves the chaperone activity of Hsp90, which is associated with phosphorylation of Hsp90 during the process. Sigma1 protein acts as a cell attachment protein, and determines viral virulence, pathways of spread, and tropism. Junctional adhesion molecule has been identified as a receptor for sigma1. In type 3 reoviruses, a small region, predicted to form a beta sheet, in the N-terminal tail was found to bind target cell surface sialic acid (i.e. sialic acid acts as a co-receptor) and promote apoptosis. The sigma1 protein also binds to the lambda2 core protein. 43018 pfam03085: Rhoptry-associated protein 1 (RAP-1). Members of this family are found in Babesia species. Though not in this Pfam family, rhoptry-associated proteins are also found in Plasmodium falciparum. Indeed, animal infection with Babesia may produce a pattern similar to human malaria. Rhoptry organelles form part of the apical complex in apicomplexan parasites. Rhoptry-associated proteins are antigenic, and generate partially protective immune responses in infected mammals. Thus RAPs are among the targeted vaccine antigens for babesial (and malarial) parasites. However, RAP-1 proteins are encoded by by a multigene family; thus RAP-1 proteins are polymorphic, with B and T cell epitopes that are conserved among strains, but not across species. Antibodies to Babesia RAP-1 may also be helpful in the serological detection of Babesia infections. 43019 pfam03086: MG032/MG096/MG288 family 2. This family consists entirely of mycoplasmal proteins. Their function is unknown. Another related family, pfam03072, also consists entirely of mycoplasmal proteins of the MG032/MG096/MG288 family. Some proteins,, are included in both families, but of course differ in the aligned residues. 43020 pfam03087: Arabidopsis protein of unknown function. This family represents a number of Arabidopsis proteins. Their functions are unknown. 43021 pfam03088: Strictosidine synthase. Strictosidine synthase (E.C. 4.3.3.2) is a key enzyme in alkaloid biosynthesis. It catalyses the condensation of tryptamine with secologanin to form strictosidine. 43022 pfam03089: Recombination activating protein 2. V-D-J recombination is the combinatorial process by which the huge range of immunoglobulin and T cell binding specificity is generated from a limited amount of genetic material. This process is synergistically activated by RAG1 and RAG2 in developing lymphocytes. Defects in RAG2 in humans are a cause of severe combined immunodeficiency B cell negative and Omenn syndrome. 43023 pfam03090: Replicase family. This is a family of bacterial plasmid DNA replication initiator proteins. Pfam: PF01651 is a similar family. These RepA proteins exist as monomers and dimers in equilibrium: monomers bind directly to repeated DNA sequences and thus activate replication; dimers repress repA transcription by binding an inversely repeated DNA operator. Dimer dissociation can occur spontaneously or may be mediated by Hsp70 chaperones. 43024 pfam03091: CutA1 divalent ion tolerance protein. Several gene loci with a possible involvement in cellular tolerance to copper have been identified. One such locus in eubacteria and archaebacteria, cutA, is thought to be involved in cellular tolerance to a wide variety of divalent cations other than copper. The cutA locus consists of two operons, of one and two genes. The CutA1 protein is a cytoplasmic protein, encoded by the single-gene operon and has been linked to divalent cation tolerance. It has no recognised structural motifs. This family also contains putative proteins from eukaryotes (human and Drosophila).. 43025 pfam03092: BT1 family. Members of this family are transmembrane proteins. Several are Leishmania putative proteins that are thought to be pteridine transporters. One such protein, previously termed (and is still annotated as) ORFG, was shown to encode a biopterin transport protein using null mutants, thus being subsequently renamed BT1. The significant similarity of ORFG/BT1 to Trypanosoma brucei ESAG10 (a putative transmembrane protein and another member of this family) was previously noted. This family also contains five putative Arabidopsis thaliana proteins of unknown function. In addition, it also contains two predicted prokaryotic proteins (from the cyanobacteria Synechocystis and Synechococcus).. 43026 pfam03093: Nucleoporin FG repeat family. FG repeats are found in diverse nucleoporins. It has been suggested that these repeats mediate interactions with substrates to be transported through the nuclear pore. These repeats also appear to be present in a family of helicases. Due to the shortness of this repeat it is not clear if the similarity to the helicases is biologically meaningful. 43027 pfam03094: Mlo family. A family of plant integral membrane proteins, first discovered in barley. Mutants lacking wild-type Mlo proteins show broad spectrum resistance to the powdery mildew fungus, and dysregulated cell death control, with spontaneous cell death in response to developmental or abiotic stimuli. Thus wild-type Mlo proteins are thought to be inhibitors of cell death whose deficiency lowers the threshold required to trigger the cascade of events that result in plant cell death. Mlo proteins are localised in the plasma membrane and possess seven transmembrane regions; thus the Mlo family is the only major higher plant family to possess 7 transmembrane domains. It has been suggested that Mlo proteins function as G-protein coupled receptors in plants; however the molecular and biological functions of Mlo proteins remain to be fully determined. 43028 pfam03095: Phosphotyrosyl phosphate activator (PTPA) protein. Phosphotyrosyl phosphatase activator (PTPA) proteins stimulate the phosphotyrosyl phosphatase (PTPase) activity of the dimeric form of protein phosphatase 2A (PP2A). PTPase activity in PP2A (in vitro) is relatively low when compared to the better recognised phosphoserine/ threonine protein phosphorylase activity. The specific biological role of PTPA is unknown, Basal expression of PTPA depends on the activity of a ubiquitous transcription factor, Yin Yang 1 (YY1). The tumour suppressor protein p53 can inhibit PTPA expression through an unknown mechanism that negatively controls YY1. 43029 pfam03096: Ndr family. This family consists of proteins from different gene families: Ndr1/RTP/Drg1, Ndr2, and Ndr3. Their similarity was previously noted. The precise molecular and cellular function of members of this family is still unknown. Yet, they are known to be involved in cellular differentiation events. The Ndr1 group was the first to be discovered. Their expression is repressed by the proto-oncogenes N-myc and c-myc, and in line with this observation, Ndr1 protein expression is down-regulated in neoplastic cells, and is reactivated when differentiation is induced by chemicals such as retinoic acid. Ndr2 and Ndr3 expression is not under the control of N-myc or c-myc. Ndr1 expression is also activated by several chemicals: tunicamycin and homocysteine induce Ndr1 in human umbilical endothelial cells; nickel induces Ndr1 in several cell types. Members of this family are found in wide variety of multicellular eukaryotes, including an Ndr1 type protein in Helianthus annuus (sunflower), known as Sf21. Interestingly, the highest scoring matches in the noise are all alpha/beta hydrolases pfam00561, suggesting that this family may have an enzymatic function (Bateman A pers. obs.).. 43030 pfam03097: BRO1-like domain. This functionally uncharacterized domain is found in a number of signal transduction proteins, including Rhophilin and BRO1. 43031 pfam03098: Animal haem peroxidase. 43032 pfam03099: Biotin/lipoate A/B protein ligase family. This family includes biotin protein ligase, lipoate-protein ligase A and B. Biotin is covalently attached at the active site of certain enzymes that transfer carbon dioxide from bicarbonate to organic acids to form cellular metabolites. Biotin protein ligase (BPL) is the enzyme responsible for attaching biotin to a specific lysine at the active site of biotin enzymes. Each organism probably has only one BPL. Biotin attachment is a two step reaction that results in the formation of an amide linkage between the carboxyl group of biotin and the epsilon-amino group of the modified lysine. Lipoate-protein ligase A (LPLA) catalyses the formation of an amide linkage between lipoic acid and a specific lysine residue in lipoate dependent enzymes. 43033 pfam03100: CcmE. CcmE is the product of one of a cluster of Ccm genes that are necessary for cytochrome c biosynthesis in eubacteria. Expression of these proteins is induced when the organisms are grown under anaerobic conditions with nitrate or nitrite as the final electron acceptor. 43034 pfam03101: FAR1 family. This family seems to be plant specific. It has been shown to possibly have a role in phytochrome signaling. 43035 pfam03102: NeuB family. NeuB is the prokaryotic N-acetylneuraminic acid (Neu5Ac) synthase. It catalyses the direct formation of Neu5Ac (the most common sialic acid) by condensation of phosphoenolpyruvate (PEP) and N-acetylmannosamine (ManNAc). This reaction has only been observed in prokaryotes; eukaryotes synthesise the 9-phosphate form, Neu5Ac-9-P, and utilise ManNAc-6-P instead of ManNAc. Such eukaryotic enzymes are not present in this family. This family also contains SpsE spore coat polysaccharide biosynthesis proteins. 43036 pfam03103: Domain of unknown function (DUF243). This family of uncharacterised proteins is only found in fly proteins. It is found associated with YLP motifs pfam02757 in some proteins. 43038 pfam03105: SPX domain. We have named this region the SPX domain after (SYG1, Pho81 and XPR1). This 180 residue length domain is found at the amino terminus of a variety of proteins. In the yeast protein SYG1, the N-terminus directly binds to the G- protein beta subunit and inhibits transduction of the mating pheromone signal. This finding suggests that all the members of this family are involved in G-protein associated signal transduction. The N-termini of several proteins involved in the regulation of phosphate transport, including the putative phosphate level sensors PHO81 from Saccharomyces cerevisiae and NUC-2 from Neurospora crassa, are also members of this family. NUC-2 contains several ankyrin repeats pfam00023. Several members of this family are annotated as XPR1 proteins: the xenotropic and polytropic retrovirus receptor confers susceptibility to infection with murine leukaemia viruses (MLV). The similarity between SYG1, phosphate regulators and XPR1 sequences has been previously noted, as has the additional similarity to several predicted proteins, of unknown function, from Drosophila melanogaster, Arabidopsis thaliana, Caenorhabditis elegans, Schizosaccharomyces pombe, and Saccharomyces cerevisiae. In addition, given the similarities between XPR1 and SYG1 and phosphate regulatory proteins, it has been proposed that XPR1 might be involved in G-protein associated signal transduction and may itself function as a phosphate sensor. 43039 pfam03106: WRKY DNA -binding domain. 43040 pfam03107: DC1 domain. This short domain is rich in cysteines and histidines. The pattern of conservation is similar to that found in pfam00130, therefore we have termed this domain DC1 for divergent C1 domain. This domain probably also binds to two zinc ions. The function of proteins with this domain is uncertain, however this domain may bind to molecules such as diacylglycerol (A Bateman pers. obs.). This family are found in plant proteins. 43041 pfam03108: MuDR family transposase. This region is found in plant proteins that are presumed to be the transposases for Mutator transposable elements. These transposons contain two ORFs. The molecular function of this region is unknown. 43042 pfam03109: ABC1 family. This family includes ABC1 from yeast and AarF from E. coli. These proteins have a nuclear or mitochondrial subcellular location in eukaryotes. The exact molecular functions of these proteins is not clear, however yeast ABC1 suppresses a cytochrome b mRNA translation defect and is essential for the electron transfer in the bc 1 complex and E. coli AarF is required for ubiquinone production. It has been suggested that members of the ABC1 family are novel chaperonins. These proteins are unrelated to the ABC transporter proteins. 43043 pfam03110: SBP domain. SBP domains (for SQUAMOSA-pROMOTER BINDING PROTEIN) are found in plant proteins. It is a sequence specific DNA-binding domain. Members of family probably function as transcription factors involved in the control of early flower development. The domain contains 10 conserved cysteine and histidine residues that probably are zinc ligands. 43044 pfam03111: E5R poxvirus protein family. This family of poxvirus proteins is found in cytoplasmic sites of viral DNA replication. However, its function is unknown. 43045 pfam03112: Uncharacterized protein family (ORF7) DUF. Several members of this family are Borrelia burgdorferi plasmid proteins of uncharacterized function. 43046 pfam03113: Respiratory synctial virus non-structural protein NS2. The molecular structure and function of the NS2 protein is not known. However, mutants lacking the NS2 grow at slower rates when compared to the wild-type. Nevertheless, NS2 is not essential for viral replication. 43047 pfam03114: BAR domain. The BAR domain is found in amphiphysin and clathrin binding protein. However the function of this domain is unknown. 43048 pfam03115: Astrovirus capsid protein precursor. This product is encoded by astrovirus ORF2, one of the three astrovirus ORFs (1a, 1b, 2). The 87kD precursor protein undergoes an intracellular cleavage to form a 79kD protein. Subsequently, extracellular trypsin cleavage yields the three proteins forming the infectious virion. 43049 pfam03116: NQR2, RnfD, RnfE family. This family of bacterial proteins includes a sodium-translocating NADH-ubiquinone oxidoreductase (i.e. a respiration linked sodium pump). In Vibrio cholerae, it negatively regulates the expression of virulence factors through inhibiting (by an unknown mechanism) the transcription of the transcriptional activator ToxT. The family also includes proteins involved in nitrogen fixation, RnfD and RnfE. The similarity of these proteins to NADH-ubiquinone oxidoreductases was previously noted. 43050 pfam03117: UL49 family. Members of this family, found in several herpesviruses, include EBV BFRF2 and other UL49 proteins (e.g. HCMVA UL49, HSV6 U33). There are eight conserved cysteine residues in this alignment, all lying towards the C-terminus. Their function is unknown. 43051 pfam03118: Bacterial RNA polymerase, alpha chain C terminal domain. The alpha subunit of RNA polymerase consists of two independently folded domains, referred to as amino-terminal and carboxyl terminal domains. The amino terminal domain is involved in the interaction with the other subunits of the RNA polymerase. The carboxyl-terminal domain interacts with the DNA and activators. The amino acid sequence of the alpha subunit is conserved in prokaryotic and chloroplast RNA polymerases. There are three regions of particularly strong conservation, two in the amino-terminal and one in the carboxyl- terminal. 43052 pfam03119: NAD-dependent DNA ligase C4 zinc finger domain. DNA ligases catalyse the crucial step of joining the breaks in duplex DNA during DNA replication, repair and recombination, utilising either ATP or NAD(+) as a cofactor. This family is a small zinc binding motif that is presumably DNA binding. IT is found only in NAD dependent DNA ligases. 43053 pfam03120: NAD-dependent DNA ligase OB-fold domain. DNA ligases catalyse the crucial step of joining the breaks in duplex DNA during DNA replication, repair and recombination, utilising either ATP or NAD(+) as a cofactor. This family is a small domain found after the adenylation domain pfam01653 in NAD dependent ligases. OB-fold domains generally are involved in nucleic acid binding. 43054 pfam03121: Herpesviridae UL52/UL70 DNA primase. Herpes simplex virus type 1 DNA replication in host cells is known to be mediated by seven viral-encoded proteins, three of which form a heterotrimeric DNA helicase-primase complex. This complex consists of UL5, UL8, and UL52 subunits. Heterodimers consisting of UL5 and UL52 have been shown to retain both helicase and primase activities. Nevertheless, UL8 is still essential for replication: though it lacks any DNA binding or catalytic activities, it is involved in the transport of UL5-UL52 and it also interacts with other replication proteins. The molecular mechanisms of the UL5-UL52 catalytic activities are not known. While UL5 is associated with DNA helicase activity and UL52 with DNA primase activity, the helicase activity requires the interaction of UL5 and UL52. It is not known if the primase activity can be maintained by UL52 alone. The region encompassed by residues 610-636 of HSV1 UL52 is thought to contain a divalent metal cation binding motif. Indeed, this region contains several aspartate and glutamate residues that might be involved in divalent cation binding. The biological significance of UL52-UL8 interaction is not known. Yeast two-hybrid analysis together with immunoprecipitation experiments have shown that the HSV1 UL52 region between residues 366-914 is essential for this interaction, while the first 349 N-terminal residues are dispensable. This family also includes protein UL70 from cytomegalovirus (CMV, a subgroup of the Herpesviridae) strains, which, by analogy with UL52, is thought to have DNA primase activity. Indeed, CMV strains also possess a DNA helicase-primase complex, the other subunits being protein UL105 (with known similarity to HSV1 UL5) and protein UL102. 43055 pfam03122: Herpes virus major capsid protein. This family represents the major capsid protein (MCP) of herpes viruses. The capsid shell consists of 150 MCP hexamers and 12 MCP pentamers. One pentamer is found at each of the 12 apices of the icosahedral shell, and the hexamers form the edges and 20 faces. 43056 pfam03123: CAT RNA binding domain. This RNA binding domain is found at the amino terminus of a family of transcriptional antiterminator proteins. This domain has been called the CAT (Co-AntiTerminator) domain. This domain forms a dimer in the known structure. Transcriptional antiterminators of the BglG/SacY family are regulatory proteins that mediate the induction of sugar metabolising operons in Gram-positive and Gram-negative bacteria. Upon activation, these proteins bind to specific targets in nascent mRNAs, thereby preventing abortive dissociation of the RNA polymerase from the DNA template. 43057 pfam03124: EXS family. We have named this region the EXS family after (ERD1, XPR1, and SYG1). This family includes C-terminus portions from the SYG1 G-protein associated signal transduction protein from Saccharomyces cerevisiae, and sequences that are thought to be murine leukaemia virus (MLV) receptors (XPR1). N-terminus portions from these proteins are aligned in the SPX pfam03105 family. The previously noted similarity between SYG1 and MLV receptors over their whole sequences is thus borne out in pfam03105 and this family. While the N-termini aligned in pfam03105 are thought to be involved in signal transduction, the role of the C-terminus sequences aligned in this family is not known. This region of similarity contains several predicted transmembrane helices. This family also includes the ERD1 (ERD: ER retention defective) yeast proteins. ERD1 proteins are involved in the localisation of endogenous endoplasmic reticulum (ER) proteins. erd1 null mutants secrete such proteins even though they possess the C-terminal HDEL ER lumen localisation label sequence. In addition, null mutants also exhibit defects in the Golgi-dependent processing of several glycoproteins, which led to the suggestion that the sorting of luminal ER proteins actually occurs in the Golgi, with subsequent return of these proteins to the ER via `salvage' vesicles. 43058 pfam03125: C. elegans Sre G protein-coupled chemoreceptor. Caenorhabditis elegans Sre proteins are candidate chemosensory receptors. There are four main recognised groups of such receptors: Odr-10, Sra, Sro, and Srg. Sre (this family), Sra pfam02117 and Srb pfam02175 comprise the Sra group. All of the above receptors are thought to be G protein-coupled seven transmembrane domain proteins. The existence of several different chemosensory receptors underlies the fact that in spite of having only 20-30 chemosensory neurones, C. elegans detects hundreds of different chemicals, with the ability to discern individual chemicals among combinations. 43059 pfam03126: Plus-3 domain. This domain is about 90 residues in length and is often found associated with the pfam02213 domain. The function of this domain is uncertain. It is possible that this domain is involved in DNA binding as it has three conserved positively charged residues, hence this domain has been named the plus-3 domain. It is found in yeast Rtf1 which may be a transcription elongation factor. 43060 pfam03127: GAT domain. The GAT domain is responsible for binding of GGA proteins to several members of the ARF family including ARF1 and ARF3. The GAT domain stabilises membrane bound ARF1 in its GTP bound state, by interfering with GAP proteins. 43061 pfam03128: CXCXC repeat. This repeat contains the conserved pattern CXCXC where X can be any amino acid. The repeat is found in up to five copies in Vascular endothelial growth factor C. In the salivary glands of the dipteran Chironomus tentans, a specific messenger ribonucleoprotein (mRNP) particle, the Balbiani ring (BR) granule, can be visualised during its assembly on the gene and during its nucleocytoplasmic transport. This repeat is found over 70 copies in the balbiani ring protein 3. It is also found in some silk proteins. . 43062 pfam03129: Anticodon binding domain. This domain is found in histidyl, glycyl, threonyl and prolyl tRNA synthetases; it is probably the anticodon binding domain. 43063 pfam03130: PBS lyase HEAT-like repeat. This family contains a short bi-helical repeat that is related to pfam02985. Cyanobacteria and red algae harvest light energy using macromolecular complexes known as phycobilisomes (PBS), peripherally attached to the photosynthetic membrane. The major components of PBS are the phycobiliproteins. These heterodimeric proteins are covalently attached to phycobilins: open-chain tetrapyrrole chromophores, which function as the photosynthetic light-harvesting pigments. Phycobiliproteins differ in sequence and in the nature and number of attached phycobilins to each of their subunits. This family includes the lyase enzymes that specifically attach particular phycobilins to apophycobiliprotein subunits. The most comprehensively studied of these is the CpcE/F lyase, which attaches phycocyanobilin (PCB) to the alpha subunit of apophycocyanin. Similarly, MpeU/V attaches phycoerythrobilin to phycoerythrin II, while CpeY/Z is thought to be involved in phycoerythrobilin (PEB) attachment to phycoerythrin (PE) I (PEs I and II differ in sequence and in the number of attached molecules of PEB: PE I has five, PE II has six). All the reactions of the above lyases involve an apoprotein cysteine SH addition to a terminal delta 3,3'-double bond. Such a reaction is not possible in the case of phycoviolobilin (PVB), the phycobilin of alpha-phycoerythrocyanin (alpha-PEC). It is thought that in this case, PCB, not PVB, is first added to apo-alpha-PEC, and is then isomerised to PVB. The addition reaction has been shown to occur in the presence of either of the components of alpha-PEC-PVB lyase PecE or PecF (or both). The isomerisation reaction occurs only when both PecE and PecF components are present, i.e. the PecE/F phycobiliprotein lyase is also a phycobilin isomerase. Another member of this family is the NblB protein, whose similarity to the phycobiliprotein lyases was previously noted. This constitutively expressed protein is not known to have any lyase activity. It is thought to be involved in the coordination of PBS degradation with environmental nutrient limitation. It has been suggested that the similarity of NblB to the phycobiliprotein lyases is due to the ability to bind tetrapyrrole phycobilins via the common repeated motif. 43064 pfam03131: bZIP Maf transcription factor. Maf transcription factors contain a conserved basic region leucine zipper (bZIP) domain, which mediates their dimerisation and DNA binding property. Thus, this family is probably related to pfam00170. 43065 pfam03133: Tubulin-tyrosine ligase family. Tubulins and microtubules are subjected to several post-translational modifications of which the reversible detyrosination/tyrosination of the carboxy-terminal end of most alpha-tubulins has been extensively analysed. This modification cycle involves a specific carboxypeptidase and the activity of the tubulin-tyrosine ligase (TTL). The true physiological function of TTL has so far not been established. Tubulin-tyrosine ligase (TTL) catalyses the ATP-dependent post-translational addition of a tyrosine to the carboxy terminal end of detyrosinated alpha-tubulin. In normally cycling cells, the tyrosinated form of tubulin predominates. However, in breast cancer cells, the detyrosinated form frequently predominates, with a correlation to tumour aggressiveness. On the other hand, 3-nitrotyrosine has been shown to be incorporated, by TTL, into the carboxy terminal end of detyrosinated alpha-tubulin. This reaction is not reversible by the carboxypeptidase enzyme. Cells cultured in 3-nitrotyrosine rich medium showed evidence of altered microtubule structure and function, including altered cell morphology, epithelial barrier dysfunction, and apoptosis. 43066 pfam03134: TB2/DP1, HVA22 family. This family includes members from a wide variety of eukaryotes. It includes the TB2/DP1 (deleted in polyposis) protein, which in humans is deleted in severe forms of familial adenomatous polyposis, an autosomal dominant oncological inherited disease. The family also includes the plant protein of known similarity to TB2/DP1, the HVA22 abscisic acid-induced protein, which is thought to be a regulatory protein. 43067 pfam03135: CagE, TrbE, VirB family, component of type IV transporter system. This family includes the Helicobacter pylori protein CagE, which together with other proteins from the cag pathogenicity island (PAI), encodes a type IV transporter secretion system. The precise role of CagE is not known, but studies in animal models have shown that it is essential for pathogenesis in Helicobacter pylori induced gastritis and peptic ulceration. Indeed, the expression of the cag PAI has been shown to be essential for stimulating human gastric epithelial cell apoptosis in vitro. Similar type IV transport systems are also found in other bacteria. This family includes the TrbE and VirB proteins from the respective trb and Vir conjugal transfer systems in Agrobacterium tumefaciens. Homologues of VirB proteins from other species are also members of this family, e.g. VirB from Brucella suis. 43068 pfam03136: Putative proteasome component. This family of Actinomycetales proteins includes members that may be involved in the 20S proteasome complex of these bacteria. Their precise function is unknown. 43069 pfam03137: Organic Anion Transporter Polypeptide (OATP) family. This family consists of several eukaryotic Organic-Anion-Transporting Polypeptides (OATPs). Several have been identified mostly in human and rat. Different OATPs vary in tissue distribution and substrate specificity. Since the numbering of different OATPs in particular species was based originally on the order of discovery, similarly numbered OATPs in humans and rats did not necessarily correspond in function, tissue distribution and substrate specificity (in spite of the name, some OATPs also transport organic cations and neutral molecules). Thus, Tamai et al. initiated the current scheme of using digits for rat OATPs and letters for human ones. Prostaglandin transporter (PGT) proteins are also considered to be OATP family members. In addition, the methotrexate transporter OATK is closely related to OATPs. This family also includes several predicted proteins from Caenorhabditis elegans and Drosophila melanogaster. This similarity was not previously noted. Note: Members of this family are described as belonging to the SLC21 family of transporters. 43070 pfam03138: Plant protein family. The function of this family of plant proteins is not known. However, annotations for several sequences in this family refer to a possible involvement in auxin-independent growth regulation, on the basis of publications which have been retracted. 43071 pfam03139: Vanadium/alternative nitrogenase delta subunit. The nitrogenase complex EC:1.18.6.1 catalyses the conversion of molecular nitrogen to ammonia (nitrogen fixation) as follows: 8 reduced ferredoxin + 8 H(+) + N(2) + 16 ATP <=> 8 oxidised ferredoxin + 2 NH(3) + 16 ADP + 16 phosphate. The complex is hexameric, consisting of 2 alpha, 2 beta, and 2 delta subunits. This family represents the delta subunit of a group of nitrogenases that do not utilise molybdenum (Mo) as a cofactor, but instead use either vanadium (V nitrogenases), or iron (alternative nitrogenases). V nitrogenases are encoded by vnf operons, and alternative nitrogenases by anf operons. The delta subunits are VnfG and AnfG, respectively. 43072 pfam03140: Plant protein of unknown function. The function of the plant proteins constituting this family is unknown. 43073 pfam03141: Putative methyltransferase. Members of this family of hypothetical plant proteins are probably methyltransferases: several of the aligned sequences either match a methyltransferase profile or contain a SAM-binding motif; one member contains both. Several family members are described as ankyrin like. 43074 pfam03142: Chitin synthase. Members of this family are fungal chitin synthase EC:2.4.1.16 enzymes. They catalyse chitin synthesis as follows: UDP-N-acetyl-D-glucosamine + {(1,4)-(N-acetyl-beta-D-glucosaminyl)}(N) <=> UDP + {(1,4)-(N-acetyl-beta-D-glucosaminyl)}(N+1).. 43075 pfam03143: Elongation factor Tu C-terminal domain. Elongation factor Tu consists of three structural domains, this is the third domain. This domain adopts a beta barrel structure. This the third domain is involved in binding to both charged tRNA and binding to EF-Ts pfam00889. 43076 pfam03144: Elongation factor Tu domain 2. Elongation factor Tu consists of three structural domains, this is the second domain. This domain adopts a beta barrel structure. This the second domain is involved in binding to charged tRNA. This domain is also found in other proteins such as elongation factor G and translation initiation factor IF-2. This domain is structurally related to pfam03143, and in fact has weak sequence matches to this domain. 43077 pfam03145: Seven in absentia protein family. The seven in absentia (sina) gene was first identified in Drosophila. The Drosophila Sina protein is essential for the determination of the R7 pathway in photoreceptor cell development: the loss of functional Sina results in the transformation of the R7 precursor cell to a non- neuronal cell type. The Sina protein contains an N-terminal RING finger domain pfam00097. Through this domain, Sina binds E2 ubiquitin-conjugating enzymes (UbcD1) Sina also interacts with Tramtrack (TTK88) via PHYL. Tramtrack is a transcriptional repressor that blocks photoreceptor determination, while PHYL down-regulates the activity of TTK88. In turn, the activity of PHYL requires the activation of the Sevenless receptor tyrosine kinase, a process essential for R7 determination. It is thought that thus Sina targets TTK88 for degradation, therefore promoting the R7 pathway. Murine and human homologues of Sina have also been identified. The human homologue Siah-1 also binds E2 enzymes (UbcH5) and through a series of physical interactions, targets beta-catenin for ubiquitin degradation. Siah-1 expression is enhanced by p53, itself promoted by DNA damage. Thus this pathway links DNA damage to beta-catenin degradation. Sina proteins, therefore, physically interact with a variety of proteins. The N-terminal RING finger domain that binds ubiquitin conjugating enzymes is described in pfam00097, and does not form part of the alignment for this family. The remainder C-terminal part is involved in interactions with other proteins, and is included in this alignment. In addition to the Drosophila protein and mammalian homologues, whose similarity was noted previously, this family also includes putative homologues from Caenorhabditis elegans, Arabidopsis thaliana. 43078 pfam03146: Agrin NtA domain. Agrin is a multidomain heparan sulphate proteoglycan, that is a key organiser for the induction of postsynaptic specialisations at the neuromuscular junction. Binding of agrin to basement membranes requires the amino terminal (NtA) domain. This region mediates high affinity interaction with the coiled-coil domain of laminins. The binding of agrin to laminins via the NtA domain is subject to tissue-specific regulation. The NtA domain-containing form of agrin is expressed in non-neuronal cells or in neurons that project to non-neuronal cell such as motor neurons. The structure of this domain is an OB-fold. 43079 pfam03147: Ferredoxin-fold anticodon binding domain. This is the anticodon binding domain found in some phenylalanyl tRNA synthetases. The domain has a ferredoxin fold. 43080 pfam03148: Tektin family. Tektins are cytoskeletal proteins. They have been demonstrated in such cellular sites as centrioles, basal bodies, and along ciliary and flagellar doublet microtubules. Tektins form unique protofilaments, organised as longitudinal polymers of tektin heterodimers with axial periodicity matching tubulin. Tektin polypeptides consist of several alpha-helical regions that are predicted to form coiled coils. Indeed, tektins share considerable structural similarities with intermediate filament proteins. Possible functional roles for tektins are: stabilisation of tubulin protofilaments; attachment of A and B-tubules in ciliary/flagellar microtubule doublets and C-tubules in centrioles; binding of axonemal components. 43081 pfam03149: Flotillin family. Flotillins are integral membrane proteins that have been shown to be present in several subcellular components, including caveolae (invaginated plasma membrane microdomains), lipid rafts (sphingolipid and cholesterol-rich, detergent-resistant plasma membrane microdomains), and the Golgi apparatus. The molecular function of flotillins remains uncertain. They are probably involved in organising the structure of caveolae and lipid rafts, and other detergent resistant membrane domains. They may also be involved in signal transduction. Flotillins have been shown to accumulate in brain cells with the development of Alzheimer's pathology. Also included in this family are Reggie proteins, which are expressed in non-caveolar neuronal plasma membrane domains. 43082 pfam03150: Di-haem cytochrome c peroxidase. This is a family of distinct cytochrome c peroxidases (CCPs) that contain two haem groups. Similar to other cytochrome c peroxidases, they reduce hydrogen peroxide to water using c-type haem as an oxidisable substrate. However, since they possess two, instead of one, haem prosthetic groups, bacterial CCPs reduce hydrogen peroxide without the need to generate semi-stable free radicals. The two haem groups have significantly different redox potentials. The high potential (+320 mV) haem feeds electrons from electron shuttle proteins to the low potential (-330 mV) haem, where peroxide is reduced (indeed, the low potential site is known as the peroxidatic site). The CCP protein itself is structured into two domains, each containing one c-type haem group, with a calcium-binding site at the domain interface. This family also includes MauG proteins, whose similarity to di-haem CCP was previously recognised. 43083 pfam03151: Domain of unknown function, DUF250. This family consists entirely of aligned regions from Drosophila melanogaster proteins. One member contains three repeats of this region. In other proteins, the aligned region is located towards the C-terminus. The function of the aligned region is unknown. 43084 pfam03152: Ubiquitin fusion degradation protein UFD1. Post-translational ubiquitin-protein conjugates are recognised for degradation by the ubiquitin fusion degradation (UFD) pathway. Several proteins involved in this pathway have been identified. This family includes UFD1, a 40kD protein that is essential for vegetative cell viability. The human UFD1 gene is expressed at high levels during embryogenesis, especially in the eyes and in the inner ear primordia and is thought to be important in the determination of ectoderm-derived structures, including neural crest cells. In addition, this gene is deleted in the CATCH-22 (cardiac defects, abnormal facies, thymic hypoplasia, cleft palate and hypocalcaemia with deletions on chromosome 22) syndrome. This clinical syndrome is associated with a variety of developmental defects, all characterised by microdeletions on 22q11.2. Two such developmental defects are the DiGeorge syndrome OMIM:188400, and the velo-cardio- facial syndrome OMIM:145410. Several of the abnormalities associated with these conditions are thought to be due to defective neural crest cell differentiation. 43085 pfam03153: Transcription factor IIA, alpha/beta subunit. Transcription initiation factor IIA (TFIIA) is a heterotrimer, the three subunits being known as alpha, beta, and gamma, in order of molecular weight. The N and C-terminal domains of the gamma subunit are represented in pfam02268 and pfam02751, respectively. This family represents the precursor that yields both the alpha and beta subunits. The TFIIA heterotrimer is an essential general transcription initiation factor for the expression of genes transcribed by RNA polymerase II. Together with TFIID, TFIIA binds to the promoter region; this is the first step in the formation of a pre-initiation complex (PIC). Binding of the rest of the transcription machinery follows this step. After initiation, the PIC does not completely dissociate from the promoter. Some components, including TFIIA, remain attached and re-initiate a subsequent round of transcription. 43086 pfam03154: Atrophin-1 family. Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteristic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity. 43087 pfam03155: ALG6, ALG8 glycosyltransferase family. N-linked (asparagine-linked) glycosylation of proteins is mediated by a highly conserved pathway in eukaryotes, in which a lipid (dolichol phosphate)-linked oligosaccharide is assembled at the endoplasmic reticulum membrane prior to the transfer of the oligosaccharide moiety to the target asparagine residues. This oligosaccharide is composed of Glc(3)Man(9)GlcNAc(2). The addition of the three glucose residues is the final series of steps in the synthesis of the oligosaccharide precursor. Alg6 transfers the first glucose residue, and Alg8 transfers the second one. In the human alg6 gene, a C->T transition, which causes Ala333 to be replaced with Val, has been identified as the cause of a congenital disorder of glycosylation, designated as type Ic OMIM:603147. 43088 pfam03157: High molecular weight glutenin subunit. Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm. 43089 pfam03158: Multigene family 530 protein. Members of this family are multigene family 530 proteins from African swine fever viruses. These proteins may be involved in promoting survival of infected macrophages. 43090 pfam03159: XRN 5'-3' exonuclease N-terminus. This family aligns residues towards the N-terminus of several proteins with multiple functions. The members of this family all appear to possess 5'-3' exonuclease activity EC:3.1.11.-. Thus, the aligned region may be necessary for 5'->3' exonuclease function. The family also contains several Xrn1 and Xrn2 proteins. The 5'-3' exoribonucleases Xrn1p and Xrn2p/Rat1p function in the degradation and processing of several classes of RNA in Saccharomyces cerevisiae. Xrn1p is the main enzyme catalysing cytoplasmic mRNA degradation in multiple decay pathways, whereas Xrn2p/Rat1p functions in the processing of rRNAs and small nucleolar RNAs (snoRNAs) in the nucleus. 43091 pfam03160: Calx-beta domain. 43092 pfam03161: LAGLIDADG DNA endonuclease family. This is a family of site-specific DNA endonucleases encoded by DNA mobile elements. Similar to pfam00961, the members of this family are also LAGLIDADG endonucleases. 43093 pfam03162: Tyrosine phosphatase family. This family is closely related to the pfam00102 and pfam00782 families. 43094 pfam03164: SAND. Members of this family are uncharacterised proteins that have been called SAND proteins. These proteins do not contain a SAND domain. The members of this family are 500-600 amino acids long and contain several conserved motifs. This family has recently been implicated in protein-vacuolar targeting. See also Cottage A, Edwards YJK, Elgar G (2001) SAND, a new protein family: from nucleic acid to protein structure and function prediction Comparative and Functional Genomics 2 (4):226-235. . 43095 pfam03165: MH1 domain. This is the MH1 (MAD homology 1) domain found at the amino terminus of MAD related proteins. This domain can bind to DNA. This domain is separated from the MH2 domain by a non-conserved linker region. The crystal structure of the MH1 domain shows that a highly conserved 11 residue beta hairpin is used to bind DNA in the major groove. Not all examples of MH1 can bind to DNA however. Smad2 cannot bind DNA and has a large insertion within the hairpin that presumably abolishes DNA binding. 43096 pfam03166: MH2 domain. This is the MH2 (MAD homology 2) domain found at the carboxy terminus of MAD related proteins. This domain is separated from the MH1 domain by a non-conserved linker region. The MH2 domain mediates interaction with a wide variety of proteins. 43097 pfam03167: Uracil DNA glycosylase superfamily. 43098 pfam03168: Late embryogenesis abundant protein. Different types of LEA proteins are expressed at different stages of late embryogenesis in higher plant seed embryos and under conditions of dehydration stress. The function of these proteins is unknown. This family represents a group of LEA proteins that appear to be distinct from those in pfam02987. 43099 pfam03169: OPT oligopeptide transporter protein. The OPT family of oligopeptide transporters is distinct from the ABC pfam00005 and PTR pfam00854 transporter families. OPT transporters were first recognised in fungi (Candida albicans and Schizosaccharomyces pombe), but this alignment also includes orthologues from Arabidopsis thaliana. OPT transporters are thought to have 12-14 transmembrane domains and contain the following motif: SPYxEVRxxVxxxDDP. 43100 pfam03170: Bacterial cellulose synthase subunit. This family includes bacterial proteins involved in cellulose synthesis. Cellulose synthesis has been identified in several bacteria. In Agrobacterium tumefaciens, for instance, cellulose has a pathogenic role: it allows the bacteria to bind tightly to their host plant cells. While several enzymatic steps are involved in cellulose synthesis, potentially the only step unique to this pathway is that catalysed by cellulose synthase. This enzyme is a multi subunit complex. This family encodes a subunit that is thought to bind the positive effector cyclic di-GMP. This subunit is found in several different bacterial cellulose synthase enzymes. The first recognised sequence for this subunit is BcsB. In the AcsII cellulose synthase, this subunit and the subunit corresponding to BcsA are found in the same protein. Indeed, this alignment only includes the C-terminal half of the AcsAII synthase, which corresponds to BcsB. 43101 pfam03171: 2OG-Fe(II) oxygenase superfamily. This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. This family includes the C-terminal of prolyl 4-hydroxylase alpha subunit. The holoenzyme has the activity EC:1.14.11.2 catalysing the reaction: Procollagen L-proline + 2-oxoglutarate + O2 <=> procollagen trans- 4-hydroxy-L-proline + succinate + CO2. The full enzyme consists of a alpha2 beta2 complex with the alpha subunit contributing most of the parts of the active site. The family also includes lysyl hydrolases, isopenicillin synthases and AlkB. 43102 pfam03172: Sp100 domain. The function of this domain is unknown. It is about 105 amino acid residues in length and is predicted to be predominantly alpha helical. This domain is usually found at the amino terminus of protein that contain a SAND domain pfam01342. 43103 pfam03173: Putative carbohydrate binding domain. This domain represents the N terminal domain in chitobiases and beta-hexosaminidases EC:3.2.1.52. It is composed of a beta sandwich structure that is similar in structure to the cellulose binding domain of cellulase from Cellulomonas fimi. This suggests that this may be a carbohydrate binding domain. 43104 pfam03174: Chitobiase/beta-hexosaminidase C-terminal domain. This short domain represents the C terminal domain in chitobiases and beta-hexosaminidases EC:3.2.1.52. It is composed of a beta sandwich structure. The function of this domain is unknown. 43105 pfam03175: DNA polymerase type B, organellar and viral. Like pfam00136, members of this family are also DNA polymerase type B proteins. Those included here are found in plant and fungal mitochondria, and in viruses. 43106 pfam03176: MMPL family. Members of this family are putative integral membrane proteins from bacteria. Several of the members are mycobacterial proteins. Many of the proteins contain two copies of this aligned region. The function of these proteins is not known, although it has been suggested that they may be involved in lipid transport. 43107 pfam03177: Non-repetitive/WGA-negative nucleoporin. This is a family of nucleoporin proteins. Nucleoporins are the main components of the nuclear pore complex in eukaryotic cells, and mediate bidirectional nucleocytoplasmic transport, especially of mRNA and proteins. Two nucleoporin classes are known: one is characterised by the FG repeat pfam03093; the other is represented by this family, and lacks any repeats. 43108 pfam03178: CPSF A subunit region. This family includes a region that lies towards the C-terminus of the cleavage and polyadenylation specificity factor (CPSF) A (160 kDa) subunit. CPSF is involved in mRNA polyadenylation and binds the AAUAAA conserved sequence in pre-mRNA. CPSF has also been found to be necessary for splicing of single-intron pre-mRNAs. The function of the aligned region is unknown but may be involved in RNA/DNA binding. 43109 pfam03179: Vacuolar (H+)-ATPase G subunit. This family represents the eukaryotic vacuolar (H+)-ATPase (V-ATPase) G subunit. V-ATPases generate an acidic environment in several intracellular compartments. Correspondingly, they are found as membrane-attached proteins in several organelles. They are also found in the plasma membranes of some specialised cells. V-ATPases consist of peripheral (V1) and membrane integral (V0) heteromultimeric complexes. The G subunit is part of the V1 subunit, but is also thought to be strongly attached to the V0 complex. It may be involved in the coupling of ATP degradation to H+ translocation. 43110 pfam03180: NLPA lipoprotein. This family of bacterial lipoproteins contains several antigenic members, that may be involved in bacterial virulence. Their precise function is unknown. However they are probably distantly related to pfam00497 which are solute binding proteins. 43111 pfam03181: BURP domain. The BURP domain is found at the C-terminus of several different plant proteins. It was named after the proteins in which it was first identified: the BNM2 clone-derived protein from Brassica napus; USPs and USP-like proteins; RD22 from Arabidopsis thaliana; and PG1beta from Lycopersicon esculentum. This domain is around 230 amino acid residues long. It possesses the following conserved features: two phenylalanine residues at its N-terminus; two cysteine residues; and four repeated cysteine-histidine motifs, arranged as: CH-X(10)-CH-X(25-27)-CH-X(25-26)-CH, where X can be any amino acid. The function of this domain is unknown. 43112 pfam03183: Borrelia repeat protein. 43113 pfam03184: DDE superfamily endonuclease. This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction. Interestingly this family also includes the CENP-B protein. This domain in that protein appears to have lost the metal binding residues and is unlikely to have endonuclease activity. Centromere Protein B (CENP-B) is a DNA-binding protein localised to the centromere. 43114 pfam03185: Calcium-activated potassium channel, beta subunit. 43115 pfam03186: CobD/Cbib protein. This family includes CobD proteins from a number of bacteria, in Salmonella this protein is called Cbib. Salmonella CobD is a different protein. This protein is involved in cobalamin biosynthesis and is probably an enzyme responsible for the conversion of adenosylcobyric acid to adenosylcobinamide or adenosylcobinamide phosphate. 43116 pfam03187: Corona nucleocapsid I protein. 43117 pfam03188: Cytochrome b561. 43118 pfam03189: Protein of unknown function, DUF270. 43119 pfam03190: Protein of unknown function, DUF255. 43120 pfam03191: Protein of unknown function, DUF256. 43121 pfam03192: Pyrococcus protein of unknown function, DUF257. 43122 pfam03193: Protein of unknown function, DUF258. 43123 pfam03194: LUC7 N_terminus. This family contains the N terminal region of several LUC7 protein homologues and only contains eukaryotic proteins. LUC7 has been shown to be a U1 snRNA associated protein with a role in splice site recognition. The family also contains human and mouse LUC7 like (LUC7L) proteins and human cisplatin resistance-associated overexpressed protein (CROP).. 43124 pfam03195: Protein of unknown function DUF260. 43125 pfam03196: Protein of unknown function, DUF261. 43126 pfam03197: Bacteriophage FRD2 protein. 43127 pfam03198: Glycolipid anchored surface protein (GAS1).. 43128 pfam03199: Eukaryotic glutathione synthase. 43129 pfam03200: Mannosyl oligosaccharide glucosidase. This is a family of eukaryotic enzymes belonging to glycosyl hydrolase family 63. They catalyse the specific cleavage of the non-reducing terminal glucose residue from Glc(3)Man(9)GlcNAc(2). Mannosyl oligosaccharide glucosidase EC:3.2.1.106 is the first enzyme in the N-linked oligosaccharide processing pathway. 43130 pfam03201: H2-forming N5,N10-methylenetetrahydromethanopterin dehydrogenase. 43131 pfam03202: Putative mycoplasma lipoprotein, C-terminal region. 43132 pfam03203: MerC mercury resistance protein. 43133 pfam03204: Mo25 protein family. 43134 pfam03205: Molybdopterin guanine dinucleotide synthesis protein B. This protein contains a P-loop. 43135 pfam03206: Nitrogen fixation protein NifW. Nitrogenase is a complex metalloenzyme composed of two proteins designated the Fe-protein and the MoFe-protein. Apart from these two proteins, a number of accessory proteins are essential for the maturation and assembly of nitrogenase. Even though experimental evidence suggests that these accessory proteins are required for nitrogenase activity, the exact roles played by many of these proteins in the functions of nitrogenase are unclear. Using yeast two-hybrid screening it has been shown that NifW can interact with itself as well as NifZ. 43136 pfam03207: Borrelia outer surface protein D (OspD).. 43137 pfam03208: PRA1 family protein. This family includes the PRA1 (Prenylated rab acceptor) protein. This family also includes the glutamate transporter EAAC1 interacting protein GTRAP3-18. 43138 pfam03209: PUCC protein. This protein is required for high-level transcription of the PUC operon. 43139 pfam03210: Paramyxovirus P/V phosphoprotein. Paramyxoviral P genes are able to generate more than one product, using alternative reading frames and RNA editing. The P gene encodes the structural phosphoprotein P. In addition, it encodes several non-structural proteins present in the infected cell but not in the virus particle. This family includes phosphoprotein P and the non-structural phosphoprotein V from different paramyxoviruses. Phosphoprotein P is essential for the activity of the RNA polymerase complex which it forms with another subunit, L pfam00946. Although all the catalytic activities of the polymerase are associated with the L subunit, its function requires specific interactions with phosphoprotein P. The P and V phosphoproteins are amino co-terminal, but diverge at their C-termini. This difference is generated by an RNA-editing mechanism in which one or two non-templated G residues are inserted into P-gene-derived mRNA. In measles virus and Sendai virus, one G residue is inserted and the edited transcript encodes the V protein. In mumps, simian virus type 5 and Newcastle disease virus, two G residues are inserted, and the edited transcript codes for the P protein. Being phosphoproteins, both P and V are rich in serine and threonine residues over their whole lengths. In addition, the V proteins are rich in cysteine residues at the C-termini. 43140 pfam03211: Pectate lyase. 43141 pfam03212: Pertactin. 43142 pfam03213: Poxvirus P35 protein. 43143 pfam03214: Reversibly glycosylated polypeptide. 43144 pfam03215: Rad17 cell cycle checkpoint protein. 43145 pfam03216: Rhabdovirus nucleoprotein. 43146 pfam03217: Bacterial surface layer protein. 43147 pfam03219: TLC ATP/ADP transporter. 43148 pfam03220: Tombusvirus P19 core protein. 43149 pfam03221: Tc5 transposase. 43150 pfam03222: Tryptophan/tyrosine permease family. 43151 pfam03223: V-ATPase subunit C. 43152 pfam03224: V-ATPase subunit H. 43153 pfam03225: Viral heat shock protein Hsp90 homologue. 43154 pfam03226: Yippee putative zinc-binding protein. 43155 pfam03227: Gamma interferon inducible lysosomal thiol reductase (GILT). This family includes the two characterised human gamma-interferon-inducible lysosomal thiol reductase (GILT) sequences. It also contains several other eukaryotic putative proteins with similarity to GILT. The aligned region contains three conserved cysteine residues. In addition, the two GILT sequences possess a C-X(2)-C motif that is shared by some of the other sequences in the family. This motif is thought to be associated with disulphide bond reduction. 43156 pfam03228: Adenoviral core protein VII. The function of this protein is unknown. It has a conserved amino terminus of 50 residues followed by a positively charged tail, suggesting it may interact with nucleic acid. The major core protein of the adenovirus, protein VII, was found to be associated with viral DNA throughout infection. The precursor to protein VII were shown to be in vivo and in vitro acceptors of ADP-ribose. The ADP-ribosylated core proteins were assembled into mature virus particles. ADP-ribosylation of adenovirus core proteins may have a role in virus decapsidation. 43157 pfam03229: Alphavirus glycoprotein J. 43158 pfam03230: Antirestriction protein. This family includes various protein that are involved in antirestriction. The ArdB protein efficiently inhibits restriction by members of the three known families of type I systems of E. coli. 43159 pfam03231: Bunyavirus non-structural protein NS-S. This family represents the Bunyavirus NS-S family. Bunyavirus has three genomic segments: small (S), middle-sized (M), and large (L). The S segment encodes the nucleocapsid and a non-structural protein. The M segment codes for two glycoproteins, G1 and G2, and another non-structural protein (NSm). The L segment codes for an RNA polymerase. 43160 pfam03232: Ubiquinone biosynthesis protein COQ7. Members of this family contain two repeats of about 90 amino acids, that contains two conserved motifs. One of these DXEXXH may be part of an enzyme active site. 43161 pfam03233: Aphid transmission protein. This protein is found in various caulimoviruses. It codes for an 18 kDa protein (PII), which is dispensable for infection but which is required for aphid transmission of the virus. This protein interacts with the PIII protein. 43162 pfam03234: Cdc37 family. In the budding yeast Saccharomyces cerevisiae, Cdc37 is required for the productive formation of Cdc28-cyclin complexes. Cdc37 may be a kinase targeting subunit of Hsp90. 43163 pfam03235: Protein of unknown function DUF262. 43164 pfam03236: Domain of unknown function DUF263. 43165 pfam03237: Protein of unknown function DUF264. This family represents a group of plasmid encodes proteins specifically found in Borrelia and that currently do not show any similarity to any other proteins outside the Borrelia genus. Proteins within this family are about 450 residues long and are found to be expanded in Borrelia burgdorferi. The function of this protein is unknown. 43166 pfam03238: ESAG protein. Expression-site-associated gene (ESAG) proteins are thought to be involved in VSG activation. This family includes ESAG 117A as well as ESAG IM. 43167 pfam03239: Iron permease FTR1 family. 43168 pfam03240: FlgA family. This protein is involved in the assembly of the flageller P-ring. 43169 pfam03241: 4-hydroxyphenylacetate 3-hydroxylase family. HpaB encodes part of the 4-hydroxyphenylacetate 3-hydroxylase from Escherichia coli. HpaB is part of a heterodimeric enzyme that also requires HpaC. The enzyme is NADH-dependent and uses FAD as the redox chromophore. This family also includes PvcC, which may play a role in one of the proposed hydroxylation steps of pyoverdine chromophore biosynthesis. 43170 pfam03242: Late embryogenesis abundant protein. Members of this family are similar to late embryogenesis abundant proteins. Members of the family have been isolated in a number of different screens. However, the molecular function of these proteins remains obscure. 43171 pfam03243: Alkylmercury lyase. Alkylmercury lyase (EC:4.99.1.2) cleaves the carbon-mercury bond of organomercurials such as phenylmercuric acetate. 43172 pfam03244: Photosystem I reaction centre subunit VI. Photosystem I (PSI) is an integral membrane protein complex that uses light energy to mediate electron transfer from plastocyanin to ferredoxin. 43173 pfam03245: Bacteriophage lysis protein. This protein is involved in host lysis. This family is not considered to be a peptidase according to the MEROPs database. 43174 pfam03246: Pneumovirus nucleocapsid protein. 43175 pfam03247: Prothymosin/parathymosin family. Prothymosin alpha and parathymosin are two ubiquitous small acidic nuclear proteins that are thought to be involved in cell cycle progression, proliferation, and cell differentiation. 43176 pfam03248: Rer1 family. RER1 family protein are involved in involved in the retrieval of some endoplasmic reticulum membrane proteins from the early golgi compartment. The C terminus of yeast Rer1p interacts with a coatomer complex. 43177 pfam03249: Type specific antigen. There are several antigenic variants in Rickettsia tsutsugamushi, and a type-specific antigen (TSA) of 56-kilodaltons located on the rickettsial surface is responsible for the variation. TSA proteins are probably integral membrane proteins. 43178 pfam03250: Tropomodulin. Tropomodulin is a novel tropomyosin regulatory protein that binds to the end of erythrocyte tropomyosin and blocks head-to-tail association of tropomyosin along actin filaments. Limited proteolysis shows this protein is composed of two domains. The amino terminal domain contains the tropomyosin binding function. 43179 pfam03251: Tymovirus 45/70Kd protein. Tymoviruses are single stranded RNA viruses. This family includes a protein of unknown function that has been named based on its molecular weight. Tymoviruses such as the ononis yellow mosaic tymovirus encode only three proteins. Of these two are overlapping this protein overlaps a larger ORF that is thought to be the polymerase. 43180 pfam03252: Herpesvirus UL21. The UL21 protein appears to be a dispensable component in herpesviruses. 43181 pfam03253: Urea transporter. Members of this family transport urea across membranes. The family includes a bacterial homologue. 43182 pfam03254: Xyloglucan fucosyltransferase. Plant cell walls are crucial for development, signal transduction, and disease resistance in plants. Cell walls are made of cellulose, hemicelluloses, and pectins. Xyloglucan (XG), the principal load-bearing hemicellulose of dicotyledonous plants, has a terminal fucosyl residue. This fucosyltransferase adds this residue. 43183 pfam03255: Acetyl co-enzyme A carboxylase carboxyltransferase alpha subunit. Acetyl co-enzyme A carboxylase carboxyltransferase is composed of an alpha and beta subunit. 43184 pfam03256: Anaphase-promoting complex, subunit 10 (APC10).. 43185 pfam03257: Mycoplasma adhesin P1. This family corresponds to a short 100 residue region found in adhesins from Mycoplasmas. 43186 pfam03258: Baculovirus FP protein. The FP protein is missing in baculovirus (Few Polyhedra) mutants. 43187 pfam03259: Roadblock/LC7 domain. This family includes proteins that are about 100 amino acids long and have been shown to be related. Members of this family of proteins are associated with both flagellar outer arm dynein and Drosophila and rat brain cytoplasmic dynein. It is proposed that roadblock/LC7 family members may modulate specific dynein functions. This family also includes Golgi-associated MP1 adapter protein and MglB from Myxococcus xanthus, a protein involved in gliding motility. However the family also includes members from non-motile bacteria such as Streptomyces coelicolor, suggesting that the protein may play a structural or regulatory role. 43188 pfam03260: Lepidopteran low molecular weight (30 kD) lipoprotein. 43189 pfam03261: Cyclin-dependent kinase 5 activator protein. 43190 pfam03262: Coronavirus 6B/7B protein. 43191 pfam03263: Cucumovirus protein 2B. This protein may be a viral movement protein. 43192 pfam03264: NapC/NirT cytochrome c family, N-terminal region. Within the NapC/NirT family of cytochrome c proteins, some members, such as NapC and NirT, bind four haem groups, while others, such as TorC, bind five haems. This family aligns the common N-terminal region that contains four haem-binding C-X(2)-CH motifs. 43193 pfam03265: Deoxyribonuclease II. 43194 pfam03266: Protein of unknown function, DUF265. 43195 pfam03267: Arabidopsis protein of unknown function, DUF266. 43196 pfam03268: Caenorhabditis protein of unknown function, DUF267. 43197 pfam03269: Caenorhabditis protein of unknown function, DUF268. 43198 pfam03270: Protein of unknown function, DUF269. Members of this family may be involved in nitrogen fixation, since they are found within nitrogen fixation operons. 43199 pfam03271: EB1-like C-terminal motif. This motif is found at the C-terminus of proteins that are related to the EB1 protein. The EB1 proteins contain an N-terminal CH domain pfam00307. The human EB1 protein was originally discovered as a protein interacting with the C-terminus of the APC protein. This interaction is often disrupted in colon cancer, due to deletions affecting the APC C-terminus. Several EB1 orthologues are also included in this family. The interaction between EB1 and APC has been shown to have a potent synergistic effect on microtubule polymerisation. Neither of EB1 or APC alone has this effect. It is thought that EB1 targets APC to the + ends of microtubules, where APC promotes microtubule polymerisation. This process is regulated by APC phosphorylation by Cdc2, which disrupts APC-EB1 binding. Human EB1 protein can functionally substitute for the yeast EB1 homologue Mal3. In addition, Mal3 can substitute for human EB1 in promoting microtubule polymerisation with APC. 43200 pfam03272: Viral enhancin protein. 43201 pfam03273: Baculovirus gp64 envelope glycoprotein family. This family includes the gp64 glycoprotein from baculovirus as well as other viruses. 43202 pfam03274: Foamy virus BEL 1/2 protein. 43203 pfam03275: UDP-galactopyranose mutase. 43204 pfam03276: Spumavirus gag protein. 43205 pfam03277: Herpesvirus UL4 family. 43206 pfam03278: IpaB/EvcA family. This family includes IpaB, which is an invasion plasmid antigen from Shigella, as well as EvcA from E. coli. Members of this family seem to be involved in pathogenicity of some enterobacteria. However the exact function of this component is not clear. 43207 pfam03279: Bacterial lipid A biosynthesis acyltransferase. 43208 pfam03280: Proteobacterial lipase chaperone protein. 43209 pfam03281: Mab-21 protein. 43210 pfam03283: Pectinacetylesterase. 43211 pfam03284: Phenazine biosynthesis protein A/B. 43212 pfam03285: Paralemmin. 43213 pfam03286: Pox virus Ag35 surface protein. 43214 pfam03287: Poxvirus C7/F8A protein. 43215 pfam03288: Poxvirus D5 protein-like. This family includes D5 from Poxviruses which is necessary for viral DNA replication, and is a nucleic acid independent nucleoside triphosphatase. Members of this family are also found outside of poxviruses. 43216 pfam03289: Poxvirus protein I1. 43217 pfam03290: Vaccinia virus I7 processing peptidase. 43218 pfam03291: mRNA capping enzyme. This family of enzymes are related to pfam03919. 43219 pfam03292: Poxvirus P4B major core protein. 43220 pfam03293: Poxvirus DNA-directed RNA polymerase, 18 kD subunit. 43221 pfam03294: RNA polymerase-associated transcription specificity factor, Rap94. 43222 pfam03295: Poxvirus trans-activator protein A1. 43223 pfam03296: Poxvirus poly(A) polymerase catalytic subunit. 43224 pfam03297: S25 ribosomal protein. 43225 pfam03298: Stanniocalcin family. 43226 pfam03299: Transcription factor AP-2. 43227 pfam03300: Tenuivirus non-structural protein NS4. 43228 pfam03301: Tryptophan 2,3-dioxygenase. 43229 pfam03302: Giardia variant-specific surface protein. 43230 pfam03303: WTF protein. This is a family of hypothetical Schizosaccharomyces pombe proteins. Their function is unknown. 43231 pfam03304: Mlp lipoprotein family. The Mlp (for Multicopy Lipoprotein) family of lipoproteins is found in Borrelia species. This family were previously known as 2.9 lipoprotein genes. These surface expressed genes may represent new candidate vaccinogens for Lyme disease. Members of this family generally are downstream of four ORFs called A,B,C and D that are involved in hemolytic activity. 43232 pfam03305: Mycoplasma MG185/MG260 protein. Most of the aligned regions in this family are found towards the middle of the member proteins. 43233 pfam03306: Alpha-acetolactate decarboxylase. 43234 pfam03307: Adenovirus 15.3kD protein in E3 region. 43235 pfam03308: ArgK protein. The ArgK protein acts as an ATPase enzyme and as a kinase, and phosphorylates periplasmic binding proteins involved in the LAO (lysine, arginine, ornithine)/AO transport systems. 43236 pfam03309: Bordetella pertussis Bvg accessory factor family. 43237 pfam03310: Caulimovirus DNA-binding protein. 43238 pfam03311: Cornichon protein. 43239 pfam03312: Protein of unknown function, DUF272. 43240 pfam03313: Serine dehydratase alpha chain. L-serine dehydratase (EC:4.2.1.13) is a found as a heterodimer of alpha and beta chain or as a fusion of the two chains in a single protein. This enzyme catalyses the deamination of serine to form pyruvate. This enzyme is part of the gluconeogenesis pathway. 43241 pfam03314: Protein of unknown function, DUF273. 43242 pfam03315: Serine dehydratase beta chain. L-serine dehydratase (EC:4.2.1.13) is a found as a heterodimer of alpha and beta chain or as a fusion of the two chains in a single protein. This enzyme catalyses the deamination of serine to form pyruvate. This enzyme is part of the gluconeogenesis pathway. 43243 pfam03316: Actinomycetales protein of unknown function, DUF275. 43244 pfam03317: ELF protein. This is a family of hypothetical proteins from cereal crops. 43245 pfam03318: Clostridium epsilon toxin ETX/Bacillus mosquitocidal toxin MTX2. 43246 pfam03319: Ethanolamine utilisation protein EutN/carboxysome structural protein Ccml. 43247 pfam03320: Bacterial fructose-1,6-bisphosphatase, glpX-encoded. 43248 pfam03321: GH3 auxin-responsive promoter. 43249 pfam03322: Gamma-butyrobetaine hydroxylase. Members of this family are gamma-Butyrobetaine hydroxylase enzymes EC:1.14.11.1. 43250 pfam03323: Bacillus/Clostridium GerA spore germination protein. 43251 pfam03324: Herpesvirus DNA helicase/primase complex associated protein. This family includes HSV UL8, EHV-1 54, VZV 52 AND HCMV 102. 43252 pfam03325: Herpesvirus polymerase accessory protein. The same proteins are also known as polymerase processivity factors. 43253 pfam03326: Herpesvirus transcription activation factor (transactivator). This family includes EBV BRLF1 and similar ORF 50 proteins from other herpesviruses. 43254 pfam03327: Herpesvirus capsid shell protein VP19C. 43255 pfam03328: HpcH/HpaI aldolase family. This family includes 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase and 4-hydroxy-2-oxovalerate aldolase. 43256 pfam03330: Rare lipoprotein A (RlpA)-like double-psi beta-barrel. Rare lipoprotein A (RlpA) contains a conserved region that has the double-psi beta-barrel (DPBB) fold. The function of RlpA is not well understood, but it has been shown to act as a prc mutant suppressor in Escherichia coli. The DPBB fold is often an enzymatic domain. The members of this family are quite diverse, and if catalytic this family may contain several different functions. 43257 pfam03331: UDP-3-O-acyl N-acetylglycosamine deacetylase. The enzymes in this family catalyse the second step in the biosynthetic pathway for lipid A. 43258 pfam03332: Eukaryotic phosphomannomutase. This enzyme EC:5.4.2.8 is involved in the synthesis of the GDP-mannose and dolichol-phosphate-mannose required for a number of critical mannosyl transfer reactions. 43259 pfam03333: Adhesin biosynthesis transcription regulatory protein. This family includes PapB, DaaA, FanA, FanB, and AfaA. 43260 pfam03334: Na+/H+ antiporter subunit. This family includes PhaG from Rhizobium meliloti, MnhG from Staphylococcus aureus, YufB from Bacillus subtilis. 43261 pfam03335: Phage tail fibre repeat. 43262 pfam03336: Poxvirus C4/C10 protein. 43263 pfam03337: Poxvirus F12L protein. 43264 pfam03338: Poxvirus J1 protein. 43265 pfam03339: Poxvirus L3/FP4 protein. 43266 pfam03340: Poxvirus rifampicin resistance protein. 43267 pfam03341: Poxvirus mRNA capping enzyme, small subunit. 43268 pfam03342: Rhabdovirus M1 matrix protein (M1 polymerase-associated protein).. 43269 pfam03343: SART-1 family. This family of proteins appear to contain a leucine zipper and may therefore be a family of transcription factors. 43270 pfam03344: Daxx Family. The Daxx protein (also known as the Fas-binding protein) is thought to play a role in apoptosis, but precise role played by Daxx remians to be determined. 43271 pfam03345: Dolichyl-diphosphooligosaccharide-protein glycosyltransferase 48kD subunit. Members of this family are involved in asparagine-linked protein glycosylation. In particular, dolichyl-diphosphooligosaccharide-protein glycosyltransferase (DDOST), also known as oligosaccharyltransferase EC:2.4.1.119, transfers the high-mannose sugar GlcNAc(2)-Man(9)-Glc(3) from a dolichol-linked donor to an asparagine acceptor in a consensus Asn-X-Ser/Thr motif. In most eukaryotes, the DDOST complex is composed of three subunits, which in humans are described as a 48kD subunit, ribophorin I, and ribophorin II. However, the yeast DDOST appears to consist of six subunits (alpha, beta, gamma, delta, epsilon, zeta). The yeast beta subunit is a 45kD polypeptide, previously discovered as the Wbp1 protein, with known sequence similarity to the human 48kD subunit and the other orthologues. This family includes the 48kD-like subunits from several eukaryotes; it also includes the yeast DDOST beta subunit Wbp1. 43272 pfam03346: Actinobacillus constitutively-expressed outer membrane lipoprotein A. 43273 pfam03347: Vibrio thermostable direct hemolysin. 43274 pfam03348: TMS membrane protein/tumour differentially expressed protein (TDE).. 43275 pfam03349: Outer membrane protein transport protein (OMPP1/FadL/TodX). This family includes TodX from Pseudomonas putida F1 and TbuX from Ralstonia pickettii PKO1. These are membrane proteins of uncertain function that are involved in toluene catabolism. Related proteins involved in the degradation of similar aromatic hydrocarbons are also in this family, such as CymD. This family also includes FadL involved in translocation of long-chain fatty acids across the outer membrane. It is also a receptor for the bacteriophage T2. 43276 pfam03350: Uncharacterized protein family, UPF0114. 43277 pfam03351: DOMON domain. The DOMON (named after dopamine beta-monooxygenase N-terminal) domain is 110-125 residues long. It is predicted to form an all beta fold with 7-8 strands. The domain may mediate extracellular adhesive interactions. 43278 pfam03352: Methyladenine glycosylase. The DNA-3-methyladenine glycosylase I is constitutively expressed and is specific for the alkylated 3-methyladenine DNA. 43279 pfam03353: DUF278. This is a family of C. elegans proteins. 43280 pfam03354: Phage Terminase. The majority of the members of this family are bacteriophage proteins, several of which are thought to be terminase large subunit proteins. There are also a number of bacterial proteins of unknown function. 43281 pfam03355: Viral Trans-Activator Protein. These proteins function as a trans-activator of viral late genes. 43282 pfam03356: Viral late protein H2. All Members of this family show similarity to the vaccinia virus late protein H2. This protein is often referred to by its gene name H2R. Members from this family all belong to the viral taxon Poxviridae. . 43283 pfam03357: SNF7. This family consists of a group of SNF-7 homologues involved in protein sorting and transport from the endosome to the vacuole/lysosome in eukaryotic cells. Vacuoles/lysosomes play an important role in the degradation of both lipids and cellular proteins. In order to perform this degradative function, vacuoles/lysosomes contain numerous hydrolases which have been transported in the form of inactive precursors via the biosynthetic pathway and are proteolytically activated upon delivery to the vacuole/lysosome. The delivery of transmembrane proteins, such as activated cell surface receptors to the lumen of the vacuole/lysosome, either for degradation/downregulation, or in the case of hydrolases, for proper localisation, requires the formation of multivesicular bodies (MVBs). These late endosomal structures are formed by invaginating and budding of the limiting membrane into the lumen of the compartment. During this process, a subset of the endosomal membrane proteins is sorted into the forming vesicles. Mature MVBs fuse with the vacuole/lysosome, thereby releasing cargo containing vesicles into its hydrolytic lumen for degradation. Endosomal proteins that are not sorted into the intralumenal MVB vesicles are either recycled back to the plasma membrane or Golgi complex, or remain in the limiting membrane of the MVB and are thereby transported to the limiting membrane of the vacuole/lysosome as a consequence of fusion. Therefore, the MVB sorting pathway plays a critical role in the decision between recycling and degradation of membrane proteins. A few archaeal sequences are also present within this family. 43284 pfam03358: NADPH-dependent FMN reductase. 43285 pfam03359: Guanylate-kinase-associated protein (GKAP) protein. 43286 pfam03360: Glycosyltransferase family 43. 43287 pfam03361: Herpes virus intermediate/early protein 2/3. 43288 pfam03362: Herpesvirus UL47 protein. 43289 pfam03363: Herpesvirus leader protein. 43290 pfam03364: Streptomyces cyclase/dehydrase. Members of this family of enzymes from Streptomyces spp. are involved in polyketide synthesis. 43291 pfam03365: Stromal antigen (SA/STAG) protein. 43292 pfam03366: YEATS family. We have named this family the YEATS family, after `YNK7', `ENL', `AF-9', and `TFIIF small subunit'. This family also contains the GAS41 protein. All these proteins are thought to have a transcription stimulatory activity. 43293 pfam03367: ZPR1 zinc-finger domain. The zinc-finger protein ZPR1 is ubiquitous among eukaryotes. It is indeed known to be an essential protein in yeast. In quiescent cells, ZPR1 is localised to the cytoplasm. But in proliferating cells treated with EGF or with other mitogens, ZPR1 accumulates in the nucleolus. ZPR1 interacts with the cytoplasmic domain of the inactive EGF receptor (EGFR) and is thought to inhibit the basal protein tyrosine kinase activity of EGFR. This interaction is disrupted when cells are treated with EGF, though by themselves, inactive EGFRs are not sufficient to sequester ZPR1 to the cytoplasm. Upon stimulation by EGF, ZPR1 directly binds the eukaryotic translation elongation factor-1alpha (eEF-1alpha) to form ZPR1/eEF-1alpha complexes. These move into the nucleus, localising particularly at the nucleolus. Indeed, the interaction between ZPR1 and eEF-1alpha has been shown to be essential for normal cellular proliferation, and ZPR1 is thought to be involved in pre-ribosomal RNA expression. The alignment for this family shows a domain of which there are two copies in ZPR1 proteins. This family also includes several hypothetical archaeal proteins (from both Crenarchaeota and Euryarchaeota), which only contain one copy of the aligned region. This similarity between ZPR1 and archaeal proteins was not previously noted. 43294 pfam03368: Domain of unknown function. This putative domain is found in members of the Dicer protein family. This protein is a dsRNA nuclease that is involved in RNAi and related processes. This domain of about 100 amino acids has no known function, but does contain 3 possible zinc ligands. 43295 pfam03369: Herpesvirus UL3 protein. 43296 pfam03370: Putative phosphatase regulatory subunit. This family consists of several eukaryotic proteins that are thought to be involved in the regulation of glycogen metabolism. For instance, the mouse PTG protein has been shown to interact with glycogen synthase, phosphorylase kinase, phosphorylase a: these three enzymes have key roles in the regulation of glycogen metabolism. PTG also binds the catalytic subunit of protein phosphatase 1 (PP1C) and localises it to glycogen. Subsets of similar interactions have been observed with several other members of this family, such as the yeast PIG1, PIG2, GAC1 and GIP2 proteins. While the precise function of these proteins is not known, they may serve a scaffold function, bringing together the key enzymes in glycogen metabolism. This family is a carbohydrate binding domain. 43297 pfam03371: PRP38 family. Members of this family are related to the pre mRNA splicing factor PRP38 from yeast. Therefore all the members of this family could be involved in splicing. This conserved region could be involved in RNA binding. The putative domain is about 180 amino acids in length. PRP38 is a unique component of the U4/U6.U5 tri-small nuclear ribonucleoprotein (snRNP) particle and is necessary for an essential step late in spliceosome maturation. . 43298 pfam03372: Endonuclease/Exonuclease/phosphatase family. This large family of proteins includes magnesium dependent endonucleases and a large number of phosphatases involved in intracellular signalling. This family includes: AP endonuclease proteins EC:4.2.99.18, DNase I proteins EC:3.1.21.1, Synaptojanin an inositol-1,4,5-trisphosphate phosphatase EC:3.1.3.56, Sphingomyelinase EC:3.1.4.12 and Nocturnin. 43299 pfam03373: Octapeptide repeat. This octapeptide repeat is found in several bacterial proteins. The function of this repeat is unknown. 43300 pfam03374: Phage antirepressor protein. 43301 pfam03375: Caenorhabditis protein of unknown function, DUF280. 43302 pfam03376: Adenovirus E3B protein. 43303 pfam03377: Xanthomonas avirulence protein, Avr/PthA. 43304 pfam03378: CAS/CSE protein, C-terminus. Mammalian cellular apoptosis susceptibility (CAS) proteins are homologous to the yeast chromosome-segregation protein, CSE1. This family aligns the C-terminal halves (approximately). CAS is involved in both cellular apoptosis and proliferation. Apoptosis is inhibited in CAS-depleted cells, while the expression of CAS correlates to the degree of cellular proliferation. Like CSE1, it is essential for the mitotic checkpoint in the cell cycle (CAS depletion blocks the cell in the G2 phase), and has been shown to be associated with the microtubule network and the mitotic spindle, as is the protein MEK, which is thought to regulate the intracellular localisation (predominantly nuclear vs. predominantly cytosolic) of CAS. In the nucleus, CAS acts as a nuclear transport factor in the importin pathway. The importin pathway mediates the nuclear transport of several proteins that are necessary for mitosis and further progression. CAS is therefore thought to affect the cell cycle through its effect on the nuclear transport of these proteins. Since apoptosis also requires the nuclear import of several proteins (such as P53 and transcription factors), it has been suggested that CAS also enables apoptosis by facilitating the nuclear import of at least a subset of these essential proteins. 43305 pfam03379: CcmB protein. CcmB is the product of one of a cluster of Ccm genes that are necessary for cytochrome c biosynthesis in eubacteria. Expression of these proteins is induced when the organisms are grown under anaerobic conditions with nitrate or nitrite as the final electron acceptor. CcmB is required for the export of haem to the periplasm. 43306 pfam03380: Caenorhabditis protein of unknown function, DUF282. 43307 pfam03381: LEM3 (ligand-effect modulator 3) family / CDC50 family. Members of this family have been predicted to contain transmembrane helices. The family member LEM3 is a ligand-effect modulator, mutation of which increases glucocorticoid receptor activity in response to dexamethasone and also confers increased activity on other intracellular receptors including the progesterone, oestrogen and mineralocorticoid receptors. LEM3 is thought to affect a downstream step in the glucocorticoid receptor pathway. Factors that modulate ligand responsiveness are likely to contribute to the context-specific actions of the glucocorticoid receptor in mammalian cells. The products of genes YNR048w, YNL323w and YCR094w (CDC50) show redundancy of function and are involved in regulation of transcription via CDC39. CDC39 (also known as NOT1) is normally a negative regulator of transcription either by affecting the general RNA polymerase II machinery or by altering chromatin structure. One function of CDC39 is to block activation of the mating response pathway in the absence of pheromone, and mutation causes arrest in G1 by activation of the pathway. It may be that the cold-sensitive arrest in G1 noticed in CDC50 mutants may be due to inactivation of CDC39. The effects of LEM3 on glucocorticoid receptor activity may also be due to effects on transcription via CDC39. 43308 pfam03382: Mycoplasma protein of unknown function, DUF285. 43309 pfam03383: Caenorhabditis protein of unknown function, DUF286. 43310 pfam03384: Drosophila protein of unknown function, DUF287. 43311 pfam03385: Protein of unknown function, DUF288. 43312 pfam03386: Early nodulin 93 ENOD93 protein. 43313 pfam03387: Herpesvirus UL46 protein. 43314 pfam03388: Legume-like lectin family. Lectins are structurally diverse proteins that bind to specific carbohydrates. This family includes the VIP36 and ERGIC-53 lectins. These two proteins were the first recognised members of a family of animal lectins similar (19-24%) to the leguminous plant lectins. The alignment for this family aligns residues lying towards the N-terminus, where the similarity of VIP36 and ERGIC-53 is greatest. However, while Fiedler and Simons identified these proteins as a new family of animal lectins, our alignment also includes yeast sequences. ERGIC-53 is a 53kD protein, localised to the intermediate region between the endoplasmic reticulum and the Golgi apparatus (ER-Golgi-Intermediate Compartment, ERGIC). It was identified as a calcium-dependent, mannose-specific lectin. Its dysfunction has been associated with combined factors V and VIII deficiency OMIM:227300 OMIM:601567, suggesting an important and substrate-specific role for ERGIC-53 in the glycoprotein- secreting pathway. . 43315 pfam03389: MobA/MobL family. This family includes of the MobA protein from the E. coli plasmid RSF1010, and the MobL protein from the Thiobacillus ferrooxidans plasmid PTF1. These sequences are mobilisation proteins, which are essential for specific plasmid transfer. 43316 pfam03390: Bacterial sodium:citrate symporter. The sodium:citrate symporter is found on the boundary membrane, and allows the uptake of citrate for its utilisation as a source of carbon and energy. 43317 pfam03391: Nepovirus coat protein, central domain. The members of this family are derived from nepoviruses. Together with comoviruses and picornaviruses, nepoviruses are classified in the picornavirus superfamily of plus strand single-stranded RNA viruses. This family aligns several nepovirus coat protein sequences. In several cases, this is found at the C-terminus of the RNA2-encoded viral polyprotein. The coat protein consists of three trapezoid-shaped beta-barrel domains, and forms a pseudo T = 3 icosahedral capsid structure. 43318 pfam03392: Insect pheromone-binding family, A10/OS-D. 43319 pfam03393: Pneumovirus matrix protein. 43320 pfam03394: Poxvirus E8 protein. 43321 pfam03395: Poxvirus P4A protein. 43322 pfam03396: Poxvirus DNA-directed RNA polymerase, 35 kD subunit. 43323 pfam03397: Rhabdovirus matrix protein. 43324 pfam03398: Eukaryotic protein of unknown function, DUF292. 43325 pfam03399: SAC3/GANP family. This family of eukaryotic proteins brings together the yeast nuclear export factor Sac3, and mammalian GANP/MCM3-associated proteins, which facilitate the nuclear localisation of MCM3, a protein that associates with chromatin in the G1 phase of the cell-cycle. 43326 pfam03400: IS1 transposase. Transposase proteins are necessary for efficient DNA transposition. This family represents bacterial IS1 transposases. 43327 pfam03401: Bordetella uptake gene (bug) product. These probable extra-cytoplasmic solute receptors are strongly overrepresented in several beta-proteobacteria. 43328 pfam03402: Vomeronasal organ pheromone receptor family, V1R. This family represents one of two known vomeronasal organ receptor families, the V1R family (after).. 43329 pfam03403: Platelet-activating factor acetylhydrolase, plasma/intracellular isoform II. Platelet-activating factor acetylhydrolase (PAF-AH) is a subfamily of phospholipases A2, responsible for inactivation of platelet-activating factor through cleavage of an acetyl group. Three known PAF-AHs are the brain heterotrimeric PAF-AH Ib, whose catalytic beta and gamma subunits are aligned in pfam02266, the extracellular, plasma PAF-AH (pPAF-AH), and the intracellular PAF-AH isoform II (PAF-AH II). This family aligns pPAF-AH and PAF-AH II, whose similarity was previously noted. 43330 pfam03404: Mo-co oxidoreductase dimerisation domain. This domain is found in molybdopterin cofactor (Mo-co) oxidoreductases. It is involved in dimer formation, and has an Ig-fold structure. 43331 pfam03405: Fatty acid desaturase. 43332 pfam03406: Phage tail fibre repeat. This repeat is found in the tail fibres of phage. For example protein K. The repeats are about 40 residues long. 43333 pfam03407: Protein of unknown function (DUF271). This family of worm proteins has no known function. 43334 pfam03408: Foamy virus envelope protein. Expression of the envelope (Env) glycoprotein is essential for viral particle egress. This feature is unique to the Spumavirinae, a subclass of the Retroviridae. 43335 pfam03409: Protein of unknown function (DUF274). This family of worm proteins has no known function. 43336 pfam03410: Protein G1. Protein G1, named after the vaccinia virus protein, is a glycoprotein expressed by many Poxviridae. . 43337 pfam03411: Penicillin-insensitive murein endopeptidase. 43338 pfam03412: Peptidase C39 family. Lantibiotic and non-lantibiotic bacteriocins are synthesised as precursor peptides containing N-terminal extensions (leader peptides) which are cleaved off during maturation. Most non-lantibiotics and also some lantibiotics have leader peptides of the so-called double-glycine type. These leader peptides share consensus sequences and also a common processing site with two conserved glycine residues in positions -1 and -2. The double- glycine-type leader peptides are unrelated to the N-terminal signal sequences which direct proteins across the cytoplasmic membrane via the sec pathway. Their processing sites are also different from typical signal peptidase cleavage sites, suggesting that a different processing enzyme is involved. Peptide bacteriocins are exported across the cytoplasmic membrane by a dedicated ATP-binding cassette (ABC) transporter. The ABC transporter is the maturation protease and its proteolytic domain resides in the N-terminal part of the protein. This peptidase domain is found in a wide range of ABC transporters, however the presumed catalytic cysteine and histidine are not conserved in all members of this family. 43339 pfam03413: Thermolysin metallopeptidase, propeptide. 43340 pfam03414: Glycosyltransferase family 6. 43341 pfam03415: Clostripain family. 43342 pfam03416: Peptidase family C54. 43343 pfam03417: Acyl-coenzyme A:6-aminopenicillanic acid acyl-transferase. 43344 pfam03418: Germination protease. 43345 pfam03419: Sporulation factor SpoIIGA. 43346 pfam03420: Prohead core protein protease. 43347 pfam03421: Peptidase C55 family. 43348 pfam03422: Carbohydrate binding module (family 6).. 43349 pfam03423: Carbohydrate binding domain (family 25).. 43350 pfam03424: Carbohydrate binding domain (family 17/28).. 43351 pfam03425: Carbohydrate binding domain (family 11).. 43352 pfam03426: Carbohydrate binding domain (family 15).. 43353 pfam03427: Carbohydrate binding domain (family 19).. 43354 pfam03428: Replication protein C. Replication protein C is involved in the early stages of viral DNA replication. 43355 pfam03429: Major surface protein 1B. The major surface protein (MSP1) of the cattle pathogen Anaplasma is a heterodimer comprised of MSP1a and MSP1b. This family is the MSP1b chain. There MSP1 proteins are putative adhesins for bovine erythrocytes. 43356 pfam03430: Trans-activating transcriptional regulator. This family of trans-activating transcriptional regulator (TATR), also known as intermediate early protein 1, are common to the Nucleopolyhedroviruses. 43357 pfam03431: RNA replicase, beta-chain. This family is of Leviviridae RNA replicases. The replicase is also known as RNA dependent RNA polymerase. 43358 pfam03432: Relaxase/Mobilisation nuclease domain. Relaxases/mobilisation proteins are required for the horizontal transfer of genetic information contained on plasmids that occurs during bacterial conjugation. The relaxase, in conjunction with several auxiliary proteins, forms the relaxation complex or relaxosome. Relaxases nick duplex DNA in a specific manner by catalysing trans-esterification. . 43359 pfam03433: EspA-like secreted protein. EspA if the prototypical member of this family. EspA, together with EspB, EspD and Tir are exported by a type III secretion system. These proteins are essential for attaching and effacing lesion formation. EspA is a structural protein and a major component of a large, transiently expressed, filamentous surface organelle which forms a direct link between the bacterium and the host cell. 43360 pfam03434: DUF276. This family is specific to Borrelia burgdorferi. The protein is encoded on extra-chromosomal DNA. This domain has no known function. 43361 pfam03435: Saccharopine dehydrogenase. This family comprised of three structural domains that can not be separated in the linear sequence. In some organisms this enzyme is found as a bifunctional polypeptide with lysine ketoglutarate reductase. The saccharopine dehydrogenase can also function as a saccharopine reductase. . 43362 pfam03436: Domain of unknown function (DUF281). This family of worm domain has no known function. The boundaries of the presumed domain are rather uncertain. 43363 pfam03437: BtpA family. The BtpA protein is tightly associated with the thylakoid membranes, where it stabilises the reaction centre proteins of photosystem I. 43364 pfam03438: Pneumovirus NS1 protein. This non-structural protein is one of two found in pneumoviruses. The protein is about 140 amino acids in length. The NS1 protein appears to be important for efficient replication but not essential. The NS1 protein has been shown by yeast two-hybrid to interact with the viral P protein. This protein is also known as the 1C protein. It has also been shown that NS1 can potently inhibit transcription and RNA replication. 43365 pfam03439: Supt5 repeat. This short region of similarity is found in two tandem copies in Supt5 proteins that are involved in chromatin regulation. The function of this region is unknown. 43366 pfam03440: Aerolysin/Pertussis toxin (APT) domain. This family represents the N-terminal domain of aerolysin and pertussis toxin and has a type-C lectin like fold. 43367 pfam03441: FAD binding domain of DNA photolyase. 43368 pfam03442: Domain of unknown function. The structure of this module is known and consists of an Ig-like fold. The function of this domain is unknown, but might be involved in mediating interaction with carbohydrate. 43369 pfam03443: Glycosyl hydrolase family 61. 43370 pfam03444: Domain of unknown function. This domain is always found with a pair of CBS domains pfam00571. this region may be distantly related to the HrcA proteins of prokaryotes (Bateman A pers. obs.).. 43371 pfam03445: Putative nucleotidyltransferase DUF294. This domain is found associated with pfam00571. This region is uncharacterised, however it seems to be similar to pfam01909, conserving the DXD motif. This strongly suggests that members of this family are also nucleotidyltransferases (Bateman A pers. obs.).. 43372 pfam03446: NAD binding domain of 6-phosphogluconate dehydrogenase. The NAD binding domain of 6-phosphogluconate dehydrogenase adopts a Rossman fold. 43373 pfam03447: Homoserine dehydrogenase, NAD binding domain. This domain adopts a Rossman NAD binding fold. The C-terminal domain of homoserine dehydrogenase contributes a single helix to this structural domain, which is not included in the Pfam model. 43374 pfam03448: MgtE intracellular domain. This domain is found in eubacterial magnesium transporters of the MgtE family pfam01769. This region of similarity is presumed to be an intracellular domain, that may be involved in magnesium binding. 43375 pfam03449: Prokaryotic transcription elongation factor, GreA/GreB, N-terminal domain. This domain adopts a long alpha-hairpin structure. 43376 pfam03450: CO dehydrogenase flavoprotein C-terminal domain. 43377 pfam03451: HELP motif. The HELP (Hydrophobic ELP) motif is found in EMAP and EMAP-like proteins (ELPs). The HELP motif contains a predicted transmembrane helix so probably does not form a globular domain. It is also not clear if these proteins localise to membranes. A preliminary study has shown that the N terminus of Sea urchin EMAP containing HELP is sufficient for microtubule binding in vitro (Eichenmuller et al In press).. 43378 pfam03452: Anp1. The members of this family (Anp1, Van1 and Mnn9) are membrane proteins required for proper Golgi function. These proteins co-localise within the cis Golgi, and that they are physically associated in two distinct complexes. 43379 pfam03453: MoeA N-terminal region (domain I and II). This family contains two structural domains. One of these contains the conserved DGXA motif. This region is found in proteins involved in biosynthesis of molybdopterin cofactor however the exact molecular function of this region is uncertain. 43380 pfam03454: MoeA C-terminal region (domain IV). This domain is found in proteins involved in biosynthesis of molybdopterin cofactor however the exact molecular function of this domain is uncertain. The structure of this domain is known and forms an incomplete beta barrel. 43381 pfam03455: dDENN domain. This region is always found associated with pfam02141. It is predicted to form a globular domain. This domain is predicted to be completely alpha helical. Although not statistically supported it has been suggested that this domain may be similar to members of the Rho/Rac/Cdc42 GEF family. 43382 pfam03456: uDENN domain. This region is always found associated with pfam02141. It is predicted to form an all beta domain. 43383 pfam03457: Helicase associated domain. This short domain is found in multiple copies in bacterial helicase proteins. The domain is predicted to contain 3 alpha helices. The function of this domain may be to bind nucleic acid. 43384 pfam03458: UPF0126 domain. Domain always found as pair in bacterial membrane proteins of unknown function. This domain contains three transmembrane helices. The conserved glycines are suggestive of an ion channel (C. Yeats unpublished obs.).. 43385 pfam03459: TOBE domain. The TOBE domain (Transport-associated OB) always occurs as a dimer as the C-terminal strand of each domain is supplied by the partner. Probably involved in the recognition of small ligands such as molybdenum and sulfate. Found in ABC transporters immediately after the ATPase domain. 43386 pfam03460: Nitrite/Sulfite reductase ferredoxin-like half domain. Sulfite and Nitrite reductases are key to both biosynthetic assimilation of sulfur and nitrogen and dissimilation of oxidised anions for energy transduction. Two copies of this repeat are found in Nitrite and Sulfite reductases and form a single structural domain. 43387 pfam03461: TRCF domain. 43388 pfam03462: PCRF domain. This domain is found in peptide chain release factors. 43389 pfam03463: eRF1 domain 1. The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification. 43390 pfam03464: eRF1 domain 2. The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification. 43391 pfam03465: eRF1 domain 3. The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification. 43392 pfam03466: LysR substrate binding domain. The structure of this domain is known and is similar to the periplasmic binding proteins. 43393 pfam03467: Smg-4/UPF3 family. This family contains proteins that are involved in nonsense mediated mRNA decay. A process that is triggered by premature stop codons in mRNA. The family includes Smg-4 and UPF3. 43394 pfam03468: XS domain. The XS (rice gene X and SGS3) domain is found in a family of plant proteins including gene X and SGS3. SGS3 is thought to be involved in post-transcriptional gene silencing (PTGS). This domain contains a conserved aspartate residue that may be functionally important. 43395 pfam03469: XH domain. The XH (rice gene X Homology) domain is found in a family of plant proteins including gene X. The molecular function of these proteins is unknown. However these proteins usually contain an XS domain that is also found in the PTGS protein SGS3. This domain contains a conserved glutamate residue that may be functionally important. 43396 pfam03470: XS zinc finger domain. This domain is a putative nucleic acid binding zinc finger found in proteins that also contain an XS domain. 43397 pfam03471: Transporter associated domain. This small domain is found in a family of proteins with the pfam01595 domain and two CBS domains with this domain found at the C-terminus of the proteins, the domain is also found at the C terminus of some Na+/H+ antiporters. This domain is also found in CorC that is involved in Magnesium and cobalt efflux. The function of this domain is uncertain but might be involved in modulating transport of ion substrates. 43398 pfam03472: Autoinducer binding domain. This domain is found a a large family of transcriptional regulators. This domain specifically binds to autoinducer molecules. 43399 pfam03473: MOSC domain. The MOSC (MOCO sulfurase C-terminal) domain is a superfamily of beta-strand-rich domains identified in the molybdenum cofactor sulfurase and several other proteins from both prokaryotes and eukaryotes. These MOSC domains contain an absolutely conserved cysteine and occur either as stand-alone forms, or fused to other domains such as NifS-like catalytic domain in Molybdenum cofactor sulfurase. The MOSC domain is predicted to be a sulfur-carrier domain that receives sulfur abstracted by the pyridoxal phosphate-dependent NifS-like enzymes, on its conserved cysteine, and delivers it for the formation of diverse sulfur-metal clusters. 43400 pfam03474: DMRTA motif. This region is found to the C-terminus of the pfam00751. DM-domain proteins with this motif are known as DMRTA proteins. The function of this region is unknown. 43401 pfam03475: 3-alpha domain. This small triple helical domain has been predicted to assume a topology similar to helix-turn-helix domains. These domains are found at the C-terminus of proteins related to E. coli yiiM. 43402 pfam03476: MOSC N-terminal beta barrel domain. This domain is found to the N-terminus of pfam03473. The function of this domain is unknown, however it is predicted to adopt a beta barrel fold. 43403 pfam03477: ATP cone domain. 43404 pfam03478: Protein of unknown function (DUF295). This family of proteins are found in plants. The function of the proteins is unknown. 43405 pfam03479: Domain of unknown function (DUF296). This putative domain is found in proteins that contain AT-hook motifs pfam02178, which strongly suggests a DNA-binding function for the proteins as a whole, however the function of this domain is unknown. 43406 pfam03480: Bacterial extracellular solute-binding protein, family 7. This family of proteins are involved in binding extracellular solutes for transport across the bacterial cytoplasmic membrane. This family includes a C4-dicarboxylate-binding protein. 43407 pfam03481: SUA5 domain. The function of this domain is unknown, it is found in yeast SUA5 and its relatives. It is found to the C-terminus of pfam01300. 43408 pfam03482: sic protein. Serotype M1 group A Streptococcus strains cause epidemic waves of human infections. This family includes the sic protein an extracellular protein (streptococcal inhibitor of complement) that inhibits human complement. 43409 pfam03483: B3/4 domain. This domain is found in tRNA synthetase beta subunits as well as in some non tRNA synthetase proteins. 43410 pfam03484: tRNA synthetase B5 domain. This domain is found in phenylalanine-tRNA synthetase beta subunits. 43411 pfam03485: Arginyl tRNA synthetase N terminal domain. This domain is found at the amino terminus of Arginyl tRNA synthetase, also called additional domain 1 (Add-1). It is about 140 residues long and it has been suggested that this domain will be involved in tRNA recognition. 43412 pfam03486: HI0933-like protein. 43413 pfam03487: Interleukin-13. 43414 pfam03488: Nematode insulin-related peptide beta type. 43415 pfam03489: Saposin-like type B, region 2. 43416 pfam03490: Variant-surface-glycoprotein phospholipase C. 43417 pfam03491: Serotonin (5-HT) neurotransmitter transporter, N-terminus. 43418 pfam03492: SAM dependent carboxyl methyltransferase. This family of plant methyltransferases contains enzymes that act on a variety of substrates including salicylic acid, jasmonic acid and 7-Methylxanthine. Caffeine is synthesised through sequential three-step methylation of xanthine derivatives at positions 7-N, 3-N, and 1-N. The protein 7-methylxanthine methyltransferase (designated as CaMXMT) catalyses the second step to produce theobromine. 43419 pfam03493: Calcium-activated BK potassium channel alpha subunit. 43420 pfam03494: Beta-amyloid peptide (beta-APP).. 43421 pfam03495: Clostridial binary toxin B/anthrax toxin PA. 43422 pfam03496: Clostridial binary toxin A. 43423 pfam03497: Anthrax toxin LF subunit. 43424 pfam03498: Cytolethal distending toxin A. 43425 pfam03499: Cytolethal distending toxin C. 43426 pfam03500: Cellulose synthase subunit D. 43427 pfam03501: Plectin/S10 domain. This presumed domain is found at the N-terminus of some isoforms of the cytoskeletal muscle protein plectin as well as the ribosomal S10 protein. This domain may be involved in RNA binding. 43428 pfam03502: Nucleoside-specific channel-forming protein, Tsx. 43429 pfam03503: Chlamydia cysteine-rich outer membrane protein 3. 43430 pfam03504: Chlamydia cysteine-rich outer membrane protein 6. 43431 pfam03505: Clostridium enterotoxin. 43432 pfam03506: Influenza C non-structural protein (NS1). The influenza C virus genome consists of seven single-stranded RNA segments. The shortest RNA segment encodes a 286 amino acid non-structural protein NS1. This protein contains 6 conserved cysteines that may be functionally important, perhaps binding to a metal ion. 43433 pfam03507: CagA exotoxin. 43434 pfam03508: Gap junction alpha-1 protein (Cx43).. 43435 pfam03509: Gap junction alpha-8 protein (Cx50).. 43436 pfam03510: 2C endopeptidase (C24) cysteine protease family. 43437 pfam03511: Fanconi anaemia group A protein. 43438 pfam03512: Glycosyl hydrolase family 52. 43439 pfam03513: Cloacin immunity protein. 43440 pfam03514: GRAS family transcription factor. Sequence analysis of the products of the GRAS (GAI, RGA, SCR) gene family indicates that they share a variable amino-terminus and a highly conserved carboxyl-terminus that contains five recognisable motifs. Proteins in the GRAS family are transcription factors that seem to be involved in development and other processes. Mutation of the SCARECROW (SCR) gene results in a radial pattern defect, loss of a ground tissue layer, in the root. The PAT1 protein is involved in phytochrome A signal transduction. 43441 pfam03515: Cloacin. 43442 pfam03516: Filaggrin. 43443 pfam03517: Nucleotide-sensitive chloride conductance regulator (ICln).. 43444 pfam03518: Salmonella/Shigella invasin protein B. 43445 pfam03519: Invasion protein B family. 43446 pfam03520: KCNQ voltage-gated potassium channel. This family matches to the C-terminal tail of KCNQ type potassium channels. 43447 pfam03521: Kv2 voltage-gated K+ channel. 43448 pfam03522: K-Cl Co-transporter type 1 (KCC1).. 43449 pfam03523: Macrophage scavenger receptor. 43450 pfam03524: Conjugal transfer protein. This family includes type IV secretion system CagX conjugation protein. Other members of this family are involved in conjugal transfer to plant cells of T-DNA. 43451 pfam03525: Meiotic recombination protein rec114. 43452 pfam03526: Colicin E1 (microcin) immunity protein. 43453 pfam03527: RHS protein. 43454 pfam03528: Rabaptin. 43455 pfam03529: Otx1 transcription factor. 43456 pfam03530: Calcium-activated SK potassium channel. 43457 pfam03531: Structure-specific recognition protein. 43458 pfam03532: OMS28 porin. 43459 pfam03533: SPO11 homologue. 43460 pfam03534: Salmonella virulence plasmid 65kDa B protein. 43461 pfam03535: Paxillin family. 43462 pfam03536: Salmonella virulence-associated 28kDa protein. 43463 pfam03537: TM1410 hypothetical-related protein. 43464 pfam03538: Salmonella virulence plasmid 28.1kDa A protein. 43465 pfam03539: Spumavirus aspartic protease (A9).. 43466 pfam03540: Transcription initiation factor TFIID 23-30kDa subunit. 43467 pfam03541: Huntingtin. 43468 pfam03542: Tuberin. Tuberous sclerosis complex (TSC) is an autosomal dominant disorder and is characterized by the presence of hamartomas in many organs, such as brain, skin, heart, lung, and kidney. It is caused by mutation either TSC1 or TSC2 tumour suppressor gene. The TSC2 gene codes for tuberin and interacts with hamartin pfam04388, containing two coiled-coil regions, which have been shown to mediate binding to tuberin. These two proteins function within the same pathway(s) regulating cell cycle, cell growth, adhesion, and vesicular trafficking. 43469 pfam03543: Yersinia/Haemophilus virulence surface antigen. 43470 pfam03544: Gram-negative bacterial tonB protein. 43471 pfam03545: Yersinia virulence determinant (YopE).. 43472 pfam03546: Treacher Collins syndrome protein Treacle. 43473 pfam03547: Auxin Efflux Carrier. This family of transporters are found in all domains of life. 43474 pfam03548: Outer membrane lipoprotein carrier protein LolA. 43475 pfam03549: Translocated intimin receptor (Tir).. 43476 pfam03550: Outer membrane lipoprotein LolB. 43477 pfam03551: Transcriptional regulator PadR-like family. Members of this family are transcriptional regulators that appear to be related to the pfam01047 family. This family includes PadR, a protein that is involved in negative regulation of phenolic acid metabolism. 43478 pfam03552: Cellulose synthase. Cellulose, an aggregate of unbranched polymers of beta-1,4-linked glucose residues, is the major component of wood and thus paper, and is synthesised by plants, most algae, some bacteria and fungi, and even some animals. The genes that synthesise cellulose in higher plants differ greatly from the well-characterized genes found in Acetobacter and Agrobacterium sp. More correctly designated as 'cellulose synthase catalytic subunits', plant cellulose synthase (CesA) proteins are integral membrane proteins, approximately 1,000 amino acids in length. There are a number of highly conserved residues, including several motifs shown to be necessary for processive glycosyltransferase activity. 43479 pfam03553: Na+/H+ antiporter family. This family includes integral membrane proteins, some of which are NA+/H+ antiporters. 43480 pfam03554: UL73 viral envelope glycoprotein. This family groups together the viral proteins BLRF1, U46, 53, and UL73. The UL73-like envelope glycoproteins, which associates in a high molecular mass complex with its counterpart, gM, induce neutralising antibody responses in the host. These glycoprotein are highly polymorphic, particularly in the N-terminal region. 43481 pfam03555: Influenza C non-structural protein (NS2). The influenza C virus genome consists of seven single-stranded RNA segments. The shortest RNA segment encodes a 286 amino acid non-structural protein NS1 pfam03506 as well as the NS2 protein. The NS2 protein is only about 60 amino acids in length and of unknown function. 43482 pfam03556: Domain of unknown function (DUF298). Members of this family contain a basic helix-loop-helix leucine zipper motif. . 43483 pfam03557: Bunyavirus glycoprotein G1. Bunyavirus has three genomic segments: small (S), middle-sized (M), and large (L). The S segment encodes the nucleocapsid and a non-structural protein. The M segment codes for two glycoproteins, G1 and G2, and another non-structural protein (NSm). The L segment codes for an RNA polymerase. This family contains the G1 glycoprotein which is the viral attachment protein. 43484 pfam03558: TBSV core protein P21/P22. This protein is required for cell-to-cell movement in plants. Furthermore, the membrane-associated protein is dispensable for both replication and transcription. 43485 pfam03559: NDP-hexose 2,3-dehydratase. This family includes a range of proteins from antibiotic production pathways. The family includes gra-ORF27 product that probably functions at an early step, most likely as a dTDP-4-keto-6- deoxyglucose-2,3-dehydratase. Its homologues include dnmT from the daunorubicin biosynthetic gene cluster in S. peucetius, a similar gene from the daunomycin biosynthetic cluster in Streptomyces sp. strain C5, eryBVI from the erythromycin cluster in S. erythraea and snoH from the nogalamycin cluster in S. nogalater. The proteins in this family are composed of two copies of a 200 amino acid long unit that may be a structural domain. 43486 pfam03561: Allantoicase repeat. This family is found in pairs in Allantoicases, forming the majority of the protein. These proteins allow the use of purines as secondary nitrogen sources in nitrogen-limiting conditions through the reaction: allantoate + H(2)0 = (-)-ureidoglycolate + urea. 43487 pfam03562: MltA N-terminal domain. This presumed domain is found in MltA a murein degrading transglycosylase enzyme. It is not clear if this or pfam06725 is the catalytic portion of the protein. 43488 pfam03563: Bunyavirus glycoprotein G2. Bunyavirus has three genomic segments: small (S), middle-sized (M), and large (L). The S segment encodes the nucleocapsid and a non-structural protein. The M segment codes for two glycoproteins, G1 and G2, and another non-structural protein (NSm). The L segment codes for an RNA polymerase. This family contains the G2 glycoprotein which interacts with the pfam03557 G1 glycoprotein. 43489 pfam03564: Peptidase family A16. 43490 pfam03565: Glutaminyl cyclase. Glutaminyl cyclase catalyses the formation of the pyroglutamyl residue present at the amino terminus of numerous secretory peptides and proteins. Glutaminyl cyclase posses a zinc aminopeptidase domain in which the four functionally important histidines form the active site. It is hypothesised that mammalian glutaminyl cyclases may have structural and catalytic similarities bacterial zinc aminopeptidases. 43491 pfam03566: Peptidase family A21. 43492 pfam03567: Sulfotransferase. Chondroitin 6-sulfotransferase catalyses the transfer of sulfate to position 6 of the N-acetylgalactosamine residue of chondroitin. 43493 pfam03568: Peptidase family C50. 43494 pfam03569: Peptidase family C8. 43495 pfam03571: Peptidase family M49. 43496 pfam03572: Peptidase family S41B. 43497 pfam03573: outer membrane porin, OprD family. This family includes outer membrane proteins related to OprD. OprD has been described as a serine type peptidase. However the proposed catalytic residues are not conserved suggesting that many of these proteins are not peptidases. 43498 pfam03574: Peptidase family S48. 43499 pfam03575: Peptidase family S51. 43500 pfam03576: Peptidase family T4. 43501 pfam03577: Peptidase family U34. 43502 pfam03578: HGWP repeat. This short (30 amino acids) repeat is found in a number of plant proteins. It contains a conserved HGWP motif, hence its name. The function of these proteins is unknown. 43503 pfam03579: Small hydrophobic protein. The small hydrophobic integral membrane protein, SH (previously designated 1A) is found to have a variety of glycosylated forms. This protein is a component of the mature virion. 43504 pfam03580: Herpesvirus UL14-like protein. This is a family of Herpesvirus proteins including UL14. UL14 protein is a minor component of the virion tegument and is expressed late in infection. UL14 protein can influence the intracellular localisation patterns of a number of proteins belonging to the capsid or the DNA encapsidation machinery. 43505 pfam03581: Herpesvirus UL33-like protein. This is a family of Herpesvirus proteins including UL33, UL51. The proteins in this family are involved in packaging viral DNA. 43506 pfam03582: Herpesvirus UL34-like protein. This family includes the UL34 protein from herpesviruses. UL34 gene product is a membrane protein exclusively phosphorylated by the U(S)3 protein kinase. This protein forms a complex with pfam02718. 43507 pfam03583: Secretory lipase. These lipases are expressed and secreted during the infection cycle of these pathogens. In particular, C. albicans has a large number of different lipases, possibly reflecting broad lipolytic activity, which may contribute to the persistence and virulence of C. albicans in human tissue. 43508 pfam03584: Herpesvirus ICP4-like protein N-terminal region. The immediate-early protein ICP4 (infected-cell polypeptide 4) is required for efficient transcription of early and late viral genes and is thus essential for productive infection. ICP4 is a large phosphoprotein that binds DNA in a sequence specific manner as a homodimer. ICP4 represses transcription from LAT, ICP4 and ORF-P that have high-affinity a ICP4 binding site that spans the transcription initiation site. ICP4 proteins have two highly conserved regions, this family contains the N-terminal region that contains sites for DNA binding and homodimerisation. 43509 pfam03585: Herpesvirus ICP4-like protein C-terminal region. The immediate-early protein ICP4 (infected-cell polypeptide 4) is required for efficient transcription of early and late viral genes and is thus essential for productive infection. ICP4 is a large phosphoprotein that binds DNA in a sequence specific manner as a homodimer. ICP4 represses transcription from LAT, ICP4 and ORF-P that have high-affinity a ICP4 binding site that spans the transcription initiation site. ICP4 proteins have two highly conserved regions, this family contains the C-terminal region that probably acts as an enhancer for the N-terminal region. 43510 pfam03586: Herpesvirus UL36 tegument protein. The UL36 open reading frame (ORF) encodes the largest herpes simplex virus type 1 (HSV-1) protein, a 270-kDa polypeptide designated VP1/2, which is also a component of the virion tegument. A null mutation in the UL36 gene of herpes simplex virus type 1 results in accumulation of unenveloped DNA-filled capsids in the cytoplasm of infected cells. This family only covers a small central part of this large protein. 43511 pfam03587: Suppressor Mra1. The suppressor Mra1 is found in high-copy-number when Ras1 is mutated, that recovers the mating deficiency caused by the decrease of Ras1 activity. Mutational analysis in yeast suggests that the suppressor Mra1 is essential for cell growth and promotes mating. Members of this family are essential for 40S ribosomal biogenesis as they interfere with a methylation reaction during the early stages of pre-rRNA processing necessary for the generation of the ribosomal subunits. . 43512 pfam03588: Leucyl/phenylalanyl-tRNA protein transferase. 43513 pfam03589: Antitermination protein. 43514 pfam03590: Aspartate-ammonia ligase. 43515 pfam03591: AzlC protein. 43516 pfam03592: Terminase small subunit. Packaging of double-stranded viral DNA concatemers requires interaction of the prohead with virus DNA. This process is mediated by a phage-encoded DNA recognition and terminase protein. The terminase enzymes described so far, which are hetero-oligomers composed of a small and a large subunit, do not have a significant level of sequence homology. The small terminase subunit is thought to form a nucleoprotein structure that helps to position the terminase large subunit at the packaging initiation site. 43517 pfam03593: Nuclear movement protein. The nuclear movement protein or NudC, was first identified as a nuclear distribution (nud) gene that regulates nuclear movement in the filamentous fungus. The mammalian homologue of NudC interacts with Lis1, a neuronal migration protein important during neocorticogenesis. Nuclear movement and neuronal migration are thought to use a common mechanism. 43518 pfam03594: Benzoate membrane transport protein. 43519 pfam03595: C4-dicarboxylate transporter/malic acid transport protein. 43520 pfam03596: Cadmium resistance transporter. 43521 pfam03597: Cytochrome oxidase maturation protein cbb3-type. 43522 pfam03598: CO dehydrogenase/acetyl-CoA synthase complex beta subunit. 43523 pfam03599: CO dehydrogenase/acetyl-CoA synthase delta subunit. 43524 pfam03600: Citrate transporter. 43525 pfam03601: Conserved hypothetical protein 698. 43526 pfam03602: Conserved hypothetical protein 95. 43527 pfam03603: DNA polymerase III psi subunit. 43528 pfam03604: DNA directed RNA polymerase, 7 kDa subunit. 43529 pfam03605: Anaerobic c4-dicarboxylate membrane transporter. 43530 pfam03606: C4-dicarboxylate anaerobic carrier. 43531 pfam03607: Doublecortin. 43532 pfam03608: PTS system enzyme II sorbitol-specific factor. 43533 pfam03609: PTS system sorbose-specific iic component. 43534 pfam03610: PTS system fructose IIA component. 43535 pfam03611: PTS system Galactitol-specific IIC component. 43536 pfam03612: Sorbitol phosphotransferase enzyme II. 43537 pfam03613: PTS system mannose/fructose/sorbose family IID component. 43538 pfam03614: Repressor of phase-1 flagellin. 43539 pfam03615: GCM motif protein. 43540 pfam03616: Sodium/glutamate symporter. 43541 pfam03617: IBV 3A protein. The gene product of gene 3 from Avian infectious bronchitis virus. Currently, the function of this protein remains unknown. 43542 pfam03618: Domain of unknown function (DUF299). Family of bacterial proteins with no known function. 43543 pfam03619: Domain of unknown function. 43544 pfam03620: IBV 3C protein. Product of ORF 3C from Avian infectious bronchitis virus (IBV). Currently, the function of this protein remains unknown. 43545 pfam03621: MbtH-like protein. This domain is found in the MbtH protein, as well as at the N terminus of the antibiotic synthesis protein NIKP1. This domain is about 70 amino acids long and contains 3 fully conserved tryptophan residues. Many of the members of this family are found in known antibiotic synthesis gene clusters. 43546 pfam03622: IBV 3B protein. Product of ORF 3B from Avian infectious bronchitis virus (IBV). Currently, the function of this protein remains unknown. 43547 pfam03623: Focal adhesion targeting region. Focal adhesion kinase (FAK) is a tyrosine kinase found in focal adhesions, intracellular signaling complexes that are formed following engagement of the extracellular matrix by integrins. The C-terminal 'focal adhesion targeting' (FAT) region is necessary and sufficient for localising FAK to focal adhesions. The crystal structure of FAT shows it forms a four-helix bundle that resembles those found in two other proteins involved in cell adhesion, alpha-catenin and vinculin. The binding of FAT to the focal adhesion protein, paxillin, requires the integrity of the helical bundle, whereas binding to another focal adhesion protein, talin, does not. 43548 pfam03625: Domain of unknown function DUF302. Domain is found in an undescribed set of proteins. Normally occurs uniquely within a sequence, but is found as a tandem repeat. Shows interesting phylogenetic distribution with majority of examples in bacteria and archaea, but also in in D.melanogaster. 43549 pfam03626: Prokaryotic Cytochrome C oxidase subunit IV. Cytochrome c oxidase (COX) is a multi-subunit enzyme complex that catalyses the final step of electron transfer through the respiratory chain on the mitochondrial inner membrane. This family is composed of cytochrome c oxidase subunit 4 from prokaryotes. 43550 pfam03627: PapG carbohydrate binding domain. PapG, the adhesin of the P-pili, is situated at the tip and is only a minor component of the whole pilus structure. A two-domain structure has been postulated for PapG; a carbohydrate binding N-terminus (this domain) and chaperone binding C-terminus. The carbohydrate-binding domain interacts with the receptor glycan. 43551 pfam03628: PapG chaperone-binding domain. PapG, the adhesin of the P-pili, is situated at the tip and is only a minor component of the whole pilus structure. A two-domain structure has been postulated for PapG; a carbohydrate binding N-terminus and chaperone binding C-terminus (this domain). The chaperone-binding domain is highly conserved, and is essential for the correct assembly of the pili structure when aided by the chaperone molecule PapD. 43552 pfam03629: Domain of unknown function (DUF303). Distribution of this domain seems limited to prokaryotes and viruses. . 43553 pfam03630: Fumble. Fumble is required for cell division in Drosophila. Mutants lacking fumble exhibit abnormalities in bipolar spindle organisation, chromosome segregation, and contractile ring formation. Analyses have demonstrated that encodes three protein isoforms, all of which contain a domain with high similarity to the pantothenate kinases of A. nidulans and mouse. A role of fumble in membrane synthesis has been proposed. 43554 pfam03631: Ribonuclease BN-like family. This family contains integral membrane proteins with 5 to 6 predicted transmembrane spans. The family include ribonuclease BN that is involved in tRNA maturation. This family of proteins does not seem to contain any completely conserved polar residues that would be expected in a nuclease enzyme, suggesting that many members of this family may not have this catalytic activity. 43555 pfam03632: Glycosyl hydrolase family 65 central catalytic domain. This family of glycosyl hydrolases contains vacuolar acid trehalase and maltose phosphorylase.Maltose phosphorylase (MP) is a dimeric enzyme that catalyses the conversion of maltose and inorganic phosphate into beta-D-glucose-1-phosphate and glucose. The central domain is the catalytic domain, which binds a phosphate ion that is proximal the the highly conserved Glu. The arrangement of the phosphate and the glutamate is thought to cause nucleophilic attack on the anomeric carbon atom. The catalytic domain also forms the majority of the dimerisation interface. 43556 pfam03633: Glycosyl hydrolase family 65, C-terminal domain. This family of glycosyl hydrolases contains vacuolar acid trehalase and maltose phosphorylase.Maltose phosphorylase (MP) is a dimeric enzyme that catalyses the conversion of maltose and inorganic phosphate into beta-D-glucose-1-phosphate and glucose. The C-terminal domain forms a two layered jelly roll motif. This domain is situated at the base of the catalytic domain, however its function remains unknown. 43557 pfam03634: TCP family transcription factor. The cycloidea (cyc) and teosinte branched 1 (tb1) genes code for structurally related proteins implicated in the evolution of key morphological traits. However, the biochemical function of CYC and TB1 proteins remains to be demonstrated. One of the conserved regions is predicted to form a non-canonical basic-Helix-Loop-Helix (bHLP) structure. This domain is also found in two rice DNA-binding proteins, PCF1 and PCF2, where it has been shown to be involved in DNA-binding and dimerisation. This indicates a new family of transcription factors, which we have termed the TCP family after its first characterised members (TB1, CYC and PCFs).. 43558 pfam03635: Vacuolar protein sorting-associated protein 35. Vacuolar protein sorting-associated protein (Vps) 35 is one of around 50 proteins involved in protein trafficking. In particular, Vps35 assembles into a retromer complex with at least four other proteins Vps5, Vps17, Vps26 and Vps29. Vps35 contains a central region of weaker sequence similarity, thought to indicate the presence of at least three domains. 43559 pfam03636: Glycosyl hydrolase family 65, N-terminal domain. This family of glycosyl hydrolases contains vacuolar acid trehalase and maltose phosphorylase.Maltose phosphorylase (MP) is a dimeric enzyme that catalyses the conversion of maltose and inorganic phosphate into beta-D-glucose-1-phosphate and glucose. This domain is believed to be essential for catalytic activity, although its precise function remains unknown. 43560 pfam03637: Mob1/phocein family. Mob1 is an essential Saccharomyces cerevisiae protein, identified from a two-hybrid screen, that binds Mps1p, a protein kinase essential for spindle pole body duplication and mitotic checkpoint regulation. Mob1 contains no known structural motifs; however MOB1 is a member of a conserved gene family and shares sequence similarity with a nonessential yeast gene, MOB2. Mob1 is a phosphoprotein in vivo and a substrate for the Mps1p kinase in vitro. Conditional alleles of MOB1 cause a late nuclear division arrest at restrictive temperature. This family also includes phocein, a rat protein that by yeast two hybrid interacts with striatin. 43561 pfam03638: Tesmin/TSO1-like CXC domain. This family includes proteins that have two copies of a cysteine rich motif as follows: C-X-C-X4-C-X3-YC-X-C-X6-C-X3-C-X-C-X2-C. The family includes Tesmin and TSO1. This family is called a CXC domain in. 43562 pfam03639: Glycosyl hydrolase family 81. Family of eukaryotic beta-1,3-glucanases. 43563 pfam03640: Secreted repeat of unknown function. This family occurs as tandem repeats in a set of lipoproteins. The alignment contains a Y-X4-D motif. 43564 pfam03641: Possible lysine decarboxylase. The members of this family share a highly conserved motif PGGXGTXXE that is probably functionally important. This family includes proteins annotated as lysine decarboxylases, although the evidence for this is not clear. 43565 pfam03642: MAP domain. This presumed 110 amino acid residue domain is found in multiple copies in MAP (MHC class II analogue protein). The protein has been found to a wide range of extracellular matrix proteins. 43566 pfam03643: Vacuolar protein sorting-associated protein 26. Vacuolar protein sorting-associated protein (Vps) 26 is one of around 50 proteins involved in protein trafficking. In particular, Vps26 assembles into a retromer complex with at least four other proteins Vps5, Vps17, Vps29 and Vps35. This family also contains Down syndrome critical region 3/A. 43567 pfam03644: Glycosyl hydrolase family 85. Family of endo-beta-N-acetylglucosaminidases. These enzymes work on a broad spectrum of substrates. 43568 pfam03645: Tctex-1 family. Tctex-1 is a dynein light chain. It has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit this interaction. . 43569 pfam03646: FlaG protein. Although important for flagella the exact function of this protein is unknown. 43570 pfam03647: Uncharacterized protein family (UPF0136). This family of short membrane proteins are as yet uncharacterized. 43571 pfam03648: Glycosyl hydrolase family 67. Family of alpha-glucuronidase. Deletion mutants have indicated that the central region is responsible for the catalytic activity. Within this central domain, the invariant Glu and Asp (residues 391 and 364 respectively from B. stearothermophilus) are thought to from the the catalytic centre. 43572 pfam03649: Uncharacterised protein family (UPF0014).. 43573 pfam03650: Uncharacterised protein family (UPF0041).. 43574 pfam03651: Partner of SLD five, PSF1. The GINS complex is essential for initiation of DNA replication in Xenopus egg extracts. This 100 kD stable complex includes Sld5, Psf1, Psf2, and Psf3. Homologues of these components are found also in yeasts and in humans. 43575 pfam03652: Uncharacterised protein family (UPF0081).. 43576 pfam03653: Uncharacterized protein family (UPF0093).. 43577 pfam03654: Aromatic-Rich Protein Family. This family may be related to polyketide synthases (pfam03364).. 43578 pfam03656: Uncharacterized protein family (UPF0108).. 43579 pfam03657: Uncharacterised protein family (UPF0113).. 43580 pfam03658: Uncharacterized protein family (UPF0125).. 43581 pfam03659: Glycosyl hydrolase family 71. Family of alpha-1,3-glucanases. 43582 pfam03660: PHF5-like protein. This family of proteins the superfamily of PHD-finger proteins. At least one example, from mouse, may act as a chromatin-associated protein. 43583 pfam03661: Uncharacterized protein family (UPF0121). Uncharacterized integral membrane protein family. 43584 pfam03662: Glycosyl hydrolase family 79, N-terminal domain. Family of endo-beta-N-glucuronidase, or heparanase. Heparan sulfate proteoglycans (HSPGs) play a key role in the self- assembly, insolubility and barrier properties of basement membranes and extracellular matrices. Hence, cleavage of heparan sulfate (HS) affects the integrity and functional state of tissues and thereby fundamental normal and pathological phenomena involving cell migration and response to changes in the extracellular micro-environment. Heparanase degrades HS at specific intra-chain sites. The enzyme is synthesised as a latent approximately 65 kDa protein that is processed at the N-terminus into a highly active approximately 50 kDa form. Experimental evidence suggests that heparanase may facilitate both tumour cell invasion and neovascularization, both critical steps in cancer progression. The enzyme is also involved in cell migration associated with inflammation and autoimmunity. 43585 pfam03663: Glycosyl hydrolase family 76. Family of alpha-1,6-mannanases. 43586 pfam03664: Glycosyl hydrolase family 62. Family of alpha -L-arabinofuranosidase (EC 3.2.1.55). This enzyme hydrolysed aryl alpha-L-arabinofuranosides and cleaves arabinosyl side chains from arabinoxylan and arabinan. 43587 pfam03665: Uncharacterised protein family (UPF0172).. 43588 pfam03666: Uncharacterized protein family (UPF0171).. 43589 pfam03667: Uncharacterized protein family (UPF0174).. 43590 pfam03668: P-loop ATPase protein family. This family contains an ATP-binding site and could be an ATPase (personal obs:C Yeats).. 43591 pfam03669: Uncharacterised protein family (UPF0139).. 43592 pfam03670: Uncharacterized protein family (UPF0184).. 43593 pfam03671: Uncharacterized protein family (UPF0185). This family contains a number of small uncharacterized proteins including BM-002. 43594 pfam03672: Uncharacterised protein family (UPF0154). This family contains a set of short bacterial proteins of unknown function. 43595 pfam03673: Uncharacterized protein family (UPF0128). The members of this family are about 240 amino acids in length. The proteins are as yet uncharacterized. 43596 pfam03674: Uncharacterised protein family (UPF0131). This family of proteins are uncharacterised, however BtrG is part of a butirosin-biosynthetic gene cluster from Bacillus circulans. 43597 pfam03675: Uncharacterized protein family (UPF0132). This is a family of small integral membrane proteins found in some archaebacteria. 43598 pfam03676: Uncharacterised protein family (UPF0183). This family of proteins includes Lin-10 from C. elegans. . 43599 pfam03677: Uncharacterized protein family (UPF0137). This family includes GP6-D a virulence plasmid encoded protein. 43600 pfam03678: Hexon, adenovirus major coat protein, C-terminal domain. Hexon is the major coat protein from adenovirus type 2. Hexon forms a homo-trimer. The 240 copies of the hexon trimer are organised so that 12 lie on each of the 20 facets. The central 9 hexons in a facet are cemented together by 12 copies of polypeptide IX. The penton complex, formed by the peripentonal hexons and base hexon (holding in place a fibre), lie at each of the 12 vertices. The N and C-terminal domains adopt the same PNGase F-like fold although they are significantly different in length. 43601 pfam03680: Uncharacterized protein family (UPF0148). This small family of uncharacterized proteins contains a potential zinc binding motif. 43602 pfam03681: Uncharacterised protein family (UPF0150). This family of small proteins are uncharacterised. In one member this domain is found next to a DNA binding helix-turn-helix domain pfam01402, which suggests that this is some kind of ligand binding domain. 43603 pfam03682: Uncharacterised protein family (UPF0158).. 43604 pfam03683: Uncharacterized protein family (UPF0175). This family contains small proteins of unknown function. 43605 pfam03684: Uncharacterized protein family (UPF0179). The function of this family is unknown, however the proteins contain two cysteine clusters that may be iron sulphur redox centres. 43606 pfam03685: Uncharacterized protein family (UPF0147). This family of small proteins have no known function. 43607 pfam03686: Uncharacterized protein family (UPF0146). The function of this family of proteins is unknown. 43608 pfam03687: Uncharacterized protein family (UPF0164). This family of uncharacterized proteins are only found in Treponema pallidum. They contain a putative signal peptide so may be secreted proteins. 43609 pfam03688: Nepovirus coat protein, C-terminal domain. The members of this family are derived from nepoviruses. Together with comoviruses and picornaviruses, nepoviruses are classified in the picornavirus superfamily of plus strand single-stranded RNA viruses. This family aligns several nepovirus coat protein sequences. In several cases, this is found at the C-terminus of the RNA2-encoded viral polyprotein. The coat protein consists of three trapezoid-shaped beta-barrel domains, and forms a pseudo T = 3 icosahedral capsid structure. 43610 pfam03689: Nepovirus coat protein, N-terminal domain. The members of this family are derived from nepoviruses. Together with comoviruses and picornaviruses, nepoviruses are classified in the picornavirus superfamily of plus strand single-stranded RNA viruses. This family aligns several nepovirus coat protein sequences. In several cases, this is found at the C-terminus of the RNA2-encoded viral polyprotein. The coat protein consists of three trapezoid-shaped beta-barrel domains, and forms a pseudo T = 3 icosahedral capsid structure. 43611 pfam03690: Uncharacterized protein family (UPF0160). This family of proteins contains a large number of metal binding residues. The patterns are suggestive of a phosphoesterase function. The conserved DHH motif may mean this family is related to pfam01368. 43612 pfam03691: Uncharacterised protein family (UPF0167). The proteins in this family are about 200 amino acids long and each contain 3 CXXC motifs. 43613 pfam03692: Uncharacterised protein family (UPF0153). This family of proteins contain 8 conserved cysteines that may form a metal binding site. The function of these proteins is unknown but might be an Fe-S cluster as part of an oxidoreductase complex. 43614 pfam03693: Uncharacterized protein family (UPF0156). This family of proteins are about 80 amino acids in length and their function is unknown. The proteins contain a conserved GRY motif. 43615 pfam03694: Erg28 like protein. This is a family of integral membrane proteins, which may contain four transmembrane helices. Members of this family are thought to be involved in sterol C-4 demethylation. In S. cerevisiae they may tether Erg26p (sterol dehydrogenase/decarboxylase) and Erg27p (3-ketoreductase) to the endoplasmic reticulum or may facilitate interaction between these proteins. The family contains a conserved arginine and histidine that may be functionally important. 43616 pfam03695: Uncharacterised protein family (UPF0149). The protein in this family are about 190 amino acids long. The function of these proteins is unknown. 43617 pfam03696: Uncharacterized protein family (UPF0169). Members of this family are predicted to be lipoproteins. The function of these proteins is unknown. 43618 pfam03698: Uncharacterised protein family (UPF0180). The members of this family are small uncharacterised proteins. 43619 pfam03699: Uncharacterised protein family (UPF0182). This family contains uncharacterised integral membrane proteins. 43620 pfam03700: Sorting nexin, N-terminal domain. These proteins bins to the cytoplasmic domain of plasma membrane receptors. and are involved in endocytic protein trafficking. The N-terminal domain appears to be specific to sorting nexins 1 and 2. 43621 pfam03701: Uncharacterised protein family (UPF0181). This family contains small proteins of about 50 amino acids of unknown function. The family includes YoaH. 43622 pfam03702: Uncharacterised protein family (UPF0075). The proteins is this family are about 370 amino acids long and have no known function. 43623 pfam03703: Bacterial membrane flanked domain. Domain found in uncharacterised family of membrane proteins. 1-3 copies found in each protein, with each copy flanked by transmembrane helices. 43624 pfam03704: Bacterial transcriptional activator domain. Found in the DNRI/REDD/AFSR family of regulators. This region of AFSR along with the C terminal region is capable of independently directing actinorhodin production. 43625 pfam03705: CheR methyltransferase, all-alpha domain. CheR proteins are part of the chemotaxis signaling mechanism in bacteria. CheR methylates the chemotaxis receptor at specific glutamate residues. CheR is an S-adenosylmethionine- dependent methyltransferase. 43626 pfam03706: Uncharacterised protein family (UPF0104). This family of proteins are integral membrane proteins. These proteins are uncharacterised but contain a conserved PG motif. Some members of this family are annotated as dolichol-P-glucose synthetase and contain a pfam00535 domain. 43627 pfam03707: Bacterial signalling protein N terminal repeat. Found as an N terminal triplet tandem repeat in bacterial signalling proteins. Family includes CoxC and CoxH from P.carboxydovorans. Each repeat contains two transmembrane helices. Domain is also described as the MHYT domain. 43628 pfam03708: Avian retrovirus envelope protein, gp85. Family of a vain specific viral glycoproteins that forms a receptor-binding gp85 polypeptide that is linked through disulfide to a membrane-spanning gp37 spike. Gp85 confers a high degree of subgroup specificity for interaction with distinct cell receptors. 43629 pfam03709: Orn/Lys/Arg decarboxylase, N-terminal domain. This domain has a flavodoxin-like fold, and is termed the ""wing"" domain because of its position in the overall 3D structure. 43630 pfam03710: Glutamate-ammonia ligase adenylyltransferase. Conserved repeated domain found in GlnE proteins. These proteins adenylate and deadenylate glutamine synthases: ATP + {L-Glutamate:ammonia ligase (ADP-forming)} = Diphosphate + Adenylyl-{L-Glutamate:Ammonia ligase (ADP-forming)}. The family is related to the pfam01909 domain. 43631 pfam03711: Orn/Lys/Arg decarboxylase, C-terminal domain. 43632 pfam03712: Copper type II ascorbate-dependent monooxygenase, C-terminal domain. The N and C-terminal domains of members of this family adopt the same PNGase F-like fold. 43633 pfam03713: Domain of unknown function (DUF305). Domain found in small family of bacterial secreted proteins with no known function. Also found in Paramecium bursaria chlorella virus 1. This domain is short and found in one or two copies. The domain has a conserved HH motif that may be functionally important. 43634 pfam03714: Bacterial pullanase-associated domain. Domain is found in pullanase - carbohydrate de-branching - proteins. It is found both to the N or the C terminii of of the alpha-amylase active site region. This domain contains several conserved aromatic residues that are suggestive of a carbohydrate binding function. 43635 pfam03715: Uncharacterised protein family (UPF0120). At least one member, Nco2p from yeast, is required for a late step in 60S subunit export from the nucleus. It has also been shown to co-precipitate with Nug1p, a nuclear GTPase also required for ribosome nucleus export. 43636 pfam03716: WCCH motif. The WCCH motif is found in a retrotransposons and Gemini viruses. A specific function has not been associated to this motif. 43637 pfam03717: Penicillin-binding Protein dimerisation domain. This domain is found at the N terminus of Class B High Molecular Weight Penicillin-Binding Proteins. Its function has not been precisely defined, but is strongly implicated in PBP polymerisation. The domain forms a largely disordered 'sugar tongs' structure. . 43638 pfam03718: Glycosyl hydrolase family 49. Family of dextranase (EC 3.2.1.11) and isopullulanase (EC 3.2.1.57). Dextranase hydrolyses alpha-1,6-glycosidic bonds in dextran polymers. 43639 pfam03719: Ribosomal protein S5, C-terminal domain. 43640 pfam03720: UDP-glucose/GDP-mannose dehydrogenase family, UDP binding domain. The UDP-glucose/GDP-mannose dehydrogenaseses are a small group of enzymes which possesses the ability to catalyse the NAD-dependent 2-fold oxidation of an alcohol to an acid without the release of an aldehyde intermediate. 43641 pfam03721: UDP-glucose/GDP-mannose dehydrogenase family, NAD binding domain. The UDP-glucose/GDP-mannose dehydrogenaseses are a small group of enzymes which possesses the ability to catalyse the NAD-dependent 2-fold oxidation of an alcohol to an acid without the release of an aldehyde intermediate. 43642 pfam03722: Hemocyanin, all-alpha domain. This family includes arthropod hemocyanins and insect larval storage proteins. 43643 pfam03723: Hemocyanin, ig-like domain. This family includes arthropod hemocyanins and insect larval storage proteins. 43644 pfam03724: Domain of unknown function (306). Small domain family found in proteins of of unknown function. Some are secreted and implicated in motility in bacteria. Also occurs in Leishmania spp. as an essential gene. Over-expression in L.amazonensis increases virulence. A pair of cysteine residues show correlated conservation, suggesting that they form a disulphide bond. 43645 pfam03725: 3' exoribonuclease family, domain 2. This family includes 3'-5' exoribonucleases. Ribonuclease PH contains a single copy of this domain, and removes nucleotide residues following the -CCA terminus of tRNA. Polyribonucleotide nucleotidyltransferase (PNPase) contains two tandem copies of the domain. PNPase is involved in mRNA degradation in a 3'-5' direction. The exosome is a 3'-5' exoribonuclease complex that is required for 3' processing of the 5.8S rRNA. Three of its five protein components contain a copy of this domain. A hypothetical protein from S. pombe appears to belong to an uncharacterised subfamily. This subfamily is found in both eukaryotes and archaebacteria. 43646 pfam03726: Polyribonucleotide nucleotidyltransferase, RNA binding domain. This family contains the RNA binding domain of Polyribonucleotide nucleotidyltransferase (PNPase) PNPase is involved in mRNA degradation in a 3'-5' direction. 43647 pfam03727: Hexokinase. Hexokinase (EC:2.7.1.1) contains two structurally similar domains represented by this family and pfam00349. Some members of the family have two copies of each of these domains. 43648 pfam03728: Viral DNA-binding protein, zinc binding domain. This family represents the zinc binding domain of the viral DNA- binding protein, a multi functional protein involved in DNA replication and transcription control. Two copies of this domain are found at the C-terminus of many members of the family. 43649 pfam03729: Short repeat of unknown function (DUF308). Family of short repeats that occurs in a limited number of membrane proteins. It may divide further in short repeats of around 7-10 residues of the pattern G-#-X(2)-#(2)-X (#=hydrophobic).. 43650 pfam03730: Ku70/Ku80 C-terminal arm. The Ku heterodimer (composed of Ku70 and Ku80) contributes to genomic integrity through its ability to bind DNA double-strand breaks and facilitate repair by the non-homologous end-joining pathway. This is the C terminal arm. This alpha helical region embraces the beta-barrel domain pfam02735 of the opposite subunit. 43651 pfam03731: Ku70/Ku80 N-terminal alpha/beta domain. The Ku heterodimer (composed of Ku70 and Ku80) contributes to genomic integrity through its ability to bind DNA double-strand breaks and facilitate repair by the non-homologous end-joining pathway. This is the amino terminal alpha/beta domain. This domain only makes a small contribution to the dimer interface. The domain comprises a six stranded beta sheet of the Rossman fold. 43652 pfam03732: Retrotransposon gag protein. Gag or Capsid-like proteins from LTR retrotransposons. There is a central motif QGXXEXXXXXFXXLXXH that is common to Retroviridae gag-proteins, but is poorly conserved. 43653 pfam03733: Domain of unknown function (DUF307). Domain occurs as one or more copies in a small family of putative membrane proteins. 43654 pfam03734: ErfK/YbiS/YcfS/YnhG. This family of proteins are found in a range of bacteria. Th conserved region contains a conserved histidine and cysteine, suggesting that these proteins have an enzymatic activity. Several members of this family contain peptidoglycan binding domains. So these proteins may use peptidoglycan or a precursor as a substrate. 43655 pfam03735: ENT domain. This presumed domain is named after Emsy N Terminus (ENT). Emsy is a protein that is amplified in breast cancer and interacts with BRCA2. The N terminus of this protein is found to be similar to other vertebrate and plant proteins of unknown function. This domain has a completely conserved histidine residue that may be functionally important. 43656 pfam03736: EPTP domain. Mutations in the LGI/Epitempin gene can result in a special form of epilepsy, autosomal dominant lateral temporal epilepsy. The Epitempin protein contains a large repeat in its C terminal section. This presumed domain has no known function, but may be secreted. 43657 pfam03737: Demethylmenaquinone methyltransferase. Members of this family are demethylmenaquinone methyltransferases that convert dimethylmenaquinone (DMK) to menaquinone (MK) in the final step of menaquinone biosynthesis. This region is also found at the C-terminus of the DlpA protein. 43658 pfam03738: Glutathionylspermidine synthase. This region contains the Glutathionylspermidine synthase enzymatic activity EC:6.3.1.8. This is the C-terminal region in bienzymes. Glutathionylspermidine (GSP) synthetases of Trypanosomatidae and Escherichia coli couple hydrolysis of ATP (to ADP and Pi) with formation of an amide bond between spermidine and the glycine carboxylate of glutathione (gamma-Glu-Cys-Gly). In the pathogenic trypanosomatids, this reaction is the penultimate step in the biosynthesis of the antioxidant metabolite, trypanothione (N1,N8-bis-(glutathionyl)spermidine), and is a target for drug design. 43659 pfam03739: Predicted permease YjgP/YjgQ family. Members of this family are predicted integral membrane proteins of unknown function. They are about 350 amino acids long and contain about 6 transmembrane regions. They are predicted to be permeases although there is no verification of this. 43660 pfam03740: Pyridoxal phosphate biosynthesis protein PdxJ. Members of this family belong to the PdxJ family that catalyses the condensation of 1-deoxy-d-xylulose-5-phosphate (DXP) and 1-amino-3-oxo-4-(phosphohydroxy)propan-2-one to form pyridoxine 5'-phosphate (PNP). This reaction is involved in de novo synthesis of pyridoxine (vitamin B6) and pyridoxal phosphate. 43661 pfam03741: Integral membrane protein TerC family. This family contains a number of integral membrane proteins that also contains the TerC protein. TerC has been implicated in resistance to tellurium. This protein may be involved in efflux of tellurium ions. The tellurite-resistant Escherichia coli strain KL53 was found during testing of the group of clinical isolates for antibiotics and heavy metal ion resistance. Determinant of the tellurite resistance of the strain was located on a large conjugative plasmid. Analyses showed, the genes terB, terC, terD and terE are essential for conservation of the resistance. The members of the family contain a number of conserved aspartates that could be involved in binding to metal ions. 43662 pfam03742: PetN. PetN is a small hydrophobic protein, crucial for cytochrome b6-f complex assembly and/or stability. . 43663 pfam03743: Bacterial conjugation TrbI-like protein. Although not essential for conjugation, the TrbI protein greatly increase the conjugational efficiency. 43664 pfam03744: 6-carboxyhexanoate--CoA ligase. This family contains the enzyme 6-carboxyhexanoate--CoA ligase EC:6.2.1.14. This enzyme is involved in the first step of biotin synthesis, where it converts pimelate into pimeloyl-CoA. The enzyme requires magnesium as a cofactor and forms a homodimer. 43665 pfam03745: Domain of unknown function (DUF309). This domain is found in eubacterial and archaebacterial proteins of unknown function. The proteins contain a motif HXXXEXX(W/Y) where X can be any amino acid. This motif is likely to be functionally important and may be involved in metal binding. 43666 pfam03746: LamB/YcsF family. This family includes LamB. The lam locus of Aspergillus nidulans consists of two divergently transcribed genes, lamA and lamB, involved in the utilisation of lactams such as 2-pyrrolidinone. Both genes are under the control of the positive regulatory gene amdR and are subject to carbon and nitrogen metabolite repression. The exact molecular function of the proteins in this family is unknown. 43667 pfam03747: ADP-ribosylglycohydrolase. This family includes enzymes that ADP-ribosylations, for example ADP-ribosylarginine hydrolase EC:3.2.2.19 cleaves ADP-ribose-L-arginine. The family also includes dinitrogenase reductase activating glycohydrolase. Most surprisingly the family also includes jellyfish crystallins, these proteins appear to have lost the presumed active site residues. 43668 pfam03748: Flagellar basal body-associated protein FliL. This FliL protein controls the rotational direction of the flagella during chemotaxis. FliL is a cytoplasmic membrane protein associated with the basal body. 43669 pfam03749: Sugar fermentation stimulation protein. This family contains Sugar fermentation stimulation proteins. Which is probably a regulatory factor involved in maltose metabolism. SfsA has been shown to bind DNA and it contains a helix-turn-helix motif that probably binds DNA at its C-terminus. 43670 pfam03750: Protein of unknown function (DUF310). This family contains a number of archaeal proteins that are completely uncharacterized. The proteins are between 130 and 160 amino acids long. Their C-terminus contains several conserved residues. 43671 pfam03751: Spore maturation protein B. Spore maturation protein B (SpmB) is involved in spore core dehydration in Bacillus subtilis. Spore dehydration is important for heat resistance and for processing of the spore germination protease GPR into an active form. SpmB might be involved import or export from the forespore, or for modification of the cortex peptidoglycan structure. SpmB is predicted to be an integral membrane protein. 43672 pfam03752: Short repeats of unknown function. This set of repeats is found in a small family of secreted proteins of no known function, though they are possibly involved in signal transduction. ALF stands Alanine-rich (AL) - conserved Phenylalanine (F).. 43673 pfam03753: Human herpesvirus 6 immediate early protein. The proteins in this family are poorly characterized, but an investigationhas indicated that the immediate early protein is required the down-regulation of MHC class I expression in dendritic cells. Human herpesvirus 6 immediate early protein is also referred to as U90. 43674 pfam03754: Domain of unknown function (DUF313). Family of proteins from Arabidopsis thaliana with uncharacterized function. 43675 pfam03755: YicC-like family, N-terminal region. Family of bacterial proteins. Although poorly characterised, the members of this protein family have been demonstrated to play a role in stationary phase survival. These proteins are not essential during stationary phase. 43676 pfam03756: A-factor biosynthesis repeat. The AfsA family are key enzymes in A-factor biosynthesis, which is essential for streptomycin production and resistance. 43677 pfam03757: Uncharacterized protein (DUF314).. 43678 pfam03758: Senescence marker protein-30 (SMP-30). SMP-30, also known as regucalcin, seems to play a critical role in the highly differentiated functions of the liver and kidney and to exert a major impact on Ca2+ homeostasis. 43679 pfam03759: Domain of unknown function (DUF315). Family of plant hypothetical proteins. 43680 pfam03760: Late embryogenesis abundant (LEA) group 1. Family members are conserved along the entire coding region, especially within the hydrophobic internal 20 amino acid motif, which may be repeated. 43681 pfam03761: Domain of unknown function (DUF316). Family of uncharacterized proteins from C. elegans. 43682 pfam03762: Vitelline membrane outer layer protein I (VOMI). VOMI binds tightly to ovomucin fibrils of the egg yolk membrane. The structure consists of three beta-sheets forming Greek key motifs, which are related by an internal pseudo three-fold symmetry. Furthermore, the structure of VOMI has strong similarity to the structure of the delta-endotoxin, as well as a carbohydrate-binding site in the top region of the common fold. 43683 pfam03763: Remorin, C-terminal region. Remorin binds both simple and complex galaturonides. The N-terminal region of remorin is proline rich, while the C-terminal region has been predicted to form a coiled- coil, that is expected to interact with other macromolecules, most likely DNA. Functional similarities between the behaviour of the proteins and viral proteins involved in intercellular communication have been noted. 43684 pfam03764: Elongation factor G, domain IV. This domain is found in elongation factor G, elongation factor 2 and some tetracycline resistance proteins and adopts a ribosomal protein S5 domain 2-like fold. 43685 pfam03765: CRAL/TRIO, N-terminus. This all-alpha domain is found to the N-terminus of pfam00650. 43686 pfam03766: Remorin, N-terminal region. Remorin binds both simple and complex galaturonides. The N-terminal region of remorin is proline rich, while the C-terminal region has been predicted to form a coiled- coil, that is expected to interact with other macromolecules, most likely DNA. Functional similarities between the behaviour of the proteins and viral proteins involved in intercellular communication have been noted. 43687 pfam03767: HAD superfamily, subfamily IIIB (Acid phosphatase). This family proteins includes acid phosphatases and a number of vegetative storage proteins. . 43688 pfam03768: Attacin, N-terminal region. This family includes attacin and sarcotoxin, but not diptericin (which share similarity to the C-terminal region of attacin). All members of this family are insect antibacterial proteins which are induced by the fat body and subsequently released into secreted into the hemolymph where they act synergistically to kill the invading microorganism. 43689 pfam03769: Attacin, C-terminal region. This family includes attacin, sarcotoxin and diptericin. All members of this family are insect antibacterial proteins which are induced by the fat body and subsequently released into secreted into the hemolymph where they act synergistically to kill the invading microorganism. 43690 pfam03770: Inositol polyphosphate kinase. ArgRIII has has been demonstrated to be an inositol polyphosphate kinase. . 43691 pfam03771: Domain of unknown function (DUF317). This a sequence family found in a set of bacterial proteins with no known function. This domain is currently only found in streptomyces bacteria. Most proteins contain two copies of this domain. 43692 pfam03772: Competence protein. Members of this family are integral membrane proteins with 6 predicted transmembrane helices. Some members of this family have been shown to be essential for bacterial competence in uptake of extracellular DNA. These proteins may transport DNA across the cell membrane. These proteins contain a highly conserved motif in the amino terminal transmembrane region that has two histidines that may form a metal binding site. 43693 pfam03773: Predicted permease. This family of integral membrane proteins are predicted to be permeases of unknown specificity. 43694 pfam03775: Septum formation inhibitor MinC, C-terminal domain. In Escherichia coli ftsZ assembles into a Z ring at midcell while assembly at polar sites is prevented by the min system. MinC, a component of this system, is an inhibitor of FtsZ assembly that is positioned within the cell by interaction with MinDE. MinC is an oligomer, probably a dimer. The C terminal half of MinC is the most conserved and interacts with MinD. The N terminal half is thought interact with FtsZ. 43695 pfam03776: Septum formation topological specificity factor MinE. The E. coli minicell locus was shown to code for three gene products (MinC, MinD, and MinE) whose coordinate action is required for proper placement of the division septum. The minE gene codes for a topological specificity factor that, in wild-type cells, prevents the division inhibitor from acting at internal division sites while permitting it to block septation at polar sites. 43696 pfam03777: Small secreted domain (DUF320). Small domain found in a family of secreted streptomyces proteins. It occurs singly or as a pair. Many of the domains have two cysteines that may form a disulphide bridge. 43697 pfam03778: Protein of unknown function (DUF321). This family may be related to the FARP (FMRFamide) family, pfam01581. Currently this repeat was only detectable in Arabidopsis thaliana. . 43698 pfam03779: SPW repeat. A short repeat found in a small family of membrane-bound proteins. This repeat contains a conserved SPW motif in the first of two transmembrane helices. 43699 pfam03780: Protein of unknown function (DUF322). This is a family of small proteins. It includes a protein identified as an alkaline shock protein, so may be involved in stress response. 43700 pfam03781: Domain of unknown function (DUF323). This presumed domain is found in bacterial proteins. In some cases these proteins also contain a protein kinase domain. The function of this domain is unknown. 43701 pfam03782: AMOP domain. This domain may have a role in cell adhesion. It is called the AMOP domain after Adhesion associated domain in MUC4 and Other Proteins. This domain is extracellular and contains a number of cysteines that probably form disulphide bridges. 43702 pfam03783: Curli production assembly/transport component CsgG. CsgG is an outer membrane-located lipoprotein that is highly resistant to protease digestion. During curli assembly, an adhesive surface fibre, CsgG is required to maintain the stability of CsgA and CsgB. 43703 pfam03784: Cyclotide family. This family contains a set of cyclic peptides with a variety of activities. The structure consists of a distorted triple-stranded beta-sheet and a cysteine-knot arrangement of the disulfide bonds. 43704 pfam03785: Peptidase family C25, C terminal ig-like domain. 43705 pfam03786: D-mannonate dehydratase (UxuA). UxuA (this family) and UxuB are required for hexuronate degradation. 43706 pfam03787: Protein of unknown function (DUF324). The members of this family have no known function they are around 300 amino acids in length and have two conserved motifs. At the N terminus is a PXXIG motif and a more strongly conserved motif in the central region YXPGXXXKGXXR where X can be any amino acid. 43707 pfam03788: LrgA family. This family is uncharacterised. It contains the protein LrgA that has been hypothesised to export murein hydrolases. 43708 pfam03789: ELK domain. This domain is required for the nuclear localisation of these proteins. All of these proteins are members of the Tale/Knox homeodomain family, a subfamily within homeobox pfam00046. 43709 pfam03790: KNOX1 domain. The MEINOX region is comprised of two domains, KNOX1 and KNOX2. KNOX1 plays a role in suppressing target gene expression. KNOX2, essential for function, is thought to be necessary for homo-dimerisation. 43710 pfam03791: KNOX2 domain. The MEINOX region is comprised of two domains, KNOX1 and KNOX2. KNOX1 plays a role in suppressing target gene expression. KNOX2, essential for function, is thought to be necessary for homo-dimerisation. . 43711 pfam03792: PBX domain. The PBX domain is a bipartite acidic domain. . 43712 pfam03793: PASTA domain. This domain is found at the C termini of several Penicillin-binding proteins and bacterial serine/threonine kinases. It binds the beta-lactam stem, which implicates it in sensing D-alanyl-D-alanine - the PBP transpeptidase substrate. It is a small globular fold consisting of 3 beta-sheets and an alpha-helix. The name PASTA is derived from PBP and Serine/Threonine kinase Associated domain. 43713 pfam03794: Domain of Unknown function. This domain normally occurs as tandem repeats and is found in bacteria, yeast and plants. It contains two fully conserved histidines and one glutamate residue. Members of the family include DnrN, NorA and ScdA, which have been implicated in NO response and cell wall physiology. 43714 pfam03795: YCII-related domain. The majority of proteins in this family consist of a single copy of this domain, though it is also found as a repeat. A strongly conserved histidine and a aspartate suggest that the domain has an enzymatic function. 43715 pfam03796: DnaB-like helicase C terminal domain. The hexameric helicase DnaB unwinds the DNA duplex at the Escherichia coli chromosome replication fork. Although the mechanism by which DnaB both couples ATP hydrolysis to translocation along DNA and denatures the duplex is unknown, a change in the quaternary structure of the protein involving dimerisation of the N-terminal domain has been observed and may occur during the enzymatic cycle. This C-terminal domain contains an ATP-binding site and is therefore probably the site of ATP hydrolysis. 43716 pfam03797: Autotransporter beta-domain. Secretion of protein products occurs by a number of different pathways in bacteria. One of these pathways known as the type V pathway was first described for the IgA1 protease. The protein component that mediates secretion through the outer membrane is contained within the secreted protein itself, hence the proteins secreted in this way are called autotransporters. This family corresponds to the presumed integral membrane beta-barrel domain that transports the protein. This domain is found at the C terminus of the proteins it occurs in. The N terminus contains the variable passenger domain that is translocated across the membrane. Once the passenger domain is exported it is cleaved auto-catalytically in some proteins, in others a different protease is used and in some cases no cleavage occurs. 43717 pfam03798: Longevity-assurance protein (LAG1). Members of this family are involved in determining life span. The molecular mechanisms by which LAG1 determines longevity are unclear, although some evidence suggest a participation in ceramide synthesis. 43718 pfam03799: Cell division protein FtsQ. FtsQ is one of several cell division proteins. FtsQ interacts with other Fts proteins, reviewed in. The precise function of FtsQ is unknown. 43719 pfam03800: Nuf2 family. Members of this family are components of the mitotic spindle. It has been shown that Nuf2 from yeast is part of a complex called the Ndc80p complex. This complex is thought to bind to the microtubules of the spindle. An arabidopsis protein has been included in this family that has previously not been identified as a member of this family. The match is not strong, but in common with other members of this family contains coiled-coil to the C terminus of this region. 43720 pfam03801: HEC/Ndc80p family. Members of this family are components of the mitotic spindle. It has been shown that Ndc80/HEC from yeast is part of a complex called the Ndc80p complex. This complex is thought to bind to the microtubules of the spindle. 43721 pfam03802: Apo-citrate lyase phosphoribosyl-dephospho-CoA transferase. 43722 pfam03803: Scramblase. Scramblase is palmitoylated and contains a potential protein kinase C phosphorylation site. Scramblase exhibits Ca2+-activated phospholipid scrambling activity in vitro. There are also possible SH3 and WW binding motifs. Scramblase is involved in the redistribution of phospholipids after cell activation or injury. 43723 pfam03804: Viral domain of unknown function. 43724 pfam03805: Cytoadherence-linked asexual protein. Clag (cytoadherence linked asexual gene) is a malaria surface protein which has been shown to be involved in the binding of Plasmodium falciparum infected erythrocytes to host endothelial cells, a process termed cytoadherence. The cytoadherence phenomenon is associated with the sequestration of infected erythrocytes in the blood vessels of the brain, cerebral malaria. Clag is a multi-gene family in Plasmodium falciparum with at least 9 members identified to date. Orthologous proteins in the rodent malaria species Plasmodium chabaudi (Lawson D Unpubl. obs.) suggest that the gene family is found in other malaria species and may play a more generic role in cytoadherence. 43725 pfam03806: AbgT putative transporter family. 43726 pfam03807: NADP oxidoreductase coenzyme F420-dependent. 43727 pfam03808: Glycosyl transferase WecB/TagA/CpsF family. 43728 pfam03809: Putative hexulose-6-phosphate isomerase. 43729 pfam03810: Importin-beta N-terminal domain. 43730 pfam03811: Insertion element protein. 43731 pfam03812: 2-keto-3-deoxygluconate permease. 43732 pfam03813: Nrap protein. Members of this family are nucleolar RNA-associated proteins (Nrap) which are highly conserved from yeast (Saccharomyces cerevisiae) to human. In the mouse, Nrap is ubiquitously expressed and is specifically localised in the nucleolus. Nrap is a large nucleolar protein (of more than 1000 amino acids). Nrap appears to be associated with ribosome biogenesis by interacting with pre-rRNA primary transcript. 43733 pfam03814: Potassium-transporting ATPase A subunit. 43734 pfam03815: LCCL domain. 43735 pfam03816: Cell envelope-related transcriptional attenuator domain. 43736 pfam03817: Malonate transporter MadL subunit. 43737 pfam03818: Malonate/sodium symporter MadM subunit. 43738 pfam03819: MazG nucleotide pyrophosphohydrolase domain. This domain is about 100 amino acid residues in length. It is found in the MazG protein from E. coli. It contains four conserved negatively charged residues that probably form an active site or metal binding site. This domain is found in isolation in some proteins as well as associated with pfam00590. This domain is clearly related to pfam01503 another pyrophosphohydrolase involved in histidine biosynthesis. This family may be structurally related to the NUDIX domain pfam00293 (Bateman A pers. obs.).. 43739 pfam03820: Tricarboxylate carrier. 43740 pfam03821: Golgi 4-transmembrane spanning transporter. 43741 pfam03822: NAF domain. 43742 pfam03823: Neurokinin B. 43743 pfam03824: High-affinity nickel-transport protein. High affinity nickel transporters involved in the incorporation of nickel into H2-uptake hydrogenase and urease enzymes. Essential for the expression of catalytically active hydrogenase and urease. Ion uptake is dependent on proton motive force. HoxN in Alcaligenes eutrophus is thought to be an integral membrane protein with seven transmembrane helices. The family also includes a cobalt transporter. 43744 pfam03825: Nucleoside H+ symporter. 43745 pfam03826: OAR domain. 43746 pfam03827: Orexin receptor type 2. 43747 pfam03828: PAP/25A associated domain. 43748 pfam03829: PTS system glucitol/sorbitol-specific IIA component. 43749 pfam03830: PTS system sorbose subfamily IIB component. 43750 pfam03831: PhnA protein. 43751 pfam03832: Protein kinase A anchor. 43752 pfam03833: DNA polymerase II large subunit DP2. 43753 pfam03834: DNA repair protein rad10. 43754 pfam03835: DNA repair protein Rad4. 43755 pfam03836: RasGAP C-terminus. 43756 pfam03837: RecT family. The DNA single-strand annealing proteins (SSAPs), such as RecT, Red-beta, ERF and Rad52, function in RecA-dependent and RecA-independent DNA recombination pathways. This family includes proteins related to RecT. 43757 pfam03838: Recombination protein U. 43758 pfam03839: Translocation protein Sec62. 43759 pfam03840: Preprotein translocase SecG subunit. 43760 pfam03841: L-seryl-tRNA selenium transferase. 43761 pfam03842: Silicon transporter. 43762 pfam03843: Outer membrane lipoprotein Slp family. 43763 pfam03845: Spore germination protein. 43764 pfam03846: Cell division inhibitor SulA. 43765 pfam03847: Transcription initiation factor TFIID subunit A. 43766 pfam03848: Tellurite resistance protein TehB. 43767 pfam03849: Transcription factor Tfb2. 43768 pfam03850: Transcription factor Tfb4. 43769 pfam03851: UV-endonuclease UvdE. 43770 pfam03852: DNA mismatch endonuclease Vsr. 43771 pfam03853: YjeF-related protein N-terminus. 43772 pfam03854: P-11 zinc finger. 43773 pfam03855: M-factor. The M-factor is a pheromone produce upon nitrogen starvation. The production of M-factor is increased by the pheromone signal. The protein undergoes post-translational modification, to remove the C-terminal signal peptide, the carboxy-terminal cysteine residue is carboxy-methylated and S-alkylated, with a farnesyl residue. 43774 pfam03856: Beta-glucosidase (SUN family). Members of this family include Nca3, Sun4 and Sim1. This is a family of yeast proteins, involved in a diverse set of functions (DNA replication, aging, mitochondrial biogenesis and cell septation). BGLA from Candida wickerhamii has been characterized as a Beta-glucosidase EC:3.2.1.21. 43775 pfam03857: Colicin immunity protein. Colicin immunity proteins are plasmid encoded proteins necessary for protecting the cell against colicins. Colicins are toxins released by bacteria during times of stress. 43776 pfam03858: Crustacean neurohormone H. These proteins are referred to as precursor-related peptides as they are typically co-transcribed and translated with the CHH neurohormone (pfam01147). However, in some species this neuropeptide is synthesised as a separate protein. Furthermore, neurohormone H can undergo proteolysis to give rise to 5 different neuropeptides. 43777 pfam03859: CG-1 domain. CG-1 domains are highly conserved domains of about 130 amino-acid residues containing a predicted bipartite NLS and named after a partial cDNA clone isolated from parsley encoding a sequence-specific DNA-binding protein. CG-1 domains are associated with CAMTA proteins (for CAlModulin -binding Transcription Activator) that are transcription factors containing a calmodulin-binding domain and ankyrins (ANK) motifs. 43778 pfam03860: Domain of Unknown Function (DUF326). This family is a small cysteine-rich repeat. The cysteines mostly follow a C-X(2)-C-X(3)-C-X(2)-C-X(3) pattern, though they often appear at other positions in the repeat as well. 43779 pfam03861: ANTAR domain. ANTAR (AmiR and NasR transcription antitermination regulators) is an RNA-binding domain found in bacterial transcription antitermination regulatory proteins. The majority of the domain consists of a coiled-coil. . 43780 pfam03862: SpoVA protein. Members of this family are all transcribed from the spoVA operon. These proteins are poorly characterised, but are thought to be involved in dipicolinic acid transport into the developing forespore during sporulation. 43781 pfam03863: Phage maturation protein. 43782 pfam03864: Phage major capsid protein E. Major capsid protein E is involved with the stabilisation of the condensed form of the DNA molecule in phage heads. 43783 pfam03865: Hemolysin activator HlyB. Hemolysin (HlyA) and related toxins are secreted across both the cytoplasmic and outer membranes of Gram-negative bacteria in a process which proceeds without a periplasmic intermediate. HlyA is directed by an uncleaved C-terminal targeting signal and the HlyD and HlyB translocator proteins. 43784 pfam03866: Hydrophobic abundant protein (HAP). Expression of HAP is thought to be developmentally regulated and possibly involved in spherule cell wall formation. 43785 pfam03867: Fushi tarazu (FTZ), N-terminal region. This region contains the important motif (LXXLL) necessary for the interaction of FTZ with the nuclear receptor FTZ-F1. FTZ is thought to represents a category of LXXLL motif-dependent co-activators for nuclear receptors. 43786 pfam03868: Ribosomal protein L6, N-terminal domain. 43787 pfam03869: Arc-like DNA binding domain. Arc repressor act by he cooperative binding of two Arc repressor dimers to a 21-base-pair operator site. Each Arc dimer uses an antiparallel beta-sheet to recognise bases in the major groove. 43788 pfam03870: RNA polymerase Rpb8. Rpb8 is a subunit common to the three yeast RNA polymerases, pol I, II and III. Rpb8 interacts with the largest subunit Rpb1, and with Rpb3 and Rpb11, two smaller subunits. 43789 pfam03871: RNA polymerase Rpb5, N-terminal domain. Rpb5 has a bipartite structure which includes a eukaryote-specific N-terminal domain and a C-terminal domain resembling the archaeal RNAP subunit H. The N-terminal domain is involved in DNA binding and is part of the jaw module in the RNA pol II structure. This module is important for positioning the downstream DNA. 43790 pfam03872: Anti sigma-E protein RseA, N-terminal domain. Sigma-E is important for the induction of proteins involved in heat shock response. RseA binds sigma-E via its N-terminal domain, sequestering sigma-E and preventing transcription from heat-shock promoters. The C-terminal domain is located in the periplasm, and may interact with other protein that signal periplasmic stress. 43791 pfam03873: Anti sigma-E protein RseA, C-terminal domain. Sigma-E is important for the induction of proteins involved in heat shock response. RseA binds sigma-E via its N-terminal domain, sequestering sigma-E and preventing transcription from heat-shock promoters. The C-terminal domain is located in the periplasm, and may interact with other protein that signal periplasmic stress. 43792 pfam03874: RNA polymerase Rpb4. 43793 pfam03875: Statherin. Statherin functions biologically to inhibit the nucleation and growth of calcium phosphate minerals. The N-terminus of statherin is highly charge, the glutamic acids of which have been shown to be important in the recognition hydroxyapatite. 43794 pfam03876: RNA polymerase Rpb7, N-terminal domain. Rpb7 bind to Rpb4 to form a heterodimer. This complex is thought to interact with the nascent RNA strand during Pol II elongation. 43795 pfam03878: Hrf1 family. This family includes a number of eukaryotic proteins. It is an integral membrane protein, conserved in at least 1 copy in all sequenced eukaryotes. The gene name in S. pombe is hrf1+ for Heavy metal Resistance Factor 1 (unpublished).. 43796 pfam03879: Cgr1 family. Members of this family are coiled-coil proteins that are involved in pre-rRNA processing. 43797 pfam03880: DbpA RNA binding domain. This RNA binding domain is found at the C-terminus of a number of DEAD helicase proteins. It is sufficient to confer specificity for hairpin 92 of 23S rRNA, which is part of the ribosomal A-site. However, several members of this family lack specificity for 23S rRNA. These can proteins can generally be distinguished by a basic region that extends beyond this domain. 43798 pfam03881: Fructosamine kinase. This family includes eukaryotic fructosamine-3-kinase enzymes. The family also includes bacterial members that have not been characterised but probably have a similar or identical function. 43799 pfam03882: KicB killing factor. The kicA and kicB genes are found upstream of mukB. It has been suggested that the kicB gene encodes a killing factor and the kicA gene codes for a protein that suppresses the killing function of the kicB gene product. It was also demonstrated that KicA and KicB can function as a post-segregational killing system, when the genes are transferred from the E. coli chromosome onto a plasmid. 43800 pfam03883: Protein of unknown function (DUF328). Members of this family are functionally uncharacterised. They are about 250 amino acids in length. 43801 pfam03884: Domain of unknown function (DUF329). The function of this short domain is unknown it contains four conserved cysteines and may therefore be involved in zinc binding. 43802 pfam03885: Protein of unknown function (DUF327). The proteins in this family are around 140-170 residues in length. The proteins contain many conserved residues. with the most conserved motifs found in the central and C-terminal region. The function of these proteins is unknown. 43803 pfam03886: Protein of unknown function (DUF330). The proteins in this family are uncharacterised. The proteins are 170-190 amino residues in length. 43804 pfam03887: YfbU domain. This presumed domain is about 160 residues long. It is found in archaebacteria and eubacteria. In one member it is associated with a helix-turn-helix domain. This suggests that this may be a ligand binding domain. 43805 pfam03888: MucB/RseB family. Members of this family are regulators of the anti-sigma E protein RseD. 43806 pfam03889: Domain of unknown function. Members of this family are uncharacterised proteins from a number of bacterial species. The proteins range in size from 50-70 residues. 43807 pfam03890: Domain of unknown function (DUF332). This family consists of uncharacterised proteins of about 90 amino acid residues. The family contains several conserved histidines and aspartates that are characteristic of metal dependent enzymes (Bateman A pers. obs.).. 43808 pfam03891: Domain of unknown function (DUF333). This small domain of about 70 residues is found in a number of bacterial proteins. The proteins containing this domain are uncharacterized. 43809 pfam03892: Nitrate reductase cytochrome c-type subunit (NapB). The napB gene encodes a dihaem cytochrome c, the small subunit of a heterodimeric periplasmic nitrate reductase. 43810 pfam03893: Lipase 3 N-terminal region. N terminal region to pfam01764, found on a subset of Lipase 3 containing proteins. . 43811 pfam03894: D-xylulose 5-phosphate/D-fructose 6-phosphate phosphoketolase. Bacterial enzyme splits fructose-6-P and/or xylulose-5-P with the aid of inorganic phosphate into either acetyl-P and erythrose-4-P and/or acetyl-P and glyeraldehyde-3-P EC:4.1.2.9, EC:4.1.2.22. This family is distantly related to transketolases e.g. pfam02779. 43812 pfam03895: YadA-like C-terminal region. This region represents the C-terminal 120 amino acids of a family of surface-exposed bacterial proteins. YadA, an adhesin from Yersinia, was the first member of this family to be characterised. UspA2 from Moraxella was second. The Eib immunoglobulin-binding proteins from E. coli were third, followed by the DsrA proteins of Haemophilus ducreyi and others. These proteins are homologous at their C-terminal and have predicted signal sequences, but they diverge elsewhere. The C-terminal 9 amino acids, consisting of alternating hydrophobic amino acids ending in F or W, comprise a targeting motif for the outer membrane of the Gram negative cell envelope. This region is important for oligomerisation. 43813 pfam03896: Translocon-associated protein (TRAP), alpha subunit. The alpha-subunit of the TRAP complex (TRAP alpha) is a single-spanning membrane protein of the endoplasmic reticulum (ER) which is found in proximity of nascent polypeptide chains translocating across the membrane. . 43814 pfam03897: Carotene hydroxylase. Beta-carotene hydroxylase is involved in zeaxanthin synthesis by hydroxylating beta-carotene, but the enzyme may be involved in other pathways. 43815 pfam03898: Satellite tobacco necrosis virus coat protein. 43816 pfam03899: ATP synthase I chain. 43817 pfam03900: Porphobilinogen deaminase, C-terminal domain. 43818 pfam03901: Alg9-like mannosyltransferase family. Members of this family are mannosyltransferase enzymes. At least some members are localised in endoplasmic reticulum and involved in GPI anchor biosynthesis. 43819 pfam03902: Gal4-like dimerisation domain. 43820 pfam03903: Phage T4 tail fibre. 43821 pfam03904: Domain of unknown function (DUF334). Staphylococcus aureus plasmid proteins with no characterized function. 43822 pfam03905: Coronavirus non-structural protein NS4. 43823 pfam03906: Phage T7 tail fibre protein. The bacteriophage T7 tail complex consists of a conical tail-tube surrounded by six kinked tail-fibres, which are oligomers of the viral protein gp17. 43824 pfam03907: Spo7-like protein. S. cerevisiae Spo7 has an unknown function, but has a role in formation of a spherical nucleus and meiotic division. 43825 pfam03908: Sec20. Sec20 is a membrane glycoprotein associated with secretory pathway. 43826 pfam03909: BSD domain. This domain contains a distinctive -FW- motif. It is found in a family of eukaryotic transcription factors as well as a set of proteins of unknown function. 43827 pfam03910: Adenovirus minor core protein PV. 43828 pfam03911: Sec61beta family. This family consists of homologues of Sec61beta - a component of the Sec61/SecYEG protein secretory system. The domain is found in eukaryotes and archaea and is possibly homologous to the bacterial SecG. 43829 pfam03912: Photosystem II reaction centre W protein, PsbW. PsbW is directly assembled in dimeric PSII supercomplexes. The negatively charged N-terminal region is essential for this process. 43830 pfam03913: Amb V Allergen. 43831 pfam03914: CBF/Mak21 family. 43832 pfam03915: Actin interacting protein 3. 43833 pfam03916: Polysulphide reductase, NrfD. NrfD is an integral transmembrane protein with loops in both the periplasm and the cytoplasm. NrfD is thought to participate in the transfer of electrons, from the quinone pool into the terminal components of the Nrf pathway. . 43834 pfam03917: Eukaryotic glutathione synthase, ATP binding domain. 43835 pfam03918: Cytochrome C biogenesis protein. Members of this family include NrfF, CcmH, CycL, Ccl2. 43836 pfam03919: mRNA capping enzyme, C-terminal domain. 43837 pfam03920: Groucho/TLE N-terminal Q-rich domain. The N-terminal domain of the Grouch/TLE co-repressor proteins are involved in oligomerisation. 43838 pfam03921: Intercellular adhesion molecule (ICAM), N-terminal domain. ICAMs normally functions to promote intercellular adhesion and signalling. However, The N-terminal domain of the receptor binds to the rhinovirus 'canyon' surrounding the icosahedral 5-fold axes, during the viral attachment process. This family is a family that is part of the Ig superfamily and is therefore related to the family ig (pfam00047). . 43839 pfam03922: OmpW family. This family includes outer membrane protein W (OmpW) proteins from a variety of bacterial species. This protein may form the receptor for S4 colicins in E. coli. 43840 pfam03923: Uncharacterized lipoprotein. The function of this presumed lipoprotein is unknown. The family includes E. coli YajG. 43841 pfam03924: CHASE domain. This domain is found in the extracellular portion of receptor-like proteins - such as serine/threonine kinases and adenylyl cyclases. Predicted to be a ligand binding domain. 43842 pfam03925: SeqA protein. The binding of SeqA protein to hemimethylated GATC sequences is important in the negative modulation of chromosomal initiation at oriC, and in the formation of SeqA foci necessary for Escherichia coli chromosome segregation. SeqA tetramers are able to aggregate or multimerise in a reversible, concentration-dependent manner. Apart from its function in the control of DNA replication, SeqA may also be a specific transcription factor. 43843 pfam03926: Putative metallopeptidase (SprT family). This family of uncharacterised proteins may be zinc metallopeptidases. 43844 pfam03927: NapD protein. Uncharacterized protein involved in formation of periplasmic nitrate reductase. 43845 pfam03928: Domain of unknown function (DUF336). This family contains uncharacterised sequences, including several GlcG proteins. The alignment contains many conserved motifs that are suggestive of cofactor binding and enzymatic activity. 43846 pfam03929: Uncharacterized iron-regulated membrane protein (DUF337).. 43847 pfam03930: Recombinase Flp protein N-terminus. 43848 pfam03931: Skp1 family, tetramerisation domain. 43849 pfam03932: CutC family. Copper transport in Escherichia coli is mediated by the products of at least six genes, cutA, cutB, cutC, cutD, cutE, and cutF. A mutation in one or more of these genes results in an increased copper sensitivity. Members of this family are between 200 and 300 amino acids in length are found in both eukaryotes and bacteria. 43850 pfam03933: Matrix metalloprotease, N-terminal domain. This family is found N-terminal to the catalytic domain of matrixin. 43851 pfam03934: General secretion pathway protein K. Members of this family are involved in the general secretion pathway. The family includes proteins such as ExeK, PulK, OutX and XcpX. 43852 pfam03935: Beta-glucan synthesis-associated protein (SKN1). This family consists of the beta-glucan synthesis-associated proteins KRE6 and SKN1. Beta1,6-Glucan is a key component of the yeast cell wall, interconnecting cell wall proteins, beta1,3-glucan, and chitin. It has been postulated that the synthesis of beta1,6-glucan begins in the endoplasmic reticulum with the formation of protein-bound primer structures and that these primer structures are extended in the Golgi complex by two putative glucosyltransferases that are functionally redundant, Kre6 and Skn1. This is followed by maturation steps at the cell surface and by coupling to other cell wall macromolecules. . 43853 pfam03936: Terpene synthase family, metal binding domain. It has been suggested that this gene family be designated tps (for terpene synthase). It has been split into six subgroups on the basis of phylogeny, called tpsa-tpsf. tpsa includes vetispiridiene synthase, 5-epi- aristolochene synthase, and (+)-delta-cadinene synthase. tpsb includes (-)-limonene synthase. tpsc includes kaurene synthase A. tpsd includes taxadiene synthase, pinene synthase, and myrcene synthase. tpse includes kaurene synthase B. tpsf includes linalool synthase. 43854 pfam03937: TPR repeat region. This family represents a set of three divergent TPR repeats found in a small group of uncharacterised proteins. 43855 pfam03938: Outer membrane protein (OmpH-like). This family includes outer membrane proteins such as OmpH among others. 43856 pfam03939: Ribosomal protein L23, N-terminal domain. The N-terminal domain appears to be specific to the eukaryotic ribosomal proteins L25, L23, and L23a. 43857 pfam03940: Male specific sperm protein. This family of drosophila proteins are typified by the repetitive motif C-G-P. 43858 pfam03941: Inner centromere protein, ARK binding region. This region of the inner centromere protein has been found to be necessary and sufficient for binding to aurora-related kinase. This interaction has been implicated in the coordination of chromosome segregation with cell division in yeast. 43859 pfam03942: DTW domain. This presumed domain is found in bacterial and eukaryotic proteins. Its function is unknown. The domain contains multiple conserved motifs including a DTXW motif that this domain has been named after. 43860 pfam03943: TAP C-terminal domain. The vertebrate Tap protein is a member of the NXF family of shuttling transport receptors for nuclear export of mRNA. Tap has a modular structure, and its most C-terminal domain is important for binding to FG repeat-containing nuclear pore proteins (FG-nucleoporins) and is sufficient to mediate nuclear shuttling. The structure of the C-terminal domain is composed of four helices. The structure is related to the UBA domain. 43861 pfam03944: delta endotoxin. This family contains insecticidal toxins produced by Bacillus species of bacteria. During spore formation the bacteria produce crystals of this protein. When an insect ingests these proteins they are activated by proteolytic cleavage. The N terminus is cleaved in all of the proteins and a C terminal extension is cleaved in some members. Once activated the endotoxin binds to the gut epithelium and causes cell lysis leading to death. This activated region of the delta endotoxin is composed of three structural domains. The N-terminal helical domain is involved in membrane insertion and pore formation. The second and third domains are involved in receptor binding. 43862 pfam03945: delta endotoxin, N-terminal domain. This family contains insecticidal toxins produced by Bacillus species of bacteria. During spore formation the bacteria produce crystals of this protein. When an insect ingests these proteins they are activated by proteolytic cleavage. The N terminus is cleaved in all of the proteins and a C terminal extension is cleaved in some members. Once activated the endotoxin binds to the gut epithelium and causes cell lysis leading to death. This activated region of the delta endotoxin is composed of three structural domains. The N-terminal helical domain is involved in membrane insertion and pore formation. The second and third domains are involved in receptor binding. 43863 pfam03946: Ribosomal protein L11, N-terminal domain. The N-terminal domain of Ribosomal protein L11 adopts an alpha/beta fold and is followed by the RNA binding C-terminal domain. 43864 pfam03947: Ribosomal Proteins L2, C-terminal domain. 43865 pfam03948: Ribosomal protein L9, C-terminal domain. 43866 pfam03949: Malic enzyme, NAD binding domain. 43867 pfam03950: tRNA synthetases class I (E and Q), anti-codon binding domain. Other tRNA synthetase sub-families are too dissimilar to be included. This family includes only glutamyl and glutaminyl tRNA synthetases. In some organisms, a single glutamyl-tRNA synthetase aminoacylates both tRNA(Glu) and tRNA(Gln).. 43868 pfam03951: Glutamine synthetase, beta-Grasp domain. 43869 pfam03952: Enolase, N-terminal domain. 43870 pfam03953: Tubulin/FtsZ family, C-terminal domain. This family includes the tubulin alpha, beta and gamma chains, as well as the bacterial FtsZ family of proteins. Members of this family are involved in polymer formation. FtsZ is the polymer-forming protein of bacterial cell division. It is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ and tubulin are GTPases. FtsZ can polymerise into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria and archaea. Tubulin is the major component of microtubules. 43871 pfam03954: Hepatic lectin, N-terminal domain. 43872 pfam03955: Adenovirus hexon-associated protein (IX). Hexon (PF01065) is the major coat protein from adenovirus type 2. Hexon forms a homo-trimer. The 240 copies of the hexon trimer are organised so that 12 lie on each of the 20 facets. The central 9 hexons in a facet are cemented together by 12 copies of polypeptide IX. 43873 pfam03956: Membrane protein of unknown function (DUF340). Members of this family contain a conserved core of four predicted transmembrane segments. Some members have an additional pair of N-terminal transmembrane helices. The functions of the proteins in this family are unknown. 43874 pfam03957: Jun-like transcription factor. 43875 pfam03958: Bacterial type II/III secretion system short domain. This is a short, often repeated, domain found in bacterial type II/III secretory system proteins. All previous NolW-like domains fall into this family. 43876 pfam03959: Domain of unknown function (DUF341).. 43877 pfam03960: ArsC family. This family is related to glutaredoxins pfam00462. 43878 pfam03961: Protein of unknown function (DUF342). This family of bacterial proteins has no known function. The proteins are in the region of 500-600 amino acid residues in length. 43879 pfam03962: Mnd1 family. This family of proteins includes MND1 from S. cerevisiae. The mnd1 protein forms a complex with hop2 to promote homologous chromosome pairing and meiotic double-strand break repair. 43880 pfam03963: Flagellar hook capping protein. FlgD is known to be absolutely required for hook assembly, yet it has not been detected in the mature flagellum. It appears to act as a hook-capping protein to enable assembly of hook protein subunits. 43881 pfam03964: Chorion family 2. The chorion genes of Drosophila are amplified in response to developmental signals in the follicle cells of the ovary. . 43882 pfam03965: Penicillinase repressor. The penicillinase repressor negatively regulates expression of the penicillinase gene. The N-terminal region of this protein is involved in operator recognition, while the C-terminal is responsible for dimerisation of the protein. 43883 pfam03966: Protein of unknown function (DUF343). This family of short proteins have no known function. The bacterial members are about 60-70 amino acids in length and the eukaryotic examples are about 120 amino acids in length. The C terminus contains the strongest conservation. 43884 pfam03967: Photosynthetic reaction centre, H-chain N-terminal region. The family corresponds the N-terminal cytoplasmic domain. 43885 pfam03968: OstA-like protein. This family of proteins are mostly uncharacterised. However the family does include E. coli OstA that has been characterised as an organic solvent tolerance protein. 43886 pfam03969: AFG1-like ATPase. This family of proteins contains a P-loop motif and are predicted to be ATPases. 43887 pfam03970: Herpesvirus UL37 tegument protein. UL37 interacts with UL36, which is thought to be an important early step in tegumentation during virion morphogenesis in the cytoplasm. 43888 pfam03971: Monomeric isocitrate dehydrogenase. NADP(+)-dependent isocitrate dehydrogenase (ICD) is an important enzyme of the intermediary metabolism, as it controls the carbon flux within the citric acid cycle and supplies the cell with 2-oxoglutarate EC:1.1.1.42 and NADPH for biosynthetic purposes. 43889 pfam03972: MmgE/PrpD family. This family includes 2-methylcitrate dehydratase EC:4.2.1.79 (PrpD) that is required for propionate catabolism. It catalyses the third step of the 2-methylcitric acid cycle. 43890 pfam03973: Triabin. Triabin is a serine-protease inhibitor. 43891 pfam03974: Ecotin. Ecotin is a broad range serine protease inhibitor, which forms homodimers. The C-terminal region contains the dimerisation motif. Interestingly, the binding sites show a fluidity of protein contacts binding sites show a fluidity of protein contacts derived from ecotin's innate flexibility in fitting itself to proteases while. 43892 pfam03975: CheD. This chemotaxis protein stimulates methylation of MCP proteins. 43893 pfam03976: Domain of unknown function (DUF344). This presumed domain is found in one or two copies per protein. The domain is about 230 amino acids in length and has many conserved motifs that are probably functionally important. 43894 pfam03977: Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit. Members of this family are integral membrane proteins. The decarboxylation reactions they catalyse are coupled to the vectorial transport of Na+ across the cytoplasmic membrane, thereby creating a sodium ion motive force that is used for ATP synthesis. 43895 pfam03978: Borrelia burgdorferi REV protein. This family consists of several REV proteins from Borrelia burgdorferi (Lyme disease spirochete). The function of REV is unknown although it known that gene is induced during the ingesting of host blood suggesting a role in the metabolic activation of borreliae to adapt to physiological stimuli. 43896 pfam03979: Sigma-70 factor, region 1.1. Region 1.1 modulates DNA binding by region 2 and 4 when sigma is unbound by the core RNA polymerase. Region 1.1 is also involved in promoter binding. 43897 pfam03980: Nnf1. NNF1 is an essential yeast gene required for proper spindle orientation, nucleolar and nuclear envelope structure and mRNA export. 43898 pfam03981: Ubiquinol-cytochrome C chaperone. 43899 pfam03982: Diacylglycerol acyltransferase. The terminal step of triacylglycerol (TAG) formation is catalysed by the enzyme diacylglycerol acyltransferase (DAGAT).. 43900 pfam03983: SLA1 homology domain 1, SHD1. NPFXD peptides specifically interact with the SHD1 domain. NPFXD is a clathrin-facilitated endocytic targeting signal. NPFXD was originally discovered in the cytoplasmic domain of the furin-like protease Kex2p. Sla1 is thought to function as an endocytic adaptor. 43901 pfam03984: Repeat of unknown function (DUF346). This repeat was found as seven tandem copies in one protein. It is predicted to be composed of beta-strands. Thus it is likely that it forms a beta-propeller structure. It is found in association with BNR repeats, which also form a beta-propeller. 43902 pfam03985: Paf1. Members of this family are components of the RNA polymerase II associated Paf1 complex. The Paf1 complex functions during the elongation phase of transcription in conjunction with Spt4-Spt5 and Spt16-Pob3i. . 43903 pfam03986: Autophagocytosis associated protein, N-terminal domain. Autophagocytosis is a starvation-induced process responsible for transport of cytoplasmic proteins to the vacuole. 43904 pfam03987: Autophagocytosis associated protein, C-terminal domain. Autophagocytosis is a starvation-induced process responsible for transport of cytoplasmic proteins to the vacuole. 43905 pfam03988: Repeat of Unknown Function (DUF347). This repeat is found as four tandem repeats in a family of bacterial membrane proteins. Each repeat contains two transmembrane regions and a conserved tryptophan. 43906 pfam03989: DNA gyrase C-terminal domain, beta-propeller. This repeat is found as 6 tandem copies at the C-termini of GyrA and ParC DNA gyrases. It is predicted to form 4 beta strands and to probably form a beta-propeller structure. This region has been shown to bind DNA non-specifically and may stabilise the DNA-topoisomerase complex. 43907 pfam03990: Domain of unknown function (DUF348). This domain normally occurs as tandem repeats; however it is found as a single copy in the S. cerevisiae DNA-binding nuclear protein YCR593. . 43908 pfam03991: Copper binding octapeptide repeat. This repeat is found at the amino terminus of prion proteins. It has been shown to bind to copper. 43909 pfam03992: Antibiotic biosynthesis monooxygenase. This domain is found in monooxygenases involved in the biosynthesis of several antibiotics by Streptomyces species. It's occurrence as a repeat in Streptomyces coelicolor SCO1909 is suggestive that the other proteins function as multimers. There is also a conserved histidine which is likely to be an active site residue. 43910 pfam03993: Domain of Unknown Function (DUF349). This domain is found singly or as up to five tandem repeats in a small set of bacterial proteins. There are two or three alpha-helices, and possibly a beta-strand. 43911 pfam03994: Domain of Unknown Function (DUF350). This domain occurs in a small set of of bacterial proteins. It has two transmembrane regions, and often occurs as tandem repeats. The are no conserved catalytic residues. 43912 pfam03995: Domain of Unknown Function (DUF351). This domain is currently only found in a small set of S. coelicolor secreted proteins. There are four conserved cysteines that probably form two disulphide bonds. Proteins 2SCK31.15C and SCO3675 also have probable beta-propellers at their C-termini. 43913 pfam03996: Hemagglutinin esterase. 43914 pfam03997: VPS28 protein. 43915 pfam03998: Utp11 protein. This protein is found to be part of a large ribonucleoprotein complex containing the U3 snoRNA. Depletion of the Utp proteins impedes production of the 18S rRNA, indicating that they are part of the active pre-rRNA processing complex. This large RNP complex has been termed the small subunit (SSU) processome. 43916 pfam03999: Microtubule associated protein (MAP65/ASE1 family).. 43917 pfam04000: Sas10/Utp3 family. This family contains Sas10 which hash been identified as a regulator of chromatin silencing. The family also contains Utp3 a component of the U3 ribonucleoprotein complex. The exact molecular function of this family is unknown. 43918 pfam04001: Domain of unknown function (DUF352). Domain of unknown function found in yeast. 43919 pfam04002: RadC, DNA repair protein. RadC plays a role in repair of DNA damage after UV and X-ray irradiation in prokaryotes. The E. coli radC gene encodes a RecG-like DNA recombination/repair function. RadC may function specifically in recombinational repair that is associated with the replication fork. 43920 pfam04003: Dip2/Utp12 Family. This domain is found at the C-terminus of proteins containing WD40 repeats. These proteins are part of the U3 ribonucleoprotein the yeast protein is called Utp12 or DIP2. 43921 pfam04004: Leo1-like protein. Members of this family are part of the Paf1/RNA polymerase II complex. The Paf1 complex probably functions during the elongation phase of transcription. 43922 pfam04005: Hus1-like protein. Hus1, Rad1, and Rad9 are three evolutionarily conserved proteins required for checkpoint control in fission yeast. These proteins are known to form a stable complex in vivo. Hus1-Rad1-Rad9 complex may form a PCNA-like ring structure, and could function as a sliding clamp during checkpoint control. 43923 pfam04006: Mpp10 protein. This family includes proteins related to Mpp10 (M phase phosphoprotein 10). The U3 small nucleolar ribonucleoprotein (snoRNP) is required for three cleavage events that generate the mature 18S rRNA from the pre-rRNA. In Saccharomyces cerevisiae, depletion of Mpp10, a U3 snoRNP-specific protein, halts 18S rRNA production and impairs cleavage at the three U3 snoRNP-dependent sites. 43924 pfam04007: Protein of unknown function (DUF354). Members of this family are around 350 amino acids in length. They are found in archaebacteria and have no known function. 43925 pfam04008: Protein of unknown function (DUF355). Members of this family are around 160 amino acids in length and are mainly found in archaebacteria, with a small number of eubacterial examples. The high level of conservation in this family suggests some as yet unknown important biological function. 43926 pfam04009: Protein of unknown function (DUF356). Members of this family are around 120 amino acids in length and are found in some archaebacteria. The function of this family is unknown. However it contains a conserved motif IHPPAH that may be involved in its function. 43927 pfam04010: Protein of unknown function (DUF357). Members of this family are short (less than 100 amino acid) proteins found in archaebacteria. The function of these proteins is unknown. 43928 pfam04011: LemA family. The members of this family are related to the LemA protein. LemA contains an amino terminal predicted transmembrane helix. It has been predicted that the small amino terminus is extracellular. The exact molecular function of this protein is uncertain. 43929 pfam04012: PspA/IM30 family. This family includes PspA a protein that suppresses sigma54-dependent transcription. The PspA protein, a negative regulator of the Escherichia coli phage shock psp operon, is produced when virulence factors are exported through secretins in many Gram-negative pathogenic bacteria and its homologue in plants, VIPP1, plays a critical role in thylakoid biogenesis, essential for photosynthesis. Activation of transcription by the enhancer-dependent bacterial sigma(54) containing RNA polymerase occurs through ATP hydrolysis-driven protein conformational changes enabled by activator proteins that belong to the large AAA(+) mechanochemical protein family. It has been shown that PspA directly and specifically acts upon and binds to the AAA(+) domain of the PspF transcription activator. 43930 pfam04013: Protein of unknown function (DUF358). The proteins in this family are around 200 amino acids long with the exception of a member that has an additional 100 amino acids at its amino terminus. The function of these bacterial protein is unknown, however, they do contain several conserved histidines and aspartates that might form a metal binding site. 43931 pfam04014: SpoVT / AbrB like domain. One member of this family is AbrB from Bacillus subtilis. The product of the abrB gene is an ambiactive repressor and activator or the transcription of genes expressed during the transition state between vegetative growth and the onset of stationary phase and sporulation. AbrB is thought to interact directly with the transcription initiation regions of genes under its control. AbrB contains a helix-turn-helix structure, but this domain ends before the helix-turn-helix begins. The product of the Bacillus subtilis gene spoVT is another member of this family and is also a transcriptional regulator. DNA binding activity in the Bacillus 0 AbrB homologue requires hexamerisation. Another family member has been isolated from the archaeon Sulfolobus solfataricus and has been identified as a homologue of bacterial repressor-like proteins. The E.coli family member SohA or Prl1F appears to be bifunctional and is able to regulate its own expression as well as relieve the export block imposed by high-level synthesis of beta-galactosidase hybrid proteins. 43932 pfam04015: Domain of unknown function (DUF362). Sometimes present in iron-sulphur proteins. 43933 pfam04016: Domain of unknown function (DUF364). Archaeal domain of unknown function. 43934 pfam04017: Domain of unknown function (DUF366). Archaeal domain of unknown function. 43935 pfam04018: Domain of unknown function (DUF368). Predicted transmembrane domain of unknown function. Family members have between 6 and 9 predicted transmembrane segments. 43936 pfam04019: Protein of unknown function (DUF359). This family of archaebacterial proteins are about 170 amino acids in length. They have no known function. The most conserved portion of the protein contains the sequence GEEDL that may be important for its function. 43937 pfam04020: Membrane protein of unknown function. These proteins a predicted transmembrane proteins with probably four transmembrane spans. The function of these bacterial proteins is unknown. The sequences do not appear to contain any conserved polar residues that could form an active site. 43938 pfam04021: Protein of unknown function (DUF361). This family of archaeal proteins has no known function. They contain an amino terminal motif QXSXEXXXL that is likely to be functionally important. 43939 pfam04022: Staphylocoagulase repeat. 43940 pfam04023: FeoA domain. This family includes FeoA a small protein, probably involved in Fe2+ transport. This presumed short domain is also found at the C-terminus of a variety of metal dependent transcriptional regulators. This suggests that this domain may be metal-binding. In most cases this is likely to be either iron or manganese. 43941 pfam04024: PspC domain. This family includes Phage shock protein C (PspC) that is thought to be a transcriptional regulator. The presumed domain is 60 amino acid residues in length. 43942 pfam04025: Domain of unknown function (DUF370). Bacterial domain of unknown function. 43943 pfam04026: SpoVG. Stage V sporulation protein G. Essential for sporulation and specific to stage V sporulation in Bacillus megaterium and subtilis. In B. subtilis, expression decreases after 30-60 minutes of cold shock. 43944 pfam04027: Domain of unknown function (DUF371). Archaeal domain of unknown function. 43945 pfam04028: Domain of unknown function (DUF374). Bacterial domain of unknown function. 43946 pfam04029: 2-phosphosulpholactate phosphatase. Thought to catalyse 2-phosphosulpholactate = sulpholactate + phosphate. Probable magnesium cofactor. Involved in the second step of coenzyme M biosynthesis. Inhibited by vanadate in Methanococcus jannaschii. Also known as the ComB family. 43947 pfam04030: D-arabinono-1,4-lactone oxidase. This domain is specific to D-arabinono-1,4-lactone oxidase EC:1.1.3.37 , which is involved in the final step of the D-erythroascorbic acid biosynthesis pathway. 43948 pfam04031: Las1-like. Las1 is an essential nuclear protein involved in cell morphogenesis and cell surface growth. 43949 pfam04032: RNAse P Rpr2/Rpp21 subunit domain. This family contains a ribonuclease P subunit of humans and yeast. Other members of the family include the probable archaeal homologues. This subunit possibly binds the precursor tRNA. 43950 pfam04033: Domain of unknown function (DUF365). Archaeal domain of unknown function. 43951 pfam04034: Domain of unknown function (DUF367).. 43952 pfam04035: Archaeal DNA-directed RNA polymerase subunit E'' (RpoE'' or RpoE2). Catalyses the transcription of DNA into RNA using the four ribonucleoside triphosphates as substrates. In Sulfolobus acidocaldarius, RpoE2 this is one of 13 subunits in the RNA polymerase. RpoE2 in Methanococcus jannaschii contains a predicted C4-type zinc finger at positions 4 to 19 and this sequence has been noted as a potential metal binding motif in S. acidocaldarius. It is possible that family members contain a C4 zinc finger. 43953 pfam04036: Domain of unknown function (DUF372). Domain of unknown function. 43954 pfam04037: Domain of unknown function (DUF382). This domain is specific to the human splicing factor 3b subunit 2 and it's orthologues. Splicing factor 3b subunit 2 or SAP145 is a suppressor of U2 snRNA mutations. Pre-mRNA splicing is catalysed by a large ribonucleoprotein complex called the spliceosome. Spliceosomes are multi-component enzymes that catalyse pre-mRNA splicing and form step-wise by the ordered interaction of UsnRNPs and non-snRNP proteins with short conserved regions of the pre-mRNA at the 5' and 3' splice sites and branch site. 43955 pfam04038: Domain of unknown function (DUF381). Archaeal domain of unknown function. Strongly conserved YPLM motif. 43956 pfam04039: Domain related to MnhB subunit of Na+/H+ antiporter. Possible subunit of Na+/H+ antiporter. Predicted integral membrane protein, usually four transmembrane regions in this domain. Often found in bacterial NADH dehydrogenase subunit. 43957 pfam04040: Domain of unknown function (DUF375). Domain of unknown function. 43958 pfam04041: Domain of unknown function (DUF377).. 43959 pfam04042: DNA polymerase epsilon subunit B. DNA polymerase epsilon is essential for cell viability and chromosomal DNA replication in budding yeast. In addition, DNA polymerase epsilon may be involved in DNA repair and cell-cycle checkpoint control. The enzyme consists of at least four subunits in mammalian cells as well as in yeast. The largest subunit of DNA polymerase epsilon is responsible for polymerase epsilon is responsible for polymerase activity. In mouse, the DNA polymerase epsilon subunit B is the second largest subunit of the DNA polymerase. A part of the N-terminal was found to be responsible for the interaction with SAP18. Experimental evidence suggests that this subunit may recruit histone deacetylase to the replication fork to modify the chromatin structure. 43960 pfam04043: Plant invertase/pectin methylesterase inhibitor. This domain inhibits pectin methylesterases (PMEs) and invertases through formation of a non-covalent 1:1 complex. It has been implicated in the regulation of fruit development, carbohydrate metabolism and cell wall extension. It may also be involved in inhibiting microbial pathogen PMEs. It has been observed that it is often expressed as a large inactive preprotein. It is also found at the N-termini of PMEs predicted from DNA sequences (personal obs:C Yeats), suggesting that both PMEs and their inhibitor are expressed as a single polyprotein and subsequently processed. It has two disulphide bridges and is mainly alpha-helical. 43961 pfam04044: Nup133 nucleoporin. RNA undergoing nuclear export first encounters the basket of the nuclear pore. Nup133 is a nucleoporin accessible on the basket side of the pore. 43962 pfam04045: Arp2/3 complex, 34 kD subunit p34-Arc. Arp2/3 protein complex has been implicated in the control of actin polymerisation in cells. The human complex consists of seven subunits which include the actin related Arp2 and Arp3, and five others referred to as p41-Arc, p34-Arc, p21-Arc, p20-Arc, and p16-Arc. This family represents the p34-Arc subunit. 43963 pfam04046: PSP. Proline rich domain found in numerous spliceosome associated proteins. 43964 pfam04047: Periodic tryptophan protein 2 WD repeat associated presumed domain. 43965 pfam04048: Sec8 exocyst complex component specific domain. 43966 pfam04049: Anaphase promoting complex subunit 8 / Cdc23. The anaphase-promoting complex is composed of eight protein subunits, including BimE (APC1), CDC27 (APC3), CDC16 (APC6), and CDC23 (APC8).. 43967 pfam04050: Up-frameshift suppressor 2. Transcripts harbouring premature signals for translation termination are recognised and rapidly degraded by eukaryotic cells through a pathway known as nonsense-mediated mRNA decay. In Saccharomyces cerevisiae, three trans-acting factors (Upf1 to Upf3) are required for nonsense-mediated mRNA decay. 43968 pfam04051: Transport protein particle (TRAPP) component, Bet3. TRAPP plays a key role in the targeting and/or fusion of ER-to-Golgi transport vesicles with their acceptor compartment. TRAPP is an 800 KDa that contains at least 10 subunits. 43969 pfam04052: TolB amino-terminal domain. TolB is an essential periplasmic component of the tol-dependent translocation system. This function of this amino terminal domain is uncertain. 43970 pfam04053: Coatomer WD associated region. 43971 pfam04054: CCR4-Not complex component, Not1. The Ccr4-Not complex is a global regulator of transcription that affects genes positively and negatively and is thought to regulate transcription factor TFIID. . 43972 pfam04055: Radical SAM superfamily. 43973 pfam04056: Ssl1-like. Ssl1-like proteins are 40kDa subunits of the Transcription factor II H complex. 43974 pfam04057: Replication factor-A protein 1, N-terminal domain. 43975 pfam04058: DNA polymerase alpha subunit B. The B subunit of the DNA polymerase alpha plays an essential role at the initial stage of DNA replication in S. cerevisiae and is phosphorylated in a cell cycle-dependent manner. 43976 pfam04059: RNA recognition motif 2. 43977 pfam04060: Putative Fe-S cluster. This family includes a domain with four conserved cysteines that probably form an Fe-S redox cluster. 43978 pfam04061: ORMDL family. Evidence form suggests that ORMDLs are involved in protein folding in the ER. 43979 pfam04062: P21-ARC (ARP2/3 complex 21 kDa subunit). The seven component ARP2/3 actin-organising complex is involved in actin assembly and function. 43980 pfam04063: Domain of unknown function (DUF383).. 43981 pfam04064: Domain of unknown function (DUF384).. 43982 pfam04065: Not1 N-terminal domain, CCR4-Not complex component. 43983 pfam04066: Multiple resistance and pH regulation protein F (MrpF / PhaF). Members of the PhaF / MrpF family are predicted to be an integral membrane proteins with three transmembrane regions, involved in regulation of pH. PhaF is part of a potassium efflux system involved in pH regulation. It is also involved in symbiosis in Rhizobium meliloti. MrpF is part of a Na+/H+ antiporter complex, also involved in pH homeostasis. MrpF is thought to be an efflux system for Na+ and cholate. The Mrp system in Bacilli may also have primary energisation capacities. 43984 pfam04068: Possible metal-binding domain in RNase L inhibitor, RLI. Possible metal-binding domain in endoribonuclease RNase L inhibitor. Found at the N-terminal end of RNase L inhibitor proteins, adjacent to the 4Fe-4S binding domain, fer4, pfam00037. Also often found adjacent to the DUF367 domain pfam04034 in uncharacterised proteins. The RNase L system plays a major role in the anti-viral and anti-proliferative activities of interferons, and could possibly play a more general role in the regulation of RNA stability in mammalian cells. Inhibitory activity requires concentration-dependent association of RLI with RNase L. 43985 pfam04069: Substrate binding domain of ABC-type glycine betaine transport system. Part of a high affinity multicomponent binding-protein-dependent transport system involved in bacterial osmoregulation. This domain is often fused to the permease component of the transporter complex. Family members are often integral membrane proteins or predicted to be attached to the membrane by a lipid anchor. Glycine betaine is involved in protection from high osmolarity environments for example in Bacillus subtilis. The family member OpuBC is closely related, and involved in choline transport. Choline is necessary for the biosynthesis of glycine betaine. L-carnitine is important for osmoregulation in Listeria monocytogenes. Family also contains proteins binding l-proline (ProX), histidine (HisX) and taurine (TauA).. 43986 pfam04070: Domain of unknown function (DUF378). Predicted transmembrane domain of unknown function. The majority of the family have two predicted transmembrane regions. 43987 pfam04071: Cysteine-rich small domain. Probable metal-binding domain. 43988 pfam04072: Leucine carboxyl methyltransferase. Family of leucine carboxyl methyltransferases EC:2.1.1.-. This family may need divides a the full alignment contains a significantly shorter mouse sequence. 43989 pfam04073: YbaK / prolyl-tRNA synthetases associated domain. This domain of unknown function is found in numerous prokaryote organisms. The structure of YbaK shows a novel fold. This domain also occurs in a number of prolyl-tRNA synthetases (proRS) from prokaryotes. Thus, the domain is thought to be involved in oligo-nucleotide binding, with possible roles in recognition/discrimination or editing of prolyl-tRNA. 43990 pfam04074: Domain of unknown function (DUF386). This family consists of conserved hypothetical proteins, typically about 150 amino acids in length, with no known function. 43991 pfam04075: Domain of unknown function (DUF385). Family of Mycobacterium tuberculosis proteins. 43992 pfam04076: Domain unknown function (DUF388).. 43993 pfam04077: DsrH like protein. DsrH is involved in oxidation of intracellular sulphur in the phototrophic sulphur bacterium Chromatium vinosum D. 43994 pfam04078: Cell differentiation family, Rcd1-like. Two of the members in this family have been characterised as being involved in regulation of Ste11 regulated sex genes. Mammalian Rcd1 is a novel transcriptional cofactor that mediates retinoic acid-induced cell differentiation. 43995 pfam04079: Putative transcriptional regulators (Ypuh-like). This family of conserved bacterial proteins are thought to possibly be helix-turn-helix type transcriptional regulators. 43996 pfam04080: Per1-like. A member of this family has been implemented in protein processing in the endoplasmic reticulum. 43997 pfam04081: DNA polymerase delta, subunit 4. 43998 pfam04082: Fungal specific transcription factor domain. 43999 pfam04083: ab-hydrolase associated lipase region. 44000 pfam04084: Origin recognition complex subunit 2. All DNA replication initiation is driven by a single conserved eukaryotic initiator complex termed he origin recognition complex (ORC). The ORC is a six protein complex. The function of ORC is reviewed in. . 44001 pfam04085: rod shape-determining protein MreC. MreC (murein formation C) is involved in the rod shape determination in E. coli, and more generally in cell shape determination of bacteria whether or not they are rod-shaped. 44002 pfam04086: Signal recognition particle, alpha subunit, N-terminal. SRP is a complex of six distinct polypeptides and a 7S RNA that is essential for transferring nascent polypeptide chains that are destined for export from the cell to the translocation apparatus of the endoplasmic reticulum (ER) membrane. SRP binds hydrophobic signal sequences as they emerge from the ribosome, and arrests translation. 44003 pfam04087: Domain of unknown function (DUF389). Family of hypothetical bacterial proteins with an undetermined function. 44004 pfam04088: Peroxin 13, N-terminal. Both termini of the Peroxin-13 are oriented to the cytosol. Peroxin-13 is required for peroxisomal association of peroxin-14. 44005 pfam04089: BRICHOS domain. The BRICHOS domain is about 100 amino acids long. It is found in a variety of proteins implicated in dementia, respiratory distress and cancer. 44006 pfam04090: RNA polymerase I specific initiation factor. 44007 pfam04091: Exocyst complex subunit Sec15-like. 44008 pfam04092: SRS domain. Toxoplasma gondii is a persistent protozoan parasite capable of infecting almost any warm-blooded vertebrate. The surface of Toxoplasma is coated with a family of developmentally regulated glycosylphosphatidylinositol (GPI)-linked proteins (SRSs), of which SAG1 is the prototypic member. SRS proteins mediate attachment to host cells and interface with the host immune response to regulate the virulence of the parasite. SAG1 is composed of two disulphide linked SRS domains. These have 6 cysteines that form 1-6,2-5 and 3-4 pairings. The structure of the immunodominant SAG1 antigen reveals a homodimeric configuration. The SRS domain is found in a single copy in the SAG2 proteins. This family of surface antigens are found in other apicomplexans. 44009 pfam04093: rod shape-determining protein MreD. MreD (murein formation D) is involved in the rod shape determination in E. coli, and more generally in cell shape determination of bacteria whether or not they are rod-shaped. 44010 pfam04094: Protein of unknown function (DUF390). This family of long proteins are currently only found in the rice genome. They have no known function. However they may be some kind of transposable element. 44011 pfam04095: Nicotinate phosphoribosyltransferase (NAPRTase) family. Nicotinate phosphoribosyltransferase (EC:2.4.2.11) is the rate limiting enzyme that catalyses the first reaction in the NAD salvage synthesis. This family also includes Pre-B cell enhancing factor that is a cytokine. This family is related to Quinolinate phosphoribosyltransferase pfam01729. 44012 pfam04096: Nucleoporin autopeptidase. 44013 pfam04097: Nucleoporin interacting component. 44014 pfam04098: Rad52/22 family double-strand break repair protein. The DNA single-strand annealing proteins (SSAPs), such as RecT, Red-beta, ERF and Rad52, function in RecA-dependent and RecA-independent DNA recombination pathways. This family includes proteins related to Rad52. These proteins contain two helix-hairpin-helix motifs. 44015 pfam04099: Sybindin-like family. Sybindin is a physiological syndecan-2 ligand on dendritic spines, the small protrusions on the surface of dendrites that receive the vast majority of excitatory synapses. 44016 pfam04100: Vps53-like, N-terminal. Vps53 complexes with Vps52 and Vps54 to form a multi- subunit complex involved in regulating membrane trafficking events. 44017 pfam04101: Glycosyltransferase family 28 C-terminal domain. The glycosyltransferase family 28 includes monogalactosyldiacylglycerol synthase (EC 2.4.1.46) and UDP-N-acetylglucosamine transferase (EC 2.4.1.-). Structural analysis suggests the C-terminal domain contains the UDP-GlcNAc binding site. 44018 pfam04102: SlyX. The SlyX protein has no known function. It is short less than 80 amino acids and is found close to the slyD gene. The SlyX protein has a conserved PPH(Y/W) motif at its C-terminus. The protein may be a coiled-coil structure. 44019 pfam04103: CD20/IgE Fc receptor beta subunit family. This family includes the CD20 protein and the beta subunit of the high affinity receptor for IgE Fc. The high affinity receptor for IgE is a tetrameric structure consisting of a single IgE-binding alpha subunit, a single beta subunit, and two disulfide-linked gamma subunits. The alpha subunit of Fc epsilon RI and most Fc receptors are homologous members of the Ig superfamily. By contrast, the beta and gamma subunits from Fc epsilon RI are not homologous to the Ig superfamily. Both molecules have four putative transmembrane segments and a probably topology where both amino- and carboxy termini protrude into the cytoplasm. 44020 pfam04104: Eukaryotic-type DNA primase, large subunit. DNA primase is the polymerase that synthesises small RNA primers for the Okazaki fragments made during discontinuous DNA replication. DNA primase is a heterodimer of two subunits, the small subunit Pri1 (48 kDa in yeast), and the large subunit Pri2 (58 kDa in the yeast S. cerevisiae). Both subunits participate in the formation of the active site, but the ATP binding site is located on the small subunit. Primase function has also been demonstrated for human and mouse primase subunits. 44021 pfam04106: Autophagy protein Apg5. Apg5 is directly required for the import of aminopeptidase I via the cytoplasm-to-vacuole targeting pathway. 44022 pfam04107: Glutamate-cysteine ligase family 2(GCS2). Also known as gamma-glutamylcysteine synthetase and gamma-ECS (EC:6.3.2.2). This enzyme catalyses the first and rate limiting step in de novo glutathione biosynthesis. Members of this family are found in archaea, bacteria and plants. May and Leaver discuss the possible evolutionary origins of glutamate-cysteine ligase enzymes in different organisms and suggest that it evolved independently in different eukaryotes, from an ancestral bacterial enzyme. They also state that Arabidopsis thaliana gamma-glutamylcysteine synthetase is structurally unrelated to mammalian, yeast and Escherichia coli homologues. In plants, there are separate cytosolic and chloroplast forms of the enzyme. 44023 pfam04108: Autophagy protein Apg17. Apg17 is required for activating Apg1 protein kinases. 44024 pfam04109: Autophagy protein Apg9. In yeast, 15 Apg proteins coordinate the formation of autophagosomes. Autophagy is a bulk degradation process induced by starvation in eukaryotic cells. Apg9 plays a direct role in the formation of the cytoplasm to vacuole targeting and autophagic vesicles, possibly serving as a marker for a specialised compartment essential for these vesicle-mediated alternative targeting pathways. 44025 pfam04110: Autophagy protein Apg12. In yeast, 15 Apg proteins coordinate the formation of autophagosomes. Autophagy is a bulk degradation process induced by starvation in eukaryotic cells. Apg12 is covalently bound to Apg5. 44026 pfam04111: Autophagy protein Apg6. In yeast, 15 Apg proteins coordinate the formation of autophagosomes. Autophagy is a bulk degradation process induced by starvation in eukaryotic cells. Apg6/Vps30p has two distinct functions in the autophagic process, either associated with the membrane or in a retrieval step of the carboxypeptidase Y sorting pathway. 44027 pfam04112: Mak10 subunit, NatC N(alpha)-terminal acetyltransferase. NatC N(alpha)-terminal acetyltransferases contains Mak10p, Mak31p and Mak3p subunits. All three subunits are associated with each other to form the active complex. 44028 pfam04113: Gpi16 subunit, GPI transamidase component. GPI (glycosyl phosphatidyl inositol) transamidase is a multi-protein complex. Gpi16, Gpi8 and Gaa1 for a sub-complex of the GPI transamidase. GPI transamidase that adds glycosylphosphatidylinositols (GPIs) to newly synthesised proteins. Gpi16 is an essential N-glycosylated transmembrane glycoprotein. Gpi16 is largely found on the lumenal side of the ER. It has a single C-terminal transmembrane domain and a small C-terminal, cytosolic extension with an ER retrieval motif. 44029 pfam04114: Gaa1-like, GPI transamidase component. GPI (glycosyl phosphatidyl inositol) transamidase is a multi-protein complex. Gpi16, Gpi8 and Gaa1 for a sub-complex of the GPI transamidase. GPI transamidase that adds glycosylphosphatidylinositols (GPIs) to newly synthesised proteins. 44030 pfam04115: Ureidoglycolate hydrolase. Ureidoglycolate hydrolase (EC:3.5.3.19) carried out the third step in the degradation of allantoin. 44031 pfam04116: Fatty acid hydroxylase. 44032 pfam04117: Mpv17 / PMP22 family. The 22-kDa peroxisomal membrane protein (PMP22) is a major component of peroxisomal membranes. PMP22 seems to be involved in pore forming activity and may contribute to the unspecific permeability of the organelle membrane. PMP22 is synthesised on free cytosolic ribosomes and then directed to the peroxisome membrane by specific targeting information. Mpv17 is a closely related peroxisomal protein involved in the development of early-onset glomerulosclerosis. . 44033 pfam04118: Dopey, N-terminal. DopA is the founding member of the Dopey family and is required for correct cell morphology and spatiotemporal organisation of multicellular structures in the filamentous fungus Aspergillus nidulans. DopA homologues are found in mammals. S. cerevisiae DOP1 is essential for viability and, affects cellular morphogenesis. 44034 pfam04119: Heat shock protein 9 / 12. These heat shock proteins (Hsp9 and Hsp12) are strongly expressed, an increase of 100 fold, upon entry into stationary phase in yeast. 44035 pfam04120: Low affinity iron permease. 44036 pfam04121: Nuclear pore protein 84 / 107. Nup84p forms a complex with five proteins, of which Nup120p, Nup85p, Sec13p, and a Sec13p homologues. This Nup84p complex in conjunction with Sec13-type proteins is required for correct nuclear pore biogenesis. 44037 pfam04122: Putative cell wall binding repeat 2. This repeat is found in multiple tandem copies in proteins including amidase enhancers and adhesins. 44038 pfam04123: Domain of unknown function (DUF373). Archaeal domain of unknown function. Predicted to be an integral membrane protein with six transmembrane regions. 44039 pfam04124: Dor1-like family. Dor1 is involved in vesicle targeting to the yeast Golgi apparatus and complexes with a number of other trafficking proteins, which include Sec34 and Sec35. 44040 pfam04126: Domain of unknown function (DUF369).. 44041 pfam04127: DNA / pantothenate metabolism flavoprotein. The DNA/pantothenate metabolism flavoprotein (EC:4.1.1.36) affects synthesis of DNA, and pantothenate metabolism. 44042 pfam04128: Partner of SLD five, PSF2. A eukaryotic specific domain of undetermined function.` The GINS complex is essential for initiation of DNA replication in Xenopus egg extracts. This 100 kD stable complex includes Sld5, Psf1, Psf2, and Psf3. Homologues of these components are found also in yeasts and in humans. 44043 pfam04129: Vps52 / Sac2 family. Vps52 complexes with Vps53 and Vps54 to form a multi- subunit complex involved in regulating membrane trafficking events. 44044 pfam04130: Spc97 / Spc98 family. The spindle pole body (SPB) functions as the microtubule-organising centre in yeast. Members of this family are spindle pole body (SBP) components such as Spc97 and Spc98 that form a complex with gamma-tubulin. 44045 pfam04131: Putative N-acetylmannosamine-6-phosphate epimerase. This family represents a putative ManNAc-6-P-to-GlcNAc-6P epimerase in the N-acetylmannosamine (ManNAc) utilisation pathway found mainly in pathogenic bacteria. 44046 pfam04132: Vacuolar protein sorting 36. Vps36 is involved in Golgi to endosome trafficking. 44047 pfam04133: Vacuolar protein sorting 55. Vps55 is involved in the secretion of the Golgi form of the soluble vacuolar carboxypeptidase Y, but not the trafficking of the membrane-bound vacuolar alkaline phosphatase. Both Vps55 and obesity receptor gene-related protein are important for functioning membrane trafficking to the vacuole/lysosome of eukaryotic cells. 44048 pfam04134: Protein of unknown function, DUF393. Members of this family have two highly conserved cysteine residues near their N-terminus. The function of these proteins is unknown. 44049 pfam04135: Nucleolar RNA-binding protein, Nop10p family. Nop10p is a nucleolar protein that is specifically associated with H/ACA snoRNAs. It is essential for normal 18S rRNA production and rRNA pseudouridylation by the ribonucleoprotein particles containing H/ACA snoRNAs (H/ACA snoRNPs). Nop10p is probably necessary for the stability of these RNPs. 44050 pfam04136: Sec34-like family. Sec34 and Sec35 form a sub-complex, in a seven protein complex that includes Dor1 (pfam04124). This complex is thought to be important for tether vesicles to the Golgi. 44051 pfam04137: Endoplasmic Reticulum Oxidoreductin 1 (ERO1). Members of this family are required for the formation of disulphide bonds in the ER. 44052 pfam04138: GtrA-like protein. Members of this family are predicted to be integral membrane proteins with three or four transmembrane spans. They are involved in the synthesis of cell surface polysaccharides. The GtrA family are a subset of this family. GtrA is predicted to be an integral membrane protein with 4 transmembrane spans. It is involved is in O antigen modification by Shigella flexneri bacteriophage X (SfX), but does not determine the specificity of glucosylation. Its function remains unknown, but it may play a role in translocation of undecaprenyl phosphate linked glucose (UndP-Glc) across the cytoplasmic membrane. Another member of this family is a DTDP-glucose-4-keto-6-deoxy-D-glucose reductase, which catalyses the conversion of dTDP-4-keto-6-deoxy-D-glucose to dTDP-D-fucose, which is involved in the biosynthesis of the serotype-specific polysaccharide antigen of Actinobacillus actinomycetemcomitans Y4 (serotype b). This family also includes the teichoic acid glycosylation protein, GtcA, which is a serotype-specific protein in some Listeria innocua and monocytogenes strains. Its exact function is not known, but it is essential for decoration of cell wall teichoic acids with glucose and galactose. 44053 pfam04139: Rad9. Rad9 is required for transient cell-cycle arrests and transcriptional induction of DNA repair in response to DNA damage. 44054 pfam04140: Isoprenylcysteine carboxyl methyltransferase (ICMT) family. The isoprenylcysteine o-methyltransferase (EC:2.1.1.100) family carry out carboxyl methylation of cleaved eukaryotic proteins that terminate in a CaaX motif. In Saccharomyces cerevisiae this methylation is carried out by Ste14p, an integral endoplasmic reticulum membrane protein. Ste14p is the founding member of the isoprenylcysteine carboxyl methyltransferase (ICMT) family, whose members share significant sequence homology. 44055 pfam04141: Protein of unknown function, DUF394. This family includes functionally uncharacterized proteins from such pathogenic bacteria as Helicobacter pylori, Campylobacter jejuni, and Vibrio cholerae. 44056 pfam04142: Nucleotide-sugar transporter. This family of membrane proteins transport nucleotide sugars from the cytoplasm into golgi vesicles. Members transport CMP-sialic acid, UDP-galactose and UDP-GlcNAc, for example. 44057 pfam04143: YeeE/YedE family (DUF395). This family includes YeeE and YedE from E. coli. These proteins are integral membrane proteins of unknown function. Many of these proteins contain two homologous regions that are represented by this family. This region contains several conserved glycines and an invariant cysteine that is probably an important functional residue. 44058 pfam04144: SCAMP family. In vertebrates, secretory carrier membrane proteins (SCAMPs) 1-3 constitute a family of putative membrane-trafficking proteins composed of cytoplasmic N-terminal sequences with NPF repeats, four central transmembrane regions (TMRs), and a cytoplasmic tail. SCAMPs probably function in endocytosis by recruiting EH-domain proteins to the N-terminal NPF repeats but may have additional functions mediated by their other sequences. 44059 pfam04145: Ctr copper transporter family. The redox active metal copper is an essential cofactor in critical biological processes such as respiration, iron transport, oxidative stress protection, hormone production, and pigmentation. A widely conserved family of high-affinity copper transport proteins (Ctr proteins) mediates copper uptake at the plasma membrane. A series of clustered methionine residues in the hydrophilic extracellular domain, and an MXXXM motif in the second transmembrane domain, are important for copper uptake. These methionine probably coordinate copper during the process of metal transport. 44060 pfam04146: YT521-B-like family. This family of poorly characterised proteins contains YT521-B, a putative splicing factor from Rat. YT521-B is a tyrosine-phosphorylated nuclear protein, that interacts with the nuclear transcriptosomal component scaffold attachment factor B, and the 68-kDa Src substrate associated during mitosis, Sam68. In vivo splicing assays demonstrated that YT521-B modulates alternative splice site selection in a concentration-dependent manner. 44061 pfam04147: Nop14-like family. Emg1 and Nop14 are novel proteins whose interaction is required for the maturation of the 18S rRNA and for 40S ribosome production. 44062 pfam04148: Protein of unknown function (DUF396). A family of conserved eukaryotic transmembrane proteins. 44063 pfam04149: Domain of unknown function (DUF397). The function of this family is unknown. It has been suggested that some members of this family are regulators of transcription. 44064 pfam04150: Exportin-t. Exportin-t is a specific mediator of tRNA export. RanGTP-binding, importin beta-related factor with predominantly nuclear localisation. It shuttles rapidly between nucleus and between nucleus and cytoplasm and interacts with nuclear pore complexes. Exportin-t binds tRNA directly and with high affinity. . 44065 pfam04151: Bacterial pre-peptidase C-terminal domain. This domain is normally found at the C-terminus of secreted bacterial peptidases. They are not present in the active peptidase. It is possible that they fulfill a similar role to the PKD (pfam00801) domain, which also are found in this context. Visual analysis suggests that PKD and PPC are distantly related (personal obs:Bateman A, Yeats C).. 44066 pfam04152: Mre11 DNA-binding presumed domain. The Mre11 complex is a multi-subunit nuclease that is composed of Mre11, Rad50 and Nbs1/Xrs2, and is involved in checkpoint signalling and DNA replication. Mre11 has an intrinsic DNA-binding activity that is stimulated by Rad50 on its own or in combination with Nbs1. 44067 pfam04153: NOT2 / NOT3 / NOT5 family. NOT1, NOT2, NOT3, NOT4 and NOT5 form a nuclear complex that negatively regulates the basal and activated transcription of many genes. This family includes NOT2, NOT3 and NOT5. 44068 pfam04154: Chromosome condensation protein 3, C-terminal region. Cnd3 is a Member of the five subunit condensin complex. Each subunit is essential for mitotic condensation. 44069 pfam04155: Ground-like domain. This family consists of the ground-like domain and is specific to C.elegans. It has been proposed that the ground-like domain containing proteins may bind and modulate the activity of Patched-like membrane molecules, reminiscent of the modulating activities of neuropeptides. . 44070 pfam04156: IncA protein. Chlamydia trachomatis is an obligate intracellular bacterium that develops within a parasitophorous vacuole termed an inclusion. The inclusion is nonfusogenic with lysosomes but intercepts lipids from a host cell exocytic pathway. Initiation of chlamydial development is concurrent with modification of the inclusion membrane by a set of C. trachomatis-encoded proteins collectively designated Incs. One of these Incs, IncA, is functionally associated with the homotypic fusion of inclusions. 44071 pfam04157: EAP30. EAP30 is a subunit of the ELL complex. The ELL is an 80-kDa RNA polymerase II transcription factor. ELL interacts with three other proteins to form the complex known as ELL complex. The ELL complex is capable of increasing that catalytic rate of transcription elongation, but is unable to repress initiation of transcription by RNA polymerase II as is the case of ELL. EAP30 is thought to lead to the derepression of ELL's transcriptional inhibitory activity. . 44072 pfam04158: Sof1-like domain. Sof1 is essential for cell growth and is a component of the nucleolar rRNA processing machinery. . 44073 pfam04159: NB glycoprotein. The NB glycoprotein is found in Influenza type B virus. Its function is unknown. 44074 pfam04160: Orf-X protein. This short protein has no known function and is found in Jaagsiekte sheep retrovirus. Jaagsiekte sheep retrovirus (JSRV) is the etiological agent of a contagious lung tumour of sheep known as sheep pulmonary adenomatosis. JSRV exhibits a simple genetic organisation, characteristic of the type D and type B retroviruses, with the canonical retroviral sequences gag, pro, pol and env encoding the structural proteins of the virion. An additional open reading frame (orf-x), of approximately 500 bp overlapping pol. 44075 pfam04161: Arv1-like family. Arv1 is a transmembrane protein with potential zinc-binding motifs. ARV1 is a novel mediator of eukaryotic sterol homeostasis. 44076 pfam04162: Circovirus coat protein (VP1). Circoviruses are small circular single stranded viruses. This family includes the VP1 protein from the chicken anaemia virus which is the viral coat protein. 44077 pfam04163: Tht1-like nuclear fusion protein. 44078 pfam04164: Protein of unknown function, DUF400. This family includes functionally uncharacterized proteins from such pathogenic bacteria as Helicobacter pylori, Campylobacter jejuni, and Vibrio cholerae. The Helicobacter pylori protein consists of two copies of this domain. 44079 pfam04165: Protein of unknown function (DUF401). Members if this family are predicted to have 10 transmembrane regions. 44080 pfam04166: Pyridoxal phosphate biosynthetic protein PdxA. In Escherichia coli the coenzyme pyridoxal 5'-phosphate is synthesised de novo by a pathway that is thought to involve the condensation of 4-(phosphohydroxy)-L-threonine and 1-deoxy-D-xylulose, catalysed by the enzymes PdxA and PdxJ, to form either pyridoxine (vitamin B6) or pyridoxine 5 '-phosphate. 44081 pfam04167: Protein of unknown function (DUF402). Family member FomD is a predicted protein from a fosfomycin biosynthesis gene cluster in Streptomyces wedmorensis. Its function is unknown. 44082 pfam04168: Bacterial domain of unknown function (DUF403).. 44083 pfam04169: Domain of unknown function (DUF404).. 44084 pfam04170: Uncharacterized lipoprotein NlpE involved in copper resistance. This family represents a bacterial outer membrane lipoprotein that is necessary for signalling by the Cpx pathway. This pathway responds to cell envelope disturbances and increases the expression of periplasmic protein folding and degradation factors. While the molecular function of the NlpE protein is unknown, it may be involved in detecting bacterial adhesion to abiotic surfaces. In Escherichia coli and Salmonella typhi, NlpE is also known to confer copper tolerance in copper-sensitive strains of Escherichia coli, and may be involved in copper efflux and delivery of copper to copper-dependent enzymes. 44085 pfam04171: Protein of unknown function (DUF405). Predicted to be an integral membrane protein. Several family member are annotated as potential transport proteins, but there is no experimental evidence to suggest the function of any family member. 44086 pfam04172: LrgB-like family. The two products of the lrgAB operon are potential membrane proteins, and LrgA and LrgB are both thought to control of murein hydrolase activity and penicillin tolerance. 44087 pfam04173: DoxD-like family. DoxD is a subunit of the terminal quinol oxidase present in the plasma membrane of Acidianus ambivalens, with calculated molecular mass of 20.4 kDa. 44088 pfam04174: Domain of unknown function (DUF407).. 44089 pfam04175: Protein of unknown function (DUF406). Members of this family appear to be found only in gamma proteobacteria. The function of this protein family is undetermined. 44090 pfam04176: TIP41-like family. The TOR signalling pathway activates a cell-growth program in response to nutrients. TIP41 interacts with TAP42 and negatively regulates the TOR signaling pathway. . 44091 pfam04177: TAP42-like family. The TOR signalling pathway activates a cell-growth program in response to nutrients. TIP41 (pfam04176) interacts with TAP42 and negatively regulates the TOR signaling pathway. . 44092 pfam04178: Got1-like family. Traffic through the yeast Golgi complex depends on a member of the syntaxin family of SNARE proteins, Sed5, present in early Golgi cisternae. Got1 is thought to facilitate Sed5-dependent fusion events. 44093 pfam04179: Initiator tRNA phosphoribosyl transferase. This enzyme (EC:2.4.2.-) modifies exclusively the initiator tRNA in position 64 using 5'-phosphoribosyl-1'-pyrophosphate as the modification donor. As the initiator tRNA participates both in the initiation and elongation of translation, the 2'-O-ribosyl phosphate modification discriminates the initiator tRNAs from the elongator tRNAs. 44094 pfam04180: Low temperature viability protein. 44095 pfam04181: Domain of Unknown Function (DUF408).. 44096 pfam04182: B-block binding subunit of TFIIIC. Yeast transcription factor IIIC (TFIIIC) is a multi-subunit protein complex that interacts with two control elements of class III promoters called the A and B blocks. This family represents the subunit within TFIIIC involved in B-block binding. 44097 pfam04183: IucA / IucC family. IucA and IucC catalyse discrete steps in biosynthesis of the siderophore aerobactin from N epsilon-acetyl-N epsilon-hydroxylysine and citrate. 44098 pfam04184: ST7 protein. The ST7 (for suppression of tumorigenicity 7) protein is thought to be a tumour suppressor gene. The molecular function of this protein is uncertain. 44099 pfam04185: Phosphoesterase family. This family includes both bacterial phospholipase C enzymes EC:3.1.4.3, but also eukaryotic acid phosphatases EC:3.1.3.2. 44100 pfam04186: FxsA cytoplasmic membrane protein. This is a bacterial family of cytoplasmic membrane proteins. It includes two transmembrane regions. The molecular function of FxsA is unknown, but in Escherichia coli its over-expression has been shown to alleviate the exclusion of phage T7 in those cells with an F plasmid. 44101 pfam04187: Protein of unknown function, DUF399. No function is known for any member of this family. 44102 pfam04188: Protein of unknown function (DUF409). Family of eukaryotic membrane proteins with unknown function. 44103 pfam04189: Eukaryotic initiation factor 3, gamma subunit. eIF-3 is a multi-subunit complex that stimulates translation initiation in vitro at several different steps. This family corresponds to the gamma subunit if eIF3. 44104 pfam04190: Protein of unknown function (DUF410). Family of conserved eukaryotic proteins with undetermined function. 44105 pfam04191: Phospholipid methyltransferase. The S. cerevisiae phospholipid methyltransferase (EC:2.1.1.16) has a broad substrate specificity of unsaturated phospholipids. 44106 pfam04192: Utp21 specific WD40 associated putative domain. Utp21 is a subunit of U3 snoRNP, which is essential for synthesis of 18S rRNA. 44107 pfam04193: PQ loop repeat. Members of this family are all membrane bound proteins possessing a pair of repeats each spanning two transmembrane helices connected by a loop. The PQ motif found on loop 2 is critical for the localisation of cystinosin to lysosomes. However, the PQ motif appears not to be a general lysosome-targeting motif. It is thought likely to possess a more general function. Most probably this involves a glutamine residue. 44108 pfam04194: Programmed cell death protein 2, C-terminal putative domain. 44109 pfam04195: Putative gypsy type transposon. This family of plant genes are thought to be related to gypsy type transposons. 44110 pfam04196: Bunyavirus RNA dependent RNA polymerase. The bunyaviruses are enveloped viruses with a genome consisting of 3 ssRNA segments (called L, M and S). The nucleocapsid protein is encode on the small (S) genomic RNA. The L segment codes for an RNA polymerase. This family contains the RNA dependent RNA polymerase on the L segment. 44111 pfam04197: Birnavirus RNA dependent RNA polymerase (VP1). Birnaviruses are dsRNA viruses. This family corresponds to the RNA dependent RNA polymerase. This protein is also known as VP1. All of the birnavirus VP1 proteins contain conserved RdRp motifs that reside in the catalytic ""palm"" domain of all classes of polymerases. However, the birnavirus RdRps lack the highly conserved Gly-Asp-Asp (GDD) sequence, a component of the proposed catalytic site of this enzyme family that exists in the conserved motif VI of the palm domain of other RdRps. 44112 pfam04198: Putative sugar-binding domain. This probable domain is found in bacterial transcriptional regulators such as DeoR and SorC. These proteins have an amino-terminal helix-turn-helix pfam00325 that binds to DNA. This domain is probably the ligand regulator binding region. SorC is regulated by sorbose and other members of this family are likely to be regulated by other sugar substrates. 44113 pfam04199: Putative cyclase. Proteins in this family are thought to be cyclase enzymes. They are found in proteins involved in antibiotic synthesis. However they are also found in organisms that do not make antibiotics pointing to a wider role for these proteins. The proteins contain a conserved motif HXGTHXDXPXH that is likely to form part of the active site. 44114 pfam04200: Lipoprotein associated domain. This presumed domain is about 100 amino acids in length. It is found in lipoprotein of unknown function and is greatly expanded in Mycoplasma pulmonis. The domain is found in up to five copies in some proteins. 44115 pfam04201: Tumour protein D52 family. The hD52 gene was originally identified through its elevated expression level in human breast carcinoma. Cloning of D52 homologues from other species has indicated that D52 may play roles in calcium-mediated signal transduction and cell proliferation. Two human homologues of hD52, hD53 and hD54, have also been identified, demonstrating the existence of a novel gene/protein family. These proteins have an amino terminal coiled-coil that allows members to form homo- and heterodimers with each other. 44116 pfam04202: Foot protein 3. Mytilus foot protein-3 (Mfp-3) is a highly polymorphic protein family located in the byssal adhesive plaques of blue mussels. 44117 pfam04203: Sortase family. The founder member of this family is S.aureus sortase, a transpeptidase that attaches surface proteins by the threonine of an LPXTG motif to the cell wall. . 44118 pfam04204: Homoserine O-succinyltransferase. 44119 pfam04205: FMN-binding domain. This conserved region includes the FMN-binding site of the NqrC protein, as well as the NosR and NirI regulatory proteins. 44120 pfam04206: Tetrahydromethanopterin S-methyltransferase, subunit E. The N5-methyltetrahydromethanopterin: coenzyme M (EC:2.1.1.86) of Methanosarcina mazei Go1 is a membrane-associated, corrinoid-containing protein that uses a transmethylation reaction to drive an energy-conserving sodium ion pump. 44121 pfam04207: Tetrahydromethanopterin S-methyltransferase, subunit D. The N5-methyltetrahydromethanopterin: coenzyme M (EC:2.1.1.86) of Methanosarcina mazei Go1 is a membrane-associated, corrinoid-containing protein that uses a transmethylation reaction to drive an energy-conserving sodium ion pump. 44122 pfam04208: Tetrahydromethanopterin S-methyltransferase, subunit A. The N5-methyltetrahydromethanopterin: coenzyme M (EC:2.1.1.86) of Methanosarcina mazei Go1 is a membrane-associated, corrinoid-containing protein that uses a transmethylation reaction to drive an energy-conserving sodium ion pump. 44123 pfam04209: homogentisate 1,2-dioxygenase. Homogentisate dioxygenase cleaves the aromatic ring during the metabolic degradation of Phe and Tyr. Homogentisate dioxygenase deficiency causes alkaptonuria. The structure of homogentisate dioxygenase shows that the enzyme forms a hexamer arrangement comprised of a dimer of trimers. The active site iron ion is coordinated near the interface between the trimers. 44124 pfam04210: Tetrahydromethanopterin S-methyltransferase, subunit G. The N5-methyltetrahydromethanopterin: coenzyme M (EC:2.1.1.86) of Methanosarcina mazei Go1 is a membrane-associated, corrinoid-containing protein that uses a transmethylation reaction to drive an energy-conserving sodium ion pump. 44125 pfam04211: Tetrahydromethanopterin S-methyltransferase, subunit C. The N5-methyltetrahydromethanopterin: coenzyme M (EC:2.1.1.86) of Methanosarcina mazei Go1 is a membrane-associated, corrinoid-containing protein that uses a transmethylation reaction to drive an energy-conserving sodium ion pump. 44126 pfam04212: MIT domain. 44127 pfam04213: Htaa. This domain is found in HtaA, a secreted protein implicated in iron acquisition and transport. 44128 pfam04214: Protein of unknown function, DUF. The function of the members of this bacterial protein family is unknown. Some members may be involved in conferring cation resistance. 44129 pfam04215: Putative sugar-specific permease, SgaT/UlaA. This family consists of bacterial transmembrane proteins with a putative sugar-specific permease function, analogous to the IIC component of the PTS system (pfam02378). It has been suggested that this permease may form part of an L-ascorbate utilisation pathway, with proposed specificity for 3-keto-L-gulonate (formed by hydrolysis of L-ascorbate).. 44130 pfam04216: Protein involved in formate dehydrogenase formation. The function of these proteins is unknown. They may possibly be involved in the formation of formate dehydrogenase. 44131 pfam04217: Protein of unknown function, DUF412. This family consists of bacterial uncharacterised proteins. 44132 pfam04218: CENP-B N-terminal DNA-binding domain. Centromere Protein B (CENP-B) is a DNA-binding protein localised to the centromere. Within the N-terminal 125 residues, there is a DNA-binding region, which binds to a corresponding 17bp CENP-B box sequence. CENP-B dimers either bind two separate DNA molecules or alternatively, they may bind two CENP-B boxes on one DNA molecule, with the intervening stretch of DNA forming a loop structure. The CENP-B DNA-binding domain consists of two repeating domains, RP1 and RP2. This family corresponds to RP1 has been shown to consist of four helices in a helix-turn-helix structure. 44133 pfam04219: Protein of unknown function, DUF. 44134 pfam04220: Protein of unknown function, DUF414. This family includes several bacterial proteins of unknown function, although at least one member is a putative coproporphyrinogen III oxidase. 44135 pfam04221: RelB antitoxin. RelE and RelB form a toxin-antitoxin system. RelE represses translation, probably through binding ribosomes (,). RelB stably binds RelE, presumably deactivating it. 44136 pfam04222: Protein of unknown function, DUF. This is a bacterial family of uncharacterised proteins. 44137 pfam04223: Citrate lyase, alpha subunit (CitF). In citrate-utilising prokaryotes, citrate lyase EC:4.1.3.6 cleaves intracellular citrate into acetate and oxaloacetate, and is organised as a functional complex consisting of alpha, beta, and gamma subunits. The gamma subunit serves as an acyl carrier protein (ACP), and has a 2'-(5''-phosphoribosyl)-3 '-dephospho-CoA prosthetic group. The citrate lyase is active only if this prosthetic group is acetylated; this acetylation is catalysed by an acetate:SH-citrate lyase ligase. The alpha subunit substitutes citryl for the acetyl group to form citryl-S-ACP. The beta subunit completes the reaction by cleaving the citryl to yield oxaloacetate and (regenerated) acetyl-S-ACP. This family represents the alpha subunit EC:2.8.3.10. 44138 pfam04224: Protein of unknown function, DUF417. This family of uncharacterised proteins appears to be restricted to proteobacteria. 44139 pfam04225: Opacity-associated protein A. This family includes the Haemophilus influenzae opacity-associated protein. This protein is required for efficient nasopharyngeal mucosal colonisation, and its expression is associated with a distinctive transparent colony phenotype. OapA is thought to be a secreted protein, and its expression exhibits high-frequency phase variation. 44140 pfam04226: Transglycosylase associated protein. Bacterial protein, predicted to be an integral membrane protein. Some family members have been annotated as transglycosylase associated proteins, but no experimental evidence is provided. 44141 pfam04227: Indigoidine synthase A like protein. Indigoidine is a blue pigment synthesised by Erwinia chrysanthemi implicated in pathogenicity and protection from oxidative stress. IdgA is involved in indigoidine biosynthesis, but its specific function is unknown. 44142 pfam04228: Putative neutral zinc metallopeptidase. Members of this family have a predicted zinc binding motif characteristic of neutral zinc metallopeptidases. 44143 pfam04229: Uncharacterised protein family (UPF0157). Also known as GrpB. 44144 pfam04230: Polysaccharide pyruvyl transferase. Pyruvyl-transferases involved in peptidoglycan-associated polymer biosynthesis. CsaB in Bacillus anthracis is necessary for the non-covalent anchoring of proteins containing an SLH (S-layer homology) domain to peptidoglycan-associated pyruvylated polysaccharides. WcaK and AmsJ are involved in the biosynthesis of colanic acid in Escherichia coli and of amylovoran in Erwinia amylovora. 44145 pfam04231: Endonuclease I. Bacterial periplasmic or secreted endonuclease I (EC:3.1.21.1) E. coli endonuclease I (EndoI) is a sequence independent endonuclease located in the periplasm. It is inhibited by different RNA species. It is thought to normally generate double strand breaks in DNA, except in the presence of high salt concentrations and RNA, when it generates single strand breaks in DNA. Its biological role is unknown. Other family members are known to be extracellular. This family also includes a non-specific, Mg2+ activated ribonuclease precursor. 44146 pfam04232: Stage V sporulation protein S (SpoVS). In Bacillus subtilis this protein interferes with sporulation at an early stage and this inhibitory effect is overcome by SpoIIB and SpoVG. SpoVS seems to play a positive role in allowing progression beyond stage V of sporulation. Null mutations in the spoVS gene block sporulation at stage V, impairing the development of heat resistance and coat assembly. 44147 pfam04233: Phage Mu protein F like protein. Members of this family are found in double-stranded DNA bacteriophages, and in some bacteria. A member of this family is required for viral head morphogenesis in bacteriophage SPP1. This family is possibly a minor head protein. This family may be related to the family TT_ORF1 (pfam02956).. 44148 pfam04234: Copper resistance protein CopC. CopC is a bacterial blue copper protein that binds 1 atom of copper per protein molecule. Along with CopA, CopC mediates copper resistance by sequestration of copper in the periplasm. 44149 pfam04235: Protein of unknown function (DUF418). Probable integral membrane protein. 44150 pfam04236: Tc5 transposase C-terminal domain. This family corresponds to a C-terminal cysteine rich region that probably binds to a metal ion and could be DNA binding (pers. obs. A Bateman)... 44151 pfam04237: Protein of unknown function (DUF419).. 44152 pfam04238: Protein of unknown function (DUF420). Predicted membrane protein with four transmembrane helices. 44153 pfam04239: Protein of unknown function (DUF421). YDFR family. 44154 pfam04240: Protein of unknown function (DUF422). Predicted to be an integral membrane protein. 44155 pfam04241: Protein of unknown function (DUF423). Potential integral membrane protein. 44156 pfam04242: Protein of unknown function (DUF424). Archaeal protein of unknown function. 44157 pfam04243: tRNA m(1)G methyltransferase. Family member HYNA is the product of a novel gene expressed in human liver cancer tissue. This methyltransferase is responsible for methylation of tRNA at the N-1 position of guanosine to form m(1)G at position 9 in rRNA. . 44158 pfam04244: Deoxyribodipyrimidine photolyase-related protein. This family appears to be related to pfam00875. 44159 pfam04245: 37-kD nucleoid-associated bacterial protein. 44160 pfam04246: Positive regulator of sigma(E), RseC/MucC. This bacterial family of integral membrane proteins represents a positive regulator of the sigma(E) transcription factor, namely RseC/MucC. The sigma(E) transcription factor is up-regulated by cell envelope protein misfolding, and regulates the expression of genes that are collectively termed ECF (devoted to Extra-Cellular Functions). In Pseudomonas aeruginosa, de-repression of sigma(E) is associated with the alginate-overproducing phenotype characteristic of chronic respiratory tract colonisation in cystic fibrosis patients. The mechanism by which RseC/MucC positively regulates the sigma(E) transcription factor is unknown. RseC is also thought to have a role in thiamine biosynthesis in Salmonella typhimurium. In addition, this family also includes an N-terminal part of RnfF, a Rhodobacter capsulatus protein, of unknown function, that is essential for nitrogen fixation. This protein also contains an ApbE domain pfam02424, which is itself involved in thiamine biosynthesis. 44161 pfam04247: Invasion gene expression up-regulator, SirB. SirB up-regulates Salmonella typhimurium invasion gene transcription. It is, however, not essential for the expression of these genes. Its function is unknown. 44162 pfam04248: Domain of unknown function (DUF427).. 44163 pfam04250: Protein of unknown function (DUF429).. 44164 pfam04251: Protein of unknown function (DUF430).. 44165 pfam04252: Protein of unknown function (DUF431).. 44166 pfam04253: Transferrin receptor-like dimerisation domain. This domain is involved in dimerisation of the transferrin receptor as shown in its crystal structure. 44167 pfam04254: Protein of unknown function (DUF432). Archaeal protein of unknown function. 44168 pfam04255: Protein of unknown function (DUF433).. 44169 pfam04256: Protein of unknown function (DUF434).. 44170 pfam04257: Exodeoxyribonuclease V, gamma subunit. The Exodeoxyribonuclease V enzyme is a multi-subunit enzyme comprised of the proteins RecB, RecC (this family) and RecD. This enzyme plays an important role in homologous genetic recombination, repair of double strand DNA breaks resistance to UV irradiation and chemical DNA-damage. The enzyme (EC:3.1.11.5) catalyses ssDNA or dsDNA-dependent ATP hydrolysis, hydrolysis of ssDNA or dsDNA and unwinding of dsDNA. 44171 pfam04258: Signal peptide peptidase. The members of this family are membrane proteins. In some proteins this region is found associated with pfam02225. This family corresponds with Merops subfamily A22B, the type example of which is signal peptide peptidase. There is a sequence-similarity relationship with pfam01080. 44172 pfam04259: Small, acid-soluble spore protein, gamma-type. The SASP family is a family of small, glutamine and asparagine-rich peptides that store amino acids in the spores of Bacillus subtilis and related bacteria. . 44173 pfam04260: Protein of unknown function (DUF436). Family of bacterial proteins with undetermined function. 44174 pfam04261: Dyp-type peroxidase family. This family of dye-decolourising peroxidases lack a typical heme-binding region. 44175 pfam04262: Glutamate-cysteine ligase. Family of bacterial f glutamate-cysteine ligases (EC:6.3.2.2) that carry out the first step of the glutathione biosynthesis pathway. 44176 pfam04263: Thiamin pyrophosphokinase, catalytic domain. Family of thiamin pyrophosphokinase (EC:2.7.6.2). Thiamin pyrophosphokinase (TPK) catalyses the transfer of a pyrophosphate group from ATP to vitamin B1 (thiamin) to form the coenzyme thiamin pyrophosphate (TPP). Thus, TPK is important for the formation of a coenzyme required for central metabolic functions. The structure of thiamin pyrophosphokinase suggest that the enzyme may operate by a mechanism of pyrophosphoryl transfer similar to those described for pyrophosphokinases functioning in nucleotide biosynthesis. 44177 pfam04264: YceI like family. E. coli YceI is a base-induced periplasmic protein. Its function has not yet been characterised. 44178 pfam04265: Thiamin pyrophosphokinase, vitamin B1 binding domain. Family of thiamin pyrophosphokinase (EC:2.7.6.2). Thiamin pyrophosphokinase (TPK) catalyses the transfer of a pyrophosphate group from ATP to vitamin B1 (thiamin) to form the coenzyme thiamin pyrophosphate (TPP). Thus, TPK is important for the formation of a coenzyme required for central metabolic functions. The structure of thiamin pyrophosphokinase suggest that the enzyme may operate by a mechanism of pyrophosphoryl transfer similar to those described for pyrophosphokinases functioning in nucleotide biosynthesis. 44179 pfam04266: Protein of unknown function (DUF437). Archaeal protein of unknown function. 44180 pfam04267: Sarcosine oxidase, delta subunit family. Sarcosine oxidase is a hetero-tetrameric enzyme that contains both covalently bound FMN and non-covalently bound FAD and NAD(+). This enzyme catalyses the oxidative demethylation of sarcosine to yield glycine, H2O2, and 5,10-CH2-tetrahydrofolate (H4folate) in a reaction requiring H4folate and O2. 44181 pfam04268: Sarcosine oxidase, gamma subunit family. Sarcosine oxidase is a hetero-tetrameric enzyme that contains both covalently bound FMN and non-covalently bound FAD and NAD(+). This enzyme catalyses the oxidative demethylation of sarcosine to yield glycine, H2O2, and 5,10-CH2-tetrahydrofolate (H4folate) in a reaction requiring H4folate and O2. 44182 pfam04269: Protein of unknown function, DUF440. This family consists of uncharacterised bacterial proteins. 44183 pfam04270: Streptococcal histidine triad protein. All members of this family are proteins from Streptococcal species. The proteins are characterised by having a HxxHxH motif that usually occurs multiple times throughout the protein. 44184 pfam04271: DnaD-like domain. DnaD is a component of the PriA primosome. The PriA primosome functions to recruit the replication fork helicase onto the DNA. 44185 pfam04272: Phospholamban. The regulation of calcium levels across the membrane of the sarcoplasmic reticulum involves the interplay of many membrane proteins. Phospholamban is a 52 residue integral membrane protein that is involved in reversibly inhibiting the Ca(2+) pump and regulating the flow of Ca ions across the sarcoplasmic reticulum membrane during muscle contraction and relaxation. Phospholamban is thought to form a pentamer in the membrane. 44186 pfam04273: Protein of unknown function (DUF442). Family of uncharacterised proteins. 44187 pfam04274: Tagatose 1,6-diphosphate aldolase, (LacD). Tagatose 1,6-diphosphate aldolase (EC:4.1.2.40) is part of the tagatose-6-phosphate pathway of galactose-6-phosphate degradation. 44188 pfam04275: Phosphomevalonate kinase. Phosphomevalonate kinase (EC:2.7.4.2) catalyses the phosphorylation of 5-phosphomevalonate into 5-diphosphomevalonate, an essential step in isoprenoid biosynthesis via the mevalonate pathway. This family represents the animal type of the enzyme. The other is the ERG8 type, found in plants and fungi, and some bacteria (see pfam00288).. 44189 pfam04276: Protein of unknown function (DUF443). Family of uncharacterised proteins. 44190 pfam04277: Oxaloacetate decarboxylase, gamma chain. 44191 pfam04278: Tic22-like family. The preprotein translocation at the inner envelope membrane of chloroplasts so far involves five proteins: Tic110, Tic55, Tic40, Tic22 (this family) and Tic20. The molecular function of these proteins has not yet been established. . 44192 pfam04279: Intracellular septation protein A. 44193 pfam04280: Tim44-like domain. Tim44 is an essential component of the machinery that mediates the translocation of nuclear-encoded proteins across the mitochondrial inner membrane. Tim44 is thought to bind phospholipids of the mitochondrial inner membrane both by electrostatic interactions and by penetrating the polar head group region. This family includes the C-terminal region of Tim44 that has been shown to form a stable proteolytic fragment in yeast. This region is also found in a set of smaller bacterial proteins. The molecular function of the bacterial members of this family is unknown but transport seems likely. 44194 pfam04281: Mitochondrial import receptor subunit Tom22. The mitochondrial protein translocase family, which is responsible for movement of nuclear encoded preproteins into mitochondria, is very complex with at least 19 components. These proteins include several chaperone proteins, four proteins of the outer membrane translocase (Tom) import receptor, five proteins of the Tom channel complex, five proteins of the inner membrane translocase (Tim) and three ""motor"" proteins. This family represents the Tom22 proteins. 44195 pfam04282: Family of unknown function (DUF438).. 44196 pfam04283: Protein of unknown function (DUF439). Archaeal protein of unknown function. 44197 pfam04284: Protein of unknown function (DUF441). Predicted to be an integral membrane protein. 44198 pfam04285: Protein of unknown function (DUF444). Bacterial protein of unknown function. One family member is predicted to contain a von Willebrand factor (vWF) type A domain (Smart:VWA).. 44199 pfam04286: Protein of unknown function (DUF445). Predicted to be a membrane protein. 44200 pfam04287: Domain of unknown function, DUF446. This family includes an N-terminal region of unknown function from the Erwinia cartovora exoenzyme regulation regulon orf1 protein, which also contains a RNA pseudouridylate synthase domain pfam00849. 44201 pfam04288: MukE-like family. Bacterial protein involved in chromosome partitioning, MukE. 44202 pfam04289: Protein of unknown function (DUF447). Archaeal protein of unknown function. 44203 pfam04290: Tripartite ATP-independent periplasmic transporters, DctQ component. The function of the members of this family is unknown, but DctQ homologues are invariably found in the tripartite ATP-independent periplasmic transporters. 44204 pfam04291: Spore maturation protein A. Spore maturation protein A (SpmA) is involved in spore core dehydration in Bacillus subtilis. Spore dehydration is important for heat resistance and for processing of the spore germination protease GPR into an active form. SpmA might be involved import or export from the forespore, or for modification of the cortex peptidoglycan structure. SpmA is predicted to be an integral membrane protein. 44205 pfam04292: D-galactarate dehydratase / Altronate hydrolase, N terminus. Family members include the N termini of D-galactarate dehydratase (EC:4.2.1.42) which is thought to catalyse the reaction D-galactarate = 5-keto-4-deoxy-D-glucarate + H2O, and altronate hydrolase (altronic acid hydratase, EC:4.2.1.7), which catalyses D-altronate = 2-keto-2-deoxygluconate + H2O. As purified, both enzymes are catalytically inactive in the absence of added Fe2+, Mn2+, and beta-mercaptoethanol. Synergistic activation of altronate hydrolase activity is seen in the presence of both iron and manganese ions, suggesting that the enzyme may have two ion binding sites. Mn2+ appears to be part of the enzyme active centre, but the function of the single bound Fe2+ ion is unknown. The hydratase has no Fe-S core. 44206 pfam04293: SpoVR like protein. One family member is Bacillus subtilis stage V sporulation protein R, which is involved in spore cortex formation. Little is known about cortex biosynthesis, except that it depends on several sigma E controlled genes, including spoVR. 44207 pfam04294: VanW like protein. Family members include vancomycin resistance protein W (VanW). Genes encoding members of this family have been found in vancomycin resistance gene clusters vanB and vanG. The function of VanW is unknown. 44208 pfam04295: D-galactarate dehydratase / Altronate hydrolase, C terminus. Family members include the C termini of D-galactarate dehydratase (EC:4.2.1.42) which is thought to catalyse the reaction D-galactarate = 5-keto-4-deoxy-D-glucarate + H2O, and altronate hydrolase (altronic acid hydratase, EC:4.2.1.7), which catalyses D-altronate = 2-keto-2-deoxygluconate + H2O. As purified, both enzymes are catalytically inactive in the absence of added Fe2+, Mn2+, and beta-mercaptoethanol. Synergistic activation of altronate hydrolase activity is seen in the presence of both iron and manganese ions, suggesting that the enzyme may have two ion binding sites. Mn2+ appears to be part of the enzyme active centre, but the function of the single bound Fe2+ ion is unknown. The hydratase has no Fe-S core. 44209 pfam04296: Protein of unknown function (DUF448).. 44210 pfam04297: Putative helix-turn-helix protein, YlxM / p13 like. Members of this family are predicted to contain a helix-turn-helix motif, for example residues 37-55 in Mycoplasma mycoides p13. Genes encoding family members are often part of operons that encode components of the SRP pathway, and this protein may regulate the expression of an operon related to the SRP pathway. 44211 pfam04298: Putative neutral zinc metallopeptidase. Zinc metallopeptidase zinc binding regions have been predicted in all family members by a pattern match. 44212 pfam04299: Negative transcriptional regulator. In Bacillus subtilis, family member PAI 2 is involved in the negative regulation of protease synthesis and sporulation. 44213 pfam04300: F-box associated region. Members of this family are associated with F-box domains, hence the name FBA. This domain is probably involved in binding other proteins that will be targeted for ubiquitination. One member has been shown to be involved in binding to N-glycosylated proteins. 44214 pfam04301: Protein of unknown function (DUF452).. 44215 pfam04302: Protein of unknown function (DUF451). Putative lipoprotein. 44216 pfam04303: Protein of unknown function (DUF453). FldA is thought to be involved in the degradation of the polyaromatic hydrocarbon fluorene by Sphingomonas sp. LB126. 44217 pfam04304: Protein of unknown function (DUF454). Predicted membrane protein. 44218 pfam04305: Protein of unknown function (DUF455).. 44219 pfam04306: Protein of unknown function (DUF456). Putative membrane protein. 44220 pfam04307: Predicted membrane-bound metal-dependent hydrolase (DUF457). Family of predicted membrane-bound metal-dependent hydrolases. 44221 pfam04308: Protein of unknown function (DUF458). Family of uncharacterised eubacterial proteins. 44222 pfam04309: Glycerol-3-phosphate responsive antiterminator. Intracellular glycerol is usually converted to glycerol-3-phosphate in an ATP-requiring phosphorylation reaction catalysed by glycerol kinase (GlpK) glycerol-3-phosphate activates the antiterminator GlpP. 44223 pfam04310: MukB N-terminal. This family represents the N-terminal region of MukB, one of a group of bacterial proteins essential for the movement of nucleoids from mid-cell towards the cell quarters (i.e. chromosome partitioning). The structure of the N-terminal domain consists of an antiparallel six-stranded beta sheet surrounded by one helix on one side and by five helices on the other side. It contains an exposed Walker A loop in an unexpected helix-loop-helix motif (in other proteins, Walker A motifs generally adopt a P loop conformation as part of a strand-loop-helix motif embedded in a conserved topology of alternating helices and (parallel) beta strands).. 44224 pfam04311: Protein of unknown function (DUF459). Putative periplasmic protein. 44225 pfam04312: Protein of unknown function (DUF460). Archaeal protein of unknown function. 44227 pfam04314: Protein of unknown function (DUF461). Putative membrane or periplasmic protein. 44228 pfam04315: Protein of unknown function, DUF462. This family consists of bacterial proteins of uncharacterized function. 44229 pfam04316: Anti-sigma-28 factor, FlgM. FlgM binds and inhibits the activity of the transcription factor sigma 28. Inhibition of sigma 28 prevents the expression of genes from flagellar transcriptional class 3, which include genes for the filament and chemotaxis. Correctly assembled basal body-hook structures export FlgM, relieving inhibition of sigma 28 and allowing expression of class 3 genes. NMR studies show that free FlgM is mostly unfolded, which may facilitate its export. The C terminal half of FlgM adopts a tertiary structure when it binds to sigma 28. All mutations in FlgM that prevent sigma 28 inhibition affect the C-terminal domain and is the region thought to constitute the binding domain. A minimal binding domain has been identified between Glu 64 and Arg 88 in Salmonella typhimurium.The N-terminal portion remains unstructured and may be necessary for recognition by the export machinery. 44230 pfam04317: YcjX-like family, DUF463. Some members of this family are thought to possess an ATP-binding domain towards their N-terminus. 44231 pfam04318: Protein of unknown function (DUF468). Family of uncharacterized yeast proteins. 44232 pfam04319: NifZ domain. This short protein is found in the nif (nitrogen fixation) operon. Its function is unknown but is probably involved in nitrogen fixation or regulating some component of this process. This 75 residue region is presumed to be a domain. It is found in isolation in some members and in the amino terminal half of the longer NifZ proteins. 44233 pfam04320: Protein with unknown function (DUF469). Family of bacteria protein with no known function. 44234 pfam04321: RmlD substrate binding domain. L-rhanmose is a saccharide required for the virulence of some bacteria. Its precursor, dTDP-L-rhanmose, is synthesised by four different enzymes the final one of which is RmlD. The RmlD substrate binding domain is responsible for binding a sugar nucleotide. 44235 pfam04322: Protein of unknown function (DUF473). Family of uncharacterized Archaeal proteins. 44236 pfam04323: Protein of unknown function (DUF474). Family of uncharacterized Archaeal/Bacterial proteins. 44237 pfam04324: BFD-like [2Fe-2S] binding domain. The two Fe ions are each coordinated by two conserved cysteine residues. This domain occurs alone in small proteins such as Bacterioferritin-associated ferredoxin (BFD). The function of BFD is not known, but it may may be a general redox and/or regulatory component involved in the iron storage or mobilisation functions of bacterioferritin in bacteria. This domain is also found in nitrate reductase proteins in association with Nitrite and sulphite reductase 4Fe-4S domain (pfam01077), Nitrite/Sulfite reductase ferredoxin-like half domain (pfam03460) and Pyridine nucleotide-disulphide oxidoreductase (pfam00070). It is also found in NifU nitrogen fixation proteins, in association with NifU-like N terminal domain (pfam01592) and NifU-like domain (pfam01106).. 44238 pfam04325: Protein of unknown function (DUF465). Family members are found in small bacterial proteins, and also in the heavy chains of eukaryotic myosin and kinesin, C terminal of the motor domain (Myosin pfam00063, Kinesin pfam00225). Members of this family may form coiled coil structures. 44239 pfam04326: Divergent AAA domain. This family is related to the pfam00004 family, and presumably has the same function (ATP-binding).. 44240 pfam04327: Protein of unknown function (DUF464).. 44241 pfam04328: Protein of unknown function (DUF466). Small bacterial protein of unknown function. 44242 pfam04329: Family of unknown function (DUF470). This sequence is usually found in association with DUF471 and DUF472, and occasionally also with UPF0104 (pfam03706) in integral membrane proteins. Together, DUF470, DUF471 and DUF472 make up the C terminal portion of Staphylococcus aureus FmtC / MprF, which is involved resistance to defensins by the lysinylation of membrane phospholipids. DUF470, DUF471 and DUF472 also occur adjacent to the OB-fold nucleic acid binding domain (pfam01336) and tRNA synthetase class II (pfam00152) in Lysyl-tRNA synthases. 44243 pfam04330: Family of unknown function (DUF471). This sequence is usually found in association with DUF470 and DUF472, and occasionally also with UPF0104 (pfam03706) in integral membrane proteins. Together, DUF470, DUF471 and DUF472 make up the C terminal portion of Staphylococcus aureus FmtC / MprF, which is involved resistance to defensins by the lysinylation of membrane phospholipids. DUF470, DUF471 and DUF472 also occur N terminal to the OB-fold nucleic acid binding domain (pfam01336) and tRNA synthetase class II (pfam00152) in Lysyl-tRNA synthases. 44244 pfam04331: Family of unknown function (DUF472). This sequence is usually found in association with DUF470 and DUF472, and occasionally also with UPF0104 (pfam03706) in integral membrane proteins. Together, DUF470, DUF471 and DUF472 make up the C terminal portion of Staphylococcus aureus FmtC / MprF, which is involved resistance to defensins by the lysinylation of membrane phospholipids. DUF470, DUF471 and DUF472 also occur N terminal to the OB-fold nucleic acid binding domain (pfam01336) and tRNA synthetase class II (pfam00152) in Lysyl-tRNA synthases. 44245 pfam04332: Protein of unknown function (DUF475). Predicted to be an integral membrane protein with multiple membrane spans. 44246 pfam04333: VacJ like lipoprotein. VacJ is required for the intercellular spreading of Shigella flexneri. It is attached to the outer membrane by a lipid anchor. 44247 pfam04334: Protein of unknown function (DUF478). This family contains uncharacterized protein encoded on Trypanosoma kinetoplast minicircles. 44248 pfam04335: VirB8 protein. VirB8 is a bacterial virulence protein with cytoplasmic, transmembrane, and periplasmic regions. It is thought that it is a primary constituent of a DNA transporter. The periplasmic region interacts with VirB9, VirB10, and itself. 44249 pfam04336: Protein of unknown function, DUF479. This family includes several bacterial proteins of uncharacterised function. 44250 pfam04337: Protein of unknown function, DUF480. This family consists of several proteins of uncharacterised function. 44251 pfam04338: Protein of unknown function, DUF481. This family includes several proteins of uncharacterised function. 44252 pfam04339: Protein of unknown function, DUF482. This family contains several proteins of uncharacterised function. 44253 pfam04340: Protein of unknown function, DUF484. This family consists of several proteins of uncharacterised function. 44254 pfam04341: Protein of unknown function, DUF485. This family includes several putative integral membrane proteins. 44255 pfam04342: Protein of unknown function, DUF486. This family contains several proteins of uncharacterised function. 44256 pfam04343: Protein of unknown function, DUF488. This family includes several proteins of uncharacterised function. 44257 pfam04344: Chemotaxis phosphatase, CheZ. This family represents the bacterial chemotaxis phosphatase, CheZ. This protein forms a dimer characterised by a long four-helix bundle, composed of two helices from each monomer. CheZ dephosphorylates CheY in a reaction that is essential to maintain a continuous chemotactic response to environmental changes. It is thought that CheZ's conserved residue Gln 147 orientates a water molecule for nucleophilic attack at the CheY active site. 44258 pfam04345: Chorismate lyase. Chorismate lyase catalyses the first step in ubiquinone synthesis, i.e. the removal of pyruvate from chorismate, to yield 4-hydroxybenzoate. 44259 pfam04346: Ethanolamine utilisation protein, EutH. EutH is a bacterial membrane protein whose molecular function is unknown. It has been suggested that it may act as an ethanolamine transporter, responsible for carrying ethanolamine from the periplasm to the cytoplasm. 44260 pfam04347: Flagellar biosynthesis protein, FliO. FliO is an essential component of the flagellum-specific protein export apparatus. It is an integral membrane protein. Its precise molecular function is unknown. 44261 pfam04348: LppC putative lipoprotein. This family includes several bacterial outer membrane antigens, whose molecular function is unknown. 44262 pfam04349: Periplasmic glucan biosynthesis protein, MdoG. This family represents MdoG, a protein that is necessary for the synthesis of periplasmic glucans. The function of MdoG remains unknown. It has been suggested that it may catalyse the addition of branches to a linear glucan backbone. 44263 pfam04350: Pilus assembly protein, PilO. PilO proteins are involved in the assembly of pilin. However, the precise function of this family of proteins is not known. 44264 pfam04351: Pilus assembly protein, PilQ. PilQ is essential for the biogenesis of type IV pili. Its precise function is unknown, but it has been suggested that it may act as a pilus channel in the final stages of pilus assembly. 44265 pfam04352: ProQ activator of osmoprotectant transporter ProP. This family includes ProQ, which is required for full activation of the osmoprotectant transporter, ProQ, in Escherichia coli. 44266 pfam04353: Regulator of RNA polymerase sigma(70) subunit, Rsd/AlgQ. This family includes bacterial transcriptional regulators that are thought to act through an interaction with the conserved region 4 of the sigma(70) subunit of RNA polymerase. The Pseudomonas aeruginosa homologue, AlgQ, positively regulates virulence gene expression and is associated with the mucoid phenotype observed in Pseudomonas aeruginosa isolates from cystic fibrosis patients. 44267 pfam04354: ZipA, C-terminal FtsZ-binding domain. This family represents the ZipA C-terminal domain. ZipA is involved in septum formation in bacterial cell division. Its C-terminal domain binds FtsZ, a major component of the bacterial septal ring. The structure of this domain is an alpha-beta fold with three alpha helices and a beta sheet of six antiparallel beta strands. The major loops protruding from the beta sheet surface are thought to form a binding site for FtsZ. 44268 pfam04355: SmpA / OmlA family. Lipoprotein Bacterial outer membrane lipoprotein, possibly involved in in maintaining the structural integrity of the cell envelope. Lipid attachment site is a conserved N terminal cysteine residue. Sometimes found adjacent to the OmpA domain (pfam00691).. 44269 pfam04356: Protein of unknown function (DUF489). Protein of unknown function, cotranscribed with purB in Escherichia coli, but with function unrelated to purine biosynthesis. 44270 pfam04357: Family of unknown function (DUF490).. 44271 pfam04358: DsrC like protein. One family member has been observed to co-purify with Desulfovibrio vulgaris dissimilatory sulfite reductase, and many members of this family are annotated as the third (gamma) subunit of dissimilatory sulphite reductase. However, this protein appears to be only loosely associated to the sulfite reductase, which suggests that DsrC may not be an integral part of the dissimilatory sulphite reductase. Members of this family are found in organisms such as E. coli and H. influenzae which do not contain dissimilatory sulphite reductases but can synthesise assimilatory sirohaem sulphite and nitrite reductases. It is speculated that DsrC may be involved in the assembly, folding or stabilisation of sirohaem proteins. The strictly conserved cysteine in the C terminus suggests that DsrC may have a catalytic function in the metabolism of sulphur compounds. 44272 pfam04359: Protein of unknown function (DUF493).. 44273 pfam04360: Serglycin. Serglycin is the most prevalent proteoglycan produced in haemopoietic cells. Serglycin is a proteinase resistant secretory granule proteoglycan. 44274 pfam04361: Protein of unknown function (DUF494). Members of this family of uncharacterised proteins are often named Smg. 44275 pfam04362: Protein of unknown function (DUF495). Methionine start codon is known to be cleaved from Escherichia coli protein YggX. YggX is also known to be highly abundant in E. coli K-12. 44276 pfam04363: Protein of unknown function (DUF496).. 44277 pfam04364: DNA polymerase III chi subunit, HolC. The DNA polymerase III holoenzyme (EC:2.7.7.7) is the polymerase responsible for the replication of the Escherichia coli chromosome. The holoenzyme is composed of the DNA polymerase III core, the sliding clamp, and the DnaX clamp loading complex. The DnaX complex contains either either the tau or gamma product of gene dnax, complexed to delta.delta' and to chi psi. Chi forms a 1:1 heterodimer with psi. The chi psi complex functions by increasing the affinity of tau and gamma for delta.delta' allowing a functional clamp-loading complex to form at physiological subunit concentrations. Psi is responsible for the interaction with DnaX (gamma/tau), but psi is insoluble unless it is in a complex with chi. 44278 pfam04365: Protein of unknown function (DUF497).. 44279 pfam04366: Family of unknown function (DUF500). Proteins in this family often also contain an SH3 domain (pfam00018), or a FYVE zinc finger (pfam01363).. 44280 pfam04367: Protein of unknown function (DUF502). Predicted to be an integral membrane protein. 44281 pfam04368: Protein of unknown function (DUF507). Bacterial protein of unknown function. 44282 pfam04369: Lactococcin-like family. Family of bacteriocins from lactic acid bacteria. 44283 pfam04370: Domain of unknown function (DUF508). Family of uncharacterized proteins from C. elegans. 44284 pfam04371: Porphyromonas-type peptidyl-arginine deiminase. Peptidyl-arginine deiminase (PAD) enzymes catalyse the deimination of the guanidino group from carboxy-terminal arginine residues of various peptides to produce ammonia. PAD from Porphyromonas gingivalis (PPAD) appears to be evolutionarily unrelated to mammalian PAD (pfam03068), which is a metalloenzyme. PPAD is thought to belong to the same superfamily as aminotransferase and arginine deiminase, and to form an alpha/beta propeller structure. This family has previously been named PPADH (Porphyromonas peptidyl-arginine deiminase homologues). The predicted catalytic residues in PPAD are Asp130, Asp187, His236, Asp238 and Cys351. These are absolutely conserved with the exception of Asp187 which is absent in two family members. PPAD is also able to catalyse the deimination of free L-arginine, but has primarily peptidyl-arginine specificity. It may have a FMN cofactor. 44285 pfam04373: Protein of unknown function (DUF511). Bacterial protein of unknown function. 44286 pfam04375: HemX. This family consists of several bacterial HemX proteins. The hemX gene is not essential for haem synthesis in B. subtilis. HemX is a polytopic membrane protein which by an unknown mechanism down-regulates the level of HemA. 44287 pfam04376: Arginine-tRNA-protein transferase, N terminus. This family represents the N terminal region of the enzyme arginine-tRNA-protein transferase (EC 2.3.2.8), which catalyses the post-translational conjugation of arginine to the N terminus of a protein. In eukaryotes, this functions as part of the N-end rule pathway of protein degradation by conjugating a destabilising amino acid to the amino terminal aspartate or glutamate of a protein, targeting the protein for ubiquitin-dependent proteolysis. N terminal cysteine is sometimes modified. In S cerevisiae, Cys20, 23, 94 and/or 95 are thought to be important for activity. Of these, only Cys 94 appears to be completely conserved in this family. 44288 pfam04377: Arginine-tRNA-protein transferase, C terminus. This family represents the C terminal region of the enzyme arginine-tRNA-protein transferase (EC 2.3.2.8), which catalyses the post-translational conjugation of arginine to the N terminus of a protein. In eukaryotes, this functions as part of the N-end rule pathway of protein degradation by conjugating a destabilising amino acid to the amino terminal aspartate or glutamate of a protein, targeting the protein for ubiquitin-dependent proteolysis. N terminal cysteine is sometimes modified. 44289 pfam04378: Protein of unknown function (DUF519). Bacterial family of unknown function, possibly secreted. 44290 pfam04379: Protein of unknown function (DUF525). Members of this family include the bacterial protein ApaG and the C termini of some F-box proteins (pfam00646). F-box proteins contain a carboxy-terminal domain that interacts with protein substrates, so this family may be involved in protein-protein interaction. The function of ApaG proteins is unknown, but mutations in the Salmonella typhimurium ApaG homologue corD gives a phenotype of low-level cobalt resistance and decreased magnesium efflux by effects on the CorA magnesium transport system. 44291 pfam04380: Protein of unknown function (DUF526).. 44292 pfam04381: Putative exonuclease, RdgC. Members of the RdgC family may have exonuclease activity. RdgC is required for efficient pilin variation in Neisseria gonorrhoeae, suggesting that it may be involved in recombination reactions. In Escherichia coli, RdgC is required for growth in recombination-deficient exonuclease-depleted strains. Under these conditions, RdgC may act as an exonuclease to remove collapsed replication forks, in the absence of the normal repair mechanisms. 44293 pfam04382: SAB domain. This presumed domain is found in proteins containing FERM domains pfam00373. This domain is found to bind to both spectrin and actin, hence the name SAB (Spectrin and Actin Binding) domain. 44294 pfam04383: KilA-N domain. The amino-terminal module of the D6R/N1R proteins defines a novel, conserved DNA-binding domain (the KilA-N domain) that is found in a wide range of proteins of large bacterial and eukaryotic DNA viruses. The KilA-N domain is suggested to be homologous to the fungal DNA-binding APSES domain. The KilA-N and APSES domains may also share a common fold with the nucleic acid-binding modules of the LAGLIDADG nucleases and the amino-terminal domains of the tRNA endonuclease. 44295 pfam04384: Protein of unknown function (DUF528). Small bacterial protein of unknown function. 44296 pfam04385: Protein of unknown function, DUF529. This family represents a repeated region found in several Theileria parva proteins. 44297 pfam04386: Stringent starvation protein B. Escherichia coli stringent starvation protein B (SspB), is thought to enhance the specificity of degradation of tmRNA-tagged proteins by the ClpXP protease. The tmRNA tag, also known as ssrA, is an 11-aa peptide added to the C terminus of proteins stalled during translation, targets proteins for degradation by ClpXP and ClpAP. SspB a cytoplasmic protein that specifically binds to residues 1-4 and 7 of the tag. Binding of SspB enhances degradation of tagged proteins by ClpX, and masks sequence elements important for ClpA interactions, inhibiting degradation by ClpA. However, more recent work has cast doubt on the importance of SspB in wild-type cells. SspB is encoded in an operon whose synthesis is stimulated by carbon, amino acid, and phosphate starvation. SspB may play a special role during nutrient stress, for example by ensuring rapid degradation of the products of stalled translation, without causing a global increase in degradation of all ClpXP substrates. 44298 pfam04387: Protein tyrosine phosphatase-like protein, PTPLA. This family includes the mammalian protein tyrosine phosphatase-like protein, PTPLA. A significant variation of PTPLA from other protein tyrosine phosphatases is the presence of proline instead of catalytic arginine at the active site. It is thought that PTPLA proteins have a role in the development, differentiation, and maintenance of a number of tissue types. 44299 pfam04388: Hamartin protein. This family includes the hamartin protein which is thought to function as a tumour suppressor. The hamartin protein interacts with the tuberin protein pfam03542. Tuberous sclerosis complex (TSC) is an autosomal dominant disorder and is characterised by the presence of hamartomas in many organs, such as brain, skin, heart, lung, and kidney. It is caused by mutation either TSC1 or TSC2 tumour suppressor gene. TSC1 encodes a protein, hamartin, containing two coiled-coil regions, which have been shown to mediate binding to tuberin. The TSC2 gene codes for tuberin pfam03542. These two proteins function within the same pathway(s) regulating cell cycle, cell growth, adhesion, and vesicular trafficking. 44300 pfam04389: Peptidase family M28. 44301 pfam04390: Rare lipoprotein B family. The Escherichia coli family member has been named Rare lipoprotein B (RplB). Thioglyceride and N-fatty acyl residues may be attached to the N-terminal cysteine, which is conserved in this family. RplB is speculated to be involved in cell duplication. 44302 pfam04391: Protein of unknown function (DUF533). Some family members may be secreted or integral membrane proteins. 44303 pfam04392: Protein of unknown function (DUF534). Putative secreted protein of unknown function. 44304 pfam04393: Protein of unknown function (DUF535). Family member Shigella flexneri VirK is a virulence protein required for the expression, or correct membrane localisation of IcsA (VirG) on the bacterial cell surface. This family also includes Pasteurella haemolytica lapB, which is thought to be membrane-associated. 44305 pfam04394: Protein of unknown function, DUF536. This family aligns the C-terminal region from several bacterial proteins of unknown function that may be involved in a theta-type replication mechanism. 44306 pfam04395: Poxvirus B22R protein. 44307 pfam04396: Protein of unknown function, DUF537. This family represents a conserved region of unknown function within plant proteins. Some family members have one or more zinc-finger motifs towards the C-terminus. 44308 pfam04397: LytTr DNA-binding domain. This domain is found in a variety of bacterial transcriptional regulators. The domain binds to a specific DNA sequence pattern. 44309 pfam04398: Protein of unknown function, DUF538. This family consists of several plant proteins of unknown function. 44310 pfam04399: Glutaredoxin 2, C terminal domain. Glutaredoxins are a multifunctional family of glutathione-dependent disulphide oxidoreductases. Unlike other glutaredoxins, glutaredoxin 2 (Grx2) cannot reduce ribonucleotide reductase. Grx2 has significantly higher catalytic activity in the reduction of mixed disulphides with glutathione (GSH) compared with other glutaredoxins. The active site residues (Cys9-Pro10-Tyr11-Cys12, in Escherichia coli Grx2), which are found at the interface between the N- and C-terminal domains are identical to other glutaredoxins, but there is no other similarity between glutaredoxin 2 and other glutaredoxins. Grx2 is structurally similar to glutathione-S-transferases (GST), but there is no obvious sequence similarity. The inter-domain contacts are mainly hydrophobic, suggesting that the two domains are unlikely to be stable on their own. Both domains are needed for correct folding and activity of Grx2. It is thought that the primary function of Grx2 is to catalyse reversible glutathionylation of proteins with GSH in cellular redox regulation including the response to oxidative stress. 44311 pfam04400: Protein of unknown function (DUF539). Putative periplasmic protein. 44312 pfam04401: Protein of unknown function (DUF540). Uncharacterised bacterial integral membrane protein, possibly involved in cysteine biosynthesis. Speculated to be involved in sulphate transport. 44313 pfam04402: Protein of unknown function (DUF541). Members of this family have so far been found in bacteria and mouse. However possible family members have also been identified in translated rat (Genbank:AW144450) and human (Genbank:AI478629) ESTs. A mouse family member has been named SIMPL (signalling molecule that associates with mouse pelle-like kinase). SIMPL appears to facilitate and/or regulate complex formation between IRAK/mPLK (IL-1 receptor-associated kinase) and IKK (inhibitor of kappa-B kinase) containing complexes, and thus regulate NF-kappa-B activity. Separate experiments demonstrate that a mouse family member (named LaXp180) binds the Listeria monocytogenes surface protein ActA, which is a virulence factor that induces actin polymerisation. It may also bind stathmin, a protein involved in signal transduction and in the regulation of microtubule dynamics. In bacteria its function is unknown, but it is thought to be located in the periplasm or outer membrane. 44314 pfam04403: Paraquat-inducible protein A. Paraquat is a superoxide radical-generating agent. The promoter for the pqiA gene is also inducible by other known superoxide generators. This is predicted to be a family of integral membrane proteins, possibly located in the inner membrane. This family is related to NADH dehydrogenase subunit 2 (pfam00361).. 44315 pfam04404: ERF superfamily. The DNA single-strand annealing proteins (SSAPs), such as RecT, Red-beta, ERF and Rad52, function in RecA-dependent and RecA-independent DNA recombination pathways. This family includes proteins related to ERF. 44316 pfam04405: Domain of Unknown function (DUF542). This domain is always found in conjunction with the HHE domain (pfam03794) at the N-terminus. 44317 pfam04406: Type IIB DNA topoisomerase. Type II DNA topoisomerases are ubiquitous enzymes that catalyse the ATP-dependent transport of one DNA duplex through a second DNA segment via a transient double-strand break. Type II DNA topoisomerases are now subdivided into two sub-families, type IIA and IIB DNA topoisomerases. TP6A_N is present in type IIB topoisomerase and is thought to be involved in DNA binding owing to its sequence similarity to E. coli catabolite activator protein (CAP). . 44318 pfam04407: Protein of unknown function (DUF531). Family of hypothetical archaeal proteins. 44319 pfam04408: Helicase associated domain (HA2). This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding. 44320 pfam04409: Protein of unknown function (DUF530). Family of hypothetical archaeal proteins. 44321 pfam04410: Gar1 protein RNA binding region. Gar1 is a small nucleolar RNP that is required for pre-mRNA processing and pseudouridylation. It is co-immunoprecipitated with the H/ACA families of snoRNAs. This family represents the conserved central region of Gar1. This region is necessary and sufficient for normal cell growth, and specifically binds two snoRNAs snR10 and snR30. This region is also necessary for nucleolar targeting, and it is thought that the protein is co-transported to the nucleolus as part of a nucleoprotein complex. In humans, Gar1 is also component of telomerase in vivo. 44322 pfam04411: Protein of unknown function (DUF524). Family of hypothetical prokaryotic proteins. 44323 pfam04412: Protein of unknown function (DUF521). Family of hypothetical proteins. 44324 pfam04413: 3-Deoxy-D-manno-octulosonic-acid transferase (kdotransferase). Members of this family transfer activated sugars to a variety of substrates, including glycogen, fructose-6-phosphate and lipopolysaccharides. Members of the family transfer UDP, ADP, GDP or CMP linked sugars. The Glycos_transf_N region is flanked at the N-terminus by a signal peptide and at the C-terminus by Glycos_transf_1 (pfam00534). The eukaryotic glycogen synthases may be distant members of this bacterial family. 44325 pfam04414: Protein of unknown function (DUF516). Family of hypothetical proteins. Present in prokaryotes and Arabidopsis. 44326 pfam04415: Protein of unknown function (DUF515). Family of hypothetical Archaeal proteins. 44327 pfam04416: Protein of unknown function (DUF509). Family of hypothetical archaeal proteins. 44328 pfam04417: Protein of unknown function (DUF501). Family of uncharacterised bacterial proteins. 44329 pfam04418: Domain of unknown function (DUF543). This family of short eukaryotic proteins has no known function. Most of the members of this family are only 80 amino acid residues long. However the Arabidopsis homologue is over 300 residues long. The presumed domain contains a conserved amino terminal cysteine and a conserved motif GXGXGXG in the carboxy terminal half that may be functionally important. 44330 pfam04419: 4F5 protein family. Members of this family are short proteins that are rich in aspartate, glutamate, lysine and arginine. Although the function of these proteins is unknown, they are found to be ubiquitously expressed. 44331 pfam04420: CHD5-like protein. Members of this family are probably coiled-coil proteins that are similar to the CHD5 (Congenital heart disease 5) protein. The exact molecular function of these eukaryotic proteins is unknown. 44332 pfam04421: Mss4 protein. 44333 pfam04422: Coenzyme F420 hydrogenase/dehydrogenase, beta subunit N terminus. Coenzyme F420 hydrogenase (EC:1.12.99.1) reduces the low-potential two-electron acceptor coenzyme F420. This family contains the N termini of F420 hydrogenase and dehydrogenase beta subunits. The N terminus of Methanobacterium formicicum formate dehydrogenase beta chain (EC:1.2.1.2) is also a member of this family. This region is often found in association with the 4Fe-4S binding domain, fer4 (pfam00037).. 44334 pfam04423: Rad50 zinc hook motif. The Mre11 complex (Mre11 Rad50 Nbs1) is central to chromosomal maintenance and functions in homologous recombination, telomere maintenance and sister chromatid association. The Rad50 coiled-coil region contains a dimer interface at the apex of the coiled coils in which pairs of conserved Cys-X-X-Cys motifs form interlocking hooks that bind one Zn ion. This alignment includes the zinc hook motif and a short stretch of coiled-coil on either side. 44335 pfam04424: Protein of unknown function (DUF544). Eukaryotic protein of unknown function. 44336 pfam04425: Bul1 N terminus. This family contains the N terminus of Saccharomyces cerevisiae Bul1. Bul1 binds the ubiquitin ligase Rsp5, via an N terminal PPSY motif. The complex containing Bul1 and Rsp5 is involved in intracellular trafficking of the general amino acid permease Gap1, degradation of Rog1 in cooperation with Bul2 and GSK-3, and mitochondrial inheritance. Bul1 may contain HEAT repeats. 44337 pfam04426: Bul1 C terminus. This family contains the C terminus of Saccharomyces cerevisiae Bul1. Bul1 binds the ubiquitin ligase Rsp5, via an N terminal PPSY motif. The complex containing Bul1 and Rsp5 is involved in intracellular trafficking of the general amino acid permease Gap1, degradation of Rog1 in cooperation with Bul2 and GSK-3, and mitochondrial inheritance. Bul1 may contain HEAT repeats. 44338 pfam04427: Brix domain. 44339 pfam04428: Choline kinase N terminus. Found N terminal to choline/ethanolamine kinase regions (pfam01633) in some plant and fungal choline kinase enzymes (EC:2.7.1.32). This region is only found in some members of the choline kinase family, and is therefore unlikely to contribute to catalysis. 44340 pfam04429: Protein of unknown function (DUF492). Protein of unknown function. 44341 pfam04430: Protein of unknown function (DUF498). Family of uncharacterised proteins. Possibly involved in DNA repair. 44342 pfam04431: Pectate lyase, N terminus. This region is found N terminal to the pectate lyase domain (pfam00544) in some plant pectate lyase enzymes. 44343 pfam04432: Coenzyme F420 hydrogenase/dehydrogenase, beta subunit C terminus. Coenzyme F420 hydrogenase (EC:1.12.99.1) reduces the low-potential two-electron acceptor coenzyme F420. This family contains the C termini of F420 hydrogenase and dehydrogenase beta subunits. The N terminus of Methanobacterium formicicum formate dehydrogenase beta chain (EC:1.2.1.2) is also a member of this family. This region is often found in association with the 4Fe-4S binding domain, fer4 (pfam00037).. 44344 pfam04433: SWIRM domain. This SWIRM domain is a small alpha-helical domain of about 85 amino acid residues found in chromosomal proteins. This domain is predicted to be a protein-protein interaction unit. 44345 pfam04434: SWIM zinc finger. 44346 pfam04435: Domain of unknown function (DUF545). Family of uncharacterised C. elegans proteins. The region represented by this family can is found to be repeated up to four time in some proteins. 44347 pfam04437: RINT-1 / TIP-1 family. This family includes RINT-1, a Rad50 interacting protein which participates in radiation induced checkpoint control, as well as the TIP-1 protein from yeast that seems to be involved in a complex with Sec20p that is required for golgi transport. 44348 pfam04438: HIT zinc finger. This presumed zinc finger contains up to 6 cysteine residues that could coordinate zinc. The domain is named after the HIT protein. This domain is also found in the Thyroid receptor interacting protein 3 (TRIP-3) that specifically interact with the ligand binding domain of the thyroid receptor. 44349 pfam04439: Streptomycin adenylyltransferase. Also known as Aminoglycoside 6- adenylyltransferase (EC:2.7.7.-), this protein confers resistance to aminoglycoside antibiotics. 44350 pfam04440: Dysbindin (Dystrobrevin binding protein 1). Dysbindin is an evolutionary conserved 40-kDa coiled-coil-containing protein that binds to alpha- and beta-dystrobrevin in muscle and brain. Dystrophin and alpha-dystrobrevin are co-immunoprecipitated with dysbindin, indicating that dysbindin is DPC-associated in muscle. Dysbindin co-localises with alpha-dystrobrevin at the sarcolemma and is up-regulated in dystrophin-deficient muscle. In the brain, dysbindin is found primarily in axon bundles and especially in certain axon terminals, notably mossy fibre synaptic terminals in the cerebellum and hippocampus. Dysbindin may have implications for the molecular pathology of Duchenne muscular dystrophy and may provide an alternative route for anchoring dystrobrevin and the DPC to the muscle membrane. Genetic variation in the human dysbindin gene is also thought to be associated with Schizophrenia. 44351 pfam04441: Poxvirus early transcription factor (VETF), large subunit. The poxvirus early transcription factor (VETF), in addition to the viral RNA polymerase, is required for efficient transcription of early genes in vitro. VETF is a heterodimeric protein that binds specifically to early gene promoters. The heterodimer is comprised of an 82 kDa (this family) subunit and a 70 kDa subunit. . 44352 pfam04442: Cytochrome c oxidase assembly protein CtaG / Cox11. Cytochrome c oxidase assembly protein is essential for the assembly of functional cytochrome oxidase protein. In eukaryotes it is an integral protein of the mitochondrial inner membrane . Cox11 is essential for the insertion of Cu(I) ions to form the CuB site. This is essential for the stability of other structures in subunit I, for example haems a and a3, and the magnesium/manganese centre. Cox11 is probably only required in sub-stoichiometric amounts relative to the structural units. The C terminal region of the protein is known to form a dimer. Each monomer coordinates one Cu(I) ion via three conserved cysteine residues (111, 208 and 210) in Saccharomyces cerevisiae. Met 224 is also thought to play a role in copper transfer or stabilising the copper site. 44353 pfam04443: Acyl-protein synthetase, LuxE. LuxE is an acyl-protein synthetase found in bioluminescent bacteria. LuxE catalyses the formation of an acyl-protein thioester from a fatty acid and a protein. This is the second step in the bioluminescent fatty acid reduction system, which converts tetradecanoic acid to the aldehyde substrate of the luciferase-catalysed bioluminescence reaction A conserved cysteine found at position 364 in Photobacterium phosphoreum LuxE is thought to be acylated during the transfer of the acyl group from the synthetase subunit to the reductase. The carboxyl terminal of the synthetase is though to act as a flexible arm to transfer acyl groups between the sites of activation and reduction. This family also includes Vibrio cholerae RBFN protein, which is involved in the biosynthesis of the O-antigen component 3-deoxy-L-glycero-tetronic acid. 44354 pfam04444: Catechol dioxygenase N terminus. This family consists of the N termini of catechol, chlorocatechol or hydroxyquinol 1,2-dioxygenase proteins. This region is always found adjacent to the dioxygenase domain (pfam00775).. 44355 pfam04445: Protein of unknown function (DUF548). Protein of unknown function found in proteobacteria. In Salmonella typhimurium, expression of this protein is regulated by heat shock. 44356 pfam04446: Family of unknown function (DUF549). Family of uncharacterised eukaryotic proteins. 44357 pfam04447: Protein of unknown function (DUF550). This family represents the amino terminus of a protein of unknown function, found in dsDNA viruses with no RNA stage, including bacteriophages lambda and P22, and also in some Escherichia coli prophages. 44358 pfam04448: Protein of unknown function (DUF551). This family represents the carboxy terminus of a protein of unknown function, found in dsDNA viruses with no RNA stage, including bacteriophages lambda and P22, and also in some Escherichia coli prophages. 44359 pfam04449: CS1 type fimbrial major subunit. Fimbriae, also known as pili, form filaments radiating from the surface of the bacterium to a length of 0.5-1.5 micrometres. They enable the cell to colonise host epithelia. This family constitutes the major subunits of CS1 like pili, including CS2 and CFA1 from Escherichia coli, and also the Cable type II pilin major subunit from Burkholderia cepacia. The major subunit of CS1 pili is called CooA. Periplasmic CooA is mostly complexed with the assembly protein CooB. In addition, a small pool of CooA multimers, and CooA-CooD complexes exists, but the functional significance is unknown. A member of this family has also been identified in Salmonella typhi and Salmonella enterica. 44360 pfam04450: Plant Basic Secretory Protein. These basic secretory proteins (BSPs) are believed to be part of the plants defence mechanism against pathogens. . 44361 pfam04451: Iridovirus major capsid protein. This family includes the major capsid protein of iridoviruses, chlorella virus and Spodoptera ascovirus, which are all dsDNA viruses with no RNA stage. This is the most abundant structural protein and can account for up to 45% of virion protein. In Chlorella virus PBCV-1 the major capsid protein is a glycoprotein. 44362 pfam04452: Protein of unknown function (DUF558). Bacterial protein of unknown function, occasionally found adjacent to the diacylglycerol kinase catalytic domain DAGKc (pfam00781).. 44363 pfam04453: Organic solvent tolerance protein. Family involved in organic solvent tolerance in bacteria. The region contains several highly conserved, potentially catalytic, residues. 44364 pfam04454: Linocin_M18 bacteriocin protein. Many Gram-positive bacteria produce antimicrobial peptides, generally termed bacteriocins. These peptides are usually cationic, less than 50 amino acid residues long, contain an amphiphilic or hydrophobic region, and often kill their target cells by permeabilising the cell membrane. Antimicrobial peptides with these characteristics are also produced by plants and a wide variety of animals, including humans, and are thus widely distributed in nature. The Linocin_M18 region is found mostly in eubacteria, though homologous sequences have been identified in archaea. 44365 pfam04455: LOR/SDH bifunctional enzyme conserved region. Lysine-oxoglutarate reductase/Saccharopine dehydrogenase (LOR/SDH) is a bifunctional enzyme. This conserved region is commonly found immediately N-terminal to Saccharop_dh (pfam03435) in eukaryotes. 44366 pfam04456: Protein of unknown function (DUF503). Family of hypothetical bacterial proteins. 44367 pfam04457: Protein of unknown function (DUF504). Family of uncharacterised proteins. 44368 pfam04458: Protein of unknown function (DUF505). Family of uncharacterized prokaryotic proteins. 44369 pfam04459: Protein of unknown function (DUF512). Family of uncharacterized prokaryotic proteins. 44370 pfam04460: Protein of unknown function (DUF517). Family of hypothetical bacterial proteins. Possible zinc finger at N-terminus. This family may be related to pfam00384 (personal obs: Yeats C). 44371 pfam04461: Protein of unknown function (DUF520). Family of uncharacterised proteins. 44372 pfam04462: Protein of unknown function (DUF522). Family of hypothetical prokaryotic proteins. 44373 pfam04463: Protein of unknown function (DUF523). Family of uncharacterised bacterial proteins. 44374 pfam04464: CDP-Glycerol:Poly(glycerophosphate) glycerophosphotransferase. Wall-associated teichoic acids are a heterogeneous class of phosphate-rich polymers that are covalently linked to the cell wall peptidoglycan of gram-positive bacteria. They consist of a main chain of phosphodiester-linked polyols and/or sugar moieties attached to peptidoglycan via a linkage unit. CDP-glycerol:poly(glycerophosphate) glycerophosphotransferase is responsible for the polymerisation of the main chain of the teichoic acid by sequential transfer of glycerol-phosphate units from CDP-glycerol to the linkage unit lipid. 44375 pfam04465: Protein of unknown function (DUF499). Family of uncharacterized hypothetical prokaryotic proteins. 44376 pfam04466: Phage terminase large subunit. Initiation of packaging of double-stranded viral DNA involves the specific interaction of the prohead with viral DNA in a process mediated by a phage-encoded terminase protein. The terminase enzymes are usually hetero-oligomers composed of a small and a large subunit. This region is found on the large subunit and possess an endonuclease and ATPase activity that require Mg2+ and a neutral or slightly basic reaction. This region is also found in bacterial sequences. . 44377 pfam04467: Protein of unknown function (DUF483). Family of uncharacterized prokaryotic proteins. 44378 pfam04468: PSP1 C-terminal conserved region. This region is present in both eukaryotes and eubacteria. The yeast PSP1 protein is involved in suppressing mutations in the DNA polymerase alpha subunit in yeast. 44379 pfam04471: Restriction endonuclease. Prokaryotic family found in type II restriction enzymes containing the hallmark (D/E)-(D/E)XK active site. Presence of catalytic residues implicates this region in the enzymatic cleavage of DNA. 44380 pfam04472: Protein of unknown function (DUF552). Family of uncharacterized proteins. 44381 pfam04473: Protein of unknown function (DUF553). Family of uncharacterized archaeal proteins. 44382 pfam04474: Protein of unknown function (DUF554). Family of uncharacterised prokaryotic proteins. Multiple predicted transmembrane regions suggest that the region is membrane associated. 44383 pfam04475: Protein of unknown function (DUF555). Family of uncharacterized, hypothetical archaeal proteins. 44384 pfam04476: Protein of unknown function (DUF556). Family of uncharacterised, hypothetical prokaryotic proteins. 44385 pfam04477: Protein of unknown function (DUF557). Family of uncharacterized, hypothetical archaeal proteins. 44386 pfam04478: Mid2 like cell wall stress sensor. This family represents a region near the C terminus of Mid2, which contains a transmembrane region. The remainder of the protein sequence is serine-rich and of low complexity, and is therefore impossible to align accurately. Mid2 is thought to act as a mechanosensor of cell wall stress. The C terminal cytoplasmic region of Mid2 is known to interact with Rom2, a guanine nucleotide exchange factor (GEF) for Rho1, which is part of the cell wall integrity signalling pathway. 44387 pfam04479: RTA1 like protein. This family is comprised of fungal proteins with multiple transmembrane regions. RTA1 is involved in resistance to 7-aminocholesterol, while RTM1 confers resistance to an an unknown toxic chemical in molasses. These proteins may bind to the toxic substance, and thus prevent toxicity. They are not thought to be involved in the efflux of xenobiotics. 44388 pfam04480: Protein of unknown function (DUF559).. 44389 pfam04481: Protein of unknown function (DUF561). Protein of unknown function found in a cyanobacterium, and the chloroplasts of algae. 44390 pfam04482: Protein of unknown function (DUF564). Protein of unknown function found in algal chloroplasts and in a cyanobacterium. 44391 pfam04483: Protein of unknown function (DUF565). Predicted transmembrane protein found in plants, chloroplasts and cyanobacteria. This family is also known as YCF20. 44392 pfam04484: Family of unknown function (DUF566). Family of related proteins that is plant specific. 44393 pfam04485: Phycobilisome degradation protein nblA. In the cyanobacterium Synechococcus PCC 7942 , nblA triggers degradation of light-harvesting phycobiliproteins in response to deprivation nutrients including nitrogen, phosphorus and sulphur. The mechanism of nblA function is not known, but it has been hypothesised that nblA may act by disrupting phycobilisome structure, activating a protease or tagging phycobiliproteins for proteolysis. Members of this family have also been identified in the chloroplasts of some red algae. 44394 pfam04486: SchA / CurD like protein. Members of this family have only been identified in species of the Streptomyces genus. Two family members are known to be part of gene clusters involved in the synthesis of polyketide-based spore pigments, homologous to clusters involved in the synthesis of polyketide antibiotics. The function of this protein is unknown, but it has been speculated to contain a NAD(P) binding site. 44395 pfam04487: CITED. CITED, CBP/p300-interacting transactivator with ED-rich tail, are characterized by a conserved 32-amino acid sequence at the C-terminus. CITED proteins do not bind DNA directly and are thought to function as transcriptional co-activators. 44396 pfam04488: Glycosyltransferase sugar-binding region containing DXD motif. The DXD motif is a short conserved motif found in many families of glycosyltransferases, which add a range of different sugars to other sugars, phosphates and proteins. DXD-containing glycosyltransferases all use nucleoside diphosphate sugars as donors and require divalent cations, usually manganese. The DXD motif is expected to play a carbohydrate binding role in sugar-nucleoside diphosphate and manganese dependent glycosyltransferases. 44397 pfam04489: Protein of unknown function (DUF570). Protein of unknown function, found in herpesvirus and cytomegalovirus. 44398 pfam04490: Poxvirus T4 protein, C terminus. This family of poxvirus proteins are thought to be retained in the endoplasmic reticulum. M-T4 of myxoma virus is thought to protect infected lymphocytes from apoptosis and modulate the inflammatory response to virus infection. 44399 pfam04491: Poxvirus T4 protein, N terminus. This family of poxvirus proteins are thought to be secreted or retained in the endoplasmic reticulum if the protein also contains an additional C terminal region (pfam04490). M-T4 of myxoma virus is thought to protect infected lymphocytes from apoptosis and modulate the inflammatory response to virus infection. 44400 pfam04492: Bacteriophage replication protein O. Replication protein O is necessary for the initiation of bacteriophage DNA replication. Protein O interacts with the lambda replication origin, and also with replication protein P to form an oligomer. It is speculated that the N-terminal half interacts with the replication origin while the C terminal half mediates protein-protein interaction. 44401 pfam04493: Endonuclease V. Endonuclease V is specific for single-stranded DNA or for duplex DNA that contains uracil or that is damaged by a variety of agents. 44402 pfam04494: WD40 associated region in TFIID subunit. This region, possibly a domain is found in subunits of transcription factor TFIID. The function of this region is unknown. 44403 pfam04495: GRASP55/65 family. GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide- sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system. 44404 pfam04496: Herpesvirus UL35 family. UL35 represents a true late gene which encodes a 12-kDa capsid protein. 44405 pfam04497: Poxvirus E2 protein. Protein E2 is a viral encoded protein, that can complex with protein E1. Only when the Protein E1, a helicase, is bound by E2, can the origin of DNA replication be located. Protein E2 can also interact directly with host transcription factors in basal keratinocytes to promote viral transcription. 44406 pfam04498: Poxvirus nucleic acid binding protein VP8 / L4R. The 25 kDa product of Vaccinia virus gene L4R is also known as VP8. VP8 is found in the cores of Vaccinia virions and is essential for the formation of transcriptionally competent viral particles. It binds both single stranded and double stranded DNA and RNA with similar affinities. Binding is thought to involve cooperative interactions between protein subunits. The protein is proteolytically cleaved during viral assembly at an Ala-Gly-Ala site. Possible roles for VP8 include packaging and maintaining the DNA genome in a transcribable configuration; binding ssDNA during transcription initiation; and cooperation with I8R protein to unwind early promoter regions. VP8 may also function in either transcription elongation or release of mRNA molecules from viral particles. 44407 pfam04499: SIT4 phosphatase-associated protein. This family includes a conserved region from a group of yeast proteins that associate with the SIT4 phosphatase. This association is required for SIT4's role in G1 cyclin transcription and for bud formation. This family also includes homologous regions from other eukaryotes. 44408 pfam04500: FLYWCH zinc finger domain. Mutations in the mod(mdg4) gene have effects on variegation (PEV), the properties of insulator sequences, correct path-finding of growing nerve cells, meiotic pairing of chromosomes, and apoptosis. The occurrence of FLYWCH motifs in mod(mdg4) gene product and other proteins is discussed in. 44409 pfam04501: Baculovirus major capsid protein VP39. This family constitutes the 39 kDa major capsid protein of the Baculoviridae. 44410 pfam04502: Family of unknown function (DUF572). Family of eukaryotic proteins with undetermined function. 44411 pfam04503: Single-stranded DNA binding protein, SSDP. This is a family of eukaryotic single-stranded DNA binding proteins with specificity to a pyrimidine-rich element found in the promoter region of the alpha2(I) collagen gene. 44412 pfam04504: Protein of unknown function, DUF573. 44413 pfam04505: Interferon-induced transmembrane protein. This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression. 44414 pfam04506: Rft protein. 44415 pfam04507: Protein of unknown function, DUF576. This family contains several uncharacterised staphylococcal proteins. 44416 pfam04508: Viral A-type inclusion protein repeat. The repeat is found in the A-type inclusion protein of the Poxvirus family. 44417 pfam04509: CheC-like family. The precise function of this family is unclear, but members of this family are involved in flagella motor switch. The region represented by this family is found in the CheC, CheX, CheA and FliY proteins. In some cases, this region is present as multiple copies. 44418 pfam04510: Family of unknown function (DUF577). Family of Arabidopsis thaliana proteins. Many of these members contain a repeated region. 44419 pfam04511: Der1-like family. The endoplasmic reticulum (ER) of the yeast Saccharomyces cerevisiae contains of proteolytic system able to selectively degrade misfolded lumenal secretory proteins. For examination of the components involved in this degradation process, mutants were isolated. They could be divided into four complementation groups. The mutations led to stabilisation of two different substrates for this process. The mutant classes were called 'der' for 'degradation in the ER'. DER1 was cloned by complementation of the der1-2 mutation. The DER1 gene codes for a novel, hydrophobic protein, that is localised to the ER. Deletion of DER1 abolished degradation of the substrate proteins. The function of the Der1 protein seems to be specifically required for the degradation process associated with the ER. Interestingly this family seems distantly related to the Rhomboid family of membrane peptidases. Suggesting that this family may also mediate degradation of misfolded proteins (Bateman A pers. obs.).. 44420 pfam04512: Baculovirus polyhedron envelope protein, PEP, N terminus. Polyhedra are large crystalline occlusion bodies containing nucleopolyhedrovirus virions, and surrounded by an electron-dense structure called the polyhedron envelope or polyhedron calyx. The polyhedron envelope (associated) protein PEP is thought to be an integral part of the polyhedron envelope. PEP is concentrated at the surface of polyhedra, and is thought to be important for the proper formation of the periphery of polyhedra. It is thought that PEP may stabilise polyhedra and protect them from fusion or aggregation. 44421 pfam04513: Baculovirus polyhedron envelope protein, PEP, C terminus. Polyhedra are large crystalline occlusion bodies containing nucleopolyhedrovirus virions, and surrounded by an electron-dense structure called the polyhedron envelope or polyhedron calyx. The polyhedron envelope (associated) protein PEP is thought to be an integral part of the polyhedron envelope. PEP is concentrated at the surface of polyhedra, and is thought to be important for the proper formation of the periphery of polyhedra. It is thought that PEP may stabilise polyhedra and protect them from fusion or aggregation. 44422 pfam04514: Bluetongue virus non-structural protein NS2. This family includes NS2 proteins from other members of the Orbivirus genus. NS2 is a non-specific single-stranded RNA-binding protein that forms large homomultimers and accumulates in viral inclusion bodies of infected cells. Three RNA binding regions have been identified in Bluetongue virus serotype 17 at residues 2-11, 153-166 and 274-286. NS2 multimers also possess nucleotidyl phosphatase activity. The precise function of NS2 is not known, but it may be involved in the transport and condensation of viral mRNAs. 44423 pfam04515: Protein of unknown function, DUF580. 44424 pfam04516: CP2 transcription factor. This family represents a conserved region in the CP2 transcription factor family. 44425 pfam04517: Microvirus lysis protein (E), C terminus. E protein causes host cell lysis by inhibiting MraY, a peptidoglycan biosynthesis enzyme. This leads to cell wall failure at septation. The N terminal transmembrane region matches the signal peptide model and must be omitted from the family. 44426 pfam04518: Protein of unknown function, DUF582. This family contains several uncharacterised chlamydial proteins. . 44427 pfam04519: Protein of unknown function, DUF583. This family contains several uncharacterised hypothetical proteins. 44429 pfam04521: ssRNA positive strand viral 18kD cysteine rich protein. 44430 pfam04522: Protein of unknown function (DUF585). This region represents the N termini of bromovirus 2a protein, and is always found N terminal to a predicted RNA dependent RNA polymerase region (pfam00978).. 44431 pfam04523: Herpes virus tegument protein U30. This family is named after the human herpesvirus protein, but has been characterized in cytomegalovirus as UL47. Cytomegalovirus UL47 is a component of the tegument, which is a protein layer surrounding the viral capsid. UL47 co-precipitates with UL48 and UL69 tegument proteins, and the major capsid protein UL86. A UL47-containing complex is thought to be involved in the release of viral DNA from the disassembling virus particle. 44432 pfam04524: Protein of unknown function, DUF586. This family contains a conserved region in several bacterial proteins of unknown function. 44433 pfam04525: Protein of unknown function (DUF567). Family of uncharacterised proteins. This family contains both plant and bacterial members. 44434 pfam04526: Protein of unknown function (DUF568). Family of uncharacterised plant proteins. 44435 pfam04527: Drosophila Retinin like protein. Family of Drosophila proteins related to the C-terminal region of the Drosophila Retinin protein. Conserved region is found towards the C-terminus of the member proteins. 44436 pfam04528: Adenovirus early E4 34 kDa protein conserved region. Conserved region found in the Adenovirus E4 34 kDa protein. 44437 pfam04529: Herpesvirus U59 protein. The proteins in this family have no known function. Cytomegalovirus UL88 is also a member of this family. 44438 pfam04530: Viral Beta C/D like family. Family of ssRNA positive-strand viral proteins. Conserved region found in the Beta C and Beta D transcripts. 44439 pfam04531: Bacteriophage holin. This family of holins is found in several staphylococcal and streptococcal bacteriophages. Holins are a diverse family of proteins that cause bacterial membrane lysis during late-protein synthesis. It is thought that the temporal precision of holin-mediated lysis may occur through the buildup of a holin oligomer which causes the lysis. 44440 pfam04532: Protein of unknown function (DUF587). This family consists of the N termini of some human herpesvirus U58 proteins, and some cytomegalovirus UL87 proteins. This region is always found N terminal to the Pfam family UL87 (pfam03043), which has no known function. 44441 pfam04533: Herpes virus U44 protein. In cytomegalovirus this protein is known as UL71. This family of proteins has no known function. 44442 pfam04534: Herpesvirus UL56 protein. In herpes simplex virus type 2, UL56 is thought to be a tail-anchored type II membrane protein involved in vesicular trafficking. The C terminal hydrophobic region is required for association with the cytoplasmic membrane, and the N terminal proline-rich region is important for the translocation of UL56 to the Golgi apparatus and cytoplasmic vesicles. 44443 pfam04535: Domain of unknown function (DUF588). This family of plant proteins contains a domain that may have a catalytic activity. It has a conserved arginine and aspartate that could form an active site. These proteins are predicted to contain 3 or 4 transmembrane helices. 44444 pfam04536: Domain of unknown function (DUF477). The function of this presumed domain is unknown. It is found in both eukarya and eubacteria. 44445 pfam04537: Herpesvirus UL55 protein. In infected cells, UL55 is associated with the nuclear matrix, and found adjacent to compartments containing the capsid protein ICP35. UL55 was not detected in assembled virions. It is thought that UL55 may play a role in virion assembly or maturation. 44446 pfam04538: Brain expressed X-linked like family. 44447 pfam04539: Sigma-70 region 3. Region 3 forms a discrete compact three helical domain within the sigma-factor. Region is not normally involved in the recognition of promoter DNA, but as some specific bacterial promoters containing an extended -10 promoter element, residues within region 3 play an important role. Region 3 primarily is involved in binding the core RNA polymerase in the holoenzyme. 44448 pfam04540: Herpesvirus UL51 protein. UL51 protein is a virion protein. In pseudorabies virus, UL51 was identified as a component of the capsid. In herpes simplex virus type 1 there is evidence for post-translational modification of UL51. 44449 pfam04541: Herpesvirus virion protein U34. This protein is known as R50 in cytomegalovirus. 44450 pfam04542: Sigma-70 region 2. Region 2 of sigma-70 is the most conserved region of the entire protein. All members of this class of sigma-factor contain region 2. The high conservation is due to region 2 containing both the -10 promoter recognition helix and the primary core RNA polymerase binding determinant. The core binding helix, interacts with the clamp domain of the largest polymerase subunit, beta prime. The aromatic residues of the recognition helix, found at the C-terminus of this domain are though to mediate strand separation, thereby allowing transcription initiation. . 44451 pfam04543: Family of unknown function (DUF589). Family of uncharacterised proteins. 44452 pfam04544: Herpesvirus egress protein UL20. UL20 is predicted to be a transmembrane protein with multiple membrane spans. It is involved in the trans-cellular transport of enveloped virions, and is therefore important for viral egress. However, UL20 operates in different cellular compartments and different stages of egress in pseudorabies virus and herpes simplex virus. This is thought to be due to differences in egress pathways between these two viruses. 44453 pfam04545: Sigma-70, region 4. Region 4 of sigma-70 like sigma-factors are involved in binding to the -35 promoter element via a helix-turn-helix motif. Due to the way Pfam works, the threshold has been set artificially high to prevent overlaps with other helix-turn-helix families. Therefore there are many false negatives. . 44454 pfam04546: Sigma-70, non-essential region. The domain is found in the primary vegetative sigma factor. The function of this domain is unclear and can be removed without loss of function. . 44455 pfam04547: Protein of unknown function, DUF590. This family contains several uncharacterised eukaryotic proteins. 44456 pfam04548: AIG1 family. Arabidopsis protein AIG1 appears to be involved in plant resistance to bacteria. 44457 pfam04549: CD47 integrin associated protein. This family represents the CD47 leukocyte antigen. 44458 pfam04550: Phage holin family 2. Holins are a diverse family of proteins that cause bacterial membrane lysis during late-protein synthesis. It is thought that the temporal precision of holin-mediated lysis may occur through the buildup of a holin oligomer which causes the lysis. 44459 pfam04551: GcpE protein. In a variety of organisms, including plants and several eubacteria, isoprenoids are synthesised by the mevalonate-independent 2-C-methyl-D-erythritol 4-phosphate (MEP) pathway. Although different enzymes of this pathway have been described, the terminal biosynthetic steps of the MEP pathway have not been fully elucidated. GcpE gene of Escherichia coli is involved in this pathway. 44460 pfam04552: Sigma-54, DNA binding domain. This DNA binding domain is based on peptide fragmentation data. This domain is proximal to DNA in the promoter/holoenzyme complex. Furthermore this region contains a putative helix-turn-helix motif. At the C-terminus, there is a highly conserved region known as the RpoN box and is the signature of the sigma-54 proteins. . 44461 pfam04553: Tis11B like protein, C terminus. Members of this family always contain a tandem repeat of CCCH zinc fingers pfam00642. Tis11B, Tis11D and their homologues are thought to be regulatory proteins involved in the response to growth factors. The function of the C terminus is unknown. 44462 pfam04554: Extensin-like region. 44463 pfam04555: Restriction endonuclease XhoI. This family consists of type II restriction enzymes (EC:3.1.21.4) that recognise the double-stranded sequence CTCGAG and cleave after C-1. 44464 pfam04556: DpmII restriction endonuclease. Members of this family are type II restriction enzymes (EC: 3.1.21.4). They recognise the double-stranded unmethylated sequence GATC and cleave before G-1. 44465 pfam04557: Glutaminyl-tRNA synthetase, non-specific RNA binding region part 2. This is a region found N terminal to the catalytic domain of glutaminyl-tRNA synthetase (EC 6.1.1.18) in eukaryotes but not in Escherichia coli. This region is thought to bind RNA in a non-specific manner, enhancing interactions between the tRNA and enzyme, but is not essential for enzyme function. 44466 pfam04558: Glutaminyl-tRNA synthetase, non-specific RNA binding region part 1. This is a region found N terminal to the catalytic domain of glutaminyl-tRNA synthetase (EC 6.1.1.18) in eukaryotes but not in Escherichia coli. This region is thought to bind RNA in a non-specific manner, enhancing interactions between the tRNA and enzyme, but is not essential for enzyme function. 44467 pfam04559: Herpesvirus UL17 protein. UL17 protein is required for DNA cleavage and packaging in herpes viruses. It has been shown to associate with immature B-type capsids, and is required for the the localisation of capsids and capsid proteins to the intranuclear sites where viral DNA is cleaved and packaged. In the virion, UL17 is a component of the tegument, which is a protein layer surrounding the viral capsid. 44468 pfam04560: RNA polymerase Rpb2, domain 7. RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). Rpb2 is the second largest subunit of the RNA polymerase. This domain comprised of the structural domains anchor and clamp. The clamp region (C-terminal) contains a zinc-binding motif. The clamp region is named due to its interaction with the clamp domain found in Rpb1. The domain also contains a region termed ""switch 4"". The switches within the polymerase are thought to signal different stages of transcription. 44469 pfam04561: RNA polymerase Rpb2, domain 2. RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). Rpb2 is the second largest subunit of the RNA polymerase. This domain forms one of the two distinctive lobes of the Rpb2 structure. This domain is also known as the lobe domain. DNA has been demonstrated to bind to the concave surface of the lobe domain, and plays a role in maintaining the transcription bubble. Many of the bacterial members contain large insertions within this domain, as region known as dispensable region 1 (DRI). . 44470 pfam04562: Dictyostelium spore coat protein, N terminus. The Dictyostelium spore coat is a polarised extracellular matrix composed of glycoproteins and cellulose. Four of the major coat glycoproteins exist as a multi-protein complex within the prespore vesicles before secretion. Of these, SP96 and SP70 are members of this family. The presence of SP96 and SP70 in the complex is necessary for the cellulose binding activity of the complex, which is in turn necessary for normal spore coat assembly. The function of this region of these proteins is not known. 44471 pfam04563: RNA polymerase beta subunit. RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain forms one of the two distinctive lobes of the Rpb2 structure. This domain is also known as the protrusion domain. The other lobe (pfam04561) is nested within this domain. 44472 pfam04564: U-box domain. This domain is related to the Ring finger pfam00097 but lacks the zinc binding residues. 44473 pfam04565: RNA polymerase Rpb2, domain 3. RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). Domain 3, s also known as the fork domain and is proximal to catalytic site. . 44474 pfam04566: RNA polymerase Rpb2, domain 4. RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). Domain 4, is also known as the external 2 domain. 44475 pfam04567: RNA polymerase Rpb2, domain 5. RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). Domain 5, is also known as the external 2 domain. 44476 pfam04568: Mitochondrial ATPase inhibitor, IATP. ATP synthase inhibitor prevents the enzyme from switching to ATP hydrolysis during collapse of the electrochemical gradient, for example during oxygen deprivation, ATP synthase inhibitor forms a one to one complex with the F1 ATPase, possibly by binding at the alpha-beta interface. It is thought to inhibit ATP synthesis by preventing the release of ATP. The minimum inhibitory region for bovine inhibitor is from residues 39 to 72. The inhibitor has two oligomeric states, dimer (the active state) and tetramer. At low pH , the inhibitor forms a dimer via antiparallel coiled coil interactions between the C terminal regions of two monomers. At high pH, the inhibitor forms tetramers and higher oligomers by coiled coil interactions involving the N terminus and inhibitory region, thus preventing the inhibitory activity. 44477 pfam04569: Protein of unknown function. This family represents a conserved region in a number of uncharacterised plant proteins. 44478 pfam04570: Protein of unknown function (DUF581). Family of uncharacterised proteins. 44479 pfam04571: lipin, N-terminal conserved region. Mutations in the lipin gene lead to fatty liver dystrophy in mice. The protein has been shown to be phosphorylated by the TOR Ser/Thr protein kinases in response to insulin stimulation. The conserved region is found at the N-terminus of the member proteins. 44480 pfam04572: Alpha 1,4-glycosyltransferase conserved region. The glycosphingolipids (GSL) form part of eukaryotic cell membranes. They consist of a hydrophilic carbohydrate moiety linked to a hydrophobic ceramide tail embedded within the lipid bilayer of the membrane. Lactosylceramide, Gal1,4Glc1Cer (LacCer), is the common synthetic precursor to the majority of GSL found in vertebrates. Alpha 1.4-glycosyltransferases utilise UDP donors and transfer the sugar to a beta-linked acceptor. This region appears to be confined to higher eukaryotes. No function has been yet assigned to this region. . 44481 pfam04573: Signal peptidase subunit. Translocation of polypeptide chains across the endoplasmic reticulum membrane is triggered by signal sequences. During translocation of the nascent chain through the membrane, the signal sequence of most secretory and membrane proteins is cleaved off. Cleavage occurs by the signal peptidase complex (SPC) which consists of four subunits in yeast and five in mammals. This family is common to yeast and mammals. 44482 pfam04574: Protein of unknown function (DUF592). This region is found in some SIR2 family proteins (Pfam: PF02146).. 44483 pfam04575: Protein of unknown function (DUF560). Family of hypothetical bacterial proteins. . 44484 pfam04576: Protein of unknown function, DUF593. 44485 pfam04577: Protein of unknown function (DUF563). Family of uncharacterised proteins. 44486 pfam04578: Protein of unknown function, DUF594. 44487 pfam04579: Keratin, high-sulphur matrix protein. Family of Keratin, high-sulfur matrix proteins. The keratin products of mammalian epidermal derivatives such as wool and hair consist of microfibrils embedded in a rigid matrix of other proteins. The matrix proteins include the high-sulphur and high-tyrosine keratins, having molecular weights of 6-20 kDa, whereas microfibrils contain the larger, low-sulphur keratins (40-56 kDa).. 44488 pfam04580: Chordopoxvirinae D3 protein. Chordopoxvirinae D3 protein conserved region. Region occupies entire length of D3 protein. 44489 pfam04582: Reovirus sigma C capsid protein. 44490 pfam04583: Baculoviridae p74 conserved region. Baculoviruses are distinct from other virus families in that there are two viral phenotypes: budded virus (BV) and occlusion-derived virus (ODV). BVs disseminate viral infection throughout the tissues of the host and ODVs transmit baculovirus between insect hosts. GFP tagging experiments implicate p74 as an ODV envelope protein. 44491 pfam04584: Poxvirus A28 family. Family of conserved Poxvirus A28 family proteins. Conserved region spans entire protein in the majority of family members. 44492 pfam04585: Conjugal transfer protein. Family of proteins known to be involved in conjugal transfer. The TrbF protein is thought to compose part of the pilus required for transfer. . 44493 pfam04586: Caudovirus prohead protease. Family of Caudovirus prohead proteases also found in a number of bacteria possibly as the result of horizontal transfer. . 44494 pfam04587: ADP-specific Phosphofructokinase/Glucokinase conserved region. In archaea a novel type of glycolytic pathway exists that is deviant from the classical Embden-Meyerhof pathway. This pathway utilises two novel proteins: an ADP-dependent Glucokinase and an ADP-dependent Phosphofructokinase. This conserved region is present at the C-terminal of both these proteins. Interestingly this family contains sequences from higher eukaryotes.. . 44495 pfam04588: Hypoxia induced protein conserved region. This family is found in proteins thought to be involved in the response to hypoxia. Family members mostly come from diverse eukaryotic organisms however eubacterial members have been identified. This region is found at the N-terminus of the member proteins which are predicted to be transmembrane. 44496 pfam04589: RFX1 transcription activation region. The RFX family is a family of winged-helix DNA binding proteins. RFX1 is a regulatory factor essential for expression of MHC class II genes. This region is to found N terminal to the RFX DNA binding region (pfam02257) in some mammalian RFX proteins, and is thought to activate transcription when associated with DNA. Deletion analysis has identified the region 233-351 in human RFX1 as being required for maximal activation. 44497 pfam04590: Protein of unknown function, DUF595. This family represents a conserved region, found in several Caenorhabditis elegans proteins. 44498 pfam04591: Protein of unknown function, DUF596. This family contains several uncharacterized proteins. 44499 pfam04592: Selenoprotein P, N terminal region. SelP is the only known eukaryotic selenoprotein that contains multiple selenocysteine (Sec) residues, and accounts for more than 50% of the selenium content of rat and human plasma. It is thought to be glycosylated. SelP may have antioxidant properties. It can attach to epithelial cells, and may protect vascular endothelial cells against peroxynitrite toxicity. The high selenium content of SelP suggests that it may be involved in selenium intercellular transport or storage. The promoter structure of bovine SelP suggest that it may be involved in countering heavy metal intoxication, and may also have a developmental function. The N-terminal region of SelP can exist independently of the C terminal region. Zebrafish selenoprotein Pb lacks the C terminal Sec-rich region, and a protein encoded by the rat SelP gene and lacking this region has also been reported. N-terminal region contains a conserved SecxxCys motif, which is similar to the CysxxCys found in thioredoxins. It is speculated that the N terminal region may adopt a thioredoxin fold and catalyse redox reactions. The N-terminal region also contains a His-rich region, which is thought to mediate heparin binding. Binding to heparan proteoglycans could account for the membrane binding properties of SelP. 44500 pfam04593: Selenoprotein P, C terminal region. SelP is the only known eukaryotic selenoprotein that contains multiple selenocysteine (Sec) residues, and accounts for more than 50% of the selenium content of rat and human plasma. It is thought to be glycosylated. SelP may have antioxidant properties. It can attach to epithelial cells, and may protect vascular endothelial cells against peroxynitrite toxicity. The high selenium content of SelP suggests that it may be involved in selenium intercellular transport or storage. The promoter structure of bovine SelP suggest that it may be involved in countering heavy metal intoxication, and may also have a developmental function. The N terminal region always contains one Sec residue, and this is separated from the C terminal region (9-16 sec residues) by a histidine-rich sequence. The large number of Sec residues in the C-terminal portion of SelP suggest CC that it may be involved in selenium transport or storage. However, it is also possible that this region has a redox function. 44501 pfam04594: Non-SMC condensin subunit, XCAP-D2/Cnd1. Condensin is a multi-subunit protein complex that acts as an essential regulator of chromosome condensation. It contains both SMC (structural maintenance of chromosomes) and non-SMC subunits. This family represents one of the non-SMC subunits, known as Cnd1 in Schizosaccharomyces pombe, and XCAP-D2 in Xenopus laevis. This subunit is phosphorylated at several sites by Cdc2. This phosphorylation process increases the supercoiling activity of condensin. 44502 pfam04595: Poxvirus I6-like family. This family includes I6 proteins as well as the related F5L proteins. 44503 pfam04596: Poxvirus protein F15. 44504 pfam04597: Ribophorin I. Ribophorin I is an essential subunit of oligosaccharyltransferase (OST), which is also known as Dolichyl-diphosphooligosaccharide--protein glycosyltransferase, (EC:2.4.1.119). OST catalyses the transfer of an oligosaccharide from dolichol pyrophosphate to selected asparagine residues of nascent polypeptides as they are translocated into the lumen of the rough endoplasmic reticulum. Ribophorin I and OST48 are though to be responsible for OST catalytic activity. Both yeast and mammalian proteins are glycosylated but the sites are not conserved. Glycosylation may contribute towards general solubility but is unlikely to be involved in a specific biochemical function. Most family members are predicted to have a transmembrane helix at the C terminus of this region. 44505 pfam04598: DFNA5 protein. The precise function of this protein is unknown. A deletion/insertion mutation is associated with an autosomal dominant non-syndromic hearing impairment form. In addition, this protein has also been found to contribute to acquired etoposide resistance in melanoma cells. 44506 pfam04599: Poxvirus G5 protein. 44507 pfam04600: Protein of unknown function (DUF571). Family of hypothetical bacterial proteins. 44508 pfam04601: Protein of unknown function (DUF569). Family of hypothetical proteins. Some family members contain a two copies of the region. 44509 pfam04602: Mycobacterial cell wall arabinan synthesis protein. Arabinosyltransferase is involved in arabinogalactan (AG) biosynthesis pathway in mycobacteria. AG is a component of the macromolecular assembly of the mycolyl-AG-peptidoglycan complex of the cell wall. This enzyme has important clinical applications as it is believed to be the target of the antimycobacterial drug Ethambutol. 44510 pfam04603: Ran-interacting Mob1 protein. Segregation of nuclear and cytoplasmic processes facilitates regulation of many eukaryotic cellular functions such as gene expression and cell cycle progression. Trafficking through the nuclear pore requires a number of highly conserved soluble factors that escort macromolecular substrates into and out of the nucleus. The Mob1 protein has been shown to interact with RanGTP which stimulates guanine nucleotide release, suggesting Mog1 regulates the nuclear transport functions of Ran. The human homologue of Mog1 is thought to be alternatively spliced. . 44511 pfam04604: Type-A lantibiotic. Lantibiotics are antibiotic peptides distinguished by the presence of the rare thioether amino acids lanthionine and/or methyllanthionine. They are produced by Gram-positive bacteria as gene-encoded precursor peptides and undergo post-translational modification to generate the mature peptide. Based on their structural and functional features lantibiotics are currently divided into two major groups: the flexible amphiphilic type-A and the rather rigid and globular type-B. Type-A lantibiotics act primarily by pore formation in the bacterial membrane by a mechanism involving the interaction with specific docking molecules such as the membrane precursor lipid II. 44512 pfam04605: Virulence-associated protein D (VapD) conserved region. Family of bacterial proteins associated with virulence. Conserved region is found at the N- terminus of the VapD protein. 44513 pfam04606: Phage transcriptional activator, Ogr/Delta. This is a viral family of phage zinc-binding transcriptional activators, which also contains cryptic members in some bacterial genomes. The P4 phage delta protein contains two such domains attached covalently, while the P2 phage Ogr proteins possess one domain but function as dimers. All the members of this family have the following consensus sequence: C-X(2)-C-X(3)-A-(X)2-R-X(15)-C-X(4)-C-X(3)-F. 44514 pfam04607: Region found in RelA / SpoT proteins. This region of unknown function is found in RelA and SpoT of Escherichia coli, and their homologues in plants and in other eubacteria. RelA is a guanosine 3',5'-bis-pyrophosphate (ppGpp) synthetase (EC:2.7.6.5) while SpoT is thought to be a bifunctional enzyme catalysing both ppGpp synthesis and degradation (ppGpp 3'-pyrophosphohydrolase, (EC:3.1.7.2)). This region is often found in association with HD (pfam01966), a metal-dependent phosphohydrolase, TGS (pfam02824) which is a possible nucleotide-binding region, and the ACT regulatory domain (pfam01842).. 44515 pfam04608: Phosphatidylglycerophosphatase A. This family represents a family of bacterial phosphatidylglycerophosphatases (EC:3.1.3.27), known as PgpA. It appears that bacteria possess several phosphatidylglycerophosphatases, and thus, PgpA is not essential in Escherichia coli. 44516 pfam04609: Methyl-coenzyme M reductase operon protein C. Methyl coenzyme M reductase (MCR) catalyses the final step in methanogenesis. MCR is composed of three subunits, alpha (pfam02249), beta (pfam02241) and gamma (pfam02240). Genes encoding the beta (mcrB) and gamma (mcrG) subunits are separated by two open reading frames coding for two proteins C and D. The function of proteins C and D (this family) is unknown. 44517 pfam04610: TrbL/VirB6 plasmid conjugal transfer protein. 44518 pfam04611: Mating type protein A alpha Y mating type dependent binding region. This region is important for the mating type dependent binding of Y protein to the A alpha Z protein of another mating type in Schizophyllum commune. 44519 pfam04612: General secretion pathway, M protein. This is a family of membrane proteins involved in the secretion of a number of molecules in Gram-negative bacteria. The precise function of these proteins is unknown, though in Vibrio cholerae, the EpsM protein interacts with the EpsL protein, and also forms homodimers. 44520 pfam04613: UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase, LpxD. UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase (EC 2.3.1.-) catalyses an early step in lipid A biosynthesis: UDP-3-O-(3-hydroxytetradecanoyl)glucosamine + (R)-3-hydroxytetradecanoyl-[acyl carrier protein] -> UDP-2,3-bis(3-hydroxytetradecanoyl)glucosamine + [acyl carrier protein]. Members of this family also contain a hexapeptide repeat (pfam00132). This family constitutes the non-repeating region of LPXD proteins. 44521 pfam04614: Pex19 protein family. 44522 pfam04615: Utp14 protein. This protein is found to be part of a large ribonucleoprotein complex containing the U3 snoRNA. Depletion of the Utp proteins impedes production of the 18S rRNA, indicating that they are part of the active pre-rRNA processing complex. This large RNP complex has been termed the small subunit (SSU) processome. 44523 pfam04616: Glycosyl hydrolases family 43. 44524 pfam04617: Hox9 activation region. This family constitutes the N termini of the paralogous homeobox proteins HoxA9, HoxB9, HoxC9 and HoxD9. The N terminal region is thought to act as a transcription activation region. Activation is may be by interaction with proteins such as Btg proteins, which are thought to recruit a multi-protein Ccr4-like complex. 44525 pfam04618: HD-ZIP protein N terminus. This family consists of the N termini of plant homeobox-leucine zipper proteins. Its function is unknown. 44526 pfam04619: Dr-family adhesin. This family of adhesins bind to the Dr blood group antigen component of decay-accelerating factor. This mediates adherence of uropathogenic Escherichia coli to the urinary tract. This family contains both fimbriated and afimbriated adherence structures. This protein also confers the phenotype of mannose-resistant hemagglutination, which can be inhibited by chloramphenicol. The N terminal portion of the protein is though to be responsible for chloramphenicol sensitivity. 44527 pfam04620: Flagellar filament outer layer protein FlaA. Periplasmic flagella are the organelles of spirochete mobility, and are structurally different from the flagella of other motile bacteria. They reside inside the cell within the periplasmic space, and confer mobility in viscous gel-like media such connective tissue. The flagella are composed of an outer sheath of FlaA proteins and a core filament of FlaB proteins. Each species usually has several FlaA protein species. 44528 pfam04621: PEA3 subfamily ETS-domain transcription factor N terminus. The N terminus of the PEA3 transcription factors is implicated in transactivation and in inhibition of DNA binding. Transactivation is potentiated by activation of the Ras/MAP kinase and protein kinase A signalling cascades. The N terminal region contains conserved MAP kinase phosphorylation sites. 44529 pfam04622: ERG2 and Sigma1 receptor like protein. This family consists of the fungal C-8 sterol isomerase and mammalian sigma1 receptor. C-8 sterol isomerase (delta-8--delta-7 sterol isomerase), catalyses a reaction in ergosterol biosynthesis, which results in unsaturation at C-7 in the B ring of sterols. Sigma 1 receptor is a low molecular mass mammalian protein located in the endoplasmic reticulum, which interacts with endogenous steroid hormones, such as progesterone and testosterone. It also binds the sigma ligands, which are are a set of chemically unrelated drugs including haloperidol, pentazocine, and ditolylguanidine. Sigma1 effectors are not well understood, but sigma1 agonists have been observed to affect NMDA receptor function, the alpha-adrenergic system and opioid analgesia. 44530 pfam04623: Adenovirus E1B protein N-terminus. This family constitutes the amino termini of E1B 55 kDa (pfam01696). E1B 55K binds p53 the tumour suppressor protein converting it from a transcriptional activator which responds to damaged DNA in to an unregulated repressor of genes with a p53 binding site. This protects the virus against p53 induced host antiviral responses and prevents apoptosis as induced by the by the adenovirus E1A protein. The role of the N terminus in the function of E1B is not known. 44531 pfam04624: Dec-1 repeat. The defective chorion-1 gene (dec-1) in Drosophila encodes follicle cell proteins necessary for proper eggshell assembly. Multiple products of the dec-1 gene are formed by alternative RNA splicing and proteolytic processing. Cleavage products include S80 (80 kDa) which is incorporated into the eggshell, and further proteolysis of S80 gives S60 (60 kDa). This repeat is usually found in 12 copies in the central region of the protein. Its function is unknown. Length polymorphisms of Dec-1 have been observed in wild-type strains, and are caused by changes in the numbers of the first five repeats. 44532 pfam04625: DEC-1 protein, N terminal region. The defective chorion-1 gene (dec-1) in Drosophila encodes follicle cell proteins necessary for proper eggshell assembly. Multiple products of the dec-1 gene are formed by alternative RNA splicing and proteolytic processing. Cleavage products include S80 (80 kDa) which is incorporated into the eggshell, and further proteolysis of S80 gives S60 (60 kDa).. 44533 pfam04626: Dec-1 protein, C terminal region. The defective chorion-1 gene (dec-1) in Drosophila encodes follicle cell proteins necessary for proper eggshell assembly. Multiple products of the dec-1 gene are formed by alternative RNA splicing and proteolytic processing. Cleavage products include S80 (80 kDa) which is incorporated into the eggshell, and further proteolysis of S80 gives S60 (60 kDa). Alternative splicing generates different carboxy terminal ends in different protein isoforms, so this is region is the most C terminal region that is present in the main isoforms. 44534 pfam04627: Mitochondrial ATP synthase epsilon chain. This family constitutes the mitochondrial ATP synthase epsilon subunit. This is not to be confused with the bacterial epsilon subunit, which is homologous to the mitochondrial delta subunit (pfam00401 and pfam02823) The epsilon subunit is located in the extrinsic membrane section F1, which is the catalytic site of ATP synthesis. The epsilon subunit was not well ordered in the crystal structure of bovine F1, but it is known to be located in the stalk region of F1. E subunit is thought to be involved in the regulation of ATP synthase, since a null mutation increased oligomycin sensitivity and decreased inhibition by inhibitor protein IF1. 44535 pfam04628: Sedlin, N-terminal conserved region. Mutations in this protein are associated with the X-linked spondyloepiphyseal dysplasia tarda syndrome (OMIM:313400). This family represents an N-terminal conserved region. 44536 pfam04629: Islet cell autoantigen ICA69, C-terminal domain. This family includes a 69 kD protein which has been identified as an islet cell autoantigen in type I diabetes mellitus. Its precise function is unknown. 44537 pfam04630: Phage major tail protein. 44538 pfam04631: Baculovirus hypothetical protein. This family includes several hypothetical baculoviral proteins, with predicted molecular weights of approximately 44 kD. 44539 pfam04632: Fusaric acid resistance protein conserved region. This family includes a conserved region found in two proteins associated with fusaric acid resistance, from Burkholderia cepacia and from Klebsiella oxytoca. The function of this region is unknown. 44540 pfam04633: Herpesvirus BMRF2 protein. 44541 pfam04634: Protein of unknown function, DUF600. This conserved region is found in several uncharacterised proteins from Gram positive bacteria. 44542 pfam04635: Protein of unknown function, DUF598. This family contains several uncharacterised proteins. 44543 pfam04636: PA26 p53-induced protein (sestrin). PA26 is a p53-inducible protein. Its function is unknown. 44544 pfam04637: Herpesvirus phosphoprotein 85 (HHV6-7 U14/HCMV UL25). This family includes UL25 proteins from HCMV, as well as U14 proteins from HHV 6 and HHV7. These 85 kD phosphoproteins appear to act as structural antigens, but their precise function is otherwise unknown. 44545 pfam04638: Pox virus protein O1. The function of these viral proteins is not known. 44546 pfam04639: Baculoviral E56 protein, specific to ODV envelope. This family represents the E56 protein, which is localises to the occlusion derived virus (ODV) envelope, but not to the budded virus (BV) envelope. 44547 pfam04640: Protein of unknown function, DUF597. This family includes a conserved region in several uncharacterised plant proteins. 44548 pfam04641: Protein of unknown function, DUF602. This family represents several uncharacterised eukaryotic proteins. 44549 pfam04642: Protein of unknown function, DUF601. This family represents a conserved region found in several uncharacterized plant proteins. 44550 pfam04643: Motilin/ghrelin-associated peptide. This family represents a peptide sequence that lies C-terminal to motilin/ghrelin on the respective precursor peptide. Its function is unknown. 44551 pfam04644: Motilin/ghrelin. Motilin is a gastrointestinal regulatory polypeptide produced by motilin cells in the duodenal epithelium. It is released into the general circulation at about 100-min intervals during the inter-digestive state and is the most important factor in controlling the inter-digestive migrating contractions. Motilin also stimulates endogenous release of the endocrine pancreas. This family also includes ghrelin, a growth hormone secretagogue synthesised by endocrine cells in the stomach. Ghrelin stimulates growth hormone secretagogue receptors in the pituitary. These receptors are distinct from the growth hormone-releasing hormone receptors, and thus provide a means of controlling pituitary growth hormone release by the gastrointestinal system. 44552 pfam04645: Protein of unknown function, DUF603. This family includes several uncharacterized proteins from Borrelia species. 44553 pfam04646: Protein of unknown function, DUF604. This family includes a conserved region found in several uncharacterised plant proteins. 44554 pfam04647: Accessory gene regulator B. The arg locus consists of two transcripts: RNAII and RNAIII. RNAII encodes four genes (agrA, B, C, and D) whose gene products assemble a quorum sensing system. AgrB and AgrD are essential for the production of the autoinducing peptide which functions as a signal for quorum sensing. AgrB is a transmembrane protein. 44555 pfam04648: Yeast mating factor alpha hormone. The hormone is excreted into the culture medium by haploid cells of the alpha mating type and acts on cells of the opposite mating type (type A). It inhibits DNA synthesis in type A cells synchronising them with type alpha, and so mediates the conjugation process. 44556 pfam04649: Mycoplasma hyorhinis VlpA repeat. This repeat is found in the extracellular (C-terminal) region of the variant surface antigen A (VlpA) of Mycoplasma hyorhinis. Mutations that change the number of repeats in the protein are involved in antigenic variation and immune evasion of this swine pathogen. 44557 pfam04650: YSIRK type signal peptide. Many surface proteins found in Streptococcus, Staphylococcus, and related lineages share apparently homologous signal sequences. A motif resembling [YF]SIRKxxxGxxS[VIA] appears at the start of the transmembrane domain. The GxxS motif appears perfectly conserved, suggesting a specific function and not just homology. There is a strong correlation between proteins carrying this region at the N-terminus and those carrying the Gram-positive anchor domain with the LPXTG sortase processing site at the C-terminus. 44558 pfam04651: Poxvirus A12 protein. 44559 pfam04652: Protein of unknown function, DUF605. This family includes several uncharacterised eukaryotic proteins. 44560 pfam04653: 26S proteasome non-ATPase regulatory subunit Nin1/mts3. The 26S protease (or 26S proteasome) is responsible for degrading ubiquitin conjugates. It consists of 19S regulatory complexes associated with the ends of 20S proteasomes. The 19S regulatory complex is composed of about 20 different polypeptides and confers ATP-dependence and substrate specificity to the 26S enzyme. The conserved region occurs at the C-terminal of the Nin1-like regulatory subunit. 44561 pfam04654: Protein of unknown function, DUF599. This family includes several uncharacterised proteins. 44562 pfam04655: Aminoglycoside/hydroxyurea antibiotic resistance kinase. The aminoglycoside phosphotransferases achieve inactivation of their antibiotic substrates by phosphorylation utilising ATP. Likewise hydroxyurea is inactivated by phosphorylation of the hydroxy group in the hydroxylamine moiety. 44563 pfam04656: Pox virus E6 protein. Family of pox virus E6 proteins. 44564 pfam04657: Protein of unknown function, DUF606. This family includes several uncharacterised bacterial proteins. 44565 pfam04658: TAFII55 protein conserved region. The general transcription factor, TFIID, consists of the TATA-binding protein (TBP) associated with a series of TBP-associated factors (TAFs) that together participate in the assembly of the transcription preinitiation complex. TAFII55 binds to TAFII250 and inhibits it acetyltransferase activity. The exact role of TAFII55 is currently unknown. The conserved region is situated towards the N-terminus of the protein. 44566 pfam04659: Archaeal flagella protein. Family of archaeal flaD and flaE proteins. Conserved region found at N-terminus of flaE but towards the C-terminus of flaD. . 44567 pfam04660: Nanovirus coat protein. Family of conserved Nanoviral coat proteins. 44568 pfam04661: Poxvirus I3 ssDNA-binding protein. 44569 pfam04662: Luteovirus P0 protein. This family of proteins may be involved in suppression of PTGS a plant defence mechanism. 44570 pfam04663: Phenol hydroxylase conserved region. Under aerobic conditions, phenol is usually hydroxylated to catechol and degraded via the meta or ortho pathways. Two types of phenol hydroxylase are known: one is a multi-component enzyme the other is a single-component monooxygenase. This region is found in both types of enzymes. 44571 pfam04664: Opioid growth factor receptor (OGFr) conserved region. Opioid peptides act as growth factors in neural and non-neural cells and tissues, in addition to serving in neurotransmission/neuromodulation in the nervous system. The Opioid growth factor receptor is an integral membrane protein associated with the nucleus. The conserved region is situated at the N-terminus of the member proteins with a series of imperfect repeats lying immediately to its C-terminus. 44572 pfam04665: Poxvirus A32 protein. The A32 protein is thought to be involved in viral DNA packaging. 44573 pfam04666: N-Acetylglucosaminyltransferase-IV (GnT-IV) conserved region. The complex-type of oligosaccharides are synthesised through elongation by glycosyltransferases after trimming of the precursor oligosaccharides transferred to proteins in the endoplasmic reticulum. N-Acetylglucosaminyltransferases (GnTs) take part in the formation of branches in the biosynthesis of complex-type sugar chains. In vertebrates, six GnTs, designated as GnT-I to -VI, which catalyse the transfer of GlcNAc to the core mannose residues of Asn-linked sugar chains, have been identified. GnT-IV (EC:2.4.1.145) catalyses the transfer of GlcNAc from UDP-GlcNAc to the GlcNAc1-2Man1-3 arm of core oligosaccharide [Gn2(22)core oligosaccharide] and forms GlcNAc1-4(GlcNAc1-2)Man1-3 structure on the core oligosaccharide (Gn3(2,4,2)core oligosaccharide). In some members the conserved region occupies all but the very for N-terminal, where there is a signal sequence on all members. For other members the conserved region does not occupy the entire protein but is still to the N-terminus of the protein. 44574 pfam04667: cAMP-regulated phosphoprotein/endosulfine conserved region. Conserved region found in both cAMP-regulated phosphoprotein 19 (ARPP-19) and Alpha/Beta endosulfine. No function has yet been assigned to ARPP-19. Endosulfine is the endogenous ligand for the ATP-dependent potassium (K ATP) channels which occupy a key position in the control of insulin release from the pancreatic beta cell by coupling cell polarity to metabolism. In both cases the region occupies the majority of the protein. . 44575 pfam04668: Twisted gastrulation (Tsg) protein conserved region. Tsg was identified in Drosophila as being required to specify the dorsal-most structures in the embryo, for example amnioserosa. Biochemical experiments have revealed three key properties of Tsg: it can synergistically inhibit Dpp/BMP action in both Drosophila and vertebrates by forming a tripartite complete between itself, SOG/chordin and a BMP ligand; Tsg seems to enhance the Tld/BMP-1-mediated cleavage rate of SOG/chordin and may change the preference of site utilisation; Tsg can promote the dissociation of chordin cysteine-rich-containing fragments from the ligand to inhibit BMP signalling. 44576 pfam04669: Protein of unknown function (DUF579). Family of uncharacterised plant proteins. . 44577 pfam04670: Gtr1/RagA G protein conserved region. GTR1 was first identified in S. cerevisiae as a suppressor of a mutation in RCC1. Biochemical analysis revealed that Gtr1 is in fact a G protein of the Ras family. The RagA/B proteins are the human homologues of Gtr1. Included in this family is the human Rag C, a novel protein that has been shown to interact with RagA/B. 44578 pfam04671: Erythrocyte membrane-associated giant protein antigen 332. To date many different Plasmodium antigens recognised by the hyperimmune system human sera have been cloned, sequenced and characterised. The majority contain tandemly repeated amino acid sequences which make up a considerable portion of the protein sequence. It has been suggested that these repeat-containing antigens may provide an immunological 'smokescreen' to the parasite in order to evade the human immune system. This repeat is found exclusively in the Plasmodium falciparum Ag332 protein and occupies most of its length. 44579 pfam04672: Protein of unknown function (DUF574). Family of uncharacterized proteins. 44580 pfam04673: Polyketide synthesis cyclase. This family represents a number of cyclases involved in polyketide synthesis in a number of actinobacterial species. 44581 pfam04674: Phosphate-induced protein 1 conserved region. Family of conserved plant proteins. Conserved region identified in a phosphate-induced protein of unknown function. 44582 pfam04675: DNA ligase N terminus. This region is found in many but not all ATP-dependent DNA ligase enzymes (EC:6.5.1.1). It is thought to be involved in DNA binding and in catalysis. In human DNA ligase I, and in Saccharomyces cerevisiae, this region was necessary for catalysis, and separated from the amino terminus by targeting elements. In vaccinia virus this region was not essential for catalysis, but deletion decreases the affinity for nicked DNA and decreased the rate of strand joining at a step subsequent to enzyme-adenylate formation. 44583 pfam04676: Protein similar to CwfJ C-terminus 2. This region is found in the N terminus of Schizosaccharomyces pombe protein CwfJ. CwfJ is part of the Cdc5p complex involved in mRNA splicing. 44584 pfam04677: Protein similar to CwfJ C-terminus 1. This region is found in the N terminus of Schizosaccharomyces pombe protein CwfJ. CwfJ is part of the Cdc5p complex involved in mRNA splicing. 44585 pfam04678: Protein of unknown function, DUF607. This family represents a conserved region found in several uncharacterized eukaryotic proteins. 44586 pfam04679: ATP dependent DNA ligase C terminal region. This region is found in many but not all ATP-dependent DNA ligase enzymes (EC:6.5.1.1). It is thought to constitute part of the catalytic core of ATP dependent DNA ligase. 44587 pfam04680: Opioid growth factor receptor repeat. Proline-rich repeat found only in a human opioid growth factor receptor. . 44588 pfam04681: Blastomyces yeast-phase-specific protein. The molecular function of this protein is not known. Its expression is specific to the high temperature, unicellular yeast morphology (as opposed to the lower temperature, multicellular mycelium form).. 44589 pfam04682: Herpesvirus BTRF1 protein conserved region. Herpesvirus protein. 44590 pfam04683: Adhesion regulating molecule conserved region. Family of eukaryotic proteins involved in cell adhesion. Members are involved in gastrulation and metastatic cancer formation. Experimental evidence suggests that members are transmembrane and possibly glycoproteins. 44591 pfam04684: BAF1 / ABF1 chromatin reorganising factor. ABF1 is a sequence-specific DNA binding protein involved in transcription activation, gene silencing and initiation of DNA replication. ABF1 is known to remodel chromatin, and it is proposed that it mediates its effects on transcription and gene expression by modifying local chromatin architecture. These functions require a conserved stretch of 20 amino acids in the C-terminal region of ABF1 (amino acids 639 to 662 S. cerevisiae ). The N-terminal two thirds of the protein are necessary for DNA binding, and the N-terminus (amino acids 9 to 91 in S. cerevisiae) is thought to contain a novel zinc-finger motif which may stabilise the protein structure. 44592 pfam04685: Protein of unknown function, DUF608. This family represents a conserved region with a pankaryotic distribution in a number of uncharacterised proteins. 44593 pfam04686: Streptomyces sporulation and cell division protein, SsgA. The precise function of SsgA is unknown. It has been found to be essential for spore formation, and to stimulate cell division. 44594 pfam04687: Microvirus H protein (pilot protein). A single molecule of H protein is found on each of the 12 spikes on the microvirus shell. H is involved in the ejection of the phage DNA, and at least one copy is injected into the host's periplasmic space along with the ssDNA viral genome. Part of H is thought to lie outside the shell, where it recognises lipopolysaccharide from virus-sensitive strains. Part of H may lie within the capsid, since mutations in H can influence the DNA ejection mechanism by affecting the DNA-protein interactions. H may span the capsid through the hydrophilic channels formed by G proteins. 44595 pfam04688: Phage lysis protein, holin. This family constitutes holin proteins from the dsDNA Siphidoviridae group bacteriophages. Most bacteriophages require an endolysin and a holin for host lysis. During late gene expression, holins accumulate and oligomerise in the host cell membrane. They then suddenly trigger to permeablise the membrane, which causes lysis by allowing endolysin to attach the peptidoglycan. There are thought to be at least 35 different families of holin genes. 44596 pfam04689: DNA binding protein S1FA. S1FA is a DNA-binding protein found in plants that specifically recognises the negative promoter element S1F. 44597 pfam04690: YABBY protein. YABBY proteins are a group of plant-specific transcription involved in the specification of abaxial polarity in lateral organs. 44598 pfam04691: Apolipoprotein C-I (ApoC-1). Apolipoprotein C-I (ApoC-1) is a water-soluble protein component of plasma lipoprotein. It solublises lipids and regulates lipid metabolism. ApoC-1 transfers among HDL (high density lipoprotein), VLDL (very low-density lipoprotein) and chylomicrons. ApoC-1 activates lecithin:choline acetyltransferase (LCAT), inhibits cholesteryl ester transfer protein, can inhibit hepatic lipase and phospholipase 2 and can stimulate cell growth. ApoC-1 delays the clearance of beta-VLDL by inhibiting its uptake via the LDL receptor-related pathway. ApoC-1 has been implicated in hypertriglyceridemia, and Alzheimer's disease. 44599 pfam04692: Platelet-derived growth factor, N terminal region. This family consists of the amino terminal regions of platelet-derived growth factor (PDGF, pfam00341) A and B chains. 44600 pfam04693: Archaeal putative transposase ISC1217. 44601 pfam04694: Coronavirus ORF3 protein. 44602 pfam04695: Peroxisomal membrane anchor protein (Pex14p) conserved region. Family of peroxisomal membrane anchor proteins which bind the PTS1 (peroxisomal targeting signal) receptor and are required for the import of PTS1-containing proteins into peroxisomes. Loss of functional Pex14p results in defects in both the PTS1 and PTS2-dependent import pathways. Deletion analysis of this conserved region implicates it in selective peroxisome degradation. In the majority of members this region is situated at the N-terminus of the protein. 44603 pfam04696: pinin/SDK/memA/ protein conserved region. Members of this family have very varied localisations within the eukaryotic cell. pinin is known to localise at the desmosomes and is implicated in anchoring intermediate filaments to the desmosomal plaque. SDK2/3 is a dynamically localised nuclear protein thought to be involved in modulation of alternative pre-mRNA splicing. memA is a tumour marker preferentially expressed in human melanoma cell lines. A common feature of the members of this family is that they may all participate in regulating protein-protein interactions. 44604 pfam04697: pinin/SDK conserved region. SDK2/3 is localised in nuclear speckles where as pinin is known to localise at the desmosomes where it is thought to be involved in anchoring intermediate filaments to the desmosomal plaque. The role of SDK2/3 in the nucleus is thought to be concerned with modulation of alternative pre-mRNA splicing. pinin has also been implicated as a tumour suppressor. The conserved region is found at the N-terminus of the member proteins. 44605 pfam04698: Myelin-associated oligodendrocytic basic protein (MOBP). MOBP is abundantly expressed in central nervous system myelin, and shares several characteristics with myelin basic protein (MBP), in terms of regional distribution and function. MOBP has been shown to be essential for normal arrangement of the radial component in central nervous system myelin. 44606 pfam04699: ARP2/3 complex 16 kDa subunit (p16-Arc). The Arp2/3 protein complex has been implicated in the control of actin polymerisation. The human complex consists of seven subunits which include the actin related proteins Arp2 and Arp3, and five others referred to as p41-Arc, p34-Arc, p21-Arc, p20-Arc, and p16-Arc. The precise function of p16-Arc is currently unknown. Its structure consists of a single domain containing a bundle of seven alpha helices. 44607 pfam04700: Structural glycoprotein p40/gp41 conserved region. Family of viral structural glycoproteins. 44608 pfam04701: Pox virus D2 protein. 44609 pfam04702: Vicilin N terminal region. This region is found in plant seed storage proteins, N-terminal to the Cupin domain (pfam00190). In Macadamia integrifolia, this region is processed into peptides of approximately 50 amino acids containing a C-X-X-X-C-(10-12)X-C-X-X-X-C motif. These peptides exhibit antimicrobial activity in vitro. 44610 pfam04703: FaeA-like protein. This family represents a number of fimbrial protein transcription regulators found in Gram-negative bacteria. These proteins are thought to facilitate binding of the leucine-rich regulatory protein to regulatory elements, possibly by inhibiting deoxyadenosine methylation of these elements by deoxyadenosine methylase. 44611 pfam04704: Zfx / Zfy transcription activation region. Zfx and Zfy are transcription factors implicated in mammalian sex determination. This region is found N terminal to multiple copies of a C2H2 Zinc finger (pfam00096). This region has been shown to activate transcription when fused to a GAL4 DNA binding domain. 44612 pfam04705: Thiostrepton-resistance methylase, N terminus. This region is found in some members of the SpoU-type rRNA methylase family (pfam00588).. 44613 pfam04706: Dickkopf N-terminal cysteine-rich region. Dickkopf proteins are a class of Wnt antagonists. They possess two conserved cysteine-rich regions. This family represents the N-terminal one. The C-terminal region has been found to share significant sequence similarity to the colipase fold, pfam01114, pfam02740. 44614 pfam04707: MSF1-like conserved region. This family includes a conserved region found in the yeast YLR168C gene MSF1 product. The function of this protein is unknown, though it is thought to be involved in intra-mitochondrial protein sorting. This region is also found in a number of other eukaryotic proteins. 44615 pfam04708: Poxvirus F16 protein. 44616 pfam04709: Anti-Mullerian hormone, N terminal region. Anti-Mullerian hormone, AMH is a signalling molecule involved in male and female sexual differentiation. Defects in synthesis or action of AMH cause persistent Mullerian duct syndrome (PMDS), a rare form of male pseudohermaphroditism. This family represents the N terminal part of the protein, which is not thought to be essential for activity. AMH contains a TGF-beta domain (pfam00019), at the C terminus. 44617 pfam04710: Pellino. Pellino is involved in Toll-like signalling pathways, and associates with the kinase domain of the Pelle Ser/Thr kinase. 44618 pfam04711: Apolipoprotein A-II (ApoA-II). Apolipoprotein A-II (ApoA-II) is the second major apolipoprotein of high density lipoprotein in human plasma. Mature ApoA-II is present as a dimer of two 77-amino acid chains joined by a disulphide bridge. ApoA-II regulates many steps in HDL metabolism, and its role in coronary heart disease is unclear. In bovine serum, the ApoA-II homologue is present in almost free form. Bovine ApoA-II shows antimicrobial activity against Escherichia coli and yeasts in phosphate buffered saline (PBS).. 44619 pfam04712: Radial spokehead-like protein. This family includes the radial spoke head proteins RSP4 and RSP6 from Chlamydomonas reinhardtii, and several eukaryotic homologues, including mammalian RSHL1, the protein product of a familial ciliary dyskinesia candidate gene. 44620 pfam04713: Poxvirus protein I5. 44621 pfam04714: BCL7, N-terminal conserver region. Members of the BCL family have significant sequence similarity at their N-terminus, represented in this family. The function of BCL7 proteins is unknown. They may be involved in early development. In addition, BCL7B is commonly hemizygously deleted in patients with Williams syndrome. 44622 pfam04715: Anthranilate synthase component I, N terminal region. Anthranilate synthase (EC:4.1.3.27) catalyses the first step in the biosynthesis of tryptophan. Component I catalyses the formation of anthranilate using ammonia and chorismate. The catalytic site lies in the adjacent region, described in the chorismate binding enzyme family (pfam00425). This region is involved in feedback inhibition by tryptophan. This family also contains a region of Para-aminobenzoate synthase component I (EC 4.1.3.-).. 44623 pfam04716: ETC complex I subunit conserved region. Family of eukaryotic NADH-ubiquinone oxidoreductase subunits (EC:1.6.5.3) (EC:1.6.99.3) from complex I of the electron transport chain initially identified in Neurospora crassa as a 29.9 kDa protein. The conserved region is found at the N-terminus of the member proteins. 44624 pfam04717: Phage-related baseplate assembly protein. Family of phage baseplate assembly proteins responsible for forming the small spike at the end of the tail. Also found in bacteria, probably the result of horizontal transmission. 44625 pfam04718: Mitochondrial ATP synthase g subunit. The Fo sector of the ATP synthase is a membrane bound complex which mediates proton transport. It is composed of nine different polypeptide subunits (a, b, c, d, e, f, g F6, A6L). The function of subunit g is currently unknown. The conserved region covers all but the very N-terminus of the member sequences. No prokaryotic members have been identified thus far. 44626 pfam04719: hTAFII28-like protein conserved region. The general transcription factor, TFIID, consists of the TATA-binding protein (TBP) associated with a series of TBP-associated factors (TAFs) that together participate in the assembly of the transcription preinitiation complex. The conserved region is found at the C-terminal of most member proteins. The crystal structure of hTAFII28 with hTAFII18 shows that this region is involved in the binding of these two subunits. The conserved region contains four alpha helices and three loops arranged as in histone H3. 44627 pfam04720: Protein of unknown function (DUF506). Family of uncharacterized plant proteins. 44628 pfam04721: Domain of unknown function (DUF750). Domain of unknown function that is found eukaryotic protein. There can be between 1 and 3 copies of this domain per protein. The domain is commonly associated with Transglutaminase-like domains (pfam01841). This domain may be involved in protein binding. 44629 pfam04722: Ssu72-like protein. The yeast Ssu72 is an essential protein that may be involved in transcription start site specification. Ssu72 is stably associated with yeast cleavage and polyadenylation factor CPF. There is evidence that it bridges the CPF subunits Pta1p and Ydh1p/Cft2p, the general transcription factor TFIIB, and RNAP II via Rpb2p. 44630 pfam04723: Glycine reductase complex selenoprotein A. Found in clostridia, this protein contains one active site selenocysteine and catalyses the reductive deamination of glycine, which is coupled to the esterification of orthophosphate resulting in the formation of ATP. A member of this family may also exist in Treponema denticola. 44631 pfam04724: Glycosyltransferase family 17. This family represents beta-1,4-mannosyl-glycoprotein beta-1,4-N-acetylglucosaminyltransferase (EC:2.4.1.144). This enzyme transfers the bisecting GlcNAc to the core mannose of complex N-glycans. The addition of this residue is regulated during development and has functional consequences for receptor signalling, cell adhesion, and tumour progression. 44632 pfam04725: Photosystem II 10 kDa polypeptide PsbR. This protein is associated with the oxygen-evolving complex of photosystem II. Its function in photosynthesis is not known. The C-terminal hydrophobic region functions as a thylakoid transfer signal but is not removed. 44633 pfam04726: Microvirus J protein. This small protein is involved in DNA packaging, interacting with DNA via its hydrophobic carboxyl terminus. In bacteriophage phi-X174, J is present in 60 copies, and forms an S-shaped polypeptide chain without any secondary structure. It is thought to interact with DNA through simple charge interactions. 44634 pfam04727: Protein of unknown function, DUF609. This family represents a conserved domain which is found in a number of eukaryotic proteins including CED-12, ELMO I and ELMO II. ELMO1 is a component of signalling pathways that regulate phagocytosis and cell migration and is the mammalian orthologue of the C. elegans gene, ced-12. CED-12 is required for the engulfment of dying cells and cell migration. In mammalian cells, ELMO1 interacts with Dock180 as part of the CrkII/Dock180/Rac pathway responsible for phagocytosis and cell migration. ELMO1 is ubiquitously expressed, although its expression is highest in the spleen, an organ rich in immune cells. ELMO1 has a PH domain and a polyproline sequence motif at its C terminus which are not present in this alignment. 44635 pfam04728: Repeated sequence found in lipoprotein LPP. This repeating sequence is found in the enterobacterial outer membrane lipoprotein LPP. 44636 pfam04729: Anti-silencing protein, ASF1-like. This family includes the yeast ASF1 protein, which derepresses transcriptionally silenced genes. The human ASF1 homologue has been found to possess histone chaperone activity, which may explain the derepressing function of this family. 44637 pfam04730: Agrobacterium VirD5 protein. The virD operon in Agrobacterium encodes a site-specific endonuclease, and a number of other poorly characterized products. This family represents the VirD5 protein. 44638 pfam04731: Caudal like protein activation region. This family consists of the amino termini of proteins belonging to the caudal-related homeobox protein family. This region is thought to mediate transcription activation. The level of activation caused by mouse Cdx2 is affected by phosphorylation at serine 60 via the mitogen-activated protein kinase pathway. Caudal family proteins are involved in the transcriptional regulation of multiple genes expressed in the intestinal epithelium, and are important in differentiation and maintenance of the intestinal epithelial lining. Caudal proteins always have a homeobox DNA binding domain (pfam00046).. 44639 pfam04732: Intermediate filament head (DNA binding) region. This family represents the N-terminal head region of intermediate filaments. Intermediate filament heads bind DNA. Vimentin heads are able to alter nuclear architecture and chromatin distribution, and the liberation of heads by HIV-1 protease liberates may play an important role in HIV-1 associated cytopathogenesis and carcinogenesis. Phosphorylation of the head region can affect filament stability. The head has been shown to interaction with the rod domain of the same protein. 44640 pfam04733: Coatomer epsilon subunit. This family represents the epsilon subunit of the coatomer complex, which is involved in the regulation of intracellular protein trafficking between the endoplasmic reticulum and the Golgi complex. 44641 pfam04734: Neutral/alkaline non-lysosomal ceramidase. This family represents a group of neutral/alkaline ceramidases found in both bacteria and eukaryotes. 44642 pfam04735: Baculovirus DNA helicase. 44643 pfam04736: Eclosion hormone. Eclosion hormone is an insect neuropeptide that triggers the performance of ecdysis behaviour, which causes shedding of the old cuticle at the end of a molt. 44644 pfam04737: Lantibiotic dehydratase, N terminus. Lantibiotics are ribosomally synthesised antimicrobial agents derived from ribosomally synthesised peptides. They are produced by bacteria of the Firmicutes phylum, and include mutacin, subtilin, and nisin. Lantibiotic peptides contain thioether bridges termed lanthionines that are thought to be generated by dehydration of serine and threonine residues followed by addition of cysteine residues. This family constitutes the N-terminus of the enzyme proposed to catalyse the dehydration step. 44645 pfam04738: Lantibiotic dehydratase, C terminus. Lantibiotics are ribosomally synthesised antimicrobial agents derived from ribosomally synthesised peptides. They are produced by bacteria of the Firmicutes phylum, and include mutacin, subtilin, and nisin. Lantibiotic peptides contain thioether bridges termed lanthionines that are thought to be generated by dehydration of serine and threonine residues followed by addition of cysteine residues. This family constitutes the C-terminus of the enzyme proposed to catalyse the dehydration step. 44646 pfam04739: 5'-AMP-activated protein kinase, beta subunit, complex-interacting region. This region is found in the beta subunit of the 5'-AMP-activated protein kinase complex, and its yeast homologues Sip1, Sip2 and Gal83, which are found in the SNF1 kinase complex. This region is sufficient for interaction of this subunit with the kinase complex, but is not solely responsible for the interaction, and the interaction partner is not known. The isoamylase N-terminal domain (pfam02922) is sometimes found in proteins belonging to this family. 44647 pfam04740: Bacillus transposase protein. This family of putative transposases includes mostly Bacillus members. However, we have also found a Bacillus subtilis bacteriophage SPbetac2 homologue, possibly arising as a result of horizontal transfer. 44648 pfam04741: InvH outer membrane lipoprotein. This family represents the Salmonella outer membrane lipoprotein InvH. The molecular function of this protein is unknown, but it is required for the localisation to outer membrane of InvG, which is involved in a type III secretion apparatus mediating host cell invasion. 44649 pfam04742: Protein of unknown function, DUF611. This family includes several uncharacterised Bacillus halodurans proteins. 44650 pfam04743: BSRF1-like protein. Family of herpes virus proteins. . 44651 pfam04744: Monooxygenase subunit B protein. Family of membrane associated monooxygenases (EC 1.13.12.-) which utilise O(2) to oxidise their substrate. Family members include both ammonia and methane monooxygenases involved in the oxidation of their respective substrates. These enzymes are multi-subunit complexes. This family represents the B subunit of the enzyme; the A subunit is thought to contain the active site... 44652 pfam04745: VITF-3 subunit protein. Family of Chordopoxvirus proteins composing one of the two subunits that make up VITF-3, a virally encoded complex necessary for intermediate stage transcription. 44653 pfam04746: Protein of unknown function (DUF575). Family of uncharacterized proteins. Contains several chlamydial members. 44654 pfam04747: Protein of unknown function, DUF612. This family includes several uncharacterized proteins from Caenorhabditis elegans. 44655 pfam04748: Divergent polysaccharide deacetylase. This family is divergently related to pfam01522 (personal obs:Yeats C).. 44656 pfam04749: Protein of unknown function, DUF614. This family includes a number of uncharacterised eukaryotic proteins. 44657 pfam04750: FAR-17a/AIG1-like protein. This family includes the hamster androgen-induced FAR-17a protein, and its human homologue, the AIG1 protein. The function of these proteins is unknown. This family also includes homologous regions from a number of other metazoan proteins. 44658 pfam04751: Protein of unknown function (DUF615). This family of bacterial proteins has no known function. 44659 pfam04752: ChaC-like protein. The ChaC protein is thought to be associated with the putative ChaA Ca2+/H+ cation transport protein in Escherichia coli. Its function is not known. This family also includes homologues regions from several other bacterial and eukaryotic proteins. 44660 pfam04753: Coronavirus non-structural protein NS2. 44661 pfam04754: Putative transposase, YhgA-like. This family of putative transposases includes the YhgA sequence from Escherichia coli and several prokaryotic homologues. 44662 pfam04755: PAP_fibrillin. This family identifies a conserved region found in a number of plastid lipid-associated proteins (PAPs), and in a number of putative fibrillin proteins. 44663 pfam04756: OST3 / OST6 family. The proteins in this family are part of a complex of eight ER proteins that transfers core oligosaccharide from dolichol carrier to Asn-X-Ser/Thr motifs. This family includes both OST3 and OST6, each of which contains four predicted transmembrane helices. Disruption of OST3 and OST6 leads to a defect in the assembly of the complex. Hence, the function of these genes seems to be essential for recruiting a fully active complex necessary for efficient N-glycosylation. 44664 pfam04757: Pex2 / Pex12 amino terminal region. This region is the amino terminal part of the Pex2 and Pex12 peroxisomal biogenesis proteins. It contains two predicted transmembrane segments. This region is found to the C-terminus of a ring finger pfam00097. 44665 pfam04758: Ribosomal protein S30. 44666 pfam04759: Protein of unknown function, DUF617. This family represents a conserved region in a number of uncharacterized plant proteins. 44667 pfam04760: Translation initiation factor IF-2, N-terminal region. This conserved feature at the N-terminus of bacterial translation initiation factor IF2 has recently had its structure solved. It shows structural similarity to the tRNA anticodon Stem Contact Fold domains of the methionyl-tRNA and glutaminyl-tRNA synthetases, and a similar fold is also found in the B5 domain of the phenylalanine-tRNA synthetase. 44668 pfam04761: Lactococcus bacteriophage putative transcription regulator. This family represents a number of putative transcription repressor proteins found in several Lactococcus bacteriophages. Horizontal transfer may account for the presence of similar proteins in Lactococcus. 44669 pfam04762: IKI3 family. Members of this family are components of the elongator multi-subunit component of a novel RNA polymerase II holoenzyme for transcriptional elongation. 44670 pfam04763: Protein of unknown function (DUF562). Family of uncharacterized proteins. 44671 pfam04764: Protein of unknown function (DUF613). Family of chloroplast proteins of unknown function. Some members have two copies of the conserved region. 44672 pfam04765: Protein of unknown function (DUF616). Family of uncharacterized proteins. 44673 pfam04766: Nucleopolyhedrovirus p26 protein. Family of Baculovirus p26 proteins. 44674 pfam04767: DNA-binding 11 kDa phosphoprotein. Family of poxvirus proteins required for virus morphogenesis. Protein function necessary for proteolytic processing of the major viral structural proteins, P4a and P4b. 44675 pfam04768: Protein of unknown function (DUF619). This region of unknown function is found at the C-terminus of Neurospora crassa acetylglutamate synthase (amino-acid acetyltransferase, EC: 2.3.1.1). It is also found C-terminal to the amino acid kinase region (pfam00696) in some fungal acetylglutamate kinase enzymes. 44676 pfam04769: Mating-type protein MAT alpha 1. This family includes Saccharomyces cerevisiae mating type protein alpha 1. Mat alpha 1 is a transcription activator which activates mating-type alpha-specific genes. MAT alpha 1 and MCM 1 bind cooperatively to PQ elements upstream of alpha-specific genes. Alpha 1 interacts in vivo with STE12, linking expression of alpha-specific genes to the alpha-pheromone (pfam04648) response pathway. 44677 pfam04770: ZF-HD protein dimerisation region. This family of proteins has are plant transcription factors, and have been named ZF-HD for zinc finger homeodomain proteins, on the basis of similarity to proteins of known structure. This region is thought to be involved in the formation of homo and heterodimers, and may form a zinc finger. 44678 pfam04771: Chicken anaemia virus VP-3 protein. This protein is found in the nucleus of infected cells and may act as a transcriptional regulator. It induces apoptosis, and is also known as apoptin. 44679 pfam04772: Influenza B matrix protein 2 (BM2). M2 is synthesised in the late phase of infection and incorporated into the virion. It may be phosphorylated in vivo. The function of BM2 is unknown. 44680 pfam04773: FecR protein. FecR is involved in regulation of iron dicitrate transport. In the absence of citrate FecR inactivates FecI. FecR is probably a sensor that recognises iron dicitrate in the periplasm. 44681 pfam04774: Hyaluronan / mRNA binding family. This family includes the HABP4 family of hyaluronan-binding proteins, and the PAI-1 mRNA-binding protein, PAI-RBP1. HABP4 has been observed to bind hyaluronan (a glucosaminoglycan), but it is not known whether this is its primary role in vivo. It has also been observed to bind RNA, but with a lower affinity than that for hyaluronan. PAI-1 mRNA-binding protein specifically binds the mRNA of type-1 plasminogen activator inhibitor (PAI-1), and is thought to be involved in regulation of mRNA stability. However, in both cases, the sequence motifs predicted to be important for ligand binding are not conserved throughout the family, so it is not known whether members of this family share a common function. 44682 pfam04775: Acyl-CoA thioester hydrolase / Bile acid-CoA amino acid N-acetyltransferase. This family consists of the amino termini of acyl-CoA thioester hydrolase and bile acid-CoA:amino acid N-acetyltransferase (BAAT). This region is not thought to contain the active site of either enzyme. Thioesterase isoforms have been identified in peroxisomes, cytoplasm and mitochondria, where they are thought to have distinct functions in lipid metabolism. For example, in peroxisomes, the hydrolase acts on bile-CoA esters. 44683 pfam04776: Protein of unknown function (DUF626). Protein of unknown function, currently only identified in Arabidopsis thaliana. 44684 pfam04777: Erv1 / Alr family. Biogenesis of Fe/S clusters involves a number of essential mitochondrial proteins. Erv1p of Saccharomyces cerevisiae mitochondria is required for the maturation of Fe/S proteins in the cytosol. The ALR (augmenter of liver regeneration) represents a mammalian orthologue of yeast Erv1p. Both Erv1p and full-length ALR are located in the mitochondrial intermembrane an d it thought to operate downstream of the mitochondrial ABC transporter. . 44685 pfam04778: LMP repeated region. This family consists of a repeated sequence element found in the LMP group of surface-located membrane proteins of Mycoplasma hominis. The the number of repeats in the protein affects the tendency of cells to spontaneously aggregate. Agglutination may be an important factor in colonisation. Non-agglutinating microorganisms might easily be distributed whereas aggregation might provide a better chance to avoid an antibody response since some of the epitopes may be buried. 44686 pfam04780: Protein of unknown function (DUF629). This family represents a region of several plant proteins of unknown function. A C2H2 zinc finger is predicted in this region in some family members, but the spacing between the cysteine residues is not conserved throughout the family. 44687 pfam04781: Protein of unknown function (DUF627). This family represents the N-terminal region of several plant proteins of unknown function. 44688 pfam04782: Protein of unknown function (DUF632). This plant protein may be a leucine zipper, but there is no experimental evidence for this. 44689 pfam04783: Protein of unknown function (DUF630). This region is sometimes found at the N-terminus of putative plant bZIP proteins. Its function is not known. 44690 pfam04784: Protein of unknown function, DUF547. Family of uncharacterised proteins from C. elegans and A. thaliana. . 44691 pfam04785: Rhabdovirus matrix protein M2. M protein is involved in condensing and targeting the ribonucleoprotein (RNP) coil to the plasma membrane. M interacts specifically with the transmembrane spike protein (G) is important for the incorporation of G protein into budding virions. 44692 pfam04786: ssDNA binding protein. Family of Baculovirus ssDNA binding proteins. 44693 pfam04787: Late protein H7. Family of poxvirus late H7 proteins. 44694 pfam04788: Protein of unknown function (DUF620). Family of uncharacterised proteins. 44695 pfam04789: Protein of unknown function (DUF621). Family of uncharacterized proteins. 44696 pfam04790: Sarcoglycan complex subunit protein. The dystrophin glycoprotein complex (DGC) is a membrane-spanning complex that links the interior cytoskeleton to the extracellular matrix in muscle. The sarcoglycan complex is a subcomplex within the DGC and is composed of several muscle-specific, transmembrane proteins (alpha-, beta-, gamma-, delta- and zeta-sarcoglycan). The sarcoglycans are asparagine-linked glycosylated proteins with single transmembrane domains. This family contains beta, gamma and delta members. . 44697 pfam04791: LMBR1-like membrane protein. Members of this family are integral membrane proteins that are around 500 residues in length. LMBR1 is not involved in preaxial polydactyly, as originally thought. Vertebrate members of this family may play a role in limb development. A member of this family has been shown to be a lipocalin membrane receptor. 44698 pfam04792: V antigen (LcrV) protein. Yersinia pestis, the aetiologic agent of plague, secretes a set of environmentally regulated, plasmid pCD1-encoded virulence proteins termed Yops and V antigen (LcrV) by a type III secretion mechanism. LcrV is a multifunctional protein that has been shown to act at the level of secretion control by binding the Ysc inner-gate protein LcrG and to modulate the host immune response by altering cytokine production. LcrV is also necessary for full induction of low-calcium response (LCR) stimulon virulence gene transcription. Family members are not confined to Yersinia pestis. 44699 pfam04793: BRRF1-like protein. Family of herpesvirus proteins including Epstein-barr virus protein BBRF1. 44700 pfam04794: YdjC-like protein. Family of YdjC-like proteins. This region is possibly involved in the the cleavage of cellobiose-phosphate. 44701 pfam04795: PAPA-1-like conserved region. Family of proteins with a conserved region found in PAPA-1, a PAP-1 binding protein. . 44702 pfam04796: Plasmid encoded RepA protein. Family of plasmid encoded proteins involved in plasmid replication. The role of RepA in the replication process is not clearly understood. 44703 pfam04797: Herpesvirus dUTPase protein. This family of proteins are found in Herpesvirus proteins. This family includes proteins called ORF10 and ORF11 amongst others. However, these proteins seem to be related to other dUTPases pfam00692 suggesting that these proteins are also dUTPases (Bateman A pers. obs.).. 44704 pfam04798: Baculovirus 19 kDa protein conserved region. Family of Baculovirus proteins of approximate mass 19 kDa. 44705 pfam04799: fzo-like conserved region. Family of putative transmembrane GTPase. The fzo protein is a mediator of mitochondrial fusion. This conserved region is also found in the human mitofusin protein. 44706 pfam04800: ETC complex I subunit conserved region. Family of pankaryotic NADH-ubiquinone oxidoreductase subunits (EC:1.6.5.3) (EC:1.6.99.3) from complex I of the electron transport chain initially identified in Neurospora crassa as a 21 kDa protein. 44707 pfam04801: Sin-like protein conserved region. Family of higher eukaryotic proteins. SIN was identified as a protein that interacts specifically with SXL (sex lethal) in a yeast two-hybrid assay. The interaction is mediated by one of the SXL RNA binding domains. . 44708 pfam04802: Protein of unknown function (DUF625). Family of uncharacterised proteins. 44709 pfam04803: Cor1/Xlr/Xmr conserved region. Cor1 is a component of the chromosome core in the meiotic prophase chromosomes. Xlr is a lymphoid cell specific protein. Xlm is abundantly transcribed in testis in a tissue-specific and developmentally regulated manner. The protein is located in the nuclei of spermatocytes, early in the prophase of the first meiotic division, and later becomes concentrated in the XY nuclear subregion where it is in particular associated with the axes of sex chromosomes. . 44710 pfam04804: Non-structural protein NSM. Family of plant infecting tospovirus NSM proteins. 44711 pfam04805: E10-like protein conserved region. Family of poxvirus proteins. 44712 pfam04806: EspF protein. The enteropathogenic Escherichia coli EspF secreted protein induces host cell apoptosis. Its proline-rich structure suggests that it may act by binding to SH3 domains or EVH1 domains of host cell signalling proteins. 44713 pfam04807: Geminivirus AC4/5 conserved region. 44714 pfam04808: Citrus tristeza virus (CTV) P23 protein. This family consists of protein P23 from the citrus tristeza virus, which is a member of the Closteroviridae. CTV viruses produce more positive than negative RNA strands, and P23 controls this asymmetrical RNA accumulation. Amino acids 42-180 are essential for function and are thought to contain RNA-binding and zinc finger domains. 44715 pfam04809: HupH hydrogenase expression protein, C-terminal conserved region. This family represents a C-terminal conserved region found in these bacterial proteins necessary for hydrogenase synthesis. Their precise function is unknown. 44716 pfam04810: Sec23/Sec24 zinc finger. COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is found to be zinc binding domain. 44717 pfam04811: Sec23/Sec24 trunk domain. COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is known as the trunk domain and has an alpha/beta vWA fold and forms the dimer interface. 44718 pfam04812: Hepatocyte nuclear factor 1 (HNF-1), beta isoform C terminus. This family consists of a region found within the alpha isoform and at the C terminus of the beta isoform of the homeobox-containing transcription factor of HNF-1. Different isoforms of HNF-1 are generated by the differential use of polyadenylation sites and by alternative splicing. The C-terminal region of HNF-1 is responsible for the activation of transcription. Mutations and polymorphisms in HNF-1 cause the type 3 form of maturity-onset diabetes of the young (MODY3).. 44719 pfam04813: Hepatocyte nuclear factor 1 (HNF-1), alpha isoform C terminus. This family consists of an alternative C terminus of homeobox-containing transcription factor HNF-1, found in the HNF-1A isoform. Different isoforms of HNF-1 are generated by the differential use of polyadenylation sites and by alternative splicing. The C-terminal region of HNF-1 is responsible for the activation of transcription, and HNF-1A, which has this C-terminal extension, transactivates less well than the B and C isoforms. Mutations and polymorphisms in HNF-1 cause the type 3 form of maturity-onset diabetes of the young (MODY3).. 44720 pfam04814: Hepatocyte nuclear factor 1 (HNF-1), N terminus. This family consists of the N terminus of homeobox-containing transcription factor HNF-1. This region contains a dimerisation sequence, and an acidic region that may be involved in transcription activation. Mutations and the common Ala/Val 98 polymorphism in HNF-1 cause the type 3 form of maturity-onset diabetes of the young (MODY3).. 44721 pfam04815: Sec23/Sec24 helical domain. COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is composed of five alpha helices. 44722 pfam04816: Family of unknown function (DUF633). This family of proteins are uncharacterised have no known function. 44723 pfam04817: Umbravirus long distance movement (LDM) family. The long distance movement protein of Umbraviruses mediates the movement of viral RNA through the phloem of infected plants. . 44724 pfam04818: Protein of unknown function, DUF618. This family represents a conserved region found in a number of uncharacterised eukaryotic proteins. 44725 pfam04819: Family of unknown function (DUF716). This family is equally distributed in both metazoa and plants. Annotation associated with some members suggest that it may be involved in response to viral attack in plants. However, no clear function has been assigned to this family. 44726 pfam04820: Tryptophan halogenase. Tryptophan halogenase catalyses the chlorination of tryptophan to form 7-chlorotryptophan. This is the first step in the biosynthesis of pyrrolnitrin, an antibiotic with broad-spectrum anti-fungal activity. Tryptophan halogenase is NADH-dependent. 44727 pfam04821: Timeless protein. The timeless gene in Drosophila melanogaster and its homologues in a number of other insects and mammals (including human) are involved in circadian rhythm control. This family includes a related proteins from a number of fungal species. 44728 pfam04822: Protein of unknown function, DUF622. This family includes several uncharacterised mouse proteins. 44729 pfam04823: Herpesvirus UL49 tegument protein. 44730 pfam04824: Conserved region of Rad21 / Rec8 like protein. This family represents a conserved region found in eukaryotic cohesins of the Rad21, Rec8 and Scc1 families. Members of this family mediate sister chromatid cohesion during mitosis and meiosis, as part of the cohesin complex. Cohesion is necessary for homologous recombination (including double-strand break repair) and correct chromatid segregation. These proteins may also be involved in chromosome condensation. Dissociation at the metaphase to anaphase transition causes loss of cohesion and chromatid segregation. 44731 pfam04825: N terminus of Rad21 / Rec8 like protein. This family represents a conserved N-terminal region found in eukaryotic cohesins of the Rad21, Rec8 and Scc1 families. Members of this family mediate sister chromatid cohesion during mitosis and meiosis, as part of the cohesin complex. Cohesion is necessary for homologous recombination (including double-strand break repair) and correct chromatid segregation. These proteins may also be involved in chromosome condensation. Dissociation at the metaphase to anaphase transition causes loss of cohesion and chromatid segregation. 44732 pfam04826: Protein of unknown function (DUF634). Mammalian protein of unknown function. 44733 pfam04827: Protein of unknown function (DUF635). This family of plant proteins have no known function. The alignment boundary may not be reliable. 44734 pfam04828: Protein of unknown function (DUF636). This family of proteins has no known function, but several strongly conserved cysteine residues. 44735 pfam04829: Possible hemagglutinin (DUF638). This family represents a conserved region found in a bacterial protein which may be a hemagglutinin or hemolysin. 44736 pfam04830: Possible hemagglutinin (DUF637). This family represents a conserved region found in a bacterial protein which may be a hemagglutinin or hemolysin. 44737 pfam04831: Popeye protein conserved region. The function of Popeye proteins is not well understood. They are predominantly expressed in cardiac and skeletal muscle. This family represents a conserved region which includes three potential transmembrane domains. 44738 pfam04832: SOUL heme-binding protein. This family represents a group of putative heme-binding proteins. Our family includes archaeal and bacterial homologues. 44739 pfam04833: Phytochelatin synthetase-like conserved region. Family of plant proteins believed to be phytochelatin synthetases. This enzyme is found in certain plants and yeast and is responsible for the production of phytochelatins, small glutamic acid, cysteine and glycine-rich peptides, produced in response to cadmium stress. 44740 pfam04834: Early E3 14.5 kDa protein. The E3B 14.5 kDa was first identified in Human adenovirus type 5. It is an integral membrane protein oriented with its C terminus in the cytoplasm. It functions to down-regulate the epidermal growth factor receptor and prevent tumour necrosis factor cytolysis. It achieves this through the interaction with E3 10.4 kDa protein. 44741 pfam04835: A9 protein conserved region. Family of Chordopoxvirus A9 proteins. 44742 pfam04836: Interferon-related protein conserved region. Family of proteins thought to be involved in regulating gene activity in the proliferative and/or differentiative pathways induced by NGF. 44743 pfam04837: MbeB-like, N-term conserved region. This family represents an N-terminal conserved region of MbeB/MobB proteins. These proteins are essential for specific plasmid transfer. 44744 pfam04838: Baculoviridae late expression factor 5. 44745 pfam04839: Plastid and cyanobacterial ribosomal protein (PSRP-3 / Ycf65). This small acidic protein is found in 30S ribosomal subunit of cyanobacteria and plant plastids. In plants it has been named plastid-specific ribosomal protein 3 (PSRP-3), and in cyanobacteria it is named Ycf65. Plastid-specific ribosomal proteins may mediate the effects of nuclear factors on plastid translation. The acidic PSRPs are thought to contribute to protein-protein interactions in the 30S subunit, and are not thought to bind RNA. 44746 pfam04840: Vps16, C-terminal region. This protein forms part of the Class C vacuolar protein sorting (Vps) complex. Vps16 is essential for vacuolar protein sorting, which is essential for viability in plants, but not yeast. The Class C Vps complex is required for SNARE-mediated membrane fusion at the lysosome-like yeast vacuole. It is thought to play essential roles in membrane docking and fusion at the Golgi-to-endosome and endosome-to-vacuole stages of transport. The role of VPS16 in this complex is not known. 44747 pfam04841: Vps16, N-terminal region. This protein forms part of the Class C vacuolar protein sorting (Vps) complex. Vps16 is essential for vacuolar protein sorting, which is essential for viability in plants, but not yeast. The Class C Vps complex is required for SNARE-mediated membrane fusion at the lysosome-like yeast vacuole. It is thought to play essential roles in membrane docking and fusion at the Golgi-to-endosome and endosome-to-vacuole stages of transport. The role of VPS16 in this complex is not known. 44748 pfam04842: Plant protein of unknown function (DUF639). Plant protein of unknown function. 44749 pfam04843: Herpesvirus tegument protein, N-terminal conserved region. 44750 pfam04844: Protein of unknown function, DUF623. This family represents a conserved region found in a number of uncharacterised plant proteins. 44751 pfam04845: PurA ssDNA and RNA-binding protein. This family represents most of the length of the protein. 44752 pfam04846: Herpesvirus pp38 phosphoprotein. This protein represents a conserved region found in most herpesvirus pp38 phosphoproteins. 44753 pfam04847: Calcipressin. Calcipressin is also known as calcineurin-binding protein, since it inhibits calcineurin-mediated transcriptional modulation by binding to calcineurin's catalytic domain. 44754 pfam04848: Poxvirus A22 protein. 44755 pfam04849: HAP1 N-terminal conserved region. This family represents an N-terminal conserved region found in several huntingtin-associated protein 1 (HAP1) homologues. HAP1 binds to huntingtin in a polyglutamine repeat-length-dependent manner. However, its possible role in the pathogenesis of Huntington's disease is unclear. This family also includes a similar N-terminal conserved region from hypothetical protein products of ALS2CR3 genes found in the human juvenile amyotrophic lateral sclerosis critical region 2q33-2q34. 44756 pfam04850: Baculovirus E66 occlusion-derived virus envelope protein. 44757 pfam04851: Type III restriction enzyme, res subunit. This family represents the res subunit of type III restriction enzymes (EC:3.1.21.5).. 44758 pfam04852: Protein of unknown function (DUF640). This family represents a conserved region found in plant proteins including Resistance protein-like protein. 44759 pfam04853: Plant neutral invertase. This family represents a number of plant neutral invertases (EC:3.2.1.26).. 44760 pfam04854: Protein of unknown function, DUF624. This family includes several uncharacterised bacterial proteins. 44761 pfam04855: SNF5 / SMARCB1 / INI1. SNF5 is a component of the yeast SWI/SNF complex, which is an ATP-dependent nucleosome-remodelling complex that regulates the transcription of a subset of yeast genes. SNF5 is a key component of all SWI/SNF-class complexes characterized so far. This family consists of the conserved region of SNF5, including a direct repeat motif. SNF5 is essential for the assembly promoter targeting and chromatin remodelling activity of the SWI-SNF complex. SNF5 is also known as SMARCB1, for SWI/SNF-related, matrix-associated, actin-dependent regulator of chromatin, subfamily b, member 1, and also INI1 for integrase interactor 1. Loss-of function mutations in SNF5 are thought to contribute to oncogenesis in malignant rhabdoid tumours (MRTs).. 44762 pfam04856: Securin sister-chromatid separation inhibitor. Securin is also known as pituitary tumour-transforming gene product. Over-expression of securin is associated with a number of tumours, and it has been proposed that this may be due to erroneous chromatid separation leading to chromosome gain or loss. 44763 pfam04857: CAF1 family ribonuclease. The major pathways of mRNA turnover in eukaryotes initiate with shortening of the polyA tail. CAF1 encodes a critical component of the major cytoplasmic deadenylase in yeast. Both Caf1p is required for normal mRNA deadenylation in vivo and localises to the cytoplasm. Caf1p copurifies with a Ccr4p-dependent polyA-specific exonuclease activity. Some members of this family include and inserted RNA binding domain pfam01424. This family of proteins is related to other exonucleases pfam00929 (Bateman A pers. obs.).. 44764 pfam04858: TH1 protein. TH1 is a highly conserved but uncharacterised metazoan protein. No homologue has been identified in Caenorhabditis elegans. TH1 binds specifically to A-Raf kinase. 44765 pfam04859: Plant protein of unknown function (DUF641). Plant protein of unknown function. 44766 pfam04860: Phage portal protein. Bacteriophage portal proteins form a dodecamer and is located at a five-fold vertex of the viral capsid. The portal complex forms a channel through which the viral DNA is packaged into the capsid, and exits during infection. The portal protein is though to rotate during DNA packaging. Portal proteins from different phage show little sequence homology, so this family does not represent all portal proteins. 44767 pfam04861: Circovirus VP2 protein. Circoviruses are small circular single stranded viruses. This family includes the VP2 protein from the chicken anaemia virus. 44768 pfam04862: Protein of unknown function, DUF642. This family represents a conserved region found in a number of uncharacterised plant proteins. 44769 pfam04863: Alliinase EGF-like domain. Allicin is a thiosulphinate that gives rise to dithiines, allyl sulphides and ajoenes, the three groups of active compounds in Allium species. Allicin is synthesised from sulfoxide cysteine derivatives by alliinase (EC:4.4.1.4), whose C-S lyase activity cleaves C(beta)-S(gamma) bonds. It is thought that this enzyme forms part of a primitive plant defence system. This family represents the N-terminal EGF-like domain. 44770 pfam04864: Allinase, C-terminal domain. Allicin is a thiosulphinate that gives rise to dithiines, allyl sulphides and ajoenes, the three groups of active compounds in Allium species. Allicin is synthesised from sulfoxide cysteine derivatives by alliinase (EC:4.4.1.4), whose C-S lyase activity cleaves C(beta)-S(gamma) bonds. It is thought that this enzyme forms part of a primitive plant defence system. This family represents the C-terminal domain, which contributes to ligand binding. 44771 pfam04865: Baseplate J-like protein. The P2 bacteriophage J protein lies at the edge of the baseplate. This family also includes a number of bacterial homologues, which are thought to have been horizontally transferred. 44772 pfam04866: Rotavirus non-structural protein 6. 44773 pfam04867: Protein of unknown function (DUF643). Protein of unknown function found in Borrelia burgdorferi, the Lyme disease spirochete. 44774 pfam04868: Retinal cGMP phosphodiesterase, gamma subunit. Retinal rod and cone cGMP phosphodiesterases function as the effector enzymes in the vertebrate visual transduction cascade. This family represents the inhibitory gamma subunit, which is also expressed outside retinal tissues and has been shown to interact with the G-protein-coupled receptor kinase 2 signalling system to regulate the epidermal growth factor- and thrombin-dependent stimulation of p42/p44 mitogen-activated protein kinase in human embryonic kidney 293 cells. 44775 pfam04869: Uso1 / p115 like vesicle tethering protein, head region. Also known as General vesicular transport factor, Transcytosis associated protein (TAP) and Vesicle docking protein, this myosin-shaped molecule consists of an N-terminal globular head region, a coiled-coil tail which mediates dimerisation, and a short C-terminal acidic region. p115 tethers COP1 vesicles to the Golgi by binding the coiled coil proteins giantin (on the vesicles) and GM130 (on the Golgi), via its C-terminal acidic region. It is required for intercisternal transport in the golgi stack. This family consists of part of the head region. The head region is highly conserved, but its function is unknown. It does not seem to be essential for vesicle tethering. The N-terminal part of the head region, not within this family, contains context-detected Armadillo/beta-catenin-like repeats (pfam00514).. 44776 pfam04870: Protein of unknown function, DUF644. This family represents a conserved region found in a number of uncharacterised Caenorhabditis elegans proteins. 44777 pfam04871: Uso1 / p115 like vesicle tethering protein, C terminal region. Also known as General vesicular transport factor, Transcytosis associate protein (TAP) and Vesicle docking protein, this myosin-shaped molecule consists of an N-terminal globular head region, a coiled-coil tail which mediates dimerisation, and a short C-terminal acidic region. p115 tethers COP1 vesicles to the Golgi by binding the coiled coil proteins giantin (on the vesicles) and GM130 (on the Golgi), via its C-terminal acidic region. It is required for intercisternal transport in the golgi stack. This family consists of the acidic C-terminus, which binds to the golgins giantin and GM130. p115 is thought to juxtapose two membranes by binding giantin with one acidic region, and GM130 with another. 44778 pfam04872: Poxvirus L5 protein family. This family includes variola (smallpox) and vaccinia virus L5 proteins. However, not all proteins in this family are called L5. L5 is thought to contain a metal-binding region. 44779 pfam04873: Ethylene insensitive 3. Ethylene insensitive 3 (EIN3) proteins are a family of plant DNA-binding proteins that regulate transcription in response to the gaseous plant hormone ethylene, and are essential for ethylene-mediated responses including the triple response, cell growth inhibition, and accelerated senescence. . 44780 pfam04874: Mak16 protein. The precise function of this eukaryotic protein family is unknown. The yeast orthologues have been implicated in cell cycle progression and biogenesis of 60S ribosomal subunits. The Schistosoma mansoni Mak16 has been shown to target protein transport to the nucleolus. 44781 pfam04875: Protein of unknown function, DUF645. This family includes several uncharacterized proteins from Vibrio cholerae. 44782 pfam04876: Tenuivirus major non-capsid protein. This protein of unknown function accumulates in large amounts in tenuivirus infected cells. It is found in all forms of the inclusion bodies that are formed after infection. 44783 pfam04877: HrpZ. HrpZ from the plant pathogen Pseudomonas syringae binds to lipid bilayers and forms a cation-conducting pore in vivo. This pore-forming activity may allow nutrient release or delivery of virulence factors during bacterial colonisation of host plants. 44784 pfam04878: Baculovirus P48 protein. 44785 pfam04879: Molybdopterin oxidoreductase Fe4S4 domain. 44786 pfam04880: NUDE protein, C-terminal conserved region. This family represents the C-terminal conserved region of the NUDE proteins. NUDE proteins are involved in nuclear migration. 44787 pfam04881: Adenovirus GP19K. This 19 kDa glycoprotein binds the major histocompatibility (MHC) class I antigens in the endoplasmic reticulum (ER). The ER retention signal at the C-terminus of GP19K causes retention of the complex in the ER, preventing lysis of the cell by cytotoxic T lymphocytes. 44788 pfam04882: Peroxin-3. Peroxin-3 is a peroxisomal protein. It is thought to be involve in membrane vesicle assembly prior to the translocation of matrix proteins. 44789 pfam04883: Bacteriophage protein of unknown function (DUF646). This family of proteins is found in the caudovirales. It may be a tail component. 44790 pfam04884: Protein of unknown function, DUF647. 44791 pfam04885: Stigma-specific protein, Stig1. This family represents the Stig1 cysteine rich plant protein. The STIG1 gene is developmentally regulated and expressed specifically in the stigmatic secretory zone. 44792 pfam04886: PT repeat. This short repeat is composed on the tetrapeptide XPTX. This repeat is found in a variety of proteins, however it is not clear if these repeats are homologous to each other. The alignment represents nine copies of this repeat. 44793 pfam04887: Poxvirus M2 protein. This family includes M2 protein from variola virus. The function of this protein is not known. 44794 pfam04888: Secretion system effector C (SseC) like family. SseC is a secreted protein that forms a complex together with SecB and SecD on the surface of Salmonella. All these proteins are secreted by the type III secretion system. Many mucosal pathogens use type III secretion systems for the injection of effector proteins into target cells. SecB, SseC and SecD are inserted into the target cell membrane. where they form a small pore or translocon. In addition to SseC, this family includes the bacterial secreted proteins PopB, PepB, YopB and EspD which are thought to be directly involved in pore formation, and type III secretion system translocon. . 44795 pfam04889: Cwf15/Cwc15 cell cycle control protein. This family represents Cwf15/Cwc15 (from Schizosaccharomyces pombe and Saccharomyces cerevisiae respectively) and their homologues. The function of these proteins is unknown, but they form part of the spliceosome and are thus thought to be involved in mRNA splicing. . 44796 pfam04890: Family of unknown function (DUF648). Family of hypothetical Chlamydia proteins. This family may well comprise of two domains, as some members only match the N-terminus. . 44797 pfam04891: NifQ. NifQ is involved in early stages of the biosynthesis of the iron-molybdenum cofactor (FeMo-co), which is an integral part of the active site of dinitrogenase. The conserved C-terminal cysteine residues may be involved in metal binding. 44798 pfam04892: VanZ like family. This family contains several examples of the VanZ protein, but also contains examples of phosphotransbutyrylases. 44799 pfam04893: Yip1 domain. The Yip1 integral membrane domain contains four transmembrane alpha helices. The domain is characterised by the motifs DLYGP and GY. The Yip1 protein is a golgi protein involved in vesicular transport that interacts with GTPases. 44800 pfam04894: Archaeal protein of unknown function (DUF650). This family represents the amino terminal region of an archaeal protein of unknown function. 44801 pfam04895: Archaeal protein of unknown function (DUF651). This family represents the carboxy terminal region of an archaeal protein of unknown function. 44802 pfam04896: Ammonia monooxygenase/methane monooxygenase, subunit C. Ammonia monooxygenase plays a key role in the nitrogen cycle and degrades a wide range of hydrocarbons and halogenated hydrocarbons. This family represents the AmoC subunit. It also includes the particulate methane monooxygenase subunit PmoC from methanotrophic bacteria. 44803 pfam04897: Glutamate synthase amidotransferase domain. The overall topology of the N-terminal amidotransferase domain from glutamate synthase is characterized by a four layer alpha/beta/beta/alpha architecture and is similar to other Ntn-amidotransferases. The amidotransferase domain from Fd-GltS contains the typical catalytic centre of Ntn-amidotransferases, and the N-terminal Cys-1 catalyses the hydrolysis of L-glutamine generating ammonia and the first molecule of L-glutamate. 44804 pfam04898: Glutamate synthase central domain. The central domain of glutamate synthase connects the amino terminal amidotransferase domain with the FMN-binding domain and has an alpha / beta overall topology. . 44805 pfam04899: MbeD/MobD like. The MbeD and MobD proteins are plasmid encoded, and are involved in the plasmids mobilisation and transfer in the presence of conjugative plasmids. 44806 pfam04900: Protein of unknown function, DUF652. This family includes several uncharacterized eukaryotic proteins. 44807 pfam04901: Receptor activity modifying family. The calcitonin-receptor-like receptor can function as either a calcitonin-gene-related peptide or an adrenomedullin receptor. The receptors function is modified by receptor-activity-modifying protein or RAMP. RAMPs are single-transmembrane-domain proteins. 44808 pfam04902: Conserved region in Nab1. Nab1 and Nab2 are co-repressors that specifically interact with and repress transcription mediated by the three members of the NGFI-A (Egr-1, Krox24, zif/268) family of transcription factors. This C-terminal region is found only in the Nab1 subfamily. 44809 pfam04903: Poxvirus interferon gamma receptor. 44810 pfam04904: NAB conserved region 1 (NCD1). Nab1 and Nab2 are co-repressors that specifically interact with and repress transcription mediated by the three members of the NGFI-A (Egr-1, Krox24, zif/268) family of transcription factors. This region consists of the N-terminal NAB conserved region 1, which interacts with the EGR1 inhibitory domain (R1). It may also mediate multimerisation. 44811 pfam04905: NAB conserved region 2 (NCD2). Nab1 and Nab2 are co-repressors that specifically interact with and repress transcription mediated by the three members of the NGFI-A (Egr-1, Krox24, zif/268) family of transcription factors. This family consists of NAB conserved region 2, near the C-terminus of the protein. It is necessary for transcriptional repression by the Nab proteins. It is also required for transcription activation by Nab proteins at Nab-activated promoters. 44812 pfam04906: Tweety. The tweety (tty) gene has not been characterised at the protein level. However, it is thought to form a membrane protein with five potential membrane-spanning regions. A number of potential functions have been suggested in. 44813 pfam04908: SH3-binding, glutamic acid-rich protein. 44814 pfam04909: Amidohydrolase. These proteins are amidohydrolases that are related to pfam01979. 44815 pfam04910: Protein of unknown function, DUF654. This family includes a number of poorly characterised eukaryotic proteins. . 44816 pfam04911: ATP synthase j chain. 44817 pfam04912: Dynamitin. Dynamitin is a subunit of the microtubule-dependent motor complex and in implicated in cell adhesion by binding to macrophage-enriched myristoylated alanine-rice C kinase substrate (MacMARCKS).. 44818 pfam04913: Baculovirus Y142 protein. 44819 pfam04914: DltD C-terminal region. DltD is and integral membrane protein involved in the biosynthesis of D-alanyl-lipoteichoic acid. This is important in controlling the net ionic charge in lipoteichoic acid (LTA). This family is found in bacteria of the Bacillus/Clostridium group. DltD binds Dcp and ligates it with D-alanine. DltD does not ligate acyl carrier protein (ACP) with D-alanine. It also has thioesterase activity for mischarged D-alanyl-acyl carrier protein (ACP). DltD is thought to be responsible for discriminating between Dcp involved in the D-alanylation of LTA, and ACP involved in fatty acid biosynthesis. This family consists of the C-terminal region of DltD. 44820 pfam04915: DltD N-terminal region. DltD is and integral membrane protein involved in the biosynthesis of D-alanyl-lipoteichoic acid. This is important in controlling the net ionic charge in lipoteichoic acid (LTA). This family is found in bacteria of the Bacillus/Clostridium group. DltD binds Dcp and ligates it with D-alanine. DltD does not ligate acyl carrier protein (ACP) with D-alanine. It also has thioesterase activity for mischarged D-alanyl-acyl carrier protein (ACP). DltD is thought to be responsible for discriminating between Dcp involved in the D-alanylation of LTA, and ACP involved in fatty acid biosynthesis. This family consists of the N-terminal region of DltD. 44821 pfam04916: Laminin A. 44822 pfam04917: Bacterial shufflon protein, N-terminal constant region. This family represents the high-similarity N-terminal 'constant region' shared by shufflon proteins. 44823 pfam04918: DltD central region. DltD is and integral membrane protein involved in the biosynthesis of D-alanyl-lipoteichoic acid. This is important in controlling the net ionic charge in lipoteichoic acid (LTA). This family is found in bacteria of the Bacillus/Clostridium group. DltD binds Dcp and ligates it with D-alanine. DltD does not ligate acyl carrier protein (ACP) with D-alanine. It also has thioesterase activity for mischarged D-alanyl-acyl carrier protein (ACP). DltD is thought to be responsible for discriminating between Dcp involved in the D-alanylation of LTA, and ACP involved in fatty acid biosynthesis. This family consists of the central region of DltD. 44824 pfam04919: Protein of unknown function, DUF655. This family includes several uncharacterized archaeal proteins. 44825 pfam04920: Family of unknown function (DUF656). A family of hypothetical proteins from Beet necrotic yellow vein virus. 44826 pfam04921: XAP5 protein. This protein is found in a wide range of eukaryotes. Its function is uncertain. It is a nuclear protein and is suggested to be DNA binding. 44827 pfam04922: DIE2/ALG10 family. The ALG10 protein from Saccharomyces cerevisiae encodes the alpha-1,2 glucosyltransferase of the endoplasmic reticulum. This protein has been characterised in rat as potassium channel regulator 1. 44828 pfam04923: Ninjurin. Ninjurin (nerve injury-induced protein) is involved in nerve regeneration and in the formation and function in some tissues. 44829 pfam04924: Poxvirus A6 protein. 44830 pfam04925: SHQ1 protein. S. cerevisiae SHQ1 protein is required for SnoRNAs of the box H/ACA Quantitative accumulation (unpublished).. 44831 pfam04926: Poly(A) polymerase predicted RNA binding domain. Based on its similarity structurally to the RNA recognition motif this domain is thought to be RNA binding. 44832 pfam04927: Seed maturation protein. Plant seed maturation protein. 44834 pfam04929: Herpes DNA replication accessory factor. Replicative DNA polymerases are capable of polymerising tens of thousands of nucleotides without dissociating from their DNA templates. The high processivity of these polymerases is dependent upon accessory proteins that bind to the catalytic subunit of the polymerase or to the substrate. The Epstein-Barr virus (EBV) BMRF1 protein is an essential component of the viral DNA polymerase and is absolutely required for lytic virus replication. BMRF1 is also a transactivator. This family is predicted to have a UL42 like structure. 44835 pfam04930: FUN14 family. This family of short proteins are found in eukaryotes and some archaea. Although the function of these proteins is not known they may contain transmembrane helices. 44836 pfam04931: DNA polymerase V. This family includes the fifth essential DNA polymerase in yeast EC:2.7.7.7. Pol5p is localised exclusively to the nucleolus and binds near or at the enhancer region of rRNA-encoding DNA repeating units. 44837 pfam04932: O-Antigen Polymerase. This group of bacterial proteins is involved in the synthesis of O-antigen, a lipopolysaccharide found in the outer membrane in gram-negative bacteria. The enzyme is coded for by the gene wzy which is part of the O-antigen gene cluster. 44838 pfam04934: MED6 mediator sub complex component. Component of RNA polymerase II holoenzyme and mediator sub complex. 44839 pfam04935: Surfeit locus protein 6. The surfeit locus protein SURF-6 is shown to be a component of the nucleolar matrix and has a strong binding capacity for nucleic acids. 44840 pfam04936: Protein of unknown function (DUF 658). Protein of unknown function found in Lactococcus lactis bacteriophages. 44841 pfam04937: Protein of unknown function (DUF 659). Transposase-like protein with no known function. 44842 pfam04938: Survival motor neuron (SMN) interacting protein 1 (SIP1). Survival motor neuron (SMN) interacting protein 1 (SIP1) interacts with SMN protein and plays a crucial role in the biogenesis of spliceosomes. There is evidence that the protein is linked to spinal muscular atrophy (SMA) and amyotrophic lateral sclerosis(ALS) in humans. 44843 pfam04939: Ribosome biogenesis regulatory protein (RRS1). This family consists of several eukaryotic ribosome biogenesis regulatory (RRS1) proteins. RRS1 is a nuclear protein that is essential for the maturation of 25 S rRNA and the 60 S ribosomal subunit assembly in Saccharomyces cerevisiae. 44844 pfam04940: Sensors of blue-light using FAD. The BLUF domain has been shown to bind FAD in the AppA protein. AppA is involved in the repression of photosynthesis genes in response to blue-light. 44845 pfam04941: Late expression factor 8 (LEF-8). Late expression factor 8 (LEF-8) is one of the primary components of RNA polymerase produced by polyhedrosis viruses. LEF-8 shows homology to the second largest subunit of prokaryotic DNA-directed RNA polymerase. 44846 pfam04942: CC domain. This short domain contains four conserved cysteines that probably for two disulphide bonds. The domain is named after the characteristic CC motif. 44847 pfam04943: Poxvirus F11 protein. The protein F11 is an early virus protein. 44848 pfam04944: Uncharacterized BCR (COG3801). The function of these short bacterial proteins is unknown. 44849 pfam04945: YHS domain. This short presumed domain is about 50 amino acid residues long. It often contains two cysteines that may be functionally important. This domain is found in copper transporting ATPases, some phenol hydroxylases and in a set of uncharacterised membrane proteins. This domain is named after three of the most conserved amino acids it contains. The domain may be metal binding, possibly copper ions. This domain is duplicated in some copper transporting ATPases. 44850 pfam04946: DGPF domain. This domain is about 120 residues long. Although its function is unknown it is found fused to a sigma-70 factor family domain in one member. Suggesting that this domain plays a role in transcription initiation (Bateman A per. obs.). This domain is named after the most conserved motif in the alignment. 44851 pfam04947: Poxvirus Late Transcription Factor VLTF3 like. Members of this family are approximately 26 KDa, and are involved in trans-activator of late transcription. 44852 pfam04948: Poxvirus A51 protein. 44853 pfam04949: Family of unknown function (DUF662). Family of hypothetical eukaryotic proteins. 44854 pfam04950: Protein of unknown function (DUF663). This family contains several uncharacterised eukaryotic proteins. 44855 pfam04951: D-aminopeptidase. Bacillus subtilis DppA is a binuclear zinc-dependent, D-specific aminopeptidase. The structure reveals that DppA is a new example of a 'self-compartmentalising protease', a family of proteolytic complexes. Proteasomes are the most extensively studied representatives of this family. The DppA enzyme is composed of identical 30 kDa subunits organised in a decamer with 52 point-group symmetry. A 20 A wide channel runs through the complex, giving access to a central chamber holding the active sites. The structure shows DppA to be a prototype of a new family of metalloaminopeptidases characterised by the SXDXEG key sequence. The only known substrates are D-ala-D-ala and D-ala-gly-gly. 44856 pfam04952: Succinylglutamate desuccinylase / Aspartoacylase family. This family includes Succinylglutamate desuccinylase EC:3.1.-.- that catalyses the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway. The family also include aspartoacylase EC:3.5.1.15 which cleaves acylaspartate into a fatty acid and aspartate. Mutations in human Aspartoacylase lead to Canavan disease. This family is probably structurally related to pfam00246 (Bateman A pers. obs.).. 44857 pfam04953: Citrate lyase, gamma subunit. In citrate-utilising prokaryotes, citrate lyase EC:4.1.3.6 cleaves intracellular citrate into acetate and oxaloacetate, and is organised as a functional complex consisting of alpha, beta, and gamma subunits. The gamma subunit serves as an acyl carrier protein (ACP), and has a 2'-(5''-phosphoribosyl)-3 '-dephospho-CoA prosthetic group. The citrate lyase is active only if this prosthetic group is acetylated; this acetylation is catalysed by an acetate:SH-citrate lyase ligase. The alpha subunit substitutes citryl for the acetyl group to form citryl-S-ACP. The beta subunit completes the reaction by cleaving the citryl to yield oxaloacetate and (regenerated) acetyl-S-ACP. This family represents the gamma subunit. 44858 pfam04954: Siderophore-interacting protein. 44859 pfam04955: HupE / UreJ protein. This family of proteins are hydrogenase / urease accessory proteins. The alignment contains many conserved histidines that are likely to be involved in nickel binding. 44860 pfam04956: Conjugal transfer protein TrbC. Conjugal transfer protein, TrbC has been identified as a subunit of the pilus precursor in bacteria. The protein undergoes three processing steps before gaining its mature cyclic structure. 44861 pfam04957: Ribosome modulation factor. This protein associates with 70s ribosomes and converts them to a dimeric form (100S ribosomes) which appear during the transition from the exponential growth phase to the stationary phase of Escherichia coli cells. 44862 pfam04958: Arginine N-succinyltransferase beta subunit. Arginine N-succinyltransferase EC:2.3.1.109 catalyses the transfer of succinyl-CoA to arginine to produce succinylarginine. This is the first step in arginine catabolism by the arginine succinyltransferase pathway. 44863 pfam04959: Arsenite-resistance protein 2. Arsenite is a carcinogenic compound which can act as a co-mutagen by inhibiting DNA repair. Arsenite-resistance protein 2 is thought to play a role in arsenite resistance. 44864 pfam04960: Glutaminase. This family of enzymes deaminates glutamine to glutamate EC:3.5.1.2. 44865 pfam04961: Formiminotransferase-cyclodeaminase. Members of this family are thought to be Formiminotransferase- cyclodeaminase enzymes EC:4.3.1.4. This domain is found in the C-terminus of the bifunctional animal members of the family. 44866 pfam04962: 5-keto 4-deoxyuronate isomerase. This enzyme EC:5.3.1.17 is involved in pectin degradation. 44867 pfam04963: Sigma-54 factor, core binding domain. This domain makes a direct interaction with the core RNA polymerase, to form an enhancer dependent holoenzyme. The centre of this domain contains a very weak similarity to a helix-turn-helix motif (excluded from the alignment), which may represent the other DNA binding domain. 44868 pfam04964: Flp/Fap pilin component. 44869 pfam04965: GPW / gp25 family. This protein may be a structural component of the outer wedge of the baseplate that has acidic lysozyme activity as suggested by sequence annotation. 44870 pfam04966: Carbohydrate-selective porin, OprB family. 44871 pfam04967: HTH DNA binding domain. 44872 pfam04968: CHORD. CHORD represents a Zn binding domain. Silencing of the C. elegans CHORD-containing gene results in semisterility and embryo lethality, suggesting an essential function of the wild-type gene in nematode development. 44873 pfam04969: CS domain. The role of the CS domain is unclear. The CS and CHORD (pfam04968) are fused into a single polypeptide chain in metazoans but are found in separate proteins in plants is is thought to be indicative of an interaction between CS and CHORD. 44874 pfam04970: NC domain. Members of this family are characterised by containing a well conserved NCEHF motif. The role of this domain is unclear. 44875 pfam04971: Lysis protein S. The lysis S protein is a cytotoxic protein forming holes in membranes causing cell lysis. The action of Lysis S is independent of the proportion of acidic phospholipids in the membrane. 44876 pfam04972: Putative phospholipid-binding domain. This domain is found in a family of osmotic shock protection proteins. It is also found in some Secretins and a group of potential haemolysins. Its likely function is attachment to phospholipid membranes. 44877 pfam04973: Nicotinamide mononucleotide transporter. Members of this family are integral membrane proteins that are involved in transport of nicotinamide mononucleotide. 44878 pfam04974: Archaeal flagellar protein F. 44879 pfam04975: Archaeal flagellar protein G. This family appears to be distantly related to pfam01917 and pfam04974 which are also components of the archaeal flagellar. 44880 pfam04976: DMSO reductase anchor subunit (DmsC). The terminal electron transfer enzyme Me2SO reductase of Escherichia coli is a heterotrimeric enzyme composed of a membrane extrinsic catalytic dimer (DmsAB) and a membrane intrinsic polytopic anchor subunit (DmsC).. 44881 pfam04977: Septum formation initiator. DivIC from B. subtilis is necessary for both vegetative and sporulation septum formation. These proteins are mainly composed of an amino terminal coiled-coil. 44882 pfam04978: Protein of unknown function (DUF664). This family is commonly found in Streptomyces coelicolor and is of unknown function. These proteins contain several conserved histidines at their N-terminus that may form a metal binding site. 44883 pfam04979: Protein phosphatase inhibitor 2 (IPP-2). Protein phosphotase inhibitor 2 (IPP-2) is a phosphoprotein conserved among all eukaryotes, and it appears in both the nucleus and cytoplasm of tissue culture cells. 44884 pfam04981: NMD3 family. The NMD3 protein is involved in nonsense mediated mRNA decay. This amino terminal region contains four conserved CXXC motifs that could be metal binding. NMD3 is involved in export of the 60S ribosomal subunit is mediated by the adapter protein Nmd3p in a Crm1p-dependent pathway. 44885 pfam04982: HPP family. These proteins are integral membrane proteins with four transmembrane spanning helices. The most conserved region of the alignment is a motif HPP. The function of these proteins is uncertain but they may be transporters. 44886 pfam04983: RNA polymerase Rpb1, domain 3. RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 3, represents the pore domain. The 3' end of RNA is positioned close to this domain. The pore delimited by this domain is thought to act as a channel through which nucleotides enter the active site and/or where the 3' end of the RNA may be extruded during back-tracking. 44887 pfam04984: Phage tail sheath protein. This family includes a variety of phage tail sheath proteins. 44888 pfam04985: Phage tail tube protein FII. The major structural components of the contractile tail of bacteriophage P2 are proteins FI and FII, which are believed to be the tail sheath and tube proteins, respectively. 44889 pfam04986: Putative transposase. Transposases are needed for efficient transposition of the insertion sequence or transposon DNA. This family includes transposases IS1294 and IS801. 44890 pfam04987: Phosphatidylinositolglycan class N (PIG-N). Phosphatidylinositolglycan class N (PIG-N) is a mammalian homologue of the yeast protein MCD4P and is expressed in the endoplasmic reticulum. PIG-N is essential for glycosylphosphatidylinositol anchor synthesis. Glycosylphosphatidylinositol (GPI)-anchored proteins are cell surface-localised proteins that serve many important cellular functions. 44891 pfam04988: A-kinase anchoring protein 95 (AKAP95). A-kinase (or PKA)-anchoring protein AKAP95 is implicated in mitotic chromosome condensation by acting as a targeting molecule for the condensin complex. The protein contains two zinc fingers which are thought to mediate the binding of AKAP95 to DNA. 44892 pfam04989: Cephalosporin hydroxylase. Members of this family are about 220 amino acids long. The CmcI protein is presumed to represent the cephalosporin-7--hydroxylase. However this has not been experimentally verified. 44893 pfam04990: RNA polymerase Rpb1, domain 7. RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 7, represents a mobile module of the RNA polymerase. Domain 7 forms a substantial interaction with the lobe domain of Rpb2 (pfam04561).. 44894 pfam04991: LICD Protein Family. The LICD family of proteins show high sequence similarity and are involved in phosphorylcholine metabolism. There is evidence to show that LicD2 mutants have a reduced ability to take up choline, have decreased ability to adhere to host cells and are less virulent. 44895 pfam04992: RNA polymerase Rpb1, domain 6. RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 6, represents a mobile module of the RNA polymerase. Domain 6 forms part of the shelf module. This family appears to be specific to the largest subunit of RNA polymerase II. 44896 pfam04993: TfoX N-terminal domain. TfoX may play a key role in the development of genetic competence by regulating the expression of late competence-specific genes. This family corresponds to the N-terminal presumed domain of TfoX. The domain is found as an isolated domain in some proteins suggesting this is an autonomous domain. 44897 pfam04994: TfoX C-terminal domain. TfoX may play a key role in the development of genetic competence by regulating the expression of late competence-specific genes. This family corresponds to the C-terminal presumed domain of TfoX. The domain is found associated with pfam00383 in some members. It is also found as an isolated domain in some proteins suggesting this is an autonomous domain. 44898 pfam04995: Heme exporter protein D (CcmD). The CcmD protein is part of a C-type cytochrome biogenesis operon. The exact function of this protein is uncertain. It has been proposed that CcmC, CcmD and CcmE interact directly with each other, establishing a cytoplasm to periplasm haem delivery pathway for cytochrome c maturation. This protein is found fused to CcmE in some members. These proteins contain a predicted transmembrane helix. 44899 pfam04996: Succinylarginine dihydrolase. This enzyme transforms N(2)-succinylglutamate into succinate and glutamate. This is the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway. 44900 pfam04997: RNA polymerase Rpb1, domain 1. RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 1, represents the clamp domain, which a mobile domain involved in positioning the DNA, maintenance of the transcription bubble and positioning of the nascent RNA strand. . 44901 pfam04998: RNA polymerase Rpb1, domain 5. RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 5, represents the discontinuous cleft domain that is required to from the central cleft or channel where the DNA is bound. 44902 pfam04999: Cell division protein FtsL. In Escherichia coli, nine gene products are known to be essential for assembly of the division septum. One of these, FtsL, is a bitopic membrane protein whose precise function is not understood. It has been proposed that FtsL interacts with the DivIC protein pfam04977, however this interaction may be indirect. 44903 pfam05000: RNA polymerase Rpb1, domain 4. RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 4, represents the funnel domain. The funnel contain the binding site for some elongation factors. . 44904 pfam05001: RNA polymerase Rpb1 C-terminal repeat. The repetitive C-terminal domain (CTD) of Rpb1 (RNA polymerase Pol II) plays a critical role in the regulation of gene expression. The activity of the CTD is dependent on its state of phosphorylation. 44905 pfam05002: SGS domain. This domain was thought to be unique to the SGT1-like proteins, but is also found in calcyclin binding proteins. 44906 pfam05003: Protein of unknown function (DUF668). Uncharacterised plant protein. 44907 pfam05004: Interferon-related developmental regulator (IFRD). Interferon-related developmental regulator (IFRD1) is the human homologue of the rat early response protein PC4 and its murine homologue TIS7. The exact function of IFRD1 is unknown but it has been shown that PC4 is necessary to muscle differentiation and that it might have a role in signal transduction. This family also contains IFRD2 and its murine equivalent SKMc15 which are highly expressed soon after gastrulation and in the hepatic primordium, suggesting an involvement in early hematopoiesis. 44908 pfam05005: Janus/Ocnus family (Ocnus). This family is comprised of the Ocnus, Janus-A and Janus-B proteins. These proteins have been found to be testes specific in Drosophila melanogaster. 44909 pfam05006: Protein of unknown function (DUF666). This family contains several uncharacterized viral proteins. 44910 pfam05007: Mannosyltransferase (PIG-M). PIG-M has a DXD motif. The DXD motif is found in many glycosyltransferases that utilise nucleotide sugars. It is thought that the motif is involved in the binding of a manganese ion that is required for association of the enzymes with nucleotide sugar substrates. 44911 pfam05008: Vesicle transport v-SNARE protein. V-SNARE proteins are required for protein traffic between eukaryotic organelles. The v-SNAREs on transport vesicles interact with t-SNAREs on target membranes in order to facilitate this. 44912 pfam05009: Epstein-Barr virus nuclear antigen 3 (EBNA-3). This family contains EBNA-3A, -3B, and -3C which are latent infection nuclear proteins important for Epstein-Barr virus (EBV)-induced B-cell immortalisation and the immune response to EBV infection. 44913 pfam05010: Transforming acidic coiled-coil-containing protein (TACC). This family contains the proteins TACC 1, 2 and 3 the genes for which are found concentrated in the centrosomes of eukaryotic and may play a conserved role in organising centrosomal microtubules. The human TACC proteins have been linked to cancer and TACC2 has been identified as a possible tumour suppressor (AZU-1).. 44914 pfam05011: Lariat debranching enzyme, C-terminal domain. This presumed domain is found at the C-terminus of lariat debranching enzyme. This domain is always found in association with pfam00149. 44915 pfam05012: Prophage maintenance system killer protein. P1 lysogens of Escherichia coli carry the prophage as a stable low copy number plasmid. The frequency with which viable cells cured of prophage are produced is about 10(-5) per cell per generation. A significant part of this remarkable stability can be attributed to a plasmid-encoded mechanism that causes death of cells that have lost P1. In other words, the lysogenic cells appear to be addicted to the presence of the prophage. The plasmid withdrawal response depends on a gene named doc (death on curing) that is represented by this family. 44916 pfam05013: N-formylglutamate amidohydrolase. Formylglutamate amidohydrolase (FGase) catalyses the terminal reaction in the five-step pathway for histidine utilisation in Pseudomonas putida. By this action, N-formyl-L-glutamate (FG) is hydrolysed to produce L-glutamate plus formate. 44917 pfam05014: Nucleoside 2-deoxyribosyltransferase. Nucleoside 2-deoxyribosyltransferase EC:2.4.2.6 catalyses the cleavage of the glycosidic bonds of 2`-deoxyribonucleosides. 44918 pfam05015: Plasmid maintenance system killer protein. Several plasmids with proteic killer gene systems have been reported. All of them encode a stable toxin and an unstable antidote. Upon loss of the plasmid, the less stable inhibitor is inactivated more rapidly than the toxin, allowing the toxin to be activated. The activation of those systems result in cell filamentation and cessation of viable cell production. It has been verified that both the stable killer and the unstable inhibitor of the systems are short polypeptides. This family corresponds to the toxin. 44919 pfam05016: Plasmid stabilisation system protein. Members of this family are involved in plasmid stabilisation. The exact molecular function of this protein is not known. 44920 pfam05017: TMP repeat. This short repeat consists of the motif WXXh where X can be any residue and h is a hydrophobic residue. The repeat is name TMP after its occurrence in the tape measure protein (TMP). Tape measure protein is a component of phage tail and probably forms a beta-helix. Truncated forms of TMP lead to shortened tail fibres. This repeat is also found in non-phage proteins where it may play a structural role. 44921 pfam05018: Protein of unknown function (DUF667). This family of proteins are highly conserved in eukaryotes. Some proteins in the family are annotated as transcription factors. However, there is currently no support for this in the literature. 44922 pfam05019: Coenzyme Q (ubiquinone) biosynthesis protein Coq4. Coq4p was shown to peripherally associate with the matrix face of the mitochondrial inner membrane. The putative mitochondrial- targeting sequence present at the amino-terminus of the polypeptide efficiently imported it to mitochondria. The function of Coq4p is unknown, although its presence is required to maintain a steady-state level of Coq7p, another component of the Q biosynthetic pathway. 44923 pfam05020: NPL4 family, putative zinc binding region. The HRD4 gene was identical to NPL4, a gene previously implicated in nuclear transport. Using a diverse set of substrates and direct ubiquitination assays, analysis revealed that HRD4/NPL4 is required for a poorly characterised step in ER-associated degradation after ubiquitination of target proteins but before their recognition by the 26S proteasome. This region of the protein contains possibly two zinc binding motifs (Bateman A pers. obs.). Npl4p physically associates with Cdc48p via Ufd1p to form a Cdc48p-Ufd1p-Npl4p complex. The Cdc48-Ufd1-Npl4 complex functions in the recognition of several polyubiquitin-tagged proteins and facilitates their presentation to the 26S proteasome for processive degradation or even more specific processing. 44924 pfam05021: NPL4 family. The HRD4 gene was identical to NPL4, a gene previously implicated in nuclear transport. Using a diverse set of substrates and direct ubiquitination assays, analysis revealed that HRD4/NPL4 is required for a poorly characterised step in ER-associated degradation after ubiquitination of target proteins but before their recognition by the 26S proteasome. Npl4p physically associates with Cdc48p via Ufd1p to form a Cdc48p-Ufd1p-Npl4p complex. The Cdc48-Ufd1-Npl4 complex functions in the recognition of several polyubiquitin-tagged proteins and facilitates their presentation to the 26S proteasome for processive degradation or even more specific processing. 44925 pfam05022: SRP40, C-terminal domain. This presumed domain is found at the C-terminus of the S. cerevisiae SRP40 protein and its homologues. SRP40/nopp40 is a chaperone involved in nucleocytoplasmic transport. SRP40 is also a suppressor of mutant AC40 subunit of RNA polymerase I and III. 44926 pfam05023: Phytochelatin synthase. Phytochelatin synthase is the enzyme responsible for the synthesis of heavy-metal-binding peptides (phytochelatins) from glutathione and related thiols. 44927 pfam05024: N-acetylglucosaminyl transferase component (Gpi1). Glycosylphosphatidylinositol (GPI) represents an important anchoring molecule for cell surface proteins.The first step in its synthesis is the transfer of N-acetylglucosamine (GlcNAc) from UDP-N-acetylglucosamine to phosphatidylinositol (PI). This chemically simple step is genetically complex because three or four genes are required in both yeast (GPI1, GPI2 and GPI3) and mammals (GPI1, PIG A, PIG H and PIG C), respectively. 44928 pfam05025: RbsD / FucU transport protein family. The Escherichia coli high-affinity ribose-transport system consists of six proteins encoded by the rbs operon (rbsD, rbsA, rbsC, rbsB, rbsK and rbsR). Of the six components, RbsD is the only one whose function is unknown although it is thought that it somehow plays a critical role in PtsG-mediated ribose transport. This family also includes FucU a protein from the fucose biosynthesis operon that is presumably also involved in fucose transport by similarity to RbsD. 44929 pfam05026: Dcp2, box A domain. This presumed domain is always found to the amino terminal side of pfam00293. This domain appears to be specific to mRNA decapping protein 2 and its close homologues. This region has been termed Box A. 44930 pfam05027: TBP (TATA-binding protein) -interacting protein 120 (TIP120). TIP120A is thought to be a unique global transcription factor that can interact with TBP and can stimulate all classes of eukaryotic transcription. TIP120B is specifically expressed in the skeletal muscle and heart, it is speculated that this gene is required for muscle cells. 44931 pfam05028: Poly (ADP-ribose) glycohydrolase (PARG). Poly(ADP-ribose) glycohydrolase (PARG), is a ubiquitously expressed exo- and endoglycohydrolase which mediates oxidative and excitotoxic neuronal death. 44932 pfam05029: Timeless protein C terminal region. The timeless (tim) gene is essential for circadian function in Drosophila. Putative homologues of Drosophila tim have been identified in both mice and humans (mTim and hTIM, respectively). Mammalian TIM is not the true orthologue of Drosophila TIM, but is the likely orthologue of a fly gene, timeout (also called tim-2). mTim has been shown to be essential for embryonic development, but does not have substantiated circadian function. Some family members contain a SANT domain in this region. 44933 pfam05030: SSXT protein (N-terminal region). The SSXT or SS18 protein is involved in synovial sarcoma in humans. A SYT-SSX fusion gene resulting from the chromosomal translocation t(X;18) (p11;q11) is characteristic of synovial sarcomas. This translocation fuses the SSXT (SYT) gene from chromosome 18 to either of two homologous genes at Xp11, SSX1 or SSX2. 44934 pfam05031: Iron Transport-associated domain. This domain is involved in the transport of iron, possibly as a siderophore. 44935 pfam05032: Spo12 family. This family of proteins includes Spo12 from S. cerevisiae. The Spo12 protein plays a regulatory role in two of the most fundamental processes of biology, mitosis and meiosis, and yet its biochemical function remains elusive. Spo12 is a nuclear protein. Spo12 is a component of the FEAR (Cdc fourteen early anaphase release) regulatory network, that promotes Cdc14 release from the nucleolus during early anaphase. The FEAR network is comprised of the polo kinase Cdc5, the separase Esp1, the kinetochore-associated protein Slk19, and Spo12. 44936 pfam05033: Pre-SET motif. This protein motif is a zinc binding motif. It contains 9 conserved cysteines that coordinate three zinc ions. It is thought that this region plays a structural role in stabilising SET domains. 44937 pfam05034: Methylaspartate ammonia-lyase. Methylaspartate ammonia-lyase EC:4.3.1.2 catalyses the second step of fermentation of glutamate. It is a homodimer. This protein may contain a TIM barrel fold similar to the pfam01188 family (Bateman A. pers obs).. 44938 pfam05035: 2-keto-3-deoxy-galactonokinase. 2-keto-3-deoxy-galactonokinase EC:2.7.1.58 catalyses the second step in D-galactonate degradation. 44939 pfam05036: Sporulation related repeat. This 35 residue repeat is found in proteins involved in sporulation and cell division such as FtsN, DedD, and CwlM. This repeat might be involved in binding peptidoglycan (Bateman A pers obs). FtsN is an essential cell division protein with a simple bitopic topology, a short N-terminal cytoplasmic segment fused to a large carboxy periplasmic domain through a single transmembrane domain. These repeats lay at the periplasmic C-terminus. FtsN localises to the septum ring complex. The CwlM gene is a cell wall hydrolase so this repeat may help localise the protein to the cell wall. 44940 pfam05037: Protein of unknown function (DUF669). Family of uncharacterised phage proteins. 44941 pfam05038: Cytochrome Cytochrome b558 alpha-subunit. Cytochrome b-245 light chain (p22-phox) is one of the key electron transfer elements of the NADPH oxidase in phagocytes. 44942 pfam05039: Agouti protein. The agouti protein regulates pigmentation in the mouse hair follicle producing a black hair with a subapical yellow band. A highly homologous protein agouti signal protein (ASIP)is present in humans and is expressed at highest levels in adipose tissue where it may play a role in energy homeostasis and possibly human pigmentation. 44943 pfam05040: Heparan sulfate 2-O-sulfotransferase (HS2ST). Heparan sulfate (HS) is a co-receptor for a number of growth factors, morphogens, and adhesion proteins. HS biosynthetic modifications may determine the strength and outcome of HS-ligand interactions. Mice that lack HS2ST undergo developmental failure only after midgestation,the most dramatic effect being the complete failure of kidney development. This family is related to pfam03567. 44944 pfam05041: Pecanex protein (C-terminus). This family consists of C terminal region of the pecanex protein homologues. The pecanex protein is a maternal-effect neurogenic gene found in Drosophila. 44945 pfam05042: Caleosin related protein. This family contains plant proteins related to caleosin. Caleosins contain calcium-binding domains and have an oleosin-like association with lipid bodies. Caleosins are present at relatively low levels and are mainly bound to microsomal membrane fractions at the early stages of seed development. As the seeds mature, overall levels of caleosins increased dramatically and they were associated almost exclusively with storage lipid bodies. This family is probably related to EF hands pfam00036. 44946 pfam05043: M protein trans-acting positive regulator (MGA). Mga is a DNA-binding protein that activates the expression of several important virulence genes in group A streptococcus in response to changing environmental conditions. The family also contains VirR like proteins which match only at the C-terminus of the alignment. 44947 pfam05044: Homeobox prospero-like protein (PROX1). The homeobox gene Prox1 is expressed in a subpopulation of endothelial cells that, after budding from veins, gives rise to the mammalian lymphatic system. Prox1 has been found to be an early specific marker for the developing liver and pancreas in the mammalian foregut endoderm. This family contains an atypical homeobox domain. 44948 pfam05045: Rhamnan synthesis protein F. This family consists of a group of proteins which are related to the Streptococcus rhamnose-glucose polysaccharide assembly protein (RgpF). Rhamnan backbones are found in several O polysaccharides of phytopathogenic bacteria and are regarded as pathogenic factors. 44949 pfam05046: Mitochondrial large subunit ribosomal protein (Img2). This family of proteins have been identified as part of the mitochondrial large ribosomal subunit in yeast. 44950 pfam05047: Mitochondrial ribosomal protein L51 / S25 / CI-B8 domain. The proteins in this family are located in the mitochondrion. The family includes ribosomal protein L51, and S25. This family also includes mitochondrial NADH-ubiquinone oxidoreductase B8 subunit (CI-B8) EC:1.6.5.3. It is not known whether all members of this family form part of the NADH-ubiquinone oxidoreductase and whether they are also all ribosomal proteins. 44951 pfam05048: Periplasmic copper-binding protein (NosD). NosD is a periplasmic protein which is thought to insert copper into the exported reductase apoenzyme (NosZ).. 44952 pfam05049: Interferon-inducible GTPase (IIGP). Interferon-inducible GTPase (IIGP) is thought to play a role in in intracellular defence. IIGP is predominantly associated with the Golgi apparatus and also localises to the endoplasmic reticulum and exerts a distinct role in IFN-induced intracellular membrane trafficking or processing. 44953 pfam05050: Protein of unknown function (DUF672). This family includes several proteins of unknown function and seems to be specific to C. Elegans. 44954 pfam05051: Cytochrome C oxidase copper chaperone (COX17). Cox17 is essential for the assembly of functional cytochrome c oxidase (CCO) and for delivery of copper ions to the mitochondrion for insertion into the enzyme in yeast. 44955 pfam05052: MerE protein. The prokaryotic MerE (or URF-1) protein is part of the mercury resistance operon. The protein is thought not to have any direct role in conferring mercury resistance to the organism but may be a mercury resistance transposon. 44956 pfam05053: Menin. MEN1, the gene responsible for multiple endocrine neoplasia type 1, is a tumour suppressor gene that encodes a protein called Menin which may be an atypical GTPase stimulated by nm23. 44957 pfam05054: Protein of unknown function (DUF673). Family of uncharacterized viral proteins. 44958 pfam05055: Protein of unknown function (DUF677). This family consists of AT14A like proteins from Arabidopsis thaliana. At14a has a small domain that has sequence similarities to integrins from fungi, insects and humans. Transcripts of At14a are found in all Arabidopsis tissues and localises partly to the plasma membrane. 44959 pfam05056: Protein of unknown function (DUF674). This family is found in Arabidopsis thaliana and contains several uncharacterized proteins. 44960 pfam05057: Putative serine esterase (DUF676). This family of proteins are probably serine esterase type enzymes. 44961 pfam05058: ActA Protein. The ActA family is found in Listeria and is associated with motility. ActA protein acts as a scaffold to assemble and activate host cell actin cytoskeletal factors at the bacterial surface, resulting in directional actin polymerisation and propulsion of the bacterium through the cytoplasm of the host cell. . 44962 pfam05059: Orbivirus VP4 core protein. Orbiviruses are double stranded RNA retroviruses of which the bluetongue virus is a member. The core of bluetongue virus (BTV) is a multienzyme complex composed of two major proteins (VP7 and VP3) and three minor proteins (VP1, VP4 and VP6) in addition to the viral genome. VP4 has been shown to perform all RNA capping activities and has both methyltransferase type 1 and type 2 activities associated with it. 44963 pfam05060: N-acetylglucosaminyltransferase II (MGAT2). UDP-N-acetyl-D-glucosamine:alpha-6-D-mannoside beta-1,2-N- acetylglucosaminyltransferase II (EC 2.4.1.143) (GnT II/MGAT2) is a Golgi resident enzyme that catalyses an essential step in the biosynthetic pathway leading from high mannose to complex N-linked oligosaccharides. Mutations in the MGAT2 gene lead to congenital disorder of glycosylation (CDG IIa). CDG IIa patients have an increased bleeding tendency, unrelated to coagulation factors. 44964 pfam05061: Poxvirus A11 Protein. Family of conserved Chordopoxvirinae A11 family proteins. Conserved region spans entire protein in the majority of family members. 44965 pfam05062: RICH domain. This presumed domain is about 85 residues in length and very rich in charged residues, hence the name RICH (Rich In CHarged residues). It is found in secreted proteins such as PspC, SpsA and IgA FC receptor from Streptococcus agalactiae. This domain could be involved in bacterial adherence or cell wall binding. 44966 pfam05063: MT-A70. MT-A70 is the S-adenosylmethionine-binding subunit of human mRNA:m6A methyl-transferase (MTase), an enzyme that sequence-specifically methylates adenines in pre-mRNAs. . 44967 pfam05064: Nsp1-like C-terminal region. This family probably forms a coiled-coil. This important region of Nsp1 is involved in binding Nup82. 44968 pfam05065: Phage capsid family. Family of bacteriophage hypothetical proteins and capsid proteins. . 44969 pfam05066: DNA-directed RNA polymerase delta subunit. The delta protein is a dispensable subunit of Bacillus subtilis RNA polymerase (RNAP) that has major effects on the biochemical properties of the purified enzyme. In the presence of delta, RNAP displays an increased specificity of transcription, a decreased affinity for nucleic acids, and an increased efficiency of RNA synthesis because of enhanced recycling. The delta protein, contains two distinct regions, an N-terminal domain and a glutamate and aspartate residue-rich carboxyl-terminal region. 44970 pfam05067: Manganese containing catalase. Catalases are important antioxidant metalloenzymes that catalyse disproportionation of hydrogen peroxide, forming dioxygen and water. Two families of catalases are known, one having a heme cofactor, and this family that is a structurally distinct family containing non-heme manganese. 44971 pfam05068: Mannitol repressor. The mannitol operon of Escherichia coli, encoding the mannitol-specific enzyme II of the phosphotransferase system (MtlA) and mannitol phosphate dehydrogenase (MtlD) contains an additional downstream open reading frame which encodes the mannitol repressor (MtlR).. 44972 pfam05069: Phage virion morphogenesis family. Protein S of phage P2 is thought to be involved in tail completion and stable head joining. 44973 pfam05071: NADH:ubiquinone oxidoreductase 17.2 kD subunit. This family contains the 17.2 kD subunit of complex I and its homologues. The family also contains a second related eukaryotic protein of unknown function. 44974 pfam05072: Herpesvirus UL43 protein. UL43 genes are expressed with true-late (gamma2) kinetics and have been identified as a virion tegument component. 44975 pfam05073: Baculovirus P24 capsid protein. Baculovirus P24 is associated with nucleocapsids of budded and polyhedra-derived virions. 44976 pfam05074: Beta-tubulin cofactor D. Beta-tubulin cofactor D is essential for the folding of tubulin molecules. It also plays a role (along with co-factors C and E) in the assembly of the alpha/beta- tubulin heterodimer and can interact with native tubulin, stimulating it to hydrolyse GTP and thus acting together as a beta-tubulin GTPase activating protein (GAP).. 44977 pfam05075: Protein of unknown function (DUF684). This family contains several uncharacterised proteins from Caenorhabditis elegans. 44978 pfam05076: Suppressor of fused protein (SUFU). SUFU, encoding the human orthologue of Drosophila suppressor of fused, appears to have a conserved role in the repression of Hedgehog signaling. SUFU exerts its repressor role by physically interacting with GLI proteins in both the cytoplasm and the nucleus. SUFU has been found to be a tumour-suppressor gene that predisposes individuals to medulloblastoma by modulating the SHH signaling pathway. 44979 pfam05077: Protein of unknown function (DUF678). This family contains several poxvirus proteins of unknown function. 44980 pfam05078: Protein of unknown function (DUF679). This family contains several uncharacterized plant proteins. 44981 pfam05079: Protein of unknown function (DUF680). This family contains several uncharacterized proteins which seem to be found exclusively in Rhizobium loti. 44982 pfam05080: Protein of unknown function (DUF681). This family contains several uncharacterized beak and feather disease virus proteins. 44983 pfam05081: Protein of unknown function (DUF682). This family consists if several uncharacterised baculovirus proteins. 44984 pfam05082: Protein of unknown function (DUF683). This family contains several uncharacterized bacterial proteins. These proteins are found in nitrogen fixation operons so are likely to play some role in this process. 44985 pfam05083: LST-1 protein. B144/LST1 is a gene encoded in the human major histocompatibility complex that produces multiple forms of alternatively spliced mRNA and encodes peptides fewer than 100 amino acids in length. B144/LST1 is strongly expressed in dendritic cells. Transfection of B144/LST1 into a variety of cells induces morphologic changes including the production of long, thin filopodia. 44986 pfam05084: Granule antigen protein (GRA6). This family contains the granule antigen protein GRA6 which is found in the parasitic protozoa Toxoplasma gondii and Neospora caninum. GRA6 protein plays an important role in the antigenicity and pathogenicity in these organisms. 44987 pfam05085: Protein of unknown function (DUF685). This family consists of several uncharacterized proteins from Borrelia burgdorferi (Lyme disease spirochete). There is some evidence to suggest that the proteins may be outer surface proteins. 44988 pfam05086: Dictyostelium (Slime Mold) REP protein. This family consists of REP proteins from Dictyostelium (Slime molds). REP protein is likely involved in transcription regulation and control of DNA replication, specifically amplification of plasmid at low copy numbers. The formation of homomultimers may be required for their regulatory activity. 44989 pfam05087: Rotavirus VP2 protein. Rotavirus particles consist of three concentric proteinaceous capsid layers. The innermost capsid (core) is made of VP2. The genomic RNA and the two minor proteins VP1 and VP3 are encapsidated within this layer. The N-terminus of rotavirus VP2 is necessary for the encapsidation of VP1 and VP3. 44990 pfam05088: Bacterial NAD-glutamate dehydrogenase. This family consists of several bacterial proteins which are closely related to NAD-glutamate dehydrogenase found in Streptomyces clavuligerus. Glutamate dehydrogenases (GDHs) are a broadly distributed group of enzymes that catalyse the reversible oxidative deamination of glutamate to ketoglutarate and ammonia. 44991 pfam05089: Alpha-N-acetylglucosaminidase (NAGLU). Alpha-N-acetylglucosaminidase, a lysosomal enzyme required for the stepwise degradation of heparan sulfate. Mutations on the alpha-N-acetylglucosaminidase (NAGLU) gene can lead to Mucopolysaccharidosis type IIIB (MPS IIIB; or Sanfilippo syndrome type B) characterised by neurological dysfunction but relatively mild somatic manifestations. 44992 pfam05090: Vitamin K-dependent gamma-carboxylase. Using reduced vitamin K, oxygen, and carbon dioxide, gamma-glutamyl carboxylase post-translationally modifies certain glutamates by adding carbon dioxide to the gamma position of those amino acids. In vertebrates, the modification of glutamate residues of target proteins is facilitated by an interaction between a propeptide present on target proteins and the gamma-glutamyl carboxylase. 44993 pfam05091: Eukaryotic translation initiation factor 3 subunit 7 (eIF-3). This family is made up of eukaryotic translation initiation factor 3 subunit 7 (eIF-3 zeta/eIF3 p66/eIF3d). Eukaryotic initiation factor 3 is a multi-subunit complex that is required for binding of mRNA to 40 S ribosomal subunits, stabilisation of ternary complex binding to 40 S subunits, and dissociation of 40 and 60 S subunits. These functions and the complex nature of eIF3 suggest multiple interactions with many components of the translational machinery. The gene coding for the protein has been implicated in cancer in mammals. 44994 pfam05092: Protein of unknown function (DUF686). This family consists of several uncharacterized Baculovirus proteins. 44995 pfam05093: Protein of unknown function (DUF689). This family contains several uncharacterised eukaryotic proteins of unknown function. 44996 pfam05094: Late expression factor 9 (LEF-9). Late expression factor 9 (LEF-9) is one of the primary components of RNA polymerase produced by baculoviruses. LEF-9 is homologous to the largest beta-subunit of prokaryotic DNA-directed RNA polymerase. 44997 pfam05095: Protein of unknown function (DUF687). This family contains several uncharacterised Chlamydia proteins. 44998 pfam05096: Glutamine cyclotransferase. This family of enzymes EC:2.3.2.5 catalyse the cyclization of free L-glutamine and N-terminal glutaminyl residues in proteins to pyroglutamate (5-oxoproline) and pyroglutamyl residues respectively. This family includes plant and bacterial enzymes and seems unrelated to the mammalian enzymes. 44999 pfam05097: Protein of unknown function (DUF688). This family contains several uncharacterized proteins found in Arabidopsis thaliana. 45000 pfam05098: Late expression factor 4 (LEF-4). Late expression factor 4 (LEF-4) is one of the Baculovirus late expression factor proteins. LEF-4 carries out all the enzymatic functions related to mRNA capping. 45001 pfam05099: Tellurite resistance protein TerB. This family contains the TerB tellurite resistance proteins from a a number of bacteria. 45002 pfam05100: Phage minor tail protein L. 45003 pfam05101: Type IV secretory pathway, VirB3-like protein. This family includes the Type IV secretory pathway VirB3 protein, that is found associated with bacterial inner and outer membranes. 45004 pfam05102: holin, BlyA family. BlyA, a small holin found in Borrelia circular plasmids that is encoded by a prophage. BlyA contains two largely hydrophobic helices and a highly charged C-terminus and is membrane associated. 45005 pfam05103: DivIVA protein. The Bacillus subtilis divIVA1 mutation causes misplacement of the septum during cell division, resulting in the formation of small, circular, anucleate minicells. Inactivation of divIVA produces a minicell phenotype, whereas overproduction of DivIVA results in a filamentation phenotype. These proteins appear to contain coiled-coils. 45006 pfam05104: Ribosome receptor lysine/proline rich region. This highly conserved region is found towards the C-terminus of the transmembrane domain. The function is unclear. 45007 pfam05105: Holin family. Phage holins and lytic enzymes are both necessary for bacterial lysis and virus dissemination.This family also includes TcdE/UtxA involved in toxin secretion in Clostridium difficile. 45008 pfam05106: Phage holin family (Lysis protein S). This family represents one of a large number of mutually dissimilar families of phage holins. Holins act against the host cell membrane to allow lytic enzymes of the phage to reach the bacterial cell wall. This family includes the product of the S gene of phage lambda. 45009 pfam05107: Family of unknown function (DUF694). Family of hypothetical bacterial proteins. 45010 pfam05108: Protein of unknown function (DUF690). This family contains several uncharacterised bacterial membrane proteins. 45011 pfam05109: Herpes virus major outer envelope glycoprotein (BLLF1). This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo. 45012 pfam05110: AF-4 proto-oncoprotein. This family consists of AF4 (Proto-oncogene AF4) and FMR2 (Fragile X E mental retardation syndrome) nuclear proteins. These proteins have been linked to human diseases such as acute lymphoblastic leukaemia and mental retardation. The family also contains a Drosophila AF4 protein homologue Lilliputian which contains an AT-hook domain. Lilliputian represents a novel pair-rule gene that acts in cytoskeleton regulation, segmentation and morphogenesis in Drosophila. 45013 pfam05111: Ameloblastin precursor (Amelin). This family consists of several mammalian Ameloblastin precursor (Amelin) proteins. Matrix proteins of tooth enamel consist mainly of amelogenin but also of non-amelogenin proteins, which, although their volumetric percentage is low, have an important role in enamel mineralisation. One of the non-amelogenin proteins is ameloblastin, also known as amelin and sheathlin. Ameloblastin (AMBN) is one of the enamel sheath proteins which is though to have a role in determining the prismatic structure of growing enamel crystals. 45014 pfam05112: Baculovirus P47 protein. This family consists of several Baculovirus P47 proteins which is one of the primary components of Baculovirus encoded RNA polymerase, which initiates transcription from late and very late promoters. 45015 pfam05113: Protein of unknown function (DUF693). This family consists of several uncharacterized proteins from Borrelia burgdorferi (Lyme disease spirochete).. 45016 pfam05114: Protein of unknown function (DUF692). This family consists of several uncharacterised bacterial proteins. 45017 pfam05115: Cytochrome B6-F complex subunit VI (PetL). This family consists of several Cytochrome B6-F complex subunit VI (PetL) proteins found in several plant species. PetL is one of the small subunits which make up The cytochrome b(6)f complex. PetL is strictly required neither for the accumulation nor for the function of cytochrome b6f; in its absence, however, the complex becomes unstable in vivo in aging cells and labile in vitro. It has been suggested that the N-terminus of the protein is likely to lie in the thylakoid lumen. 45018 pfam05116: Sucrose-6F-phosphate phosphohydrolase. This family consists of Sucrose-6F-phosphate phosphohydrolase proteins found in plants and cyanobacteria. Sucrose-6(F)-phosphate phosphohydrolase catalyses the final step in the pathway of sucrose biosynthesis. 45019 pfam05117: Family of unknown function (DUF695). Family of uncharacterised bacterial proteins. 45020 pfam05118: Aspartyl/Asparaginyl beta-hydroxylase. Iron (II)/2-oxoglutarate (2-OG)-dependent oxygenases catalyse oxidative reactions in a range of metabolic processes. Proline 3-hydroxylase hydroxylates proline at position 3, the first of a 2-OG oxygenase catalysing oxidation of a free alpha-amino acid. The structure of proline 3-hydroxylase contains the conserved motifs present in other 2-OG oxygenases including a jelly roll strand core and residues binding iron and 2-oxoglutarate, consistent with divergent evolution within the extended family. This family represent the arginine, asparagine and proline hydroxylases. The aspartyl/asparaginyl beta-hydroxylase (EC:1.14.11.16) specifically hydroxylates one aspartic or asparagine residue in certain epidermal growth factor-like domains of a number of proteins. 45021 pfam05119: Phage terminase, small subunit. 45022 pfam05120: Gas vesicle protein G. These proteins are involved in the formation of gas vesicles. 45023 pfam05121: Gas vesicle protein K. These proteins are involved in the formation of gas vesicles. 45024 pfam05122: Mobile element transfer protein. This proteins are involved in transferring a group of integrating conjugative DNA elements, such as pSAM2 from Streptomyces ambofaciens. Their precise role is not known. . 45025 pfam05123: S-layer like family, N-terminal region. 45026 pfam05124: S-layer like family, C-terminal region. 45027 pfam05125: Phage major capsid protein, P2 family. 45028 pfam05126: Phage minor capsid protein. The protein is suggested to be the head-tail connector, or portal protein, on the basis of its position in the phage gene order, its presence in mature phage, its size, and its conservation across a number of complete genomes of tailed phage that lack other candidate portal proteins. 45029 pfam05127: Putative ATPase (DUF699). This putative domain is about 350 amino acid residues long and appears to have a P-loop motif, suggesting this is an ATPase. This domain is often associated with pfam00583. This domain is found in isolation in some members. 45030 pfam05128: Family of unknown function (DUF697). Family of bacterial hypothetical proteins. 45031 pfam05129: Putative zinc binding domain (DUF701). This family of short proteins contains a putative zinc binding domain with four conserved cysteines. 45032 pfam05130: FlgN protein. This family includes the FlgN protein and export chaperone involved in flagellar synthesis. 45033 pfam05131: Pep3/Vps18/deep orange family. This region is found in a number of protein identified as involved in golgi function and vacuolar sorting. The molecular function of this region is unknown. The members of this family contain a C-terminal ring finger domain. 45034 pfam05132: RNA polymerase III RPC4. Specific subunit for Pol III, the tRNA specific polymerase. 45035 pfam05133: Phage portal protein, SPP1 Gp6-like. This protein forms a hole, or portal, that enables DNA passage during packaging and ejection. It also forms the junction between the phage head (capsid) and the tail proteins. During SPP1 morphogenesis, Gp6 participates in the procapsid assembly reaction. 45036 pfam05134: General secretion pathway protein L (GspL). This family consists of general secretion pathway protein L sequences from several gram-negative bacteria. The general secretion pathway of gram-negative bacteria is responsible for extracellular secretion of a number of different proteins, including proteases and toxins. This pathway supports secretion of proteins across the cell envelope in two distinct steps, in which the second step, involving translocation through the outer membrane, is assisted by at least 13 different gene products. GspL is predicted to contain a large cytoplasmic domain and has been shown to interact with the autophosphorylating cytoplasmic membrane protein GspE. It is thought that the tri-molecular complex of GspL, GspE and GspM might be involved in regulating the opening and closing of the secretion pore and/or transducing energy to the site of outer membrane translocation. 45037 pfam05135: Phage QLRG family, putative DNA packaging. The members of this family contain a conserved QLRG motif. This family of phage proteins are largely uncharacterised, although annotation of some members suggests a role in DNA packaging. 45038 pfam05136: Phage portal protein, lambda family. This protein forms a hole, or portal, that enables DNA passage during packaging and ejection. It also forms the junction between the phage capsid and the tail proteins. . 45039 pfam05137: Fimbrial assembly protein (PilN).. 45040 pfam05138: Phenylacetic acid catabolic protein. This family includes proteins such as PaaA and PaaC that are part of a catabolic pathway of phenylacetic acid. These proteins may form part of a dioxygenase complex. 45041 pfam05139: Erythromycin esterase. This family includes erythromycin esterase enzymes that confer resistance to the erythromycin antibiotic. 45042 pfam05140: ResB-like family. This family includes both ResB and cytochrome c biogenesis proteins. Mutations in ResB indicate that they are essential for growth. ResB is predicted to be a transmembrane protein. 45043 pfam05141: Pyoverdine/dityrosine biosynthesis protein. This family includes DIT1 that is involved in synthesising dityrosine. Dityrosine is a sporulation-specific component of the yeast ascospore wall that is essential for the resistance of the spores to adverse environmental conditions. One member is involved in the biosynthesis of pyoverdine. 45044 pfam05142: Domain of unknown function (DUF702). Family of uncharacterised plant proteins. 45045 pfam05143: Uncharacterized BCR (DUF703). Proteins in this family have no known function. They contain many conserved aspartates that might suggest this is a metalloprotein. 45046 pfam05144: Phage replication protein CRI. The phage replication protein CRI, is also known as Gene II, is essential for DNA replication. . 45047 pfam05145: Putative ammonia monooxygenase. This family are annotated by COGS as putative ammonia monooxygenase enzymes. 45048 pfam05146: Aha1 domain. The function of this presumed domain is unknown. It is found in a range of bacterial as well as eukaryotic proteins. This domain is found in Aha1, which is found to interact with Hsp90. It is not certain if this interaction is mediated by this domain. 45049 pfam05147: Lanthionine synthetase C-like protein. This family contains the lanthionine synthetase C-like proteins 1 and 2 which are related to the bacterial lanthionine synthetase components C (LanC). LANCL1(P40 seven-transmembrane-domain protein) and LANCL2 (testes-specific adriamycin sensitivity protein) are thought to be peptide-modifying enzyme components in eukaryotic cells. Both proteins are produced in large quantities in the brain and testes and may have role in the immune surveillance of these organs. 45050 pfam05148: Hypothetical methyltransferase. This family consists of several uncharacterised eukaryotic proteins which are related to methyltransferases pfam01209. 45051 pfam05149: Paraflagellar rod protein. This family consists of several eukaryotic paraflagellar rod component proteins. The eukaryotic flagellum represents one of the most complex macromolecular structures found in any organism and contains more than 250 proteins. In addition to its locomotive role, the flagellum is probably involved in nutrient uptake since receptors for host low-density lipoproteins are localised on the flagellar membrane as well as on the flagellar pocket membrane. 45052 pfam05150: Legionella pneumophila major outer membrane protein precursor. This family consists of major outer membrane protein precursors from Legionella pneumophila. 45053 pfam05151: Photosystem II reaction centre M protein (PsbM). This family consists of several Photosystem II reaction centre M proteins (PsbM) from plants and cyanobacteria. During the photosynthetic light reactions in the thylakoid membranes of cyanobacteria, algae, and plants, photosystem II (PSII), a multi-subunit membrane protein complex, catalyses oxidation of water to molecular oxygen and reduction of plastoquinon. 45054 pfam05152: Protein of unknown function (DUF705). This family contains several uncharacterized Baculovirus proteins. 45055 pfam05153: Family of unknown function (DUF706). Family of uncharacterised eukaryotic function. Some members have a described putative function, but a common theme is not evident. 45056 pfam05154: TM2 domain. This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts. 45057 pfam05155: Phage X family. This family is the product of Gene X. The function of this protein is unknown. 45058 pfam05156: RNA polymerase Rpa43 subunit. Subunit specific to RNA Pol I which comprises of 14 different subunits. The Rpa43 is at least one of the subunits contacted by the transcription factor TIF-IA. . 45059 pfam05157: GSPII_E N-terminal domain. This domain is found at the N-terminus of members of the general secretory system II protein E. Proteins in this subfamily are typically involved in Type IV pilus biogenesis, though some are involved in other processes; for instance aggregation in Myxococcus xanthus. . 45060 pfam05158: RNA polymerase Rpc34 subunit. Subunit specific to RNA Pol III, the tRNA specific polymerase. The C34 subunit of yeast RNA Pol III is part of a subcomplex of three subunits which have no counterpart in the other two nuclear RNA polymerases. This subunit interacts with TFIIIB70 and is therefore participates in Pol III recruitment. 45061 pfam05159: Capsule polysaccharide biosynthesis protein. This family includes export proteins involved in capsule polysaccharide biosynthesis, such as KpsS and LipB. 45062 pfam05160: DSS1/SEM1 family. This family contains SEM1 and DSS1 which are short acidic proteins. 45063 pfam05161: MOFRL family. MOFRL(multi-organism fragment with rich Leucine) family exists in bacteria and eukaryotes. The function of this domain is not clear, although it exists in some putative enzymes such as reductases and kinases. 45064 pfam05162: Ribosomal protein L41. 45065 pfam05163: DinB family. DNA damage-inducible (din) genes in Bacillus subtilis are coordinately regulated and together compose a global regulatory network that has been termed the SOS-like or SOB regulon. This family includes DinB from B. subtilis. 45066 pfam05164: Family of unknown function (DUF710). Family of eubacterial hypothetical proteins. 45067 pfam05165: GGDN family. I have named this protein family of unknown function GGDN after the most conserved motif. The proteins are 200-270 amino acids in length. 45068 pfam05166: Family of unknown function (DUF709). Family of eubacterial hypothetical proteins. 45069 pfam05167: Uncharacterized ACR (DUF711). The proteins in this family are functionally uncharacterised. The proteins are around 450 amino acids long. 45070 pfam05168: Protein of unknown function (DUF712). This family of proteins is functionally uncharacterised. 45071 pfam05169: Selenoprotein W-related family. Selenoprotein W contains selenium as selenocysteine in the primary protein structure and levels of this selenoprotein are affected by selenium. The precise role of this family is unclear. 45072 pfam05170: AsmA family. The AsmA gene, whose product is involved in the assembly of outer membrane proteins in Escherichia coli. AsmA mutations were isolated as extragenic suppressors of an OmpF assembly mutant. AsmA may have a role in LPS biogenesis. 45073 pfam05171: Haemin-degrading family. The Yersinia enterocolitica O:8 periplasmic binding-protein- dependent transport system consisted of four proteins: the periplasmic haemin-binding protein HemT, the haemin permease protein HemU, the ATP-binding hydrophilic protein HemV and the haemin-degrading protein HemS (this family).. 45074 pfam05172: MPPN (rrm-like) domain. The MPPN (Mitotic PhosphoProtein N' end) family is uncharacterised however it probably plays a role in the cell cycle because the family includes mitotic phosphoproteins. This family also includes a suppressor of thermosensitive mutations in the DNA polymerase delta gene (Pol III). The conserved central region appears to be distantly related to the pfam00076 domain, suggesting an RNA binding function for this protein (Bateman A. pers obs).. 45075 pfam05173: Dihydrodipicolinate reductase, C-terminus. Dihydrodipicolinate reductase (DapB) reduces the alpha,beta-unsaturated cyclic imine, dihydro-dipicolinate. This reaction is the second committed step in the biosynthesis of L-lysine and its precursor meso-diaminopimelate, which are critical for both protein and cell wall biosynthesis. The C-terminal domain of DapB has been proposed to be the substrate- binding domain. 45076 pfam05174: Cysteine-rich D. radiodurans N terminus. This domain is found individually and at the N terminus of a few multi-domain proteins. 45077 pfam05175: Methyltransferase small domain. This domain is found in ribosomal RNA small subunit methyltransferase C as well as other methyltransferases. 45078 pfam05176: ATP10 protein. ATP 10 is essential for the assembly of a functional mitochondrial ATPase complex. 45079 pfam05177: RCSD region. Proteins contain this region include C.elegans UNC-89. This region is found repeated in UNC-89 and shows conservation in prolines, lysines and glutamic acids. Proteins with RCSD are involved in muscle M-line assembly, but the function of this region RCSD is not clear. 45080 pfam05178: Krr1 family. The yeast member of this family is found to be required for 40S ribosome biogenesis in the nucleolus. 45081 pfam05179: RNA pol II accessory factor, Cdc73 family. 45082 pfam05180: DNL zinc finger. This short presumed domain probably binds to zinc. It is found in a number of eukaryotic proteins of unknown function. The domain is named after a short C-terminal motif of D(N/H)L. 45083 pfam05181: XPA protein C-terminus. 45084 pfam05182: Fip1 motif. This short motif is about 40 amino acids in length. In the Fip1 protein that is a component of a yeast pre-mRNA polyadenylation factor that directly interacts with poly(A) polymerase. This region of Fip1 is needed for the interaction with the Yth1 subunit of the complex and for specific polyadenylation of the cleaved mRNA precursor. 45085 pfam05183: RNA dependent RNA polymerase. This family of proteins are eukaryotic RNA dependent RNA polymerases. These proteins are involved in post transcriptional gene silencing where they are thought to amplify dsRNA templates. 45086 pfam05184: Saposin-like type B, region 1. 45087 pfam05185: Skb1 methyltransferase. The human homologue of yeast Skb1 (Shk1 kinase-binding protein 1) is a protein methyltransferase. These proteins seem to play a role in Jak signalling. 45088 pfam05186: Dpy-30 motif. This motif is found in a wide variety of domain contexts. It is found in the Dpy-30 proteins hence the motifs name. It is about 40 residues long and is probably formed of two alpha-helices. It may be a dimerisation motif analogous to pfam02197 (Bateman A pers obs).. 45089 pfam05187: Electron transfer flavoprotein-ubiquinone oxidoreductase. Electron-transfer flavoprotein-ubiquinone oxidoreductase (ETF-QO) in the inner mitochondrial membrane accepts electrons from electron-transfer flavoprotein which is located in the mitochondrial matrix and reduces ubiquinone in the mitochondrial membrane. The two redox centres in the protein, FAD and a [4Fe4S] cluster, are present in a 64-kDa monomer. . 45090 pfam05188: MutS domain II. This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam00488, pfam01624, pfam05192 and pfam05190. The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. This domain corresponds to domain II in Thermus aquaticus MutS, and has similarity resembles RNAse-H-like domains (see pfam00075).. 45091 pfam05189: RNA 3'-terminal phosphate cyclase (RTC), insert domain. RNA cyclases are a family of RNA-modifying enzymes that are conserved in all cellular organisms. They catalyse the ATP-dependent conversion of the 3'-phosphate to the 2',3'-cyclic phosphodiester at the end of RNA, in a reaction involving formation of the covalent AMP-cyclase intermediate. The structure of RTC demonstrates that RTCs are comprised two domain. The larger domain contains an insert domain of approximately 100 amino acids. . 45092 pfam05190: MutS family domain IV. This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam01624, pfam05188, pfam05192 and pfam00488. The mutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. The aligned region corresponds in part with globular domain IV, which is involved in DNA binding, in Thermus aquaticus MutS as characterised in. 45093 pfam05191: Adenylate kinase, active site lid. Comparisons of adenylate kinases have revealed a particular divergence in the active site lid. In some organisms, particularly the Gram-positive bacteria, residues in the lid domain have been mutated to cysteines and these cysteine residues are responsible for the binding of a zinc ion. The bound zinc ion in the lid domain, is clearly structurally homologous to Zinc-finger domains. However, it is unclear whether the adenylate kinase lid is a novel zinc-finger DNA/RNA binding domain, or that the lid bound zinc serves a purely structural function. 45094 pfam05192: MutS domain III. This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam00488, pfam05188, pfam01624 and pfam05190. The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. The aligned region corresponds with domain III, which is central to the structure of Thermus aquaticus MutS as characterised in. 45095 pfam05193: Peptidase M16 inactive domain. Peptidase M16 consists of two structurally related domains. One is the active peptidase, whereas the other is inactive. The two domains hold the substrate like a clamp. 45096 pfam05194: UreE urease accessory protein, C-terminal domain. UreE is a urease accessory protein. Urease pfam00449 hydrolyses urea into ammonia and carbamic acid. The C-terminal region of members of this family contains a His rich Nickel binding site. 45097 pfam05195: Aminopeptidase P, N-terminal domain. This domain is structurally very similar to the creatinase N-terminal domain (pfam01321). However, little or no sequence similarity exists between the two families. 45098 pfam05196: PTN/MK heparin-binding protein family, N-terminal domain. 45099 pfam05197: Domain of unknown function. This family of proteins has no known function. This region may contain transmembrane alpha helices. The domain is found in a variety of metazoan species. 45100 pfam05198: Translation initiation factor IF-3, N-terminal domain. 45101 pfam05199: GMC oxidoreductase. This domain found associated with pfam00732. 45102 pfam05200: Glutamyl-tRNAGlu reductase, NAD(P) binding domain. This family use NADPH as a cofactor. This family is related to other NADPH binding domains. 45103 pfam05201: Glutamyl-tRNAGlu reductase, N-terminal domain. 45104 pfam05202: Recombinase Flp protein. 45105 pfam05203: Hom_end-associated Hint. Homing endonucleases are encoded by mobile DNA elements that are found inserted within host genes in all domains of life. The crystal structure of the homing nuclease PI-Sce revealed two domains: an endonucleolytic centre resembling the C-terminal domain of Drosophila melanogaster Hedgehog protein, and a a second domain containing the protein-splicing active site. This Domain corresponds to the latter protein-splicing domain. 45106 pfam05204: Homing endonuclease,. Homing endonucleases are encoded by mobile DNA elements that are found inserted within host genes in all domains of life. The crystal structure of the homing nuclease PI-Sce revealed two domains: an endonucleolytic centre resembling the C-terminal domain of Drosophila melanogaster Hedgehog protein, and a a second domain containing the protein-splicing active site. This Domain corresponds to the C-terminal domain, which has structural similarity to PF:PF01079. 45107 pfam05205: Chromatin remodelling complex component cps15. 45108 pfam05206: Protein of unknown function (DUF715). This family of eukaryotic proteins has no characterised function. The alignment contains some conserved cysteine and histidines that might form a zinc binding site (Bateman A pers obs).. 45109 pfam05207: CSL zinc finger. This probable zinc binding motif contains four cysteines that probably chelate zinc. This domain is often found associated with a pfam00226 domain. The molecular function of these proteins is uncertain. This domain is named after the conserved motif of the final cysteine. 45110 pfam05208: ALG3 protein. The formation of N-glycosidic linkages of glycoproteins involves the ordered assembly of the common Glc3Man9GlcNAc2 core-oligosaccharide on the lipid carrier dolichyl pyrophosphate. Whereas early mannosylation steps occur on the cytoplasmic side of the endoplasmic reticulum with GDP-Man as donor, the final reactions from Man5GlcNAc2-PP-Dol to Man9GlcNAc2-PP-Dol on the lumenal side use Dol-P-Man. ALG3 gene encodes the Dol-P-Man:Man5GlcNAc2-PP-Dol mannosyltransferase. 45111 pfam05209: Septum formation inhibitor MinC, N-terminal domain. In Escherichia coli ftsZ assembles into a Z ring at midcell while assembly at polar sites is prevented by the min system. MinC, a component of this system, is an inhibitor of FtsZ assembly that is positioned within the cell by interaction with MinDE. MinC is an oligomer, probably a dimer. The C terminal half of MinC is the most conserved and interacts with MinD. The N terminal half is thought to interact with FtsZ. 45112 pfam05210: Sprouty protein (Spry). This family consists of eukaryotic Sprouty protein homologues. Sprouty proteins have been revealed as inhibitors of the Ras/mitogen-activated protein kinase (MAPK) cascade, a pathway crucial for developmental processes initiated by activation of various receptor tyrosine kinases. The sprouty gene has found to be expressed in the the brain, cochlea, nasal organs, teeth, salivary gland, lungs, digestive tract, kidneys and limb buds in mice. 45113 pfam05211: Neuraminyllactose-binding hemagglutinin precursor (NLBH). This family is comprised of several flagellar sheath adhesin proteins also called neuraminyllactose-binding hemagglutinin precursor (NLBH) or N-acetylneuraminyllactose-binding fibrillar hemagglutinin receptor-binding subunits. NLBH is found exclusively in Helicobacter which are gut colonising bacteria and bind to sialic acid rich macromolecules present on the gastric epithelium. 45114 pfam05212: Protein of unknown function (DUF707). This family consists of several uncharacterised proteins from Arabidopsis thaliana. 45115 pfam05213: Coronavirus NS2A protein. This family contains a number of corona virus non-structural proteins of unknown function. The family also includes a polymerase protein fragment from Berne virus and does not seem to be related to the pfam04753 Coronavirus NS2 family. 45116 pfam05214: Baculovirus P33. This family consists of a series of Baculovirus P33 protein homologues of unknown function. 45117 pfam05215: Spiralin. This family consists of Spiralin proteins found in spiroplasma bacteria. Spiroplasmas are helically shaped pathogenic bacteria related to the mycoplasmas. The surface of spiroplasma bacteria is crowded with the membrane-anchored lipoprotein spiralin whose structure and function are unknown although its cellular function is thought to be a structural and mechanical one rather than a catalytic one. 45118 pfam05216: UNC-50 family. This family contains several eukaryotic transmembrane proteins which are related to the C. elegans protein UNC-50. 45119 pfam05217: STOP protein. Neurons contain abundant subsets of highly stable microtubules that resist de-polymerising conditions such as exposure to the cold. Stable microtubules are thought to be essential for neuronal development, maintenance, and function. STOP is a major factor responsible for the intriguing stability properties of neuronal microtubules and is important for synaptic plasticity. Additionally knowledge of STOPs function and properties may help in the treatment of neuroleptics in illnesses such as schizophrenia, currently thought to result from synaptic defects. 45120 pfam05218: Protein of unknown function (DUF713). This family contains several proteins of unknown function from C.elegans. 45121 pfam05219: DREV methyltransferase. This family contains DREV protein homologues from several eukaryotes. The function of this protein is unknown. However, these proteins appear to be related to other methyltransferases (Bateman A pers obs).. 45122 pfam05220: MgpC protein precursor. This family contains several Mycoplasma MgpC like-proteins. 45124 pfam05222: Alanine dehydrogenase/PNT, N-terminal domain. This family now also contains the lysine 2-oxoglutarate reductases. 45125 pfam05223: NTF2-like N-terminal transpeptidase domain. The structure of this domain from MecA is known, and is found to be similar to that found in NTF2 pfam02136. This domain seems unlikely to have an enzymatic function, and its role remains unknown. 45126 pfam05224: NDT80 / PhoG like DNA-binding family. This family includes the DNA-binding region of NDT80 as well as PhoG and its homologues. The family contains VIB-1, which is thought to be a regulator of conidiation in Neurospora crassa and shares a region of similarity to PHOG, a possible phosphate nonrepressible acid phosphatase in Aspergillus nidulans. It has been found that vib-1 is not the structural gene for nonrepressible acid phosphatase, but rather may regulate nonrepressible acid phosphatase activity. 45127 pfam05225: helix-turn-helix, Psq domain. This DNA-binding motif is found in four copies in the pipsqueak protein of Drosophila melanogaster. In pipsqueak this domain binds to GAGA sequence. 45128 pfam05226: CHASE2 domain. CHASE2 is an extracellular sensory domain, which is present in various classes of transmembrane receptors that are parts of signal transduction pathways in bacteria. Specifically, CHASE2 domains are found in histidine kinases, adenylate cyclases, serine/threonine kinases and predicted diguanylate cyclases/phosphodiesterases. Environmental factors that are recognised by CHASE2 domains are not known at this time. 45129 pfam05227: CHASE3 domain. CHASE3 is an extracellular sensory domain, which is present in various classes of transmembrane receptors that are parts of signal transduction pathways in bacteria. Specifically, CHASE3 domains are found in histidine kinases, adenylate cyclases, methyl-accepting chemotaxis proteins and predicted diguanylate cyclases/phosphodiesterases. Environmental factors that are recognised by CHASE3 domains are not known at this time. 45130 pfam05228: CHASE4 domain. CHASE4. This is an extracellular sensory domain, which is present in various classes of transmembrane receptors that are parts of signal transduction pathways in prokaryotes. Specifically, CHASE4 domains are found in histidine kinases in Archaea and in predicted diguanylate cyclases/phosphodiesterases in Bacteria. Environmental factors that are recognized by CHASE4 domains are not known at this time. 45131 pfam05229: Spore Coat Protein U domain. This domain is found in a bacterial family of spore coat proteins. 45132 pfam05230: MASE2 domain. 45133 pfam05231: MASE1 domain. 45134 pfam05232: Bacterial Transmembrane Pair family. This family represents a conserved pair of transmembrane helices. It appears to be found as two tandem repeats in a family of hypothetical proteins. 45135 pfam05233: PHB accumulation regulatory domain. The proteins this domain is found in are typically involved in regulating polymer accumulation in bacteria, particularly poly-beta-hydroxybutyrate (PHB).. 45136 pfam05234: UAF complex subunit Rrn10. The protein Rrn10 has been identified as a component of the Upstream Activating Factor (UAF), an RNA polymerase I (pol I) specific transcription stimulatory factor. 45137 pfam05235: CHAD domain. The CHAD domain is an alpha-helical domain functionally associated with the pfam01928 domains. It has conserved histidines that may chelate metals. 45138 pfam05236: Transcription initiation factor TFIID component TAF4 family. This region of similarity is found in Transcription initiation factor TFIID component TAF4. 45139 pfam05237: MoeZ/MoeB domain. This putative domain is found in the MoeZ protein and the MoeB protein. The domain has two CXXC motifs that are only partly conserved. 45140 pfam05238: CHL4 family. This family includes CHL4 that is involved in chromosome segregation. 45141 pfam05239: PRC-barrel domain. The PRC-barrel is an all beta barrel domain found in photosystem reaction centre subunit H of the purple bacteria and RNA metabolism proteins of the RimM group. PRC-barrels are approximately 80 residues long, and found widely represented in bacteria, archaea and plants. This domain is also present at the carboxyl terminus of the pan-bacterial protein RimM, which is involved in ribosomal maturation and processing of 16S rRNA. A family of small proteins conserved in all known euryarchaea are composed entirely of a single stand-alone copy of the domain. 45142 pfam05240: APOBEC-like C-terminal domain. This domain is found at the C-termini of the Apolipoprotein B mRNA editing enzyme. 45143 pfam05241: Emopamil binding protein. Emopamil binding protein (EBP) is as a gene that encodes a nonglycosylated type I integral membrane protein of endoplasmic reticulum and shows high level expression in epithelial tissues. The EBP protein has emopamil binding domains, including the sterol acceptor site and the catalytic centre, which show Delta8-Delta7 sterol isomerase activity. Human sterol isomerase, a homologue of mouse EBP, is suggested not only to play a role in cholesterol biosynthesis, but also to affect lipoprotein internalisation. In humans, mutations of EBP are known to cause the genetic disorder of X-linked dominant chondrodysplasia punctata (CDPX2). This syndrome of humans is lethal in most males, and affected females display asymmetric hyperkeratotic skin and skeletal abnormalities. 45144 pfam05242: Glycosylation-dependent cell adhesion molecule 1 (GlyCAM-1). This family consists of the lactophorin precursors proteose peptone component 3 (PP3) and glycosylation-dependent cell adhesion molecule 1 (GlyCAM-1). GlyCAM-1 functions as a ligand for L-selectin, a saccharide-binding protein on the surface of circulating leukocytes, and mediates the trafficking of blood-born lymphocytes into secondary lymph nodes. In this context, sulphatation of the carbohydrates of GlyCAM-1 has been shown to be a critical structural requirement to be recognised by L-selectin. GlyCAM-1 is also expressed in pregnant and lactating mammary glands of mouse and in an unknown site in the lung, in the bovine uterus and rat cochlea. 45145 pfam05243: Protein of unknown function (DUF734). This family consists of several uncharacterized baculovirus proteins of unknown function. 45146 pfam05244: Brucella outer membrane protein 2. This family consists of several outer membrane proteins (2a and 2b) from brucella bacteria. Brucellae are Gram-negative, facultative intracellular bacteria that can infect many species of animals and man. 45147 pfam05245: Conjugal transfer protein TrbD. This family consists of several bacterial conjugal transfer proteins, TrbD. TrbD contains a nucleotide binding motif and may provide energy for the export of DNA or the export of other Trb proteins. 45148 pfam05246: Protein of unknown function (DUF735). This family consists of several uncharacterized Borrelia burgdorferi (Lyme disease spirochete) proteins of unknown function. 45149 pfam05247: Flagellar transcriptional activator (FlhD). This family consists of several bacterial flagellar transcriptional activator (FlhD) proteins. FlhD combines with FlhC to form a regulatory complex in E. coli, this complex has been shown to be a global regulator involved in many cellular processes as well as a flagellar transcriptional activator. 45150 pfam05248: Adenovirus E3A. 45151 pfam05249: Uncharacterised protein family (UPF0187). This family of proteins is functionally uncharacterised. 45152 pfam05250: Uncharacterized protein family (UPF0193). This family of proteins is functionally uncharacterized. 45153 pfam05251: Uncharacterized protein family (UPF0197). This family of proteins is functionally uncharacterized. 45154 pfam05252: Uncharacterised protein family (UPF0191). This family of proteins is functionally uncharacterised. 45155 pfam05253: Uncharacterized protein family (UPF0224). This family of proteins is functionally uncharacterized. 45156 pfam05254: Uncharacterised protein family (UPF0203). This family of proteins is functionally uncharacterised. 45157 pfam05255: Uncharacterized protein family (UPF0220). This family of proteins is functionally uncharacterized. 45158 pfam05256: Uncharacterised protein family (UPF0223). This family of proteins is functionally uncharacterised. 45159 pfam05257: CHAP domain. This domain corresponds to an amidase function. Many of these proteins are involved in cell wall metabolism of bacteria. This domain is found at the N-terminus of proteins, where is functions as a glutathionylspermidine amidase EC:3.5.1.78. 45160 pfam05258: Protein of unknown function (DUF721). This family contains several actinomycete proteins of unknown function. 45161 pfam05259: Herpesvirus glycoprotein L. This family consists of several herpesvirus glycoprotein L or UL1 proteins. Glycoprotein L is known to form a complex with glycoprotein H but the function of this complex is poorly understood. 45162 pfam05261: TraM protein. The TraM protein is an essential part of the DNA transfer machinery of the conjugative resistance plasmid R1 (IncFII). On the basis of mutational analyses, it was shown that the essential transfer protein TraM has at least two functions. First, a functional TraM protein was found to be required for normal levels of transfer gene expression. Second, experimental evidence was obtained that TraM stimulates efficient site-specific single-stranded DNA cleavage at the oriT, in vivo. Furthermore, a specific interaction of the cytoplasmic TraM protein with the membrane protein TraD was demonstrated, suggesting that the TraM protein creates a physical link between the relaxosomal nucleoprotein complex and the membrane-bound DNA transfer apparatus. 45163 pfam05262: Borrelia P83/100 protein. This family consists of several Borrelia P83/P100 antigen proteins. 45164 pfam05263: Protein of unknown function (DUF722). This family contains several bacteriophage proteins of unknown function. 45165 pfam05264: Choristoneura fumiferana antifreeze protein (CfAFP). This family consists of several antifreeze proteins from the insect Choristoneura fumiferana (Spruce budworm). Antifreeze proteins (AFPs) and antifreeze glycoproteins (AFGPs) are present in many organisms that must survive sub-zero temperatures. These proteins bind to seed ice crystals and inhibit their growth through an adsorption-inhibition mechanism. 45166 pfam05265: Protein of unknown function (DUF723). This family contains several uncharacterized proteins from Neisseria meningitidis. These proteins may have a role in DNA-binding. 45167 pfam05266: Protein of unknown function (DUF724). This family contains several uncharacterised proteins found exclusively in Arabidopsis thaliana. This region is often found associated with Agenet domains. 45168 pfam05267: Protein of unknown function (DUF725). This family contains several Drosophila proteins of unknown function. 45169 pfam05268: Phage tail fibre adhesin Gp38. This family contains several Gp38 proteins from T-even-like phages. Gp38, together with a second phage protein, gp57, catalyses the organisation of gp37 but is absent from the phage particle. Gp37 is responsible for receptor recognition. 45170 pfam05269: Bacteriophage CII protein. This family consists of several phage CII regulatory proteins. CII plays a key role in the lysis-lysogeny decision in bacteriophage lambda and related phages. 45171 pfam05270: Alpha-L-arabinofuranosidase B (ABFB). This family consists of several fungal alpha-L-arabinofuranosidase B proteins. L-Arabinose is a constituent of plant-cell-wall poly-saccharides. It is found in a polymeric form in L-arabinan, in which the backbone is formed by 1,5-a- linked l-arabinose residues that can be branched via 1,2-a- and 1,3-a-linked l-arabinofuranose side chains. AbfB hydrolyses 1,5-a, 1,3-a and 1,2-a linkages in both oligosaccharides and polysaccharides, which contain terminal non-reducing l-arabinofuranoses in side chains. 45172 pfam05271: Tobravirus 2B protein. This family consists of several tobravirus 2B proteins. It is known that the 2B protein is required for transmission by both Paratrichodorus pachydermus and P. anemones nematodes. 45173 pfam05272: Virulence-associated protein E. This family contains several bacterial virulence-associated protein E like proteins. 45174 pfam05273: Poxvirus RNA polymerase 22 kDa subunit. This family consists of several poxvirus DNA-dependent RNA polymerase 22 kDa subunits. 45175 pfam05274: Occlusion-derived virus envelope protein E25. This family consists of several nucleopolyhedrovirus occlusion-derived virus envelope E25 proteins. 45176 pfam05275: Copper resistance protein B precursor (CopB). This family consists of several bacterial copper resistance proteins. Copper is essential and serves as cofactor for more than 30 enzymes yet a surplus of copper is toxic and leads to radical formation and oxidation of biomolecules. Therefore, copper homeostasis is a key requisite for every organism. CopB serves to extrude copper when it approaches toxic levels. 45177 pfam05276: SH3 domain-binding protein 5 (SH3BP5). This family consists of several eukaryotic SH3 domain-binding protein 5 or c-Jun N-terminal kinase (JNK)-interacting proteins (SH3BP5 or Sab). Sab binds to and serves as a substrate for JNK in vitro, and has been found to interact with the Src homology 3 (SH3) domain of Bruton's tyrosine kinase (Btk). Inspection of the sequence of Sab reveals the presence of two putative mitogen-activated protein kinase interaction motifs (KIMs) similar to that found in the JNK docking domain of the c-Jun transcription factor, and four potential serine-proline JNK phosphorylation sites in the C-terminal half of the molecule. 45178 pfam05277: Protein of unknown function (DUF726). This family consists of several uncharacterised eukaryotic proteins. 45179 pfam05278: Arabidopsis phospholipase-like protein (PEARLI 4). This family contains several phospholipase-like proteins from Arabidopsis thaliana which are homologous to PEARLI 4. 45180 pfam05279: Aspartyl beta-hydroxylase N-terminal region. This family includes the N-terminal regions of the junctin, junctate and aspartyl beta-hydroxylase proteins. Junctate is an integral ER/SR membrane calcium binding protein, which comes from an alternatively spliced form of the same gene that generates aspartyl beta-hydroxylase and junctin. Aspartyl beta-hydroxylase catalyses the post-translational hydroxylation of aspartic acid or asparagine residues contained within epidermal growth factor (EGF) domains of proteins. 45181 pfam05280: Flagellar transcriptional activator (FlhC). This family consists of several bacterial flagellar transcriptional activator (FlhC) proteins. FlhC combines with FlhD to form a regulatory complex in E. coli, this complex has been shown to be a global regulator involved in many cellular processes as well as a flagellar transcriptional activator. 45182 pfam05281: Neuroendocrine protein 7B2 precursor (Secretogranin V). The neuroendocrine protein 7B2 has a critical role in the proteolytic conversion and activation of proPC2, the enzyme responsible for the proteolytic conversion of many peptide hormone precursors. The 7B2 protein acts as an intracellular binding protein for proPC2, facilitates its maturation, and is required for its enzymatic activity. Processing of many important peptide precursors does not occur in 7B2 nulls. 7B2 null mice exhibit a unique form of Cushing's disease with many atypical symptoms, such as hypoglycemia. 45183 pfam05282: AAR2 protein. This family consists of several eukaryotic AAR2-like proteins. The yeast protein AAR2 is involved in splicing pre-mRNA of the a1 cistron and other genes that are important for cell growth. 45184 pfam05283: Multi-glycosylated core protein 24 (MGC-24). This family consists of several MGC-24 (or Cd164 antigen) proteins from eukaryotic organisms. MGC-24/CD164 is a sialomucin expressed in many normal and cancerous tissues. In humans, soluble and transmembrane forms of MGC-24 are produced by alternative splicing. 45185 pfam05284: Protein of unknown function (DUF736). This family consists of several uncharacterized bacterial proteins of unknown function. 45186 pfam05285: SDA1. This family consists of several SDA1 protein homologues. SDA1 is a Saccharomyces cerevisiae protein which is involved in the control of the actin cytoskeleton. The protein is essential for cell viability and is localised in the nucleus. 45187 pfam05286: Fertility inhibition protein (FINO). This family consists of several bacterial fertility inhibition (FINO) proteins. The conjugative transfer of F-like plasmids is repressed by FinO, an RNA binding protein. FinO interacts with the F-plasmid encoded traJ mRNA and its antisense RNA, FinP, stabilising FinP against endonucleolytic degradation and facilitating sense-antisense RNA recognition. 45188 pfam05287: PMG protein. This family consists of several mouse anagen-specific protein mKAP13 (PMG1 and PMG2). PMG1 and 2 contain characteristic repeats reminiscent of the keratin-associated proteins (KAPs). Both genes are expressed in growing hair follicles in skin as well as in sebaceous and eccrine sweat glands. Interestingly, expression is also detected in the mammary epithelium where it is limited to the onset of the pubertal growth phase and is independent of ovarian hormones. Their broad, developmentally controlled expression pattern, together with their unique amino acid composition, demonstrate that pmg-1 and pmg-2 constitute a novel KAP gene family participating in the differentiation of all epithelial cells forming the epidermal appendages. 45189 pfam05288: Poxvirus A3L Protein. This family consists of several poxvirus A3L or A2_5L proteins. 45190 pfam05289: Borrelia hemolysin accessory protein. This family consists of several borrelia hemolysin accessory proteins (BLYB). BLYB was thought to be an accessory protein, which was proposed to comprise a hemolysis system but it is now thought that BlyA and BlyB function instead as a prophage-encoded holin or holin-like system. 45191 pfam05290: Baculovirus immediate-early protein (IE-0). The Autographa californica multinucleocapsid nuclear polyhedrosis virus (AcMNPV) ie-1 gene product (IE-1) is thought to play a central role in stimulating early viral transcription. IE-1 has been demonstrated to activate several early viral gene promoters and to negatively regulate the promoters of two other AcMNPV regulatory genes, ie-0 and ie-2. It is thought that that IE-1 negatively regulates the expression of certain genes by binding directly, or as part of a complex, to promoter regions containing a specific IE-1-binding motif (5'-ACBYGTAA-3') near their mRNA start sites. 45192 pfam05291: Bystin. Trophinin and tastin form a cell adhesion molecule complex that potentially mediates an initial attachment of the blastocyst to uterine epithelial cells at the time of implantation. Trophinin and tastin bind to an intermediary cytoplasmic protein called bystin. Bystin may be involved in implantation and trophoblast invasion because bystin is found with trophinin and tastin in the cells at human implantation sites and also in the intermediate trophoblasts at invasion front in the placenta from early pregnancy. This family also includes the yeast protein ENP1. ENP1 is an essential protein in Saccharomyces cerevisiae and is localised in the nucleus. It is thought that ENP1 plays a direct role in the early steps of rRNA processing as enp1 defective yeast cannot synthesise 20S pre-rRNA and hence 18S rRNA, which leads to reduced formation of 40S ribosomal subunits. 45193 pfam05292: Malonyl-CoA decarboxylase (MCD). This family consists of several eukaryotic malonyl-CoA decarboxylase (MLYCD) proteins. Malonyl-CoA, in addition to being an intermediate in the de novo synthesis of fatty acids, is an inhibitor of carnitine palmitoyltransferase I, the enzyme that regulates the transfer of long-chain fatty acyl-CoA into mitochondria, where they are oxidised. After exercise, malonyl-CoA decarboxylase participates with acetyl-CoA carboxylase in regulating the concentration of malonyl-CoA in liver and adipose tissue, as well as in muscle. Malonyl-CoA decarboxylase is regulated by AMP-activated protein kinase (AMPK).. 45194 pfam05293: African swine fever virus (ASFV) L11L protein. L11L is an integral membrane protein of the African swine fever virus (ASFV) which is expressed late in the virus replication cycle. The protein is thought to be non-essential for growth in vitro and for virus virulence in domestic swine. 45195 pfam05294: Scorpion short toxin. This family contains various secreted scorpion short toxins and seems to be unrelated to pfam00451. 45196 pfam05295: Luciferase. This family consists of dinoflagellate luciferase and luciferin binding proteins. Luciferase is involved in catalysing the light emitting reaction in bioluminescence and luciferin binding protein (LBP) is known to bind to luciferin (the substrate for luciferase) to stop it reacting with the enzyme and therefore switching off the bioluminescence function. The expression of these two proteins is controlled by a circadian clock at the translational level, with synthesis and degradation occurring on a daily basis. 45197 pfam05296: Mammalian taste receptor protein (TAS2R). This family consists of several forms of mammalian taste receptor proteins (TAS2Rs). TAS2Rs are G protein-coupled receptors expressed in subsets of taste receptor cells of the tongue and palate epithelia and are organised in the genome in clusters. The proteins are genetically linked to loci that influence bitter perception in mice and humans. 45198 pfam05297: Herpesvirus latent membrane protein 1 (LMP1). This family consists of several latent membrane protein 1 or LMP1s mostly from Epstein-Barr virus. LMP1 of EBV is a 62-65 kDa plasma membrane protein possessing six membrane spanning regions, a short cytoplasmic N-terminus and a long cytoplasmic carboxy tail of 200 amino acids. EBV latent membrane protein 1 (LMP1) is essential for EBV-mediated transformation and has been associated with several cases of malignancies. EBV-like viruses in Cynomolgus monkeys (Macaca fascicularis) have been associated with high lymphoma rates in immunosuppressed monkeys. 45199 pfam05298: Bombinin. This family consists of Bombinin and Maximin proteins from Bombina maxima (Chinese red belly toad). Two groups of antimicrobial peptides have been isolated from skin secretions of Bombina maxima. Peptides in the first group, named maximins 1, 2, 3, 4 and 5, are structurally related to bombinin-like peptides (BLPs). Unlike BLPs, sequence variations in maximins occurred all through the molecules. In addition to the potent antimicrobial activity, cytotoxicity against tumour cells and spermicidal action of maximins, maximin 3 possessed a significant anti-HIV activity. Maximins 1 and 3 have been found to be toxic to mice. Peptides in the second group, termed maximins H1, H2, H3 and H4, are homologous with bombinin H peptides. 45200 pfam05299: M61 glycyl aminopeptidase. Glycyl aminopeptidase is an unusual peptidase in that it has a preference for substrates with an N-terminal glycine or alanine. These proteins are found in Bacteria and in Archaea. 45201 pfam05300: Protein of unknown function (DUF737). This family consists of several uncharacterised mammalian proteins of unknown function. 45202 pfam05301: Protein of unknown function (DUF738). This family consists of several uncharacterised eukaryotic proteins of unknown function. 45203 pfam05302: Protein of unknown function (DUF720). This family consists of several uncharacterised Chlamydia proteins of unknown function. 45204 pfam05303: Protein of unknown function (DUF727). This family consists of several uncharacterised eukaryotic proteins of unknown function. 45205 pfam05304: Protein of unknown function (DUF728). This family consists of several uncharacterized tobravirus proteins of unknown function. 45206 pfam05305: Protein of unknown function (DUF732). This family consists of several uncharacterised Mycobacterium tuberculosis and leprae proteins of unknown function. 45207 pfam05306: Protein of unknown function (DUF733). This family consists of several uncharacterized Drosophila melanogaster proteins of unknown function. 45208 pfam05307: Bundlin. This family consists of several bundlin proteins from E. coli. Bundlin is a type IV pilin protein that is the only known structural component of enteropathogenic Escherichia coli bundle-forming pili (BFP). BFP play a role in virulence, antigenicity, autoaggregation, and localised adherence to epithelial cells. 45209 pfam05308: Protein of unknown function (DUF729). This family consists of several uncharacterised eukaryotic proteins of unknown function. 45210 pfam05309: TraE protein. This family consists of several bacterial sex pilus assembly and synthesis proteins (TraE). Conjugal transfer of plasmids from donor to recipient cells is a complex process in which a cell-to-cell contact plays a key role. Many genes encoded by self-transmissible plasmids are required for various processes of conjugation, including pilus formation, stabilisation of mating pairs, conjugative DNA metabolism, surface exclusion and regulation of transfer gene expression. The exact function of the TraE protein is unknown. 45211 pfam05310: Tenuivirus NS-3 Protein. This family consists of tenuivirus NS-3 (PV3 or GV3) proteins. The function of this protein is unknown although it is thought to be a replication protein. 45212 pfam05311: Baculovirus 33KDa late protein (PP31). Autographa californica nuclear polyhedrosis virus (AcMNPV) pp31 is a nuclear phosphoprotein that accumulates in the virogenic stroma, which is the viral replication centre in the infected-cell nucleus, binds to DNA, and serves as a late expression factor. 45213 pfam05312: Coenzyme PQQ synthesis protein C (PQQC). This family consists of several bacterial coenzyme PQQ synthesis protein C or PQQC proteins. Pyrroloquinoline quinone (PQQ) is the prosthetic group of several bacterial enzymes,including methanol dehydrogenase of methylotrophs and the glucose dehydrogenase of a number of bacteria. PQQC has been found to be required in the synthesis of PQQ but its function is unclear. 45214 pfam05313: Poxvirus P21 membrane protein. The P21 membrane protein of vaccinia virus, encoded by the A17L (or A18L) gene, has been reported to localise on the inner of the two membranes of the intracellular mature virus (IMV). It has also been shown that P21 acts as a membrane anchor for the externally located fusion protein P14 (A27L gene).. 45215 pfam05314: Baculovirus occlusion-derived virus envelope protein EC27. This family consists of several baculovirus occlusion-derived virus envelope proteins (EC27 or E27). The ODV-E27 protein has distinct functional characteristics compared to cellular and viral cyclins. Depending on the cdk protein, and perhaps other viral or cellular proteins yet to be described, the kinase-EC27 complex may have either cyclin B- or D-like activity. 45216 pfam05315: ICEA Protein. This family consists of several ICEA proteins from Helicobacter pylori. Helicobacter pylori infection causes gastritis and peptic ulcer disease, and is classified as a definite carcinogen of gastric cancer. ICEA1 is speculated to be associated with peptic ulcer disease. 45217 pfam05316: Mitochondrial ribosomal protein (VAR1). This family consists of the yeast mitochondrial ribosomal proteins VAR1. Mitochondria possess their own ribosomes responsible for the synthesis of a small number of proteins encoded by the mitochondrial genome. In yeast the two ribosomal RNAs and a single ribosomal protein, VAR1, are products of mitochondrial genes, and the remaining approximately 80 ribosomal proteins are encoded in the nucleus. VAR1 along with 15S rRNA are necessary for the formation of mature 37S subunits. 45218 pfam05317: Thermopsin. This family consists of several thermopsin proteins from archaebacteria. Thermopsin is a thermostable acid protease which is capable of hydrolysing the following bonds: Leu-Val, Leu-Tyr, Phe-Phe, Phe-Tyr, and Tyr-Thr. The specificity of thermopsin is therefore similar to that of pepsin, that is, it prefers large hydrophobic residues at both sides of the scissile bond. 45219 pfam05318: Tombusvirus movement protein. This family consists of several Tombusvirus movement proteins. These proteins allow the virus to move from cell-to-cell and allow host-specific systemic spread. . 45220 pfam05319: Baculovirus late expression factor 1. This family contains several baculovirus late expression factor 1 or LEF-1 proteins. Baculovirus LEF-1 is now known to be a DNA primase enzyme. 45221 pfam05320: Poxvirus DNA-directed RNA polymerase 19 kDa subunit. This family contains several DNA-directed RNA polymerase 19 kDa polypeptides. The Poxvirus DNA-directed RNA polymerase (EC: 2.7.7.6) catalyses DNA-template-directed extension of the 3'-end of an RNA strand by one nucleotide at a time. 45222 pfam05321: Haemolysin expression modulating protein. This family consists of haemolysin expression modulating protein (HHA) homologues. YmoA and Hha are highly similar bacterial proteins downregulating gene expression in Yersinia enterocolitica and Escherichia coli, respectively. 45223 pfam05322: NINE Protein. This family consists of NINE proteins from several bacteriophages and from E. coli. 45224 pfam05323: Poxvirus A21 Protein. This family consists of several poxvirus A21 proteins. 45225 pfam05324: Sperm antigen HE2. This family consists of several variants of the human and chimpanzee sperm antigen proteins (HE2 and EP2 respectively). The EP2 gene codes for a family of androgen-dependent, epididymis-specific secretory proteins.The EP2 gene uses alternative promoters and differential splicing to produce a family of variant messages. The translated putative protein variants differ significantly from each other. Some of these putative proteins have similarity to beta-defensins, a family of antimicrobial peptides. 45226 pfam05325: Protein of unknown function (DUF730). This family consists of several uncharacterized Arabidopsis thaliana proteins of unknown function. 45227 pfam05326: Seminal vesicle autoantigen (SVA). This family consists of seminal vesicle autoantigen and prolactin-inducible (PIP) proteins. Seminal vesicle autoantigen (SVA) is specifically present in the seminal plasma of mice. This 19-kDa secretory glycoprotein suppresses the motility of spermatozoa by interacting with phospholipid. PIP, has several known functions. In saliva, this protein plays a role in host defence by binding to microorganisms such as Streptococcus. PIP is an aspartyl proteinase and it acts as a factor capable of suppressing T-cell apoptosis through its interaction with CD4. 45228 pfam05327: RNA polymerase I specific transcription initiation factor RRN3. This family consists of several eukaryotic proteins which are homologous to the yeast RRN3 protein. RRN3 is one of the RRN genes specifically required for the transcription of rDNA by RNA polymerase I (Pol I) in Saccharomyces cerevisiae. 45229 pfam05328: CybS. This family consists of several eukaryotic succinate dehydrogenase [ubiquinone] cytochrome B small subunit, mitochondrial precursor (CybS) proteins. SDHD encodes the small subunit (cybS) of cytochrome b in succinate-ubiquinone oxidoreductase (mitochondrial complex II). Mitochondrial complex II is involved in the Krebs cycle and in the aerobic electron transport chain. It contains four proteins. The catalytic core consists of a flavoprotein and an iron-sulfur protein; these proteins are anchored to the mitochondrial inner membrane by the large subunit of cytochrome b (cybL) and cybS, which together comprise the heme-protein cytochrome b. Mutations in the SDHD gene can lead to hereditary paraganglioma, characterized by the development of benign, vascularised tumours in the head and neck. 45230 pfam05329: Protein of unknown function (DUF731). This family contains several uncharacterized plant proteins of unknown function, mostly from Arabidopsis thaliana. 45231 pfam05330: Protein of unknown function (DUF741). This family contains several uncharacterized human proteins. The function of this family is unknown, however, the family member FKSG56 is a hepatocellular carcinoma-associated antigen. 45232 pfam05331: Protein of unknown function (DUF742). This family consists of several uncharacterised Streptomyces proteins as well as one from Mycobacterium tuberculosis. The function of these proteins is unknown. 45233 pfam05332: Protein of unknown function (DUF743). This family consists of several uncharacterized Calicivirus proteins of unknown function. 45234 pfam05333: Protein of unknown function (DUF744). This family consists of several plant mitochondrial proteins of unknown function. 45235 pfam05334: Protein of unknown function (DUF719). This family consists of several eukaryotic proteins of unknown function. 45236 pfam05335: Protein of unknown function (DUF745). This family consists of several uncharacterised Drosophila melanogaster proteins of unknown function. 45237 pfam05336: Protein of unknown function (DUF718). This family consists of several uncharacterised bacterial proteins of unknown function. 45238 pfam05337: Macrophage colony stimulating factor-1 (CSF-1). Colony stimulating factor 1 (CSF-1) is a homodimeric polypeptide growth factor whose primary function is to regulate the survival, proliferation, differentiation, and function of cells of the mononuclear phagocytic lineage. This lineage includes mononuclear phagocytic precursors, blood monocytes, tissue macrophages, osteoclasts, and microglia of the brain, all of which possess cell surface receptors for CSF-1. The protein has also been linked with male fertility and mutations in the Csf-1 gene have been found to cause osteopetrosis and failure of tooth eruption. 45239 pfam05338: Protein of unknown function (DUF717). This family consists of several herpesvirus proteins of unknown function. 45240 pfam05339: Protein of unknown function (DUF739). This family contains several bacteriophage proteins. Three of the proteins in this family have been labelled putative cro repressor proteins. 45241 pfam05340: Protein of unknown function (DUF740). This family consists of several uncharacterised plant proteins of unknown function. 45242 pfam05341: Protein of unknown function (DUF708). This family consists of several uncharacterised nucleopolyhedrovirus proteins of unknown function. 45243 pfam05342: M26 IgA1-specific Metallo-endopeptidase. These peptidases, which cleave mammalian IgA, are found in Gram-positive bacteria. Often found associated with pfam00746, they may be attached to the cell wall. 45244 pfam05343: M42 glutamyl aminopeptidase. These peptidases are found in Archaea and Bacteria. The example in Lactococcus lactis, PepA, aids growth on milk. Pyrococcus horikoshii contain a thermostable de-blocking aminopeptidase member of this family used commercially for N-terminal protein sequencing. 45245 pfam05344: Domain of Unknown Function (DUF746). This is a short conserved region found in some transposons. 45246 pfam05345: Putative Ig domain. This alignment represents the conserved core region of ~90 residue repeat found in several haemagglutinins and other cell surface proteins. Sequence similarities to (pfam02494) and (pfam00801) suggest an Ig-like fold (personal obs:C. Yeats). So this family may be similar in function to the (pfam02639) and (pfam02638) domains. 45247 pfam05346: Eukaryotic membrane protein (cytomegalovirus gH-receptor) family. This family of eukaryotic membrane proteins includes the putative receptor for human cytomegalovirus gH. The cellular function of this family remains unknown. 45248 pfam05347: Complex 1 protein (LYR family). This family of short proteins includes proteins from the NADH-ubiquinone oxidoreductase complex I. The family includes the B14 subunit from Cow, and the B22 subunit from human. We would predict that all the members of this family are components of Complex I. We have named this family LYR after a highly conserved tripeptide motif close to the N-terminus of these proteins. 45249 pfam05348: Proteasome maturation factor UMP1. UMP1 is a short-lived chaperone present in the precursor form of the 20S proteasome and absent in the mature complex. UMP1 is required for the correct assembly and enzymatic activation of the proteasome. UMP1 seems to be degraded by the proteasome upon its formation. 45250 pfam05349: GATA-type transcription activator, N-terminal. GATA transcription factors mediate cell differentiation in a diverse range of tissues. Mutation are often associated with certain congenital human disorders. The six classical vertebrate GATA proteins, GATA-1 to GATA-6, are highly homologous and have two tandem zinc fingers. The classical GATA transcription factors function transcription activators. In lower metazoans GATA proteins carry a single canonical zinc finger. This family represents the N-terminal domain of the family of GATA transcription activators. 45251 pfam05350: Glycogen synthase kinase-3 binding. Glycogen synthase kinase-3 (GSK-3) sequentially phosphorylates four serine residues on glycogen synthase (GS), in the sequence SxxxSxxxSxxx-SxxxS(p), by recognising and phosphorylating the first serine in the sequence motif SxxxS(P) (where S(p) represents a phosphoserine). Interaction of GSK-3 with a peptide derived from GSK-3 binding protein (this family) prevents GSK-3 interaction with Axin. This interaction thereby inhibits the Axin-dependent phosphorylation of beta-catenin by GSK-3. . 45252 pfam05351: GMP-PDE, delta subunit. GMP-PDE delta subunit was originally identified as a fourth subunit of rod-specific cGMP phosphodiesterase (PDE)(EC:3.1.4.35). The precise function of PDE delta subunit in the rod specific GMP-PDE complex is unclear. In addition, PDE delta subunit is not confined to photoreceptor cells but is widely distributed in different tissues. PDE delta subunit is thought to be a specific soluble transport factor for certain prenylated proteins and Arl2-GTP a regulator of PDE-mediated transport. . 45253 pfam05352: Phage Connector (GP10). The head-tail connector of bacteriophage 29 is composed of 12 36 kDa subunits with 12 fold symmetry. It is the central component of a rotary motor that packages the genomic dsDNA into pre-formed proheads. This motor consists of the head-tail connector, surrounded by a 29-encoded, 174-base, RNA and a viral ATPase protein. 45254 pfam05353: Delta Atracotoxin. Delta atracotoxin produces potentially fatal neurotoxic symptoms in primates by slowing he inactivation of voltage-gated sodium channels. The structure of atracotoxin comprises a core beta region containing a triple-stranded a thumb-like extension protruding from the beta region and a C-terminal helix. The beta region contains a cystine knot motif, a feature seen in other neurotoxic polypeptides. 45255 pfam05354: Phage Head-Tail Attachment. The phage head-tail attachment protein is required for the joining of phage heads and tails at the last step of morphogenesis. . 45256 pfam05355: Apolipoprotein C-II. Apolipoprotein C-II (ApoC-II) is the major activator of lipoprotein lipase, a key enzyme in the regulation of triglyceride levels in human serum. . 45257 pfam05356: Phage Coat protein B. The major coat protein in the capsid of filamentous bacteriophage forms a helical assembly of about 7000 identical protomers, with each protomer comprised of 46 amino acid, after the cleavage of the signal peptide. Each protomer forms a slightly curved helix that combine to form a tubular structure that encapsulates the viral DNA. 45258 pfam05357: Phage Coat Protein A. Infection of Escherichia coli by filamentous bacteriophages is mediated by the minor phage coat protein A and involves two distinct cellular receptors, the F' pilus and the periplasmic protein TolA. These two receptors are contacted in a sequential manner, such that binding of TolA by the extreme N-terminal domain is conditional on a primary interaction of the second coat protein A domain with the F' pilus. 45259 pfam05358: DicB protein. DicB is part of the dic operon, which resides on cryptic prophage Kim. Under normal conditions, expression of dicB is actively repressed. When expression is induced, however, cell division rapidly ceases, and this division block is dependent on MinC with which it interacts. 45260 pfam05359: Domain of Unknown Function (DUF748).. 45261 pfam05360: yiaA/B two helix domain. This domain consists of two transmembrane helices and a conserved linking section. 45262 pfam05361: PKC-activated protein phosphatase-1 inhibitor. Contractility of vascular smooth muscle depends on phosphorylation of myosin light chains, and is modulated by hormonal control of myosin phosphatase activity. Signaling pathways activate kinases such as PKC or Rho-dependent kinases that phosphorylate the myosin phosphatase inhibitor protein called CPI-17. Phosphorylation of CPI-17 at Thr-38 enhances its inhibitory potency 1000-fold, creating a molecular switch for regulating contraction. 45263 pfam05362: Lon protease (S16) C-terminal proteolytic domain. The Lon serine proteases must hydrolyse ATP to degrade protein substrates. In Escherichia coli, these proteases are involved in turnover of intracellular proteins, including abnormal proteins following heat-shock. The active site for protease activity resides in a C-terminal domain. The Lon proteases are classified as family S16 in Merops. 45264 pfam05363: Herpesvirus US12 family. US12 a key factor in the evasion of cellular immune response against HSV-infected cells. Specific inhibition of the transporter associated with antigen processing (TAP) by US12 prevents peptide transport into the endoplasmic reticulum and subsequent loading of major histocompatibility complex (MHC) class I molecules. US12 is comprised of three helices and is associated with cellular membranes. . 45265 pfam05364: Salmonella type III secretion SopE effector. Salmonella typhimurium employs a type III secretion system to inject bacterial toxins into the host cell cytosol. These toxins transiently activate Rho family GTP-binding protein-dependent signaling cascades to induce cytoskeletal rearrangements. SopE, one of these toxins, can activate Cdc42 in a Dbl-like fashion despite its lack of sequence similarity to Dbl-like proteins, the Rho-specific eukaryotic guanine nucleotide exchange factors. 45266 pfam05365: Ubiquinol-cytochrome C reductase, UQCRX/QCR9 like. The UQCRX/QCR9 protein is the 9/10 subunit of complex III, encoding a protein of about 7-kDa. Deletion of QCR9 results in the inability of cells to grow on grow on-fermentable carbon source n yeast. 45267 pfam05366: Sarcolipin. Sarcolipin is a 31 amino acid integral membrane protein that regulates Ca-ATPase activity in skeletal muscle. 45268 pfam05367: Phage endonuclease I. The bacteriophage endonuclease I is a nuclease that is selective for the structure of the four-way Holliday DNA junction. 45269 pfam05368: NmrA-like family. NmrA is a negative transcriptional regulator involved in the post-translational modification of the transcription factor AreA. NmrA is part of a system controlling nitrogen metabolite repression in fungi. This family only contains a few sequences as iteration results in significant matches to other Rossmann fold families. 45270 pfam05369: Monomethylamine methyltransferase MtmB. Monomethylamine methyltransferase of the archaebacterium Methanosarcina barkeri contains a novel amino acid, pyrrolysine, encoded by the termination codon UAG. The structure reveals a homohexamer comprised of individual subunits with a TIM barrel fold. 45271 pfam05370: Domain of unknown function (DUF749). Archaeal domain of unknown function. This domain has been solved as part a structural genomics group and comprises of segregated helical and anti-parallel beta sheet regions. 45272 pfam05371: Phage major coat protein, Gp8. Class I phage major coat protein Gp8 or B. The coat protein is largely alpha-helix with a slight curve. . 45273 pfam05372: Delta lysin family. Delta-lysin is a 26 amino acid, hemolytic peptide toxin secreted by Staphylococcus aureus. It is thought that delta-toxin forms an amphipathic helix upon binding to lipid bilayers. The precise mode of action of delta-lysis is unclear. 45274 pfam05373: L-proline 3-hydroxylase, C-terminal. Iron (II)/2-oxoglutarate (2-OG)-dependent oxygenases catalyse oxidative reactions in a range of metabolic processes. Proline 3-hydroxylase hydroxylates proline at position 3, the first of a 2-OG oxygenase catalysing oxidation of a free alpha-amino acid. The structure contains conserved motifs present in other 2-OG oxygenases including a jelly roll strand core and residues binding iron and 2-oxoglutarate, consistent with divergent evolution within the extended family. The structure differs significantly from many other 2-OG oxygenases in possessing a discrete C-terminal helical domain. 45275 pfam05374: Mu-Conotoxin. Mu-conotoxins are peptide inhibitors of voltage-sensitive sodium channels. 45276 pfam05375: Pacifastin inhibitor (LCMII). Structures of members of this family show that they are comprised of a triple-stranded antiparallel beta-sheet connected by three disulfide bridges, which defines this as a novel family of serine protease inhibitors. . 45277 pfam05377: Flagella accessory protein C (FlaC). Although archaeal flagella appear superficially similar to those of bacteria, they are quite distinct. In several archaea, the flagellin genes are followed immediately by the flagellar accessory genes flaCDEFGHIJ. The gene products may have a role in translocation, secretion, or assembly of the flagellum. FlaC is a protein whose exact role is unknown but it has been shown to be membrane-associated (by immuno-blotting fractionated cells).. 45278 pfam05378: Hydantoinase/oxoprolinase N-terminal region. This family is found at the N-terminus of the pfam01968 family. 45279 pfam05379: Carlavirus endopeptidase. A peptidase involved in auto-proteolysis of a polyprotein from the plant pathogen blueberry scorch carlavirus (BBScV). Corresponds to Merops family C23. 45280 pfam05380: Pao retrotransposon peptidase. Corresponds to Merops family A17. These proteins are homologous to aspartic proteinases encoded by retroposons and retroviruses. 45281 pfam05381: Tymovirus endopeptidase. Corresponds to Merops family C21. The best-studied plant alpha-like virus proteolytic enzyme is the proteinase of turnip yellow mosaic virus (TYMV). The TYMV replicase protein undergoes auto-cleavage to yield two products. The auto-peptidase activity has been mapped to the central part of this polyprotein. 45282 pfam05382: Bacteriophage peptidoglycan hydrolase. At least one of the members of this family, the Pal protein from the pneumococcal bacteriophage Dp-1 has been shown to be a N-acetylmuramoyl-L-alanine amidase. According to the known modular structure of this and other peptidoglycan hydrolases from the pneumococcal system, the active site should reside at the N-terminal domain whereas the C-terminal domain binds to the choline residues of the cell wall teichoic acids. 45283 pfam05383: La domain. This presumed domain is found at the N-terminus of La RNA-binding proteins as well as other proteins. The function of this region is uncertain. 45284 pfam05384: Sensor protein DegS. This is small family of Bacillus DegS proteins. The DegS-DegU two-component regulatory system of Bacillus subtilis controls various processes that characterise the transition from the exponential to the stationary growth phase, including the induction of extracellular degradative enzymes, expression of late competence genes and down-regulation of the sigma D regulon. The family also contains one sequence from Thermoanaerobacter tengcongensis which is described as a sensory transduction histidine kinase. 45285 pfam05385: Mastadenovirus early E4 13 kDa protein. This family consists of human and simian mastadenovirus early E4 13 kDa proteins. Human adenovirus type 9 (Ad9) is unique in eliciting exclusively estrogen-dependent mammary tumours in rats and in not requiring viral E1 region transforming genes for tumorigenicity. E4 codes for an oncoprotein essential for tumourigenesis by Ad9. 45286 pfam05386: TEP1 N-terminal domain. This short sequence region is found in four copies at the N-terminus of the TEP1 telomerase component. The functional significance of the region is uncertain. However the conservation of two histidines and a cysteine suggests it is a potential zinc binding domain. 45287 pfam05387: Chorion family 3. This family consists of several Drosophila chorion proteins S36 and S38. The chorion genes of Drosophila are amplified in response to developmental signals in the follicle cells of the ovary. 45288 pfam05388: Carboxypeptidase Y pro-peptide. This family is found at the N terminus of several carboxypeptidase Y proteins and contains a signal peptide and pro-peptide regions. 45289 pfam05389: Negative regulator of genetic competence (MecA). This family contains several bacterial MecA proteins. The development of competence in Bacillus subtilis is regulated by growth conditions and several regulatory genes. In complex media competence development is poor, and there is little or no expression of late competence genes. Mec mutations permit competence development and late competence gene expression in complex media, bypassing the requirements for many of the competence regulatory genes. The mecA gene product acts negatively in the development of competence. Null mutations in mecA allow expression of a late competence gene comG, under conditions where it is not normally expressed, including in complex media and in cells mutant for several competence regulatory genes. Overexpression of MecA inhibits comG transcription. 45290 pfam05390: Yeast cell wall synthesis protein KRE9/KNH1. This family contains several KRE9 and KNH1 proteins which are involved in encoding cell surface O glycoproteins, which are required for beta -1,6-glucan synthesis in yeast. 45291 pfam05391: Lsm interaction motif. This short motif is found at the C-terminus of Prp24 proteins and probably interacts with the Lsm proteins to promote U4/U6 formation. 45292 pfam05392: Cytochrome C oxidase chain VIIB. 45293 pfam05393: Human adenovirus early E3A glycoprotein. This family consists of several early glycoproteins from human adenoviruses. 45294 pfam05394: Avirulence protein. This family consists of several avirulence proteins from Pseudomonas syringae and Xanthomonas campestris. 45295 pfam05395: Protein phosphatase inhibitor 1/DARPP-32. This family consists of several mammalian protein phosphatase inhibitor 1 (IPP-1) and dopamine- and cAMP-regulated neuronal phosphoprotein (DARPP-32) proteins. Protein phosphatase inhibitor-1 is involved in signal transduction and is an endogenous inhibitor of protein phosphatase-1. It has been demonstrated that DARPP-32, if phosphorylated, can inhibit protein-phosphatase-1. DARPP-32 has a key role in many neurotransmitter pathways throughout the brain and has been shown to be involved in controlling receptors, ion channels and other physiological factors including the brain's response to drugs of abuse, such as cocaine, opiates and nicotine. DARPP-32 is reciprocally regulated by the two neurotransmitters that are most often implicated in schizophrenia - dopamine and glutamate. Dopamine activates DARPP-32 through the D1 receptor pathway and disables DARPP-32 through the D2 receptor. Glutamate, acting through the N-methyl-d-aspartate receptor, renders DARPP-32 inactive. A mutant form of DARPP-32 has been linked with gastric cancers. 45296 pfam05396: Phage T7 capsid assembly protein. 45297 pfam05397: Transcription regulatory protein GAL11. This family contains yeast GAL11 proteins. Gal11 and Sin4 proteins are yeast global transcription factors that regulate transcription of a variety of genes, both positively and negatively. Gal11, in a major part, functions in the activation of transcription, whereas Sin4 has an opposite role, yet they are reported to be present as a complex in the so-called RNA polymerase II holoenzyme. 45298 pfam05398: PufQ cytochrome subunit. This family consists of bacterial PufQ proteins. PufQ id required for bacteriochlorophyll biosynthesis serving a regulatory function in the formation of photosynthetic complexes. 45299 pfam05399: Ectropic viral integration site 2A protein (EVI2A). This family contains several mammalian ectropic viral integration site 2A (EVI2A) proteins. The function of this protein is unknown although it is thought to be a membrane protein and may function as an oncogene in retrovirus induced myeloid tumours. 45300 pfam05400: Flagellar protein FliT. This family contains several bacterial flagellar FliT proteins. The flagellar proteins FlgN and FliT have been proposed to act as substrate specific export chaperones, facilitating incorporation of the enterobacterial hook-associated axial proteins (HAPs) FlgK/FlgL and FliD into the growing flagellum. In Salmonella typhimurium flgN and fliT mutants, the export of target HAPs is reduced, concomitant with loss of unincorporated flagellin into the surrounding medium. 45301 pfam05401: Nodulation protein S (NodS). This family consists of nodulation S (NodS) proteins. The products of the rhizobial nodulation genes are involved in the biosynthesis of lipochitin oligosaccharides (LCOs), which are host-specific signal molecules required for nodule formation. NodS is an S-adenosyl-L-methionine (SAM)-dependent methyltransferase involved in N methylation of LCOs. NodS uses N-deacetylated chitooligosaccharides, the products of the NodBC proteins, as its methyl acceptors. 45302 pfam05402: Coenzyme PQQ synthesis protein D (PqqD). This family contains several bacterial coenzyme PQQ synthesis protein D (PqqD) sequences. This protein is required for coenzyme pyrrolo-quinoline-quinone (PQQ) biosynthesis. 45303 pfam05403: Plasmodium histidine-rich protein (HRPII/III). This family consists of several histidine-rich protein II and III sequence from Plasmodium falciparum. 45304 pfam05404: Translocon-associated protein, delta subunit precursor (TRAP-delta). This family consists of several eukaryotic translocon-associated protein, delta subunit precursors (TRAP-delta or SSR-delta). The exact function of this protein is unknown. 45305 pfam05405: Mitochondrial ATP synthase B chain precursor (ATP-synt_B). The Fo sector of the ATP synthase is a membrane bound complex which mediates proton transport. It is composed of nine different polypeptide subunits (a, b, c, d, e, f, g F6, A6L). . 45306 pfam05406: WGR domain. This domain is found in a variety of polyA polymerases as well as the E. coli molybdate metabolism regulator and other proteins of unknown function. I have called this domain WGR after the most conserved central motif of the domain. The domain is found in isolation in some proteins, and is between 70 and 80 residues in length. Might be a nucleic acid binding domain. 45307 pfam05407: Rubella virus endopeptidase. Corresponds to Merops family C27. Required for processing of the rubella virus replication protein. 45308 pfam05408: Foot-and-mouth virus L-proteinase. Corresponds to Merops family C28. Protein fold of the peptidase unit for members of this family resembles that of papain. The leader proteinase of foot and mouth disease virus (FMDV) cleaves itself from the growing polyprotein and also cleaves the host translation initiation factor 4GI (eIF4G), thus inhibiting 5'-cap dependent translation. 45309 pfam05409: Coronavirus endopeptidase C30. Corresponds to Merops family C30. These peptidases are involved in viral polyprotein processing in replication. 45310 pfam05410: Porcine arterivirus-type cysteine proteinase alpha. Corresponds to Merops family C31. These peptidases are involved in viral polyprotein processing in replication. 45311 pfam05411: Equine arteritis virus-type cysteine proteinase. Corresponds to Merops family C32. These peptidases are involved in viral polyprotein processing in replication. 45312 pfam05412: Equine arterivirus Nsp2-type cysteine proteinase. Corresponds to Merops family C33. These peptidases are involved in viral polyprotein processing in replication. 45313 pfam05413: Putative closterovirus papain-like endopeptidase. Corresponds to Merops family C34. Putative closterovirus papain-like endopeptidase from the apple chlorotic leaf spot closterovirus. 45314 pfam05414: Putative capillovirus papain-like endopeptidase. Corresponds to Merops family C35. The putative capillovirus papain-like endopeptidases from apple stem grooving virus and from cherry capillovirus are probably involved in processing the viral polyprotein. 45315 pfam05415: Beet necrotic yellow vein furovirus-type papain-like endopeptidase. Corresponds to Merops family C36. This protease involved in processing the viral polyprotein. 45316 pfam05416: Southampton virus-type processing peptidase. Corresponds to Merops family C37. Norwalk-like viruses (NLVs), including the Southampton virus, cause acute non-bacterial gastroenteritis in humans. The NLV genome encodes three open reading frames (ORFs). ORF1 encodes a polyprotein, which is processed by the viral protease into six proteins. 45317 pfam05417: Hepatitis E cysteine protease. Corresponds to Merops family C41. This papain-like protease cleaves the viral polyprotein encoded by ORF1 of the hepatitis E virus (HEV).. 45318 pfam05418: Apovitellenin I (Apo-VLDL-II). This family consists of several avian apovitellenin I sequences. As part of the avian reproductive effort, large quantities of triglyceride-rich very-low-density lipoprotein (VLDL) particles are transported by receptor-mediated endocytosis into the female germ cells. Although the oocytes are surrounded by a layer of granulosa cells harbouring high levels of active lipoprotein lipase, non-lipolysed VLDL is transported into the yolk. This is because VLDL particles from laying chickens are protected from lipolysis by apolipoprotein (apo)-VLDL-II, a potent dimeric lipoprotein lipase inhibitor. Apo-VLDL-II is produced in the liver and secreted into the blood stream when induced by estrogen production in female birds. 45319 pfam05419: GUN4-like. In Arabidopsis, GUN4 is required for the functioning of the plastid mediated repression of nuclear transcription that is involved in controlling the levels of magnesium- protoporphyrin IX. GUN4 binds the product and substrate of Mg-chelatase, an enzyme that produces Mg-Proto, and activates Mg-chelatase. GUN4 is thought to participates in plastid-to-nucleus signaling by regulating magnesium-protoporphyrin IX synthesis or trafficking. 45320 pfam05420: Cellulose synthase operon protein C C-terminus (BCSC_C). This family contains the C-terminal regions of several bacterial cellulose synthase operon C (BCSC) proteins. BCSC is involved in cellulose synthesis although the exact function of this protein is unknown. 45321 pfam05421: Protein of unknown function (DUF751). This family contains several plant, cyanobacterial and algal proteins of unknown function. The family is exclusively found in phototrophic organisms and may therefore play a role in photosynthesis (personal obs:Moxon SJ).. 45322 pfam05422: Stress-activated map kinase interacting protein 1 (SIN1). This family consists of several stress-activated map kinase interacting protein 1 (MAPKAP1 OR SIN1) sequences. The fission yeast Sty1/Spc1 mitogen-activated protein (MAP) kinase is a member of the eukaryotic stress-activated MAP kinase (SAPK) family. Sin1 interacts with Sty1/Spc1. Cells lacking Sin1 display many, but not all, of the phenotypes of cells lacking the Sty1/Spc1 MAP kinase including sterility, multiple stress sensitivity and a cell-cycle delay. Sin1 is phosphorylated after stress but this is not Sty1/Spc1-dependent. 45323 pfam05423: Mycobacterium membrane protein. This family contains several membrane proteins from Mycobacterium species. 45324 pfam05424: Plasmodium Duffy binding protein. This family contains several Plasmodium Duffy binding proteins.Plasmodium vivax and Plasmodium knowlesi merozoites invade human erythrocytes that express Duffy blood group surface determinants. The Duffy receptor family is localised in micronemes, an organelle found in all organisms of the phylum Apicomplexa. 45325 pfam05425: Copper resistance protein D. Copper sequestering activity displayed by some bacteria is determined by copper-binding protein products of the copper resistance operon (cop). CopD, together with CopC, perform copper uptake into the cytoplasm. 45326 pfam05426: Alginate lyase. This family contains several bacterial alginate lyase proteins. Alginate is a family of 1-4-linked copolymers of beta -D-mannuronic acid (M) and alpha -L-guluronic acid (G). It is produced by brown algae and by some bacteria belonging to the genera Azotobacter and Pseudomonas. Alginate lyases catalyse the depolymerisation of alginates by beta -elimination, generating a molecule containing 4-deoxy-L-erythro-hex-4-enepyranosyluronate at the nonreducing end. 45327 pfam05427: Acidic fibroblast growth factor binding (FIBP). Acidic fibroblast growth factor (aFGF) intracellular binding protein (FIBP) is a protein found mainly in the nucleus that is thought to be involved in the intracellular function of aFGF. 45328 pfam05428: Corticotropin-releasing factor binding protein (CRF-BP). This family consists of several eukaryotic corticotropin-releasing factor binding proteins (CRF-BP or CRH-BP). Corticotropin-releasing hormone (CRH) plays multiple roles in vertebrate species. In mammals, it is the major hypothalamic releasing factor for pituitary adrenocorticotropin secretion, and is a neurotransmitter or neuromodulator at other sites in the central nervous system. In non-mammalian vertebrates, CRH not only acts as a neurotransmitter and hypophysiotropin, it also acts as a potent thyrotropin-releasing factor, allowing CRH to regulate both the adrenal and thyroid axes, especially in development. CRH-BP is thought to play an inhibitory role in which it binds CRH and other CRH-like ligands and prevents the activation of CRH receptors. There is however evidence that CRH-BP may also exhibit diverse extra and intracellular roles in a cell specific fashion and at specific times in development. 45329 pfam05429: Leukocyte cell-derived chemotaxin 2 (LECT2). This family consists of several leukocyte cell-derived chemotaxin 2 (LECT2) proteins. LECT2 is a liver-specific protein which is thought to be linked to hepatocyte growth although the exact function of this protein is unknown. 45330 pfam05430: Protein of unknown function (DUF752). This family contains several uncharacterised bacterial proteins with no known function. 45331 pfam05431: Insecticidal Crystal Toxin, P42. Family of Bacillus insecticidal crystal toxins. Strains of Bacillus that have this insecticidal activity use a binary toxin comprised of two proteins, P51 and P42 (this family). Members of this family are highly conserved between strains of different serotypes and phage groups. 45332 pfam05432: Bone sialoprotein II (BSP-II). Bone sialoprotein (BSP) is a major structural protein of the bone matrix that is specifically expressed by fully-differentiated osteoblasts. The expression of bone sialoprotein (BSP) is normally restricted to mineralised connective tissues of bones and teeth where it has been associated with mineral crystal formation. However, it has been found that ectopic expression of BSP occurs in various lesions, including oral and extraoral carcinomas, in which it has been associated with the formation of microcrystalline deposits and the metastasis of cancer cells to bone. 45333 pfam05433: Rickettsia 17 kDa surface antigen. This family consists of several Rickettsia genus specific 17 kDa surface antigen proteins. 45334 pfam05434: TMEM9. This family contains several eukaryotic transmembrane proteins which are homologous to human transmembrane protein 9. The TMEM9 gene encodes a 183 amino-acid protein that contains an N-terminal signal peptide, a single transmembrane region, three potential N-glycosylation sites and three conserved cys-rich domains in the N-terminus, but no known functional domains. The protein is highly conserved between species from Caenorhabditis elegans to man and belongs to a novel family of transmembrane proteins. The exact function of TMEM9 is unknown although it has been found to be widely expressed and localised to the late endosomes and lysosomes. Members of this family contain pfam03128 repeats in their N-terminal region. 45335 pfam05435: Phi-29 DNA terminal protein GP3. This family consists of DNA terminal protein GP3 sequences from Phi-29 like bacteriophages. DNA terminal protein GP3 is linked to the 5' ends of both strands of the genome through a phosphodiester bond between the beta-hydroxyl group of a serine residue and the 5'-phosphate of the terminal deoxyadenylate. This protein is essential for DNA replication and is involved in the priming of DNA elongation. 45336 pfam05436: Mating factor alpha precursor N-terminus. This family contains the N-terminal regions of the Saccharomyces mating factor alpha precursor protein. All proteins in this family contain one or more copies pfam04648 further toward their C terminus. 45337 pfam05437: Branched-chain amino acid transport protein (AzlD). This family consists of a number of bacterial and archaeal branched-chain amino acid transport proteins. AzlD is known to be involved in conferring resistance to 4-azaleucine although its exact role is uncertain. 45338 pfam05438: Thyrotropin-releasing hormone (TRH). This family consists of several thyrotropin-releasing hormone (TRH) proteins. Thyrotropin-Releasing Hormone (TRH; pyroGlu-His-Pro-NH2), originally isolated as a hypothalamic neuropeptide hormone, most likely acts also as a neuromodulator and/or neurotransmitter in the central nervous system (CNS). This interpretation is supported by the identification of a peptidase localised on the surface of neuronal cells which has been termed TRH-degrading ectoenzyme (TRH-DE) since it selectively inactivates TRH. TRH has been used clinically for the treatment of spinocerebellar degeneration and disturbance of consciousness in humans. 45339 pfam05439: Jumping translocation breakpoint protein (JTB). This family contains several jumping translocation breakpoint proteins or JTBs. Jumping translocation (JT) is an unbalanced translocation that comprises amplified chromosomal segments jumping to various telomeres. JTB, located at 1q21, has been found to fuse with the telomeric repeats of acceptor telomeres in a case of JT. hJTB (human JTB) encodes a trans-membrane protein that is highly conserved among divergent eukaryotic species. JT results in a hJTB truncation, which potentially produces an hJTB product devoid of the trans-membrane domain. hJTB is located in a gene-rich region at 1q21, called EDC (Epidermal Differentiation Complex). JTB has also been implicated in prostatic carcinomas. 45340 pfam05440: Tetrahydromethanopterin S-methyltransferase subunit B. The N5-methyltetrahydromethanopterin: coenzyme M (EC:2.1.1.86) of Methanosarcina mazei Go1 is a membrane-associated, corrinoid-containing protein that uses a transmethylation reaction to drive an energy-conserving sodium ion pump. 45341 pfam05441: Microvirus A* protein. This family contains several microvirus A* proteins. The A* protein binds to double stranded DNA and prevents their hydrolysis by nucleases. 45342 pfam05442: Microvirus A protein. Microvirus A protein is a specific endonuclease that cleaves the viral strand of supertwisted, closed circular DNA at a unique site in the A gene. The A protein also causes relaxation of supertwisted DNA and forms a complex with viral DNA that has a discontinuity in gene A of the viral strand. The C terminal region of the sequence contains the cleavage site for A/A* protein. 45343 pfam05443: ROS/MUCR transcriptional regulator protein. This family consists of several ROS/MUCR transcriptional regulator proteins. The ros chromosomal gene is present in octopine and nopaline strains of Agrobacterium tumefaciens as well as in Rhizobium meliloti. This gene encodes a 15.5-kDa protein that specifically represses the virC and virD operons in the virulence region of the Ti plasmid and is necessary for succinoglycan production. Sinorhizobium meliloti can produce two types of acidic exopolysaccharides, succinoglycan and galactoglucan, that are interchangeable for infection of alfalfa nodules. MucR from Sinorhizobium meliloti acts as a transcriptional repressor that blocks the expression of the exp genes responsible for galactoglucan production therefore allowing the exclusive production of succinoglycan. 45344 pfam05444: Protein of unknown function (DUF753). This family contains sequences with are repeated in several uncharacterised proteins from Drosophila melanogaster. 45345 pfam05445: Poxvirus serine/threonine protein kinase. 45346 pfam05446: Trypanosoma brucei ESAG 6/7 protein. Trypanosoma brucei escapes destruction by the host immune system by regularly replacing its Variant Surface Glycoprotein (VSG) coat. The VSG is expressed in a VSG expression site, together with expression site associated gene (ESAG) 6 and 7, encoding the heterodimeric transferrin receptor (Tf-R). There are around 20 VSG expression sites, and trypanosomes can change the site that is active. Since ESAG6 and 7 in different expression sites differ somewhat in sequence, expression site switching results in the production of a slightly different Tf-R. 45347 pfam05447: Copper response defect 1 (CRD1). This family contains several copper response defect 1 (CRD1) homologues from various phototrophic organisms. CRD1 is required for the maintenance of photosystem I and its associated light-harvesting complexes in copper-deficient (-Cu) and oxygen-deficient (-O(2)) Chlamydomonas reinhardtii cells and is localised to the thylakoid membrane. The family also contains the Rubrivivax gelatinosus AcsF protein. 45348 pfam05448: Acetyl xylan esterase (AXE1). This family consists of several bacterial acetyl xylan esterase proteins. Acetyl xylan esterases are enzymes that hydrolyse the ester linkages of the acetyl groups in position 2 and/or 3 of the xylose moieties of natural acetylated xylan from hardwood. These enzymes are one of the accessory enzymes which are part of the xylanolytic system, together with xylanases, beta-xylosidases, alpha-arabinofuranosidases and methylglucuronidases; these are all required for the complete hydrolysis of xylan. 45349 pfam05449: Protein of unknown function (DUF754). This domain appears to be found in a group of prophage proteins. 45350 pfam05450: Nicastrin. Nicastrin and presenilin are two major components of the gamma-secretase complex, which executes the intramembrane proteolysis of type I integral membrane proteins such as the amyloid precursor protein (APP) and Notch. Nicastrin is synthesised in fibroblasts and neurons as an endoglycosidase-H-sensitive glycosylated precursor protein (immature nicastrin) and is then modified by complex glycosylation in the Golgi apparatus and by sialylation in the trans-Golgi network (mature nicastrin).. 45351 pfam05451: Phytoreovirus nonstructural protein Pns10/11. This family consists of Phytoreovirus nonstructural proteins Pns10 and Pns11. Genome segment S11 of rice gall dwarf virus (RGDV), a member of Phytoreovirus encodes a putative protein of 40 kDa that exhibits approximately 37% homology at the amino acid level to the nonstructural proteins Pns10 of rice dwarf and wound tumour viruses, which are other members of Phytoreovirus. 45352 pfam05452: Clavanin. This family consists of clavanin proteins from the haemocytes of the invertebrate Styela clava, a solitary tunicate. The family is made up of four alpha-helical antimicrobial peptides, clavanins A, B, C and D. The tunicate peptides resemble magainins in size, primary sequence and antibacterial activity. Synthetic clavanin A displays comparable antimicrobial activity to magainins and cecropins. The presence of alpha-helical antimicrobial peptides in the haemocytes of a urochordate suggests that such peptides are primeval effectors of innate immunity in the vertebrate lineage. 45353 pfam05453: toxin 6. This family consists of several scorpion toxins which act by blocking small conductance calcium activated potassium ion channels in their victim. 45354 pfam05454: Dystroglycan (Dystrophin-associated glycoprotein 1). Dystroglycan is one of the dystrophin-associated glycoproteins, which is encoded by a 5.5 kb transcript in human. The protein product is cleaved into two non-covalently associated subunits, [alpha] (N-terminal) and [beta] (C-terminal). In skeletal muscle the dystroglycan complex works as a transmembrane linkage between the extracellular matrix and the cytoskeleton. [alpha]-dystroglycan is extracellular and binds to merosin ([alpha]-2 laminin) in the basement membrane, while [beta]-dystroglycan is a transmembrane protein and binds to dystrophin, which is a large rod-like cytoskeletal protein, absent in Duchenne muscular dystrophy patients. Dystrophin binds to intracellular actin cables. In this way, the dystroglycan complex, which links the extracellular matrix to the intracellular actin cables, is thought to provide structural integrity in muscle tissues. The dystroglycan complex is also known to serve as an agrin receptor in muscle, where it may regulate agrin-induced acetylcholine receptor clustering at the neuromuscular junction. There is also evidence which suggests the function of dystroglycan as a part of the signal transduction pathway because it is shown that Grb2, a mediator of the Ras-related signal pathway, can interact with the cytoplasmic domain of dystroglycan. In general, aberrant expression of dystrophin-associated protein complex underlies the pathogenesis of Duchenne muscular dystrophy, Becker muscular dystrophy and severe childhood autosomal recessive muscular dystrophy. Interestingly, no genetic disease has been described for either [alpha]- or [beta]-dystroglycan. Dystroglycan is widely distributed in non-muscle tissues as well as in muscle tissues. During epithelial morphogenesis of kidney, the dystroglycan complex is shown to act as a receptor for the basement membrane. Dystroglycan expression in mouse brain and neural retina has also been reported. However, the physiological role of dystroglycan in non-muscle tissues has remained unclear. . 45355 pfam05455: GvpH. This family consists of archaeal GvpH proteins which are thought to be involved in gas vesicle synthesis. 45356 pfam05456: Eukaryotic translation initiation factor 4E binding protein (EIF4EBP). This family consists of several eukaryotic translation initiation factor 4E binding proteins (EIF4EBP1 ,2 and 3). Translation initiation in eukaryotes is mediated by the cap structure (m7GpppN, where N is any nucleotide) present at the 5' end of all cellular mRNAs, except organellar. The cap is recognised by eukaryotic initiation factor 4F (eIF4F), which consists of three polypeptides, including eIF4E, the cap-binding protein subunit. The interaction of the cap with eIF4E facilitates the binding of the ribosome to the mRNA. eIF4E activity is regulated in part by translational repressors, 4E-BP1, 4E-BP2 and 4E-BP3 which bind to it and prevent its assembly into eIF4F. 45357 pfam05457: Sulfolobus transposase. 45358 pfam05458: Cd27 binding protein (Siva). Siva binds to the CD27 cytoplasmic tail. It has a DD homology region, a box-B-like ring finger, and a zinc finger-like domain. Overexpression of Siva in various cell lines induces apoptosis, suggesting an important role for Siva in the CD27-transduced apoptotic pathway. Siva-1 binds to and inhibits BCL-X(L)-mediated protection against UV radiation-induced apoptosis. Indeed, the unique amphipathic helical region (SAH) present in Siva-1 is required for its binding to BCL-X(L) and sensitising cells to UV radiation. Natural complexes of Siva-1/BCL-X(L) are detected in HUT78 and murine thymocyte, suggesting a potential role for Siva-1 in regulating T cell homeostasis. This family contains both Siva-1 and the shorter Siva-2 lacking the sequence coded by exon 2. It has been suggested that Siva-2 could regulate the function of Siva-1. 45359 pfam05459: Herpesvirus transcriptional regulator family. This family includes UL69 and IE63 that are transcriptional regulator proteins. 45360 pfam05460: Origin recognition complex subunit 6 (ORC6). This family consists of several eukaryotic origin recognition complex subunit 6 (ORC6) proteins. All DNA replication initiation is driven by a single conserved eukaryotic initiator complex termed he origin recognition complex (ORC). The ORC is a six protein complex. The function of ORC is reviewed in. 45361 pfam05461: Apolipoprotein L. Apo L belongs to the high density lipoprotein family that plays a central role in cholesterol transport. The cholesterol content of membranes is important in cellular processes such as modulating gene transcription and signal transduction both in the adult brain and during neurodevelopment. There are six apo L genes located in close proximity to each other on chromosome 22q12 in humans. 22q12 is a confirmed high-susceptibility locus for schizophrenia and close to the region associated with velocardiofacial syndrome that includes symptoms of schizophrenia. 45362 pfam05462: Slime mold cyclic AMP receptor. This family consists of cyclic AMP receptor (CAR) proteins from slime molds. CAR proteins are responsible for controlling development in Dictyostelium discoideum. 45363 pfam05463: Sclerostin (SOST). This family contains several mammalian sclerostin (SOST) proteins. SOST is thought to suppress bone formation. Mutations of the SOST gene lead to sclerosteosis, a progressive sclerosing bone dysplasia with an autosomal recessive mode of inheritance. Radiologically, it is characterised by a generalised hyperostosis and sclerosis leading to a markedly thickened and sclerotic skull, with mandible, ribs, clavicles and all long bones also being affected. Due to narrowing of the foramina of the cranial nerves, facial nerve palsy, hearing loss and atrophy of the optic nerves can occur. Sclerosteosis is clinically and radiologically very similar to van Buchem disease, mainly differentiated by hand malformations and a large stature in sclerosteosis patients. 45364 pfam05464: Phi-29-like late genes activator (early protein GP4). This family consists of phi-29-like late genes activator (or early protein GP4). This protein is thought to be a positive regulator of late transcription and may function as a sigma like component of the host RNA polymerase. 45365 pfam05465: Halobacterial gas vesicle protein C (GVPC). This family consists of Halobacterium gas vesicle protein C sequences which are thought to confer stability to the gas vesicle membranes. 45366 pfam05466: Brain acid soluble protein 1 (BASP1 protein). This family consists of several brain acid soluble protein 1 (BASP1) or neuronal axonal membrane protein NAP-22. The BASP1 is a neuron enriched Ca(2+)-dependent calmodulin-binding protein of unknown function. 45367 pfam05467: Herpesvirus glycoprotein U47. 45368 pfam05468: Bacillus ATP synthase I. 45369 pfam05469: Eukaryotic translation initiation factor 3 subunit 8 C-terminus (eIF3c_C). The largest of the mammalian translation initiation factors, eIF3, consists of at least eight subunits ranging in mass from 35 to 170 kDa. eIF3 binds to the 40 S ribosome in an early step of translation initiation and promotes the binding of methionyl-tRNAi and mRNA. 45370 pfam05470: Eukaryotic translation initiation factor 3 subunit 8 N-terminus (eIF3c_N). The largest of the mammalian translation initiation factors, eIF3, consists of at least eight subunits ranging in mass from 35 to 170 kDa. eIF3 binds to the 40 S ribosome in an early step of translation initiation and promotes the binding of methionyl-tRNAi and mRNA. 45371 pfam05471: Podocalyxin. This family consists of several eukaryotic podocalyxin proteins. Podocalyxin is a major membrane protein of the glomerular epithelium and is thought to be involved in maintenance of the architecture of the foot processes and filtration slits characteristic of this unique epithelium by virtue of its high negative charge. Podocalyxin functions as an anti-adhesin that maintains an open filtration pathway between neighbouring foot processes in the glomerular epithelium by charge repulsion. 45372 pfam05472: DNA replication terminus site-binding protein (Ter protein). This family contains several bacterial Ter proteins. The Ter protein specifically binds to DNA replication terminus sites on the host and plasmid genome and then blocks progress of the DNA replication fork. 45373 pfam05473: Herpes simplex UL45 protein. This family consists several UL45 proteins specifically found in the herpes simplex virus family. The herpes simplex virus UL45 gene encodes an 18 kDa virion envelope protein whose function remains unknown. It has been suggested that the 18 kDa UL45 gene product is required for efficient growth in the central nervous system at low doses and may play an important role under the conditions of a naturally acquired infection. 45374 pfam05474: Semenogelin. This family consists of several mammalian semenogelin (I and II) proteins. Freshly ejaculated human semen has the appearance of a loose gel in which the predominant structural protein components are the seminal vesicle secreted semenogelins (Sg).. 45375 pfam05475: Chlamydia virulence protein PGP3-D. This family consists of Chlamydia virulence proteins which are thought to be required for growth within mammalian cells. 45376 pfam05476: PET122. The nuclear PET122 gene of S. cerevisiae encodes a mitochondrial-localised protein that activates initiation of translation of the mitochondrial mRNA from the COX3 gene, which encodes subunit III of cytochrome c oxidase. 45377 pfam05477: Surfeit locus protein 2 (SURF2). Surfeit locus protein 2 is part of a group of at least six sequence unrelated genes (Surf-1 to Surf-6). The six Surfeit genes have been classified as housekeeping genes, being expressed in all tissue types tested and not containing a TATA box in their promoter region. The exact function of SURF2 is unknown. 45378 pfam05478: Prominin. The prominins are an emerging family of proteins that among the multispan membrane proteins display a novel topology. Mouse prominin and human prominin (mouse)-like 1 (PROML1) are predicted to contain five membrane spanning domains, with an N-terminal domain exposed to the extracellular space followed by four, alternating small cytoplasmic and large extracellular, loops and a cytoplasmic C-terminal domain. The exact function of prominin is unknown although in humans defects in PROM1, the gene coding for prominin, cause retinal degeneration. 45379 pfam05479: Photosystem I reaction centre subunit N (PSAN or PSI-N). This family contains several Photosystem I reaction centre subunit N (PSI-N) proteins. The protein has no known function although it is localised in the thylakoid lumen. PSI-N is a small extrinsic subunit at the lumen side and is very likely involved in the docking of plastocyanin. 45380 pfam05480: Staphylococcus haemolytic protein. This family consists of several different short Staphylococcal proteins, it contains SLUSH A, B and C proteins as well as haemolysin and gonococcal growth inhibitor. Some strains of the coagulase-negative Staphylococcus lugdunensis produce a synergistic hemolytic activity (SLUSH), phenotypically similar to the delta-hemolysin of S. aureus. Gonococcal growth inhibitor from Staphylococcus act on the cytoplasmic membrane of the gonococcal cell causing cytoplasmic leakage and, eventually, death. 45381 pfam05481: Mycobacterium 19 kDa lipoprotein antigen. Most of the antigens of Mycobacterium leprae and M. tuberculosis that have been identified are members of stress protein families, which are highly conserved throughout many diverse species. Of the M. leprae and M. tuberculosis antigens identified by monoclonal antibodies, all except the 18-kDa M. leprae antigen and the 19-kDa M. tuberculosis antigen are strongly cross-reactive between these two species and are coded within very similar genes. 45382 pfam05482: Serendipity locus alpha protein (SRY-A). The Drosophila serendipity alpha (sry alpha) gene is specifically transcribed at the blastoderm stage, from nuclear cycle 11 to the onset of gastrulation, in all somatic nuclei. SRY-A is required for the cellularisation of the embryo and is involved in the localisation of the actin filaments just prior to and during plasma membrane invagination. 45383 pfam05483: Synaptonemal complex protein 1 (SCP-1). Synaptonemal complex protein 1 (SCP-1) is the major component of the transverse filaments of the synaptonemal complex. Synaptonemal complexes are structures that are formed between homologous chromosomes during meiotic prophase. 45384 pfam05484: LRV protein FeS4 cluster. This Iron sulphur cluster is found at the N-terminus of some proteins containing pfam01816 repeats. 45385 pfam05485: THAP domain. This THAP domain is a putative DNA-binding domain with a C2CH architecture that probably binds a zinc ion. 45386 pfam05486: Signal recognition particle 9 kDa protein (SRP9). This family consists of several eukaryotic SRP9 proteins. SRP9 together with the Alu-homologous region of 7SL RNA and SRP14 comprise the ""A lu domain"" of SRP, which mediates pausing of synthesis of ribosome associated nascent polypeptides that have been engaged by the targeting domain of SRP. 45387 pfam05488: PAAR motif. This motif is found usually in pairs in a family of bacterial membrane proteins. It is also found as a triplet of tandem repeats comprising the entire length in a another family of hypothetical proteins. 45388 pfam05489: Phage Tail Protein X. This domain is found in a family of phage tail proteins. Visual analysis suggests that it is related to pfam01476 (personal obs: C Yeats). The functional annotation of family members further confirms this hypothesis. 45389 pfam05491: Holliday junction DNA helicase ruvB C-terminus. The RuvB protein makes up part of the RuvABC revolvasome which catalyses the resolution of Holliday junctions that arise during genetic recombination and DNA repair. Branch migration is catalysed by the RuvB protein that is targeted to the Holliday junction by the structure specific RuvA protein. This family consists of the C-terminal region of the RuvB protein which is thought to be helicase DNA-binding domain. . 45390 pfam05492: NAF1 domain. This domain is involved in snoRNP biogenesis. 45391 pfam05493: ATP synthase subunit H. ATP synthase subunit H is an extremely hydrophobic of approximately 9 kDa. This subunit may be required for assembly of vacuolar ATPase. 45392 pfam05494: Toluene tolerance, Ttg2. Toluene tolerance is mediated by increased cell membrane rigidity resulting from changes in fatty acid and phospholipid compositions, exclusion of toluene from the cell membrane, and removal of intracellular toluene by degradation. Many proteins are involved in these processes. This family is a transporter which shows similarity to ABC transporters. 45393 pfam05495: CHY zinc finger. This family of domains are likely to bind to zinc ions. They contain many conserved cysteine and histidine residues. We have named this domain after the N-terminal motif CXHY. This domain can be found in isolation in some proteins, but is also often associated with pfam00097. 45394 pfam05496: Holliday junction DNA helicase ruvB N-terminus. The RuvB protein makes up part of the RuvABC revolvasome which catalyses the resolution of Holliday junctions that arise during genetic recombination and DNA repair. Branch migration is catalysed by the RuvB protein that is targeted to the Holliday junction by the structure specific RuvA protein. This family contains the N-terminal region of the protein. . 45395 pfam05497: Destabilase. Destabilase is an endo-epsilon(gamma-Glu)-Lys isopeptidase, which cleaves isopeptide bonds formed by transglutaminase (Factor XIIIa) between glutamine gamma-carboxamide and the epsilon-amino group of lysine. 45396 pfam05498: Rapid ALkalinization Factor (RALF). RALF, a 5-kDa ubiquitous polypeptide in plants, arrests root growth and development. 45397 pfam05499: DNA methyltransferase 1-associated protein 1 (DMAP1). DNA methylation can contribute to transcriptional silencing through several transcriptionally repressive complexes, which include methyl-CpG binding domain proteins (MBDs) and histone deacetylases (HDACs). The chief enzyme that maintains mammalian DNA methylation, DNMT1, can also establish a repressive transcription complex. The non-catalytic amino terminus of DNMT1 binds to HDAC2 and DMAP1 (for DNMT1 associated protein), and can mediate transcriptional repression. DMAP1 has intrinsic transcription repressive activity, and binds to the transcriptional co-repressor TSG101. DMAP1 is targeted to replication foci through interaction with the far N terminus of DNMT1 throughout S phase, whereas HDAC2 joins DNMT1 and DMAP1 only during late S phase, providing a platform for how histones may become deacetylated in heterochromatin following replication. 45398 pfam05500: NisC-like family. Lantibiotics are peptide-derived, post-translationally modified antimicrobials produced by several bacterial strains. Lantibiotic peptides contain thioether bridges termed lanthionines that are putatively generated by dehydration of Ser and Thr residues followed by addition of cysteine residues within the peptide. This family of proteins catalyse the formation of the thioether ring from the cysteines. 45399 pfam05501: Domain of unknown function (DUF755). This family is predominated by ORFs from Circoviridae. The function of this family remains to be determined. 45400 pfam05502: Dynactin p62 family. Dynactin is a multi-subunit complex and a required cofactor for most, or all, of the cellular processes powered by the microtubule-based motor cytoplasmic dynein. p62 binds directly to the Arp1 subunit of dynactin. 45401 pfam05503: Poxvirus G7-like. 45402 pfam05504: Spore germination B3/ GerAC like, C-terminal. The GerAC protein of the Bacillus subtilis spore is required for the germination response to L-alanine. Members of this family are thought to be located in the inner spore membrane. Although the function of this family is unclear, they are likely to encode the components of the germination apparatus that respond directly to this germinant, mediating the spore's response. 45403 pfam05505: Ebola nucleoprotein. This family consists of Ebola and Marburg virus nucleoproteins. These proteins are responsible for encapsidation of genomic RNA. It has been found that nucleoprotein DNA vaccines can offer protection from the virus. 45404 pfam05506: Domain of unknown function (DUF756). This domain is found, normally as a tandem repeat, at the C-terminus of bacterial phospholipase C proteins. 45405 pfam05507: Microfibril-associated glycoprotein (MAGP). This family consists of several mammalian microfibril-associated glycoprotein (MAGP) 1 and 2 proteins. MAGP1 and 2 are components of elastic fibres. MAGP-1 has been proposed to bind a C-terminal region of tropoelastin, the soluble precursor of elastin. MAGP-2 was found to interact with fibrillin-1 and -2, as well as fibulin-1, another component of elastic fibres this suggests that MAGP-2 may be important in the assembly of microfibrils. 45406 pfam05508: RanGTP-binding protein. The small Ras-like GTPase Ran plays an essential role in the transport of macromolecules in and out of the nucleus and has been implicated in spindle and nuclear envelope formation during mitosis in higher eukaryotes. The S. cerevisiae ORF YGL164c encoding a novel RanGTP-binding protein, termed Yrb30p was identified. The protein competes with yeast RanBP1 (Yrb1p) for binding to the GTP-bound form of yeast Ran (Gsp1p) and is, like Yrb1p, able to form trimeric complexes with RanGTP and some of the karyopherins. 45407 pfam05509: TraY family. This family consists of several enterobacterial TraY proteins. TraY is involved in bacterial conjugation where it is required for efficient nick formation in the F plasmid. 45408 pfam05510: Sarcoglycan alpha/epsilon. Sarcoglycans are a subcomplex of transmembrane proteins which are part of the dystrophin-glycoprotein complex. They are expressed in the skeletal, cardiac and smooth muscle. Although numerous studies have been conducted on the sarcoglycan subcomplex in skeletal and cardiac muscle, the manner of the distribution and localisation of these proteins along the nonjunctional sarcolemma is not clear. This family contains alpha and epsilon members. 45409 pfam05511: Mitochondrial ATP synthase coupling factor 6. Coupling factor 6 (F6) is a component of mitochondrial ATP synthase which is required for the interactions of the catalytic and proton-translocating segments. . 45410 pfam05512: AWPM-19-like family. Members of this family are 19 kDa membrane proteins. The levels of the plant protein AWPM-19 increase dramatically when there is an increase level of abscisic acid. The increase presence of this protein leads to greater tolerance of freezing. 45411 pfam05513: TraA. Conjugative transfer of a bacteriocin plasmid, pPD1, of Enterococcus faecalis is induced in response to a peptide sex pheromone, cPD1, secreted from plasmid-free recipient cells. cPD1 is taken up by a pPD1 donor cell and binds to an intracellular receptor, TraA. Once a recipient cell acquires pPD1, it starts to produce an inhibitor of cPD1, termed iPD1, which functions as a TraA antagonist and blocks self-induction in donor cells. TraA transduces the signal of cPD1 to the mating response. 45412 pfam05514: HR-like lesion-inducing. Family of plant proteins that are associated with the hypersensitive response (HR) pathway of defence against plant pathogens. 45413 pfam05515: Viral nucleic acid binding. This family is common to ssRNA positive-strand viruses and are commonly described as nucleic acid binding proteins (NABP).. 45414 pfam05516: Domain of Unknown Function (DUF757). Family of eukaryotic proteins with undetermined function. 45415 pfam05517: p25-alpha. This family encodes a 25 kDa protein that is phosphorylated by a Ser/Thr-Pro kinase. It has been described as a brain specific protein, but it is found in Tetrahymena thermophila. 45416 pfam05518: Totivirus coat protein. 45417 pfam05519: Merozoite surface protein 4/5 (MSP4/5). This family consists of Merozoite surface proteins 4 and 5 (MSP4/5). MSP4 is a protein with apparent molecular mass of 40 kDa that is synthesised by mature stage parasites and anchored to the merozoite membrane by a glycosylphosphatidylinositol moiety. MSP4 is immunogenic in both laboratory animals and during natural infection. Antibodies raised to this protein can inhibit parasite growth in vitro. Its homologue in the rodent malaria species Plasmodium yoelii, PyMSP4/5, is capable of conferring significant protection against lethal challenge in mice. All of these suggest that MSP4 is a candidate for inclusion in an effective asexual-stage malaria vaccine. 45418 pfam05520: Citrus tristeza virus P18 protein. 45419 pfam05521: Phage head-tail joining protein. 45420 pfam05522: Metallothionein. This family consists of metallothioneins from several worm and sea urchin species. Metallothioneins are low molecular weight, cysteine rich proteins known to be involved in heavy metal detoxification and homeostasis. 45421 pfam05523: WxcM-like, C-terminal. 45422 pfam05524: PEP-utilising enzyme, N-terminal. 45423 pfam05525: Branched-chain amino acid transport protein. This family consists of several bacterial branched-chain amino acid transport proteins which are responsible for the transport of leucine, isoleucine and valine via proton motive force. 45424 pfam05526: Rhodococcus equi virulence-associated protein. This family consists of several virulence-associated proteins from Rhodococcus equi. Rhodococcus equi is an important pulmonary pathogen of foals and is increasingly isolated from pneumonic infections and other infections in human immunodeficiency virus (HIV)-infected patients. Isolates from foals possess a large virulence plasmid, varying in size from 80 to 90 kb. Isolates lacking the plasmid are avirulent to foals. Little is known about the function of the plasmid apart from its encoding a virulence associated surface proteins. 45425 pfam05527: Domain of unknown function (DUF758). Family of eukaryotic proteins with unknown function, which are induced by tumour necrosis factor. 45426 pfam05528: Coronavirus gene 5 protein. Infectious bronchitis virus (IBV), a member of Coronaviridae family, has a single-stranded positive-sense RNA genome, which is 27 kb in length. Gene 5 contains two (5a and 5b) open reading frames. The function of the 5a and 5b proteins is unknown. 45427 pfam05529: B-cell receptor-associated protein 31-like. Bap31 is a polytopic integral protein of the endoplasmic reticulum membrane and a substrate of caspase-8. Bap31 is cleaved within its cytosolic domain, generating pro-apoptotic p20 Bap31. 45428 pfam05531: Nucleopolyhedrovirus P10 protein. This family consists of several nucleopolyhedrovirus P10 proteins which are thought to be involved in the morphogenesis of the polyhedra. 45429 pfam05532: CsbD-like. CsbD is a bacterial general stress response protein. It's expression is mediated by sigma-B, an alternative sigma factor. The role of CsbD in stress response is unclear. 45430 pfam05533: Beet yellows virus-type papain-like endopeptidase C42. Members of the Closteroviridae and Potyviridae families of plant positive-strand RNA viruses encode one or two papain-like leader proteinases, belonging to Merops peptidase family C42. 45431 pfam05534: HicB family. This family consists of several bacterial HicB related proteins. The function of HicB is unknown although it is thought to be involved in pilus formation. It has been speculated that HicB performs a function antagonistic to that of pili and yet is necessary for invasion of certain niches. 45432 pfam05535: Chromadorea ALT protein. This family consists of several ALT protein homologues found in nematodes. Lymphatic filariasis is a major tropical disease caused by the mosquito borne nematodes Brugia and Wuchereria. About 120 million people are infected and at risk of lymphatic pathology such as acute lymphangitis and elephantiasis. Expression of alt-1 and alt-2 is initiated midway through development in the mosquito, peaking in the infective larva and declining sharply following entry into the host. ALT-1 and the closely related ALT-2 have been found to be strong candidates for a future vaccine against human filariasis. 45433 pfam05536: Neurochondrin. This family contains several eukaryotic neurochondrin proteins. Neurochondrin induces hydroxyapatite resorptive activity in bone marrow cells resistant to bafilomycin A1, an inhibitor of macrophage- and osteoclast-mediated resorption. Expression of the gene is localised to chondrocyte, osteoblast, and osteocyte in the bone and to the hippocampus and Purkinje cell layer of cerebellum in the brain. 45434 pfam05537: Borrelia burgdorferi protein of unknown function (DUF759). This family consists of several uncharacterised proteins from the Lyme disease spirochete Borrelia burgdorferi. 45435 pfam05538: Campylobacter major outer membrane protein. This family consists of Campylobacter major outer membrane proteins. The major outer membrane protein (MOMP), a putative porin and a multifunction surface protein of Campylobacter jejuni, may play an important role in the adaptation of the organism to various host environments. 45436 pfam05539: Pneumovirinae attachment membrane glycoprotein G. 45437 pfam05540: Serpulina hyodysenteriae variable surface protein. This family consists of several variable surface proteins from Serpulina hyodysenteriae. 45438 pfam05541: Entomopoxvirus spheroidin protein. Entomopoxviruses (EPVs) are large (300-400 nm) oval-shaped viruses replicating in the cytoplasm of their insect host cells. At the end of their replicative cycle EPVs virions are occluded in a highly expressed protein called spheroidin. This protein forms large (5-20 mm long) oval-shaped occlusion bodies (OBs) called spherules. The infectious cycle of EPVs begins with the ingestion by the insect host of the spherules, their dissolution by the alkaline reducing conditions of the midgut fluid and the release of virions in the midgut lumen. The infective particles first replicate in midgut epithelial cells, then pass the gut barrier to colonise the internal tissues, mainly the fat body cells. Whilst spheroidin has been demonstrated to be non-essential for viral replication, it plays an essential role in the natural biological cycle of the virus in protecting virions from adverse environmental conditions (e.g. UV degradation) and thus improving transmission efficacy. In this respect, spheroidins are functionally similar to polyhedrins of baculoviruses or cypoviruses. 45439 pfam05542: Protein of unknown function (DUF760). This family contains several uncharacterised plant proteins. 45440 pfam05543: Staphopain peptidase C47. Staphopains are one of four major families of proteinases secreted by the Gram-positive Staphylococcus aureus. These staphylococcal cysteine proteases are secreted as preproenzymes that are proteolytically cleaved to generate the mature enzyme. 45441 pfam05544: Proline racemase. This family consists of proline racemase (EC 5.1.1.4) proteins which catalyse the interconversion of L- and D-proline in bacteria. This family also contains several similar eukaryotic proteins including a sequence with B-cell mitogenic properties, which has been characterised as a co-factor-independent proline racemase. 45442 pfam05545: Cbb3-type cytochrome oxidase component FixQ. This family consists of several Cbb3-type cytochrome oxidase components (FixQ/CcoQ). FixQ is found in nitrogen fixing bacteria. Since nitrogen fixation is an energy-consuming process, effective symbioses depend on operation of a respiratory chain with a high affinity for O2, closely coupled to ATP production. This requirement is fulfilled by a special three-subunit terminal oxidase (cytochrome terminal oxidase cbb3), which was first identified in Bradyrhizobium japonicum as the product of the fixNOQP operon. 45443 pfam05546: She9 / Mdm33 family. Members of this family are mitochondrial inner membrane proteins with a role in inner mitochondrial membrane organisation and biogenesis. 45444 pfam05547: Immune inhibitor A peptidase M6. The insect pathogenic Gram-positive Bacillus thuringiensis secretes immune inhibitor A, a metallopeptidase, which specifically cleaves host antibacterial proteins. A homologue of immune inhibitor A, PrtV, has been identified in the Gram-negative human pathogen Vibrio cholerae. 45445 pfam05548: Gametolysin peptidase M11. In the unicellular biflagellated alga, Chlamydomonas reinhardtii, gametolysin, a zinc-containing metallo-protease, is responsible for the degradation of the cell wall. Homologues of gametolysin have also been reported in the simple multicellular organism, Volvox. 45446 pfam05549: Allexivirus 40kDa protein. 45447 pfam05550: Pestivirus Npro endopeptidase C53. Unique to pestiviruses, the N-terminal protein encoded by the bovine viral diarrhoea virus genome is a cysteine protease (Npro) responsible for a self-cleavage that releases the N terminus of the core protein. This unique protease is dispensable for viral replication, and its coding region can be replaced by a ubiquitin gene directly fused in frame to the core. 45448 pfam05551: Protein of unknown function (DUF1519). This family consists of several putative homing endonuclease proteins of around 245 residues in length which appear to be found exclusively in Naegleria species. The function of this family is unclear. 45449 pfam05552: Conserved TM helix. This alignment represents a conserved transmembrane helix as well as some flanking sequence. It is often found in association with pfam00924. 45450 pfam05553: Cotton fibre expressed protein. This family consists of several plant proteins of unknown function. Three of the sequences (from Gossypium hirsutum) in this family are described as cotton fibre expressed proteins. The remaining sequences, found in Arabidopsis thaliana, are uncharacterised. 45451 pfam05554: Viral hemorrhagic septicemia virus non-virion protein. This family consists of several viral hemorrhagic septicemia virus non-virion (Nv) proteins. The NV protein is a nonstructural protein absent from mature virions although it is present in infected cells. The function of this protein is unknown. 45452 pfam05555: Coxiella burnetii protein of unknown function (DUF762). This family consists several of several uncharacterised proteins from the bacterium Coxiella burnetii. Coxiella burnetii is the causative agent of the Q fever disease. 45453 pfam05556: Calcineurin-binding protein (Calsarcin). This family consists of several mammalian calcineurin-binding proteins. The calcium- and calmodulin-dependent protein phosphatase calcineurin has been implicated in the transduction of signals that control the hypertrophy of cardiac muscle and slow fibre gene expression in skeletal muscle. Calsarcin-1 and calsarcin-2 are expressed in developing cardiac and skeletal muscle during embryogenesis, but calsarcin-1 is expressed specifically in adult cardiac and slow-twitch skeletal muscle, whereas calsarcin-2 is restricted to fast skeletal muscle. Calsarcins represent a novel family of sarcomeric proteins that link calcineurin with the contractile apparatus, thereby potentially coupling muscle activity to calcineurin activation. Calsarcin-3, is expressed specifically in skeletal muscle and is enriched in fast-twitch muscle fibres. Like calsarcin-1 and calsarcin-2, calsarcin-3 interacts with calcineurin, and the Z-disc proteins alpha-actinin, gamma-filamin, and telethonin. 45454 pfam05557: Mitotic checkpoint protein. This family consists of several eukaryotic mitotic checkpoint (Mitotic arrest deficient or MAD) proteins. The mitotic spindle checkpoint monitors proper attachment of the bipolar spindle to the kinetochores of aligned sister chromatids and causes a cell cycle arrest in prometaphase when failures occur. Multiple components of the mitotic spindle checkpoint have been identified in yeast and higher eukaryotes. In S.cerevisiae, the existence of a Mad1-dependent complex containing Mad2, Mad3, Bub3 and Cdc20 has been demonstrated. . 45455 pfam05558: DREPP plasma membrane polypeptide. This family contains several plant plasma membrane proteins termed DREPPs as they are developmentally regulated plasma membrane polypeptides. 45456 pfam05559: Protein of unknown function (DUF763). This family consists of several uncharacterised bacterial and archaeal proteins of unknown function. 45457 pfam05560: Bacillus thuringiensis P21 molecular chaperone protein. This family contains several Bacillus thuringiensis P21 proteins. These proteins are thought to be molecular chaperones and have mosquitocidal properties. 45458 pfam05561: Borrelia burgdorferi protein of unknown function (DUF764). This family consists of proteins of unknown function from Borrelia burgdorferi (Lyme disease spirochete).. 45459 pfam05562: Cold acclimation protein WCOR413. This family consists of several WCOR413-like plant cold acclimation proteins. 45460 pfam05563: Salmonella plasmid virulence protein SpvD. This family consists of several SpvD plasmid virulence proteins from different Salmonella species. 45461 pfam05564: Dormancy/auxin associated protein. This family contains several plant dormancy-associated and auxin-repressed proteins the function of which are poorly understood. 45462 pfam05565: Siphovirus Gp157. This family contains both viral and bacterial proteins which are related to the Gp157 protein of the Streptococcus thermophilus SFi bacteriophages. It is thought that bacteria possessing the gene coding for this protein have an increased resistance to the bacteriophage. 45463 pfam05566: Orthopoxvirus interleukin 18 binding protein. Interleukin-18 (IL-18) is a proinflammatory cytokine that plays a key role in the activation of natural killer and T helper 1 cell responses principally by inducing interferon-gamma (IFN-gamma). Several poxvirus genes encode proteins with sequence similarity to IL-18BPs. It has been shown that vaccinia, ectromelia and cowpox viruses secrete from infected cells a soluble IL-18BP (vIL-18BP) that may modulate the host antiviral response. The expression of vIL-18BPs by distinct poxvirus genera that cause local or general viral dissemination, or persistent or acute infections in the host, emphasises the importance of IL-18 in response to viral infections. 45464 pfam05567: Neisseria PilC protein. This family consists of several PilC protein sequences from Neisseria gonorrhoeae and N. meningitidis. PilC is a phase-variable protein associated with pilus-mediated adherence of pathogenic Neisseria to target cells. 45465 pfam05568: African swine fever virus J13L protein. This family consists of several African swine fever virus J13L proteins. 45466 pfam05569: BlaR1 peptidase M56. Production of beta-Lactamase and penicillin-binding protein 2a (which mediate staphylococcal resistance to beta-lactam antibiotics) is regulated by a signal-transducing integral membrane protein and a transcriptional repressor. The signal transducer is a fusion protein with penicillin-binding and zinc metalloprotease domains. The signal for protein expression is transmitted by site-specific proteolytic cleavage of both the transducer, which auto-activates, and the repressor, which is inactivated, unblocking gene transcription. Homologues to this peptidase domain, which corresponds to Merops family M56, are also found in a number of other bacterial genome sequences. 45467 pfam05570: Circovirus protein of unknown function (DUF765). This family consists of several short (27-30aa) porcine and bovine circovirus ORF6 proteins of unknown function. 45468 pfam05571: Protein of unknown function (DUF766). This family consists of several eukaryotic proteins of unknown function. 45469 pfam05572: Pregnancy-associated plasma protein-A. Pregnancy-associated plasma protein A (PAPP-A) is a metallo-protease belonging to Merops family M43. It cleaves insulin-like growth factor (IGF) binding protein-4 (IGFBP-4), causing a dramatic reduction in its affinity for IGF-I and -II. Through this mechanism, PAPP-A is a regulator of IGF bioactivity in several systems, including the human ovary and the cardiovascular system. 45470 pfam05573: NosL. NosL is one of the accessory proteins of the nos (nitrous oxide reductase) gene cluster. NosL is a monomeric protein of 18,540 MW that specifically and stoichiometrically binds Cu(I). The copper ion in NosL is ligated by a Cys residue, and one Met and one His are thought to serve as the other ligands. It is possible that NosL is a copper chaperone involved in metallocenter assembly. 45471 pfam05574: Zincin metallopeptidase M47. The zincins are a superfamily of structurally-related zinc-binding metallopeptidases that play a major role in a wide range of biological processes including pattern formation, growth factor activation and extracellular matrix synthesis and degradation. . 45472 pfam05575: Vibrio cholerae RfbT protein. This family consists of several RfbT proteins from Vibrio cholerae. It has been found that genetic alteration of the rfbT gene is responsible for serotype conversion of Vibrio cholerae O1 and determines the difference between the Ogawa and Inaba serotypes, in that the presence of rfbT is sufficient for Inaba-to-Ogawa serotype conversion. 45473 pfam05576: PS-10 peptidase S37. These serine proteases have been found in Streptomyces species. 45474 pfam05577: Serine carboxypeptidase S28. These serine proteases include several eukaryotic enzymes such as lysosomal Pro-X carboxypeptidase, dipeptidyl-peptidase II, and thymus-specific serine peptidase. 45475 pfam05578: Pestivirus NS3 polyprotein peptidase S31. These serine peptidases are involved in processing of the flavivirus polyprotein. 45476 pfam05579: Equine arteritis virus serine endopeptidase S32. Serine peptidases involved in processing nidovirus polyprotein. 45477 pfam05580: SpoIVB peptidase S55. The protein SpoIVB plays a key role in signaling in the final sigma-K checkpoint of Bacillus subtilis. . 45478 pfam05581: Putative chymotrypsin-like protease PrtB. Interestingly these peptidases are found fused to methyl-accepting sensor proteins in several Gram-negative bacteria. 45479 pfam05582: YabG peptidase U57. YabG is a protease involved in the proteolysis and maturation of SpoIVA and YrbA proteins, conserved with the cortex and/or coat assembly by Bacillus subtilis. 45480 pfam05583: Albicidin resistance domain. This region is found in albicidin resistance proteins. Its boundaries were determined by its existence as a tandem repeat in Burkholderia pseudomallei protein BPSL2084. 45481 pfam05584: Sulfolobus plasmid regulatory protein. This family consists of several plasmid regulatory proteins from the extreme thermophilic and acidophilic archaea Sulfolobus. 45482 pfam05585: Tas retrotransposon peptidase A16. These peptidases of unknown function found exclusively in a nematodes. 45483 pfam05586: Anthrax receptor C-terminus region. This region is found in the putatively cytoplasmic C-terminus of the anthrax receptor. 45484 pfam05587: Anthrax receptor extracellular domain. This region is found in the putatively extracellular N-terminal half of the anthrax receptor. It is probably part of the Ig superfamily and most closely related to pfam01833 (personal obs: C Yeats).. 45485 pfam05588: Clostridium botulinum HA-17 protein. This family consists of several Clostridium botulinum hemagglutinin (HA) subcomponents. Clostridium botulinum type D strain 4947 produces two different sizes of progenitor toxins (M and L) as intact forms without proteolytic processing. The M toxin is composed of neurotoxin (NT) and nontoxic-nonhemagglutinin (NTNHA), whereas the L toxin is composed of the M toxin and hemagglutinin (HA) subcomponents (HA-70, HA-17, and HA-33).. 45486 pfam05589: Protein of unknown function (DUF768). This family consists of several uncharacterised hypothetical proteins from Rhizobium loti. 45487 pfam05590: Xylella fastidiosa protein of unknown function (DUF769). This family consists of several uncharacterised hypothetical proteins of unknown function from Xylella fastidiosa, the organism that causes Pierce's disease in plants. 45488 pfam05591: Protein of unknown function (DUF770). This family consists of several proteins of unknown function from various bacterial species. 45489 pfam05592: Bacterial alpha-L-rhamnosidase. This family consists of bacterial rhamnosidase A and B enzymes. L-Rhamnose is abundant in biomass as a common constituent of glycolipids and glycosides, such as plant pigments, pectic polysaccharides, gums or biosurfactants. Some rhamnosides are important bioactive compounds. For example, terpenyl glycosides, the glycosidic precursor of aromatic terpenoids, act as important flavouring substances in grapes. Other rhamnosides act as cytotoxic rhamnosylated terpenoids, as signal substances in plants or play a role in the antigenicity of pathogenic bacteria. 45490 pfam05593: RHS Repeat. RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. 45491 pfam05594: Haemagluttinin repeat. This highly divergent repeat occurs in number of proteins implicated in cell aggregation. The Pfam alignment probably contains three such repeats (personal obs: C Yeats). These are likely to have a beta-helical structure. 45492 pfam05595: Domain of unknown function (DUF771). Family of uncharacterised ORFs found in Bacteriophage and Lactococcus lactis. 45493 pfam05596: Taeniidae antigen. This family consists of several antigen proteins from Taenia and Echinococcus (tapeworm) species. 45494 pfam05597: Poly(hydroxyalcanoate) granule associated protein (phasin). Polyhydroxyalkanoates (PHAs) are storage polyesters synthesised by various bacteria as intracellular carbon and energy reserve material. PHAs are accumulated as water-insoluble inclusions within the cells. This family consists of the phasins PhaF and PhaI which act as a transcriptional regulator of PHA biosynthesis genes. PhaF has been proposed to repress expression of the phaC1 gene and the phaIF operon. 45495 pfam05598: Sulfolobus solfataricus protein of unknown function (DUF772). This family consists of several proteins from Sulfolobus solfataricus described as first ORF in transposon ISC1212. 45496 pfam05599: Deltaretrovirus Tax protein. This family consists of Rex/Tax proteins from human and simian T-cell leukaemia viruses. The exact function of these proteins is unknown. 45497 pfam05600: Protein of unknown function (DUF773). This family contains several eukaryotic sequences which are thought to be CDK5 activator-binding proteins, however, the function of this family is unknown. 45498 pfam05601: Protein of unknown function (DUF774). This family consists of several uncharacterised Actinomycete proteins of unknown function. 45499 pfam05602: Cleft lip and palate transmembrane protein 1 (CLPTM1). This family consists of several eukaryotic cleft lip and palate transmembrane protein 1 sequences. Cleft lip with or without cleft palate is a common birth defect that is genetically complex. The nonsyndromic forms have been studied genetically using linkage and candidate-gene association studies with only partial success in defining the loci responsible for orofacial clefting. CLPTM1 encodes a transmembrane protein and has strong homology to two Caenorhabditis elegans genes, suggesting that CLPTM1 may belong to a new gene family. This family also contains the human cisplatin resistance related protein CRR9p which is associated with CDDP-induced apoptosis. 45500 pfam05603: Protein of unknown function (DUF775). This family consists of several eukaryotic proteins of unknown function. 45501 pfam05604: Protein of unknown function (DUF776). This family consists of several highly related mouse and human proteins of unknown function. 45502 pfam05605: Drought induced 19 protein (Di19). This family consists of several drought induced 19 (Di19) like proteins. Di19 has been found to be strongly expressed in both the roots and leaves of Arabidopsis thaliana during progressive drought. The precise function of Di19 is unknown. 45503 pfam05606: Borrelia burgdorferi protein of unknown function (DUF777). This family consists of several hypothetical proteins of unknown function from Borrelia burgdorferi (Lyme disease spirochete).. 45504 pfam05607: Chlamydia inclusion membrane protein CT223. This family consists of several Chlamydia CT223 inclusion membrane proteins. 45505 pfam05608: Protein of unknown function (DUF778). This family consists of several eukaryotic proteins of unknown function. 45506 pfam05609: Lamina-associated polypeptide 1C (LAP1C). This family contains rat LAP1C proteins and several uncharacterised highly related sequences from both mice and humans. LAP1s (lamina-associated polypeptide 1s) are type 2 integral membrane proteins with a single membrane-spanning region of the inner nuclear membrane. LAP1s bind to both A- and B-type lamins and have a putative role in the membrane attachment and assembly of the nuclear lamina. 45507 pfam05610: Protein of unknown function (DUF779). This family consists of several bacterial proteins of unknown function. 45508 pfam05611: Caenorhabditis elegans protein of unknown function (DUF780). This family consists of several short C. elegans proteins of unknown function. 45509 pfam05612: Mouse protein of unknown function (DUF781). This family consists of uncharacterised mouse proteins of unknown function. 45510 pfam05613: Human herpesvirus U15 protein. 45511 pfam05614: Circovirus protein of unknown function (DUF782). This family consists of porcine and bovine circovirus proteins of unknown function. 45512 pfam05615: Protein of unknown function (DUF783). This family consists of several eukaryotic proteins of unknown function. 45513 pfam05616: Neisseria meningitidis TspB protein. This family consists of several Neisseria meningitidis TspB virulence factor proteins. 45514 pfam05617: Arabidopsis thaliana protein of unknown function (DUF784). This family consists of several proteins of unknown function found exclusively in Arabidopsis thaliana. 45515 pfam05618: Protein of unknown function (DUF785). This family consists of several hypothetical proteins from different archaeal and bacterial species. 45516 pfam05619: Borrelia burgdorferi protein of unknown function (DUF787). This family consists of several hypothetical proteins of unknown function from Borrelia burgdorferi (Lyme disease spirochete).. 45517 pfam05620: Protein of unknown function (DUF788). This family consists of several eukaryotic proteins of unknown function. 45518 pfam05621: Bacterial TniB protein. This family consists of several bacterial TniB NTP-binding proteins. TniB is a probable ATP-binding protein, which is involved in Tn5053 mercury resistance transposition. 45519 pfam05622: HOOK protein. This family consists of several HOOK1, 2 and 3 proteins from different eukaryotic organisms. The different members of the human gene family are HOOK1, HOOK2 and HOOK3. Different domains have been identified in the three human HOOK proteins, and it was demonstrated that the highly conserved NH2-domain mediates attachment to microtubules, whereas the central coiled-coil motif mediates homodimerisation and the more divergent C-terminal domains are involved in binding to specific organelles (organelle-binding domains). It has been demonstrated that endogenous HOOK3 binds to Golgi membranes, whereas both HOOK1 and HOOK2 are localised to discrete but unidentified cellular structures. In mice the Hook1 gene is predominantly expressed in the testis. Hook1 function is necessary for the correct positioning of microtubular structures within the haploid germ cell. Disruption of Hook1 function in mice causes abnormal sperm head shape and fragile attachment of the flagellum to the sperm head. 45520 pfam05623: Protein of unknown function (DUF789). This family consists of several plant proteins of unknown function. 45521 pfam05624: LISCH7. This family consists of mammalian LISCH7 protein homologues. LISCH7 is a liver-specific BHLH-ZIP transcription factor. 45522 pfam05625: PAXNEB protein. PAXNEB or PAX6 neighbour is found in several eukaryotic organisms. The function of this protein is unknown. 45523 pfam05626: Protein of unknown function (DUF790). This family consists of several hypothetical archaeal proteins of unknown function. 45524 pfam05627: Nitrate-induced NOI protein. This family consists of several plant nitrate induced or NOI proteins. 45525 pfam05628: Borrelia membrane protein P13. This family consists of P13 proteins from Borrelia species. P13 is a 13kDa integral membrane protein which is post-translationally processed at both ends and modified by an unknown mechanism. 45526 pfam05629: Nanovirus component 8 (C8) protein. This family consists of a group of 17.4 kDa nanovirus proteins which are highly related to the faba bean necrotic yellows virus component 8 protein whose function is unknown. 45527 pfam05630: Necrosis inducing protein (NPP1). This family consists of several NPP1 like necrosis inducing proteins from oomycetes, fungi and bacteria. Infiltration of NPP1 into leaves of Arabidopsis thaliana plants result in transcript accumulation of pathogenesis-related (PR) genes, production of ROS and ethylene, callose apposition, and HR-like cell death. 45528 pfam05631: Protein of unknown function (DUF791). This family consists of several eukaryotic proteins of unknown function. 45529 pfam05632: Borrelia burgdorferi protein of unknown function (DUF792). This family consists of several hypothetical proteins from the Lyme disease spirochete Borrelia burgdorferi. 45530 pfam05633: Protein of unknown function (DUF793). This family consists of several plant proteins of unknown function. 45531 pfam05634: Arabidopsis thaliana protein of unknown function (DUF794). This family consists of several proteins of unknown function from Arabidopsis thaliana. 45532 pfam05635: S23 ribosomal protein. This family consists of bacterial 23S rRNA proteins. 45533 pfam05636: Protein of unknown function (DUF795). This family consists of several bacterial proteins of unknown function. 45534 pfam05637: galactosyl transferase GMA12/MNN10 family. This family contains a number of glycosyltransferase enzymes that contain a DXD motif. This family includes a number of C. elegans homologues where the DXD is replaced by DXH. Some members of this family are included in glycosyltransferase family 34. 45535 pfam05638: Protein of unknown function (DUF796). This family consists of several bacterial proteins of unknown function. 45536 pfam05639: Protein of unknown function (DUF797). This family consists of several short bacterial proteins of unknown function. 45537 pfam05640: Protein of unknown function (DUF798). This family consists of several eukaryotic proteins of unknown function. 45538 pfam05641: Agenet domain. This domain is related to the TUDOR domain pfam00567. The function of the agenet domain is unknown. This family currently only matches one of the two Agenet domains in the FMR proteins. 45539 pfam05642: Sporozoite P67 surface antigen. This family consists of several Theileria P67 surface antigens. A stage specific surface antigen of Theileria parva, p67, is the basis for the development of an anti-sporozoite vaccine for the control of East Coast fever (ECF) in cattle. The antigen has been shown to contain five distinct linear peptide sequences recognised by sporozoite-neutralising murine monoclonal antibodies. 45540 pfam05643: Putative bacterial lipoprotein (DUF799). This family consists of several bacterial proteins of unknown function. Some of the family members are described as putative lipoproteins. 45541 pfam05644: Protein of unknown function (DUF800). This family consists of several eukaryotic proteins of unknown function. 45542 pfam05645: RNA polymerase III subunit RPC82. This family consists of several DNA-directed RNA polymerase III polypeptides which are related to the Saccharomyces cerevisiae RPC82 protein. RNA polymerase C (III) promotes the transcription of tRNA and 5S RNA genes. In Saccharomyces cerevisiae, the enzyme is composed of 15 subunits, ranging from 160 to about 10 kDa. 45543 pfam05646: Protein of unknown function (DUF786). This family consists of several eukaryotic proteins of unknown function. 45544 pfam05647: C. elegans protein of unknown function (DUF801). This family consists of a series of 29 residue long repeats found in a single C. elegans protein. The function of both the repeat and the whole sequence are unknown. 45545 pfam05648: Peroxisomal biogenesis factor 11 (PEX11). This family consists of several peroxisomal biogenesis factor 11 (PEX11) proteins from several eukaryotic species. The PEX11 peroxisomal membrane proteins promote peroxisome division in multiple eukaryotes. 45546 pfam05649: Peptidase family M13. M13 peptidases are well-studied proteases found in a wide range of organisms including mammals and bacteria. In mammals they participate in processes such as cardiovascular development, blood-pressure regulation, nervous control of respiration, and regulation of the function of neuropeptides in the central nervous system. In bacteria they may be used for digestion of milk. 45547 pfam05650: Domain of unknown function (DUF802). This region is found as two or more repeats in a small number of hypothetical proteins. 45548 pfam05651: Putative sugar diacid recognition. This region is found in several proteins characterised as carbohydrate diacid regulators. An HTH DNA-binding motif is found at the C-terminus of these proteins suggesting that this region includes the sugar recognition region. 45549 pfam05652: Scavenger mRNA decapping enzyme (DcpS). This family consists of several scavenger mRNA decapping enzymes (DcpS). DcpS is a scavenger pyrophosphatase that hydrolyses the residual cap structure following 3' to 5' decay of an mRNA. The association of DcpS with 3' to 5' exonuclease exosome components suggests that these two activities are linked and there is a coupled exonucleolytic decay-dependent decapping pathway. The family contains a histidine triad (HIT) sequence with three histidines separated by hydrophobic residues. The central histidine within the DcpS HIT motif is critical for decapping activity and defines the HIT motif as a new mRNA decapping domain, making DcpS the first member of the HIT family of proteins with a defined biological function. This family is related to pfam01230. 45550 pfam05653: Protein of unknown function (DUF803). This family consists of several eukaryotic proteins of unknown function. 45551 pfam05655: Pseudomonas avirulence D protein (AvrD). This family consists of several avirulence D (AvrD) proteins primarily found in Pseudomonas syringae. 45552 pfam05656: Protein of unknown function (DUF805). This family consists of several bacterial proteins of unknown function. 45553 pfam05657: Protein of unknown function (DUF806). This family consists of several Siphovirus and Lactococcus proteins of unknown function. The viral sequences are thought to be tail component proteins. 45554 pfam05658: Hep_Hag. This seven residue repeat makes up the majority sequence of a family of bacterial haemagglutinins and invasins. The representative alignment contains four repeats. 45555 pfam05659: Arabidopsis broad-spectrum mildew resistance protein RPW8. This family consists of several broad-spectrum mildew resistance proteins from Arabidopsis thaliana. Plant disease resistance (R) genes control the recognition of specific pathogens and activate subsequent defence responses. The Arabidopsis thaliana locus RESISTANCE TO POWDERY MILDEW 8 (RPW8) contains two naturally polymorphic, dominant R genes, RPW8.1 and RPW8.2, which individually control resistance to a broad range of powdery mildew pathogens. They induce localised, salicylic acid-dependent defences similar to those induced by R genes that control specific resistance. Apparently, broad-spectrum resistance mediated by RPW8 uses the same mechanisms as specific resistance. 45556 pfam05660: Coxiella burnetii protein of unknown function (DUF807). This family consists of several proteins of unknown function from Coxiella burnetii (the causative agent of a zoonotic disease called Q fever).. 45557 pfam05661: Protein of unknown function (DUF808). This family consists of several bacterial proteins of unknown function. 45558 pfam05662: Haemagglutinin. This short motif is found in invasins and haemagglutinins, normally associated with (pfam05658).. 45559 pfam05663: Protein of unknown function (DUF809). This family consists of several proteins of unknown function Raphanus sativus (Radish) and Brassica napus (Rape).. 45560 pfam05664: Protein of unknown function (DUF810). This family consists of several plant proteins of unknown function. 45561 pfam05665: Domain of unknown function (DUF811).. 45562 pfam05666: Fels-1 Prophage Protein-like. 45563 pfam05667: Protein of unknown function (DUF812). This family consists of several eukaryotic proteins of unknown function. 45564 pfam05668: Arabidopsis thaliana protein of unknown function (DUF813). This family consists of several uncharacterised proteins from Arabidopsis thaliana. 45565 pfam05669: SOH1. The family consists of Saccharomyces cerevisiae SOH1 homologues. SOH1 is responsible for the repression of temperature sensitive growth of the HPR1 mutant and has been found to be a component of the RNA polymerase II transcription complex. SOH1 not only interacts with factors involved in DNA repair, but transcription as well. Thus, the SOH1 protein may serve to couple these two processes. 45566 pfam05670: Domain of unknown function (DUF814). This domain occurs in proteins that have been annotated as Fibronectin/fibrinogen binding protein by similarity. This annotation comes from a sequence, where the N-terminal region is involved in this activity. Hence the activity of this C-terminal domain is unknown. This domain contains a conserved motif D/E-X-W/Y-X-H that may be functionally important. 45567 pfam05671: GETHR pentapeptide repeat (5 copies). This pentapeptide repeat is found mainly in C. elegans. The most conserved amino acid at each position leads to its name GETHR. The family also includes a divergent repeat in a microneme protein. The function of this repeat is unknown. 45568 pfam05672: E-MAP-115 family. The organisation of microtubules varies with the cell type and is presumably controlled by tissue-specific microtubule-associated proteins (MAPs). The 115-kDa epithelial MAP (E-MAP-115) has been identified as a microtubule-stabilising protein predominantly expressed in cell lines of epithelial origin. The binding of this microtubule associated protein is nucleotide independent. 45569 pfam05673: Protein of unknown function (DUF815). This family consists of several bacterial proteins of unknown function. 45570 pfam05674: Baculovirus protein of unknown function (DUF816). This family includes proteins that are about 200 amino acids in length. The proteins are all from baculoviruses. This family includes ORF107 from Orgyia pseudotsugata multicapsid polyhedrosis virus (OpMNPV) and a variety of other numbered ORF proteins, such as ORF52, ORF140. The function of these proteins is unknown. 45571 pfam05675: Protein of unknown function (DUF817). This family consists of several bacterial proteins of unknown function. 45572 pfam05676: NADH-ubiquinone oxidoreductase B18 subunit (NDUFB7). This family consists of several NADH-ubiquinone oxidoreductase B18 subunit proteins from different eukaryotic organisms. Oxidative phosphorylation is the well-characterised process in which ATP, the principal carrier of chemical energy of individual cells, is produced due to a mitochondrial proton gradient formed by the transfer of electrons from NADH and FADH2 to molecular oxygen. The oxidative phosphorylation (OXPHOS) system is located in the mitochondrial inner membrane and consists of five multi-subunit enzyme complexes and two small electron carriers: coenzyme Q10 and cytochrome C. At least 70 structural proteins involved in the formation of the whole OXPHOS system are encoded by nuclear genes, whereas 13 structural proteins are encoded by the mitochondrial genome. Deficiency of NADH ubiquinone oxidoreductase, the first enzyme complex of the mitochondrial respiratory chain, is one of the most frequent causes of human mitochondrial encephalomyopathies. 45573 pfam05677: Chlamydia CHLPS protein (DUF818). This family consists of several Chlamydia CHLPS proteins, the function of which are unknown. 45574 pfam05678: VQ motif. This short motif is found in a variety of plant proteins. These proteins vary greatly in length and are mostly composed of low complexity regions. They all conserve a short motif FXhVQChTG, where X is any amino acid and h is a hydrophobic amino acid. The function of this motif is uncertain, however one protein in this family has been found to bind the SigA sigma factor. It would seem plausible that this motif is needed for this activity and that this whole family might be involved in modulating plastid sigma factors (Bateman A pers. obs.).. 45575 pfam05679: Chondroitin N-acetylgalactosaminyltransferase. 45576 pfam05680: ATP synthase E chain. This family consists of several ATP synthase E chain sequences which are components of the CF(0) subunit. 45577 pfam05681: Fumarate hydratase (Fumerase). This family consists of several bacterial fumarate hydratase proteins FumA and FumB. Fumarase, or fumarate hydratase (EC 4.2.1.2), is a component of the citric acid cycle. In facultative anaerobes such as Escherichia coli, fumarase also engages in the reductive pathway from oxaloacetate to succinate during anaerobic growth. Three fumarases, FumA, FumB, and FumC, have been reported in E. coli. fumA and fumB genes are homologous and encode products of identical sizes which form thermolabile dimers of Mr 120,000. FumA and FumB are class I enzymes and are members of the iron-dependent hydrolases, which include aconitase and malate hydratase. The active FumA contains a 4Fe-4S centre, and it can be inactivated upon oxidation to give a 3Fe-4S centre. 45578 pfam05682: Phosphorylase kinase alpha/beta. This family consists of several eukaryotic phosphorylase kinase alpha and beta subunits. Phosphorylase kinase (PHK) is a regulatory enzyme in glycogen metabolism. Mutations in the gene encoding the alpha subunit of PHK (PHKA2) have been shown to be responsible for X-linked liver glycogenosis (XLG). XLG, a frequent type of glycogen storage disease, is characterised by hepatomegaly and growth retardation. 45579 pfam05683: Fumarase C-terminus. This family consists of the C terminal region of several bacterial fumarate hydratase proteins (FumA and FumB). Fumarase, or fumarate hydratase (EC 4.2.1.2), is a component of the citric acid cycle. In facultative anaerobes such as Escherichia coli, fumarase also engages in the reductive pathway from oxaloacetate to succinate during anaerobic growth. 45580 pfam05684: Protein of unknown function (DUF819). This family contains proteins of unknown function from archaeal, bacterial and plant species. 45581 pfam05685: Protein of unknown function (DUF820). This family consists of a number of hypothetical proteins from the Anabaena and Synechocystis cyanobacterial species. 45582 pfam05686: Arabidopsis thaliana protein of unknown function (DUF821). This family consists of a group of Arabidopsis thaliana proteins with no known function. 45583 pfam05687: Plant protein of unknown function (DUF822). This family consists of the N terminal regions of several plant proteins of unknown function. 45584 pfam05688: Salmonella repeat of unknown function (DUF824). This family consists of several repeated sequences of around 45 residues. 45585 pfam05689: Salmonella repeat of unknown function (DUF823). This family consists of a series of repeated sequences (of around 180 residues) which are found in Salmonella typhimurium and Salmonella typhi. Sequences from this family are almost always found with pfam05688. 45586 pfam05690: Thiazole biosynthesis protein ThiG. This family consists of several bacterial thiazole biosynthesis protein G sequences. ThiG , together with ThiF and ThiH, is proposed to be involved in the synthesis of 4-methyl-5-(b-hydroxyethyl)thiazole (THZ) which is an intermediate in the thiazole production pathway. 45587 pfam05691: Raffinose synthase or seed imbibition protein Sip1. This family consists of several raffinose synthase proteins, also known as seed imbibition (Sip1) proteins. Raffinose (O-alpha- D-galactopyranosyl- (1-->6)- O-alpha- D-glucopyranosyl-(1<-->2)- O-beta- D-fructofuranoside) is a widespread oligosaccharide in plant seeds and other tissues. Raffinose synthase (EC 2.4.1.82) is the key enzyme that channels sucrose into the raffinose oligosaccharide pathway. 45588 pfam05692: Mycoplasma haemagglutinin. This family consists of several haemagglutinin sequences from Mycoplasma synoviae and Mycoplasma gallisepticum. The major plasma membrane proteins, pMGAs, of Mycoplasma gallisepticum are cell adhesin (hemagglutinin) molecules. It has been shown that the genetic determinants that code for the haemagglutinins are organised into a large family of genes and that only one of these genes is predominately expressed in any given strain. 45589 pfam05693: Glycogen synthase. This family consists of the eukaryotic glycogen synthase proteins GYS1, GYS2 and GYS3. Glycogen synthase (GS) is the enzyme responsible for the synthesis of -1,4-linked glucose chains in glycogen. It is the rate limiting enzyme in the synthesis of the polysaccharide, and its activity is highly regulated through phosphorylation at multiple sites and also by allosteric effectors, mainly glucose 6-phosphate (G6P).. 45590 pfam05694: 56kDa selenium binding protein (SBP56). This family consists of several eukaryotic selenium binding proteins as well as three sequences from archaea. The exact function of this protein is unknown although it is thought that SBP56 participates in late stages of intra-Golgi protein transport. The Lotus japonicus homologue of SBP56, LjSBP is thought to have more than one physiological role and can be implicated in controlling the oxidation/reduction status of target proteins, in vesicular Golgi transport. 45591 pfam05695: Plant protein of unknown function (DUF825). This family consists of several plant proteins greater than 1000 residues in length. The function of this family is unknown. 45592 pfam05696: Protein of unknown function (DUF826). This family consists of several enterobacterial and siphoviral sequences of unknown function. 45593 pfam05697: Bacterial trigger factor protein (TF). In the E. coli cytosol, a fraction of the newly synthesised proteins requires the activity of molecular chaperones for folding to the native state. The major chaperones implicated in this folding process are the ribosome-associated Trigger Factor (TF), and the DnaK and GroEL chaperones with their respective co-chaperones. Trigger Factor is an ATP-independent chaperone and displays chaperone and peptidyl-prolyl-cis-trans-isomerase (PPIase) activities in vitro. It is composed of at least three domains, an N-terminal domain which mediates association with the large ribosomal subunit, a central substrate binding and PPIase domain with homology to FKBP proteins, and a C-terminal domain of unknown function. The positioning of TF at the peptide exit channel, together with its ability to interact with nascent chains as short as 57 residues renders TF a prime candidate for being the first chaperone that binds to the nascent polypeptide chains. This family represents the N-terminal region of the protein. 45594 pfam05698: Bacterial trigger factor protein (TF) C-terminus. In the E. coli cytosol, a fraction of the newly synthesised proteins requires the activity of molecular chaperones for folding to the native state. The major chaperones implicated in this folding process are the ribosome-associated Trigger Factor (TF), and the DnaK and GroEL chaperones with their respective co-chaperones. Trigger Factor is an ATP-independent chaperone and displays chaperone and peptidyl-prolyl-cis-trans-isomerase (PPIase) activities in vitro. It is composed of at least three domains, an N-terminal domain which mediates association with the large ribosomal subunit, a central substrate binding and PPIase domain with homology to FKBP proteins, and a C-terminal domain of unknown function. The positioning of TF at the peptide exit channel, together with its ability to interact with nascent chains as short as 57 residues renders TF a prime candidate for being the first chaperone that binds to the nascent polypeptide chains. This family represents the C-terminal region of the protein. 45595 pfam05699: hAT family dimerisation domain. This dimerisation domain is found at the C terminus of the transposases of elements belonging to the Activator superfamily (hAT element superfamily). The isolated dimerisation domain forms extremely stable dimers in vitro. 45596 pfam05700: Breast carcinoma amplified sequence 2 (BCAS2). This family consists of several eukaryotic sequences of unknown function. The mammalian members of this family are annotated as breast carcinoma amplified sequence 2 (BCAS2) proteins. BCAS2 is a putative spliceosome associated protein. 45597 pfam05701: Plant protein of unknown function (DUF827). This family consists of several plant proteins of unknown function. Several sequences in this family are described as being ""myosin heavy chain-like"".. 45598 pfam05702: Herpesvirus UL49.5 envelope/tegument protein. UL49.5 protein consists of 98 amino acids with a calculated molecular mass of 10,155 Da. It contains putative signal peptide and transmembrane domains but lacks a consensus sequence for N glycosylation. UL49.5 protein is an O-glycosylated structural component of the viral envelope. 45599 pfam05703: Plant protein of unknown function (DUF828). This family consists of several plant proteins of unknown function. 45600 pfam05704: Capsular polysaccharide synthesis protein. This family consists of several capsular polysaccharide proteins. Capsular polysaccharide (CPS) is a major virulence factor in Streptococcus pneumoniae. 45601 pfam05705: Eukaryotic protein of unknown function (DUF829). This family consists of several uncharacterised eukaryotic proteins. 45602 pfam05706: Cyclin-dependent kinase inhibitor 3 (CDKN3). This family consists of cyclin-dependent kinase inhibitor 3 or kinase associated phosphatase proteins from several mammalian species. The cyclin-dependent kinase (Cdk)-associated protein phosphatase (KAP) is a human dual specificity protein phosphatase that dephosphorylates Cdk2 on threonine 160 in a cyclin-dependent manner. . 45603 pfam05707: Zonular occludens toxin (Zot). This family consists of bacterial and viral proteins which are very similar to the Zonular occludens toxin (Zot). Zot is elaborated by bacteriophages present in toxigenic strains of Vibrio cholerae. Zot is a single polypeptide chain of 44.8 kDa, with the ability to reversibly alter intestinal epithelial tight junctions, allowing the passage of macromolecules through mucosal barriers. 45604 pfam05708: Orthopoxvirus protein of unknown function (DUF830). This family consists of several Orthopoxvirus proteins of unknown function. 45605 pfam05709: Siphovirus tail component protein. This family consists of several Siphovirus tail component proteins as well as some bacterial bacterial proteins of unknown function. 45606 pfam05710: Coiled coil. This region is found in a group of Dictyostelium discoideum proteins. It is likely to form a coiled-coil. Some of the proteins are regulated by cyclic AMP and are expressed late in development. . 45607 pfam05711: Macrocin-O-methyltransferase (TylF). This family consists of bacterial macrocin O-methyltransferase (TylF) proteins. TylF is responsible for the methylation of macrocin to produce tylosin. Tylosin is a macrolide antibiotic used in veterinary medicine to treat infections caused by Gram-positive bacteria and as an animal growth promoter in the swine industry. It is produced by several Streptomyces species. As with other macrolides, the antibiotic activity of tylosin is due to the inhibition of protein biosynthesis by a mechanism that involves the binding of tylosin to the ribosome, preventing the formation of the mRNA-aminoacyl-tRNA-ribosome complex. 45608 pfam05712: MRG. This family consists of three different eukaryotic proteins (mortality factor 4 (MORF4/MRG15), male-specific lethal 3(MSL-3) and ESA1-associated factor 3(EAF3)). It is thought that the MRG family is involved in transcriptional regulation via histone acetylation. 45609 pfam05713: Bacterial mobilisation protein (MobC). This family consists of several bacterial MobC-like, mobilisation proteins. MobC proteins belong to the group of relaxases. Together with MobA and MobB they bind to a single cis-active site of a mobilising plasmid, the origin of transfer (oriT) region. The absence of MobC has several different effects on oriT DNA. Site- and strand-specific nicking by MobA protein is severely reduced, accounting for the lower frequency of mobilisation. The localised DNA strand separation required for this nicking is less affected, but becomes more sensitive to the level of active DNA gyrase in the cell. In addition, strand separation is not efficiently extended through the region containing the nick site. These effects suggest a model in which MobC acts as a molecular wedge for the relaxosome-induced melting of oriT DNA. The effect of MobC on strand separation may be partially complemented by the helical distortion induced by supercoiling. However, MobC extends the melted region through the nick site, thus providing the single-stranded substrate required for cleavage by MobA. 45610 pfam05714: Borrelia burgdorferi virulent strain associated lipoprotein. This family consists of several virulent strain associated lipoproteins from the Lyme disease spirochete Borrelia burgdorferi. 45611 pfam05715: Piccolo Zn-finger. This (predicted) Zinc finger is found in the bassoon and piccolo proteins. There are eight conserved cysteines, suggesting that it coordinates two zinc ligands. 45612 pfam05716: A-kinase anchor protein 110 kDa (AKAP 110). This family consists of several mammalian protein kinase A anchoring protein 3 (PRKA3) or A-kinase anchor protein 110 kDa (AKAP 110) sequences. Agents that increase intracellular cAMP are potent stimulators of sperm motility. Anchoring inhibitor peptides, designed to disrupt the interaction of the cAMP-dependent protein kinase A (PKA) with A kinase-anchoring proteins (AKAPs), are potent inhibitors of sperm motility. PKA anchoring is a key biochemical mechanism controlling motility. AKAP110 shares compartments with both RI and RII isoforms of PKA and may function as a regulator of both motility- and head-associated functions such as capacitation and the acrosome reaction. 45613 pfam05717: IS66 Orf2 like protein. This protein is found in insertion sequences related to IS66. The function of these proteins is uncertain, but they are probably essential for transposition. 45614 pfam05718: Poxvirus intermediate transcription factor. This family consists of several highly related Poxvirus sequences which are thought to be intermediate transcription factors. 45615 pfam05719: Golgi phosphoprotein 3 (GPP34). This family consists of several eukaryotic GPP34 like proteins. GPP34 localises to the Golgi complex and is conserved from yeast to humans. The cytosolic-ally exposed location of GPP34 predict a role for a novel coat protein in Golgi trafficking. 45616 pfam05720: Cell-cell adhesion domain. This family is based on a group of Dictyostelium discoideum proteins that are essential in early development. Some members are known to be located on the cell surface and mediate cell-cell adhesion. 45617 pfam05721: Phytanoyl-CoA dioxygenase (PhyH). This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins as well as a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalysing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterised by the accumulation of phytanic acid in plasma and tissues. . 45618 pfam05722: Ustilago B locus mating-type protein. This family consists of several Ustilago mating-type proteins. The b locus of the phytopathogenic fungus Ustilago maydis encodes a multiallelic recognition function that controls the ability of the fungus to form a dikaryon and complete the sexual stage of the life cycle. The b locus has at least 25 alleles and any combination of two different alleles, brought together by mating between haploid cells, allows the fungus to cause disease and undergo sexual development within the plant. 45619 pfam05724: Thiopurine S-methyltransferase (TPMT). This family consists of thiopurine S-methyltransferase proteins from both eukaryotes and prokaryotes. Thiopurine S-methyltransferase (TPMT) is a cytosolic enzyme that catalyses S-methylation of aromatic and heterocyclic sulfhydryl compounds, including anticancer and immunosuppressive thiopurines. 45620 pfam05725: FNIP Repeat. This repeat is approximately 22 residues long and is only found in Dictyostelium discoideum. It appears to be related to pfam00560 (personal obs:C Yeats). The alignment consists of two tandem repeats. It is termed the FNIP repeat after the pattern of conserved residues. 45621 pfam05726: Pirin C-terminal region. This region is found the C-terminal half of the Pirin protein. 45622 pfam05727: Uncharacterised protein family (UPF0228). This small family of proteins is currently restricted Methanosarcina species. Members of this family are about 200 residues in length, except for a member with two copies of this region. Although the function of this region is unknown the pattern of conservation suggests that this may be an enzyme, including multiple conserved aspartate and glutamate residues (Bateman A. pers. obs.). The most conserved motif in these proteins is NEL/MEXNE/D, where X can be any amino acid, which is found at the C-terminus of these proteins. 45623 pfam05728: Uncharacterised protein family (UPF0227). Despite being classed as uncharacterised proteins, the members of this family are almost certainly enzymes that are distantly related to the pfam00561. 45624 pfam05729: NACHT domain. This NTPase domain is found in apoptosis proteins as well as those involved in MHC transcription activation. This family is closely related to pfam00931. 45625 pfam05730: CFEM domain. This fungal specific cysteine rich domain is found in some proteins with proposed roles in fungal pathogenesis. 45626 pfam05731: TROVE domain. This presumed domain is found in TEP1 and Ro60 proteins, that are RNA-binding components of Telomerase, Ro and Vault RNPs. This domain has been named TROVE, (after Telomerase, Ro and Vault). This domain is probably RNA-binding. 45627 pfam05732: Firmicute plasmid replication protein (RepL). This family consists of Firmicute RepL proteins which are involved in plasmid replication. 45628 pfam05733: Tenuivirus nucleocapsid protein. This family consists of several Tenuivirus nucleocapsid proteins. 45629 pfam05734: Herpesvirus protein of unknown function (DUF832). This family consists of several herpesvirus proteins of unknown function. 45630 pfam05735: Thrombospondin C-terminal region. This region is found at the C-terminus of thrombospondin and related proteins. 45631 pfam05736: OmpF membrane domain. This domain represents the presumed membrane spanning region of the OmpF proteins. This region is involved in channel formation and is thought to form an 8-stranded beta-barrel. 45632 pfam05737: Collagen binding domain. The domain fold is a jelly-roll, composed of two antiparallel beta-sheets and two short alpha-helices. A groove on beta-sheet I exhibited the best surface complementarity to the collagen. This site partially overlaps with the peptide sequence previously shown to be critical for collagen binding. Recombinant proteins containing single amino acid mutations designed to disrupt the surface of the putative binding site exhibited significantly lower affinities for collagen. 45633 pfam05738: Cna protein B-type domain. This domain is found in Staphylococcus aureus collagen-binding surface protein. However, this region does not mediate collagen binding, the pfam05737 region carries out that function. The structure of the repetitive B-region has been solved and forms a beta sandwich structure. It is thought that this region forms a stalk in Staphylococcus aureus collagen-binding protein that presents the ligand binding domain away from the bacterial cell surface. 45634 pfam05739: SNARE domain. Most if not all vesicular membrane fusion events in eukaryotic cells are believed to be mediated by a conserved fusion machinery, the SNARE [soluble N-ethylmaleimide-sensitive factor (NSF) attachment protein (SNAP) receptors] machinery. The SNARE domain is thought to act as a protein-protein interaction module in the assembly of a SNARE protein complex. . 45635 pfam05740: Rotavirus RNA polymerase (VP1). This family consists of several Rotavirus VP1 proteins. The minor core protein, VP1, is the viral RNA-dependent RNA polymerase and functions as both the viral transcriptase and replicase. 45636 pfam05741: Nanos RNA binding domain. This family consists of several conserved novel zinc finger domains found in the eukaryotic proteins Nanos and Xcat-2. In Drosophila melanogaster, Nanos functions as a localised determinant of posterior pattern. Nanos RNA is localised to the posterior pole of the maturing egg cell and encodes a protein that emanates from this localised source. Nanos acts as a translational repressor and thereby establishes a gradient of the morphogen Hunchback. Xcat-2 is found in the vegetal cortical region and is inherited by the vegetal blasomeres during development, and is degraded very early in development. The localised and maternally restricted expression of Xcat-2 RNA suggests a role for its protein in setting up regional differences in gene expression that occur early in development. . 45637 pfam05742: Protein of unknown function (DUF833). This family is found in eukaryotes, prokaryotes and viruses and has no known function. One member has been found to be expressed during early embryogenesis in mice. 45638 pfam05743: Tumour susceptibility gene 101 protein (TSG101). This family consists of the eukaryotic tumour susceptibility gene 101 protein (TSG101). Altered transcripts of this gene have been detected in sporadic breast cancers and many other human malignancies. However, the involvement of this gene in neoplastic transformation and tumourigenesis is still elusive. TSG101 is required for normal cell function of embryonic and adult tissues but that this gene is not a tumour suppressor for sporadic forms of breast cancer. . 45639 pfam05744: Benyvirus P25 protein. This family consists of P25 proteins from the beet necrotic yellow vein viruses. 45640 pfam05745: Chlamydia 15 kDa cysteine-rich outer membrane protein (CRPA). This family consists of several Chlamydia 15 kDa cysteine-rich outer membrane proteins which are associated with differentiation of reticulate bodies (RBs) into elementary bodies (EBs).. 45641 pfam05746: DALR anticodon binding domain. This all alpha helical domain is the anticodon binding domain in Arginyl and glycyl tRNA synthetase. This domain is known as the DALR domain after characteristic conserved amino acids. 45642 pfam05747: Poxvirus N2L protein. This family consists of Poxvirus N2L proteins. N2L may be responsible for alpha amanitin resistance. 45643 pfam05748: Rubella membrane glycoprotein E1. Rubella virus (RV), the sole member of the genus Rubivirus within the family Togaviridae, is a small enveloped, positive strand RNA virus. The nucleocapsid consists of 40S genomic RNA and a single species of capsid protein which is enveloped within a host-derived lipid bilayer containing two viral glycoproteins, E1 (58 kDa) and E2 (42-46 kDa). In virus infected cells, RV matures by budding either at the plasma membrane, or at the internal membranes depending on the cell type and enters adjacent uninfected cells by a membrane fusion process in the endosome, directed by E1-E2 heterodimers. The heterodimer formation is crucial for E1 transport out of the endoplasmic reticulum to the Golgi and plasma membrane. In RV E1, a cysteine at position 82 is crucial for the E1-E2 heterodimer formation and cell surface expression of the two proteins. The E1 has been shown to be a type 1 membrane protein, rich in cysteine residues with extensive intramolecular disulfide bonds. 45644 pfam05749: Rubella membrane glycoprotein E2. Rubella virus (RV), the sole member of the genus Rubivirus within the family Togaviridae, is a small enveloped, positive strand RNA virus. The nucleocapsid consists of 40S genomic RNA and a single species of capsid protein which is enveloped within a host-derived lipid bilayer containing two viral glycoproteins, E1 (58 kDa) and E2 (42-46 kDa). In virus infected cells, RV matures by budding either at the plasma membrane, or at the internal membranes depending on the cell type and enters adjacent uninfected cells by a membrane fusion process in the endosome, directed by E1-E2 heterodimers. The heterodimer formation is crucial for E1 transport out of the endoplasmic reticulum to the Golgi and plasma membrane. In RV E1, a cysteine at position 82 is crucial for the E1-E2 heterodimer formation and cell surface expression of the two proteins. 45645 pfam05750: Rubella capsid protein. Rubella virus is an enveloped positive-strand RNA virus of the family Togaviridae. Virions are composed of three structural proteins: a capsid and two membrane-spanning glycoproteins, E2 and E1. During virus assembly, the capsid interacts with genomic RNA to form nucleocapsids. It has been discovered that capsid phosphorylation serves to negatively regulate binding of viral genomic RNA. This may delay the initiation of nucleocapsid assembly until sufficient amounts of virus glycoproteins accumulate at the budding site and/or prevent non-specific binding to cellular RNA when levels of genomic RNA are low. It follows that at a late stage in replication, the capsid may undergo dephosphorylation before nucleocapsid assembly occurs. 45646 pfam05751: FixH. This family consists of several Rhizobium FixH like proteins. It has been suggested that suggested that the four proteins FixG, FixH, FixI, and FixS may participate in a membrane-bound complex coupling the FixI cation pump with a redox process catalysed by FixG. 45647 pfam05752: Calicivirus minor structural protein. This family consists of minor structural proteins largely from human calicivirus isolates. Human calicivirus causes gastroenteritis. The function of this family is unknown. 45648 pfam05753: Translocon-associated protein beta (TRAPB). This family consists of several eukaryotic translocon-associated protein beta (TRAPB) or signal sequence receptor beta subunit (SSR-beta) proteins. The normal translocation of nascent polypeptides into the lumen of the endoplasmic reticulum (ER) is thought to be aided in part by a translocon-associated protein (TRAP) complex consisting of 4 protein subunits. The association of mature proteins with the ER and Golgi, or other intracellular locales, such as lysosomes, depends on the initial targeting of the nascent polypeptide to the ER membrane. A similar scenario must also exist for proteins destined for secretion. 45649 pfam05754: Domain of unknown function (DUF834). This short presumed domain is found in a large number of hypothetical plant proteins. The domain is quite rich in conserved glycine residues. It occurs in some putative transposons but currently has no known function. 45650 pfam05755: Rubber elongation factor protein (REF). This family consists of the highly related rubber elongation factor (REF), small rubber particle protein (SRPP) and stress-related protein (SRP) sequences. REF and SRPP are released from the rubber particle membrane into the cytosol during osmotic lysis of the sedimentable organelles (lutoids). The exact function of this family is unknown. . 45651 pfam05756: S-antigen protein. S-antigens are heat stable proteins that are found in the blood of individuals infected with malaria. 45652 pfam05757: Oxygen evolving enhancer protein 3 (PsbQ). This family consists of the plant specific oxygen evolving enhancer protein 3 (PsbQ). Photosystem II (PSII)1 is a pigment-protein complex, which consists of at least 25 different protein subunits, at present denoted PsbA-Z according to the genes that encode them. PsbQ plays an important role in the lumenal oxygen-evolving activity of PSII from higher plants and green algae. 45653 pfam05758: Ycf1. The chloroplast genomes of most higher plants contain two giant open reading frames designated ycf1 and ycf2. Although the function of Ycf1 is unknown, it is known to be an essential gene. 45654 pfam05759: Orthopoxvirus C1 protein. This family consists of several sequences which are highly related to the C1 protein of the Vaccinia virus. 45655 pfam05760: Immediate early response protein (IER). This family consists of several eukaryotic immediate early response (IER) 2 and 5 proteins. The role of IER5 is unclear although it play an important role in mediating the cellular response to mitogenic signals. Again, little is known about the function of IER2 although it is thought to play a role in mediating the cellular responses to a variety of extracellular signals. 45656 pfam05761: 5' nucleotidase family. This family of eukaryotic proteins includes 5' nucleotidase enzymes, such as purine 5 '-nucleotidase EC:3.1.3.5. 45657 pfam05762: VWA domain containing CoxE-like protein. This family is annotated by SMART as containing a VWA type domain. The exact function of this family is unknown. It is found as part of a CO oxidising (Cox) system operon is several bacteria. 45658 pfam05763: Protein of unknown function (DUF835). The members of this archaebacterial protein family are around 250-300 amino acid residues in length. The function of these proteins is not known. 45659 pfam05764: YL1 nuclear protein. The proteins in this family are designated YL1. These proteins have been shown to be DNA-binding and may be a transcription factor. 45660 pfam05765: Tegument protein. Herpesvirus virions share a characteristic architecture in which the double-stranded DNA genome is surrounded by an icosahedral protein capsid, a thick tegument layer, and a lipid bilayer envelope. This large tegument protein is found in a variety of herpesviruses. 45661 pfam05766: Bacteriophage Lambda NinG protein. NinG or Rap is involved in recombination. Rap (recombination adept with plasmid) increases lambda-by-plasmid recombination catalysed by Escherichia coli's RecBCD pathway. 45662 pfam05767: Poxvirus virion envelope protein A14. This family consists of several Poxvirus virion envelope protein A14 like sequences. A14 is a component of the virion membrane and has been found to be an H1 phosphatase substrate in vivo and in vitro. A14 is hyperphosphorylated on serine residues in the absence of H1 expression. 45663 pfam05768: Protein of unknown function (DUF836). This family consists of several Poxvirus proteins of unknown function as well as one related bacterial sequence from Leuconostoc mesenteroides which is annotated as a MesC protein. MesC is a protein of unknown function which forms part of the mesentericin operon. 45664 pfam05769: Protein of unknown function (DUF837). This family consists of several eukaryotic proteins of unknown function. One of the family members is a circulating cathodic antigen (CCA) found in Schistosoma mansoni (Blood fluke).. 45665 pfam05770: Inositol 1, 3, 4-trisphosphate 5/6-kinase. This family consists of several inositol 1, 3, 4-trisphosphate 5/6-kinase proteins. Inositol 1,3,4-trisphosphate is at a branch point in inositol phosphate metabolism. It is dephosphorylated by specific phosphatases to either inositol 3,4-bisphosphate or inositol 1,3-bisphosphate. Alternatively, it is phosphorylated to inositol 1,3,4,6-tetrakisphosphate or inositol 1,3,4,5-tetrakisphosphate by inositol trisphosphate 5/6-kinase. 45666 pfam05771: Poxvirus A31 protein. 45667 pfam05772: NinB protein. The ninR region of phage lambda contains two recombination genes, orf (ninB) and rap (ninG), that have roles when the RecF and RecBCD recombination pathways of E. coli, respectively, operate on phage lambda. 45668 pfam05773: RWD domain. This domain was identified in WD40 repeat proteins and Ring finger domain proteins. The function of this domain is unknown. 45669 pfam05774: Herpesvirus helicase-primase complex component. This family consists of several helicase-primase complex components from the Gammaherpesviruses. 45670 pfam05775: Enterobacteria AfaD invasin protein. This family consists of several AfaD and related proteins from Escherichia coli and Salmonella bacteria. The afa gene clusters encode an afimbrial adhesive sheath produced by Escherichia coli. The adhesive sheath is composed of two proteins, AfaD and AfaE, which are independently exposed at the bacterial cell surface. AfaE is required for bacterial adhesion to HeLa cells and AfaD for the uptake of adherent bacteria into these cells. . 45671 pfam05776: Papillomavirus E5A protein. Human papillomaviruses (HPVs) are epitheliotropic viruses, and their life cycle is intimately linked to the stratification and differentiation state of the host epithelial tissues. The kinetics of E5a protein expression during the complete viral life cycle has been studied and the highest level was found to be coincidental with the onset of virion morphogenesis. . 45672 pfam05777: Drosophila accessory gland-specific peptide 26Ab (Acp26Ab). This family consists of accessory gland-specific 26Ab peptides or male accessory gland secretory protein 355B from different Drosophila species. Drosophila males, like males of most other insects, transfer a group of specific proteins (Acp26Ab and Acp26Aa in Drosophila) to the females during mating. These proteins are produced primarily in the accessory gland and are likely to influence the female's reproduction. 45673 pfam05778: Apolipoprotein CIII (Apo-CIII). This family consists of several mammalian apolipoprotein CIII (Apo-CIII) sequences. Apolipoprotein C-III is a 79-residue glycoprotein. It is synthesised in the intestine and liver as part of the very low density lipoprotein (VLDL) and the high density lipoprotein (HDL) particles. Owing to its positive correlation with plasma triglyceride (Tg) levels, Apo-CIII is suggested to play a role in Tg metabolism and is therefore of interest regarding atherosclerosis. However, unlike other apolipoproteins such as Apo-AI, Apo E or CII for which many naturally occurring mutations are known, the structure-function relationships of apo C-III remains a subject of debate. One possibility is that apo C-III inhibits lipoprotein lipase (LPL) activity, as shown by in vitro experiments. Another suggestion, is that elevated levels of Apo-CIII displace other apolipoproteins at the lipoprotein surface, modifying their clearance from plasma. 45674 pfam05779: Bacterial protein of unknown function (DUF838). This family consists of a number of conserved proteins of unknown function from several bacterial species. 45675 pfam05780: Coronavirus nonstructural protein 4. This family consists of several nonstructural protein 4 (NS4) sequences or putative small membrane protein. 45676 pfam05781: MRVI1 protein. This family consists of mammalian MRVI1 proteins which are related to the lymphoid-restricted membrane protein (JAW1) and the IP3 receptor associated cGMP kinase substrates A and B (IRAGA and IRAGB). The function of MRVI1 is unknown although mutations in the Mrvi1 gene induces myeloid leukaemia by altering the expression of a gene important for myeloid cell growth and/or differentiation so it has been speculated that Mrvi1 is a tumour suppressor gene. IRAG is very similar in sequence to MRVI1 and is an essential NO/cGKI-dependent regulator of IP3-induced calcium release. Activation of cGKI decreases IP3-stimulated elevations in intracellular calcium, induces smooth muscle relaxation and contributes to the antiproliferative and pro-apoptotic effects of NO/cGMP. Jaw1 is a member of a class of proteins with COOH-terminal hydrophobic membrane anchors and is structurally similar to proteins involved in vesicle targeting and fusion. This suggests that the function and/or the structure of the ER in lymphocytes may be modified by lymphoid-restricted resident ER proteins. 45677 pfam05782: Extracellular matrix protein 1 (ECM1). This family consists of several eukaryotic extracellular matrix protein 1 (ECM1) sequences. ECM1 has been shown to regulate endochondral bone formation, stimulate the proliferation of endothelial cells and induce angiogenesis. Mutations in the ECM1 gene can cause lipoid proteinosis, a disorder which causes generalised thickening of skin, mucosae and certain viscera. Classical features include beaded eyelid papules and laryngeal infiltration leading to hoarseness. 45678 pfam05783: Dynein light intermediate chain (DLIC). This family consists of several eukaryotic dynein light intermediate chain proteins. The light intermediate chains (LICs) of cytoplasmic dynein consist of multiple isoforms, which undergo post-translational modification to produce a large number of species. DLIC1 is known to be involved in assembly, organisation, and function of centrosomes and mitotic spindles when bound to pericentrin. DLIC2 is a subunit of cytoplasmic dynein 2 that may play a role in maintaining Golgi organisation by binding cytoplasmic dynein 2 to its Golgi-associated cargo. 45679 pfam05784: Betaherpesvirus UL82/83 protein N terminus. This family represents the N terminal region of the Betaherpesvirus UL82 and UL83 proteins. As viruses are reliant upon their host cell to serve as proper environments for their replication, many have evolved mechanisms to alter intracellular conditions to suit their own needs. Human cytomegalovirus induces quiescent cells to enter the cell cycle and then arrests them in late G(1), before they enter the S phase, a cell cycle compartment that is presumably favourable for viral replication. The protein product of the human cytomegalovirus UL82 gene, pp71, can accelerate the movement of cells through the G(1) phase of the cell cycle. This activity would help infected cells reach the late G(1) arrest point sooner and thus may stimulate the infectious cycle. pp71 also induces DNA synthesis in quiescent cells, but a pp71 mutant protein that is unable to induce quiescent cells to enter the cell cycle still retains the ability to accelerate the G(1) phase. Thus, the mechanism through which pp71 accelerates G(1) cell cycle progression appears to be distinct from the one that it employs to induce quiescent cells to exit G(0) and subsequently enter the S phase. 45680 pfam05785: Cytotoxic necrotizing factor. This family consists of several bacterial cytotoxic necrotizing factor proteins as well as related dermonecrotic toxin (DNT) from Bordetella species. Cytotoxic necrotizing factor 1 (CNF1) causes necrosis of rabbit skin and re-organisation of the actin cytoskeleton in cultured cells. Bordetella dermonecrotic toxin (DNT) stimulates the assembly of actin stress fibres and focal adhesions by deamidating or polyaminating Gln63 of the small GTPase Rho. DNT is an A-B toxin which is composed of an N-terminal receptor-binding (B) domain and a C-terminal enzymatically active (A) domain. 45681 pfam05786: Barren protein. This family consists of several Barren protein homologues from several eukaryotic organisms. In Drosophila Barren (barr) is required for sister-chromatid segregation in mitosis. barr encodes a novel protein that is present in proliferating cells and has homologues in yeast and human. Mitotic defects in barr embryos become apparent during cycle 16, resulting in a loss of PNS and CNS neurons. Centromeres move apart at the metaphase-anaphase transition and Cyclin B is degraded, but sister chromatids remain connected, resulting in chromatin bridging. Barren protein localises to chromatin throughout mitosis. Colocalisation and biochemical experiments indicate that Barren associates with Topoisomerase II throughout mitosis and alters the activity of Topoisomerase II. It has been suggested that this association is required for proper chromosomal segregation by facilitating the decatenation of chromatids at anaphase. 45682 pfam05787: Bacterial protein of unknown function (DUF839). This family consists of several bacterial proteins of unknown function. 45683 pfam05788: Orbivirus RNA-dependent RNA polymerase (VP1). This family consists of the RNA-dependent RNA polymerase protein VP1 from the Orbiviruses. VP1 may have both enzymatic and structural roles in the virus life cycle. 45684 pfam05789: Baculovirus VP1054 protein. This family consists of several VP1054 proteins from the Baculoviruses. VP1054 is a virus structural protein required for nucleocapsid assembly. . 45685 pfam05790: T-cell surface antigen CD2 protein. This family consists of several mammalian T-cell surface antigen CD2 proteins as well as homologous African swine fever virus sequences. CD2 mediates T cell adhesion via its ectodomain and signal transduction utilising its 117-amino acid cytoplasmic tail. The structural and functional similarities of the African swine fever virus (ASFV) LMW8-DR to CD2, a protein that is involved in cell-cell adhesion and immune response modulation, suggest a possible role in the pathogenesis of ASFV infection. . 45686 pfam05791: Bacillus haemolytic enterotoxin (HBL). This family consists of several Bacillus haemolytic enterotoxins (HblC, HblD, HblA, NheA, and NheB) which can cause food poisoning in humans. 45687 pfam05792: Candida agglutinin-like protein (ALS). This family consists of several agglutinin-like proteins from different Candida species. ALS genes of Candida albicans encode a family of cell-surface glycoproteins with a three-domain structure. Each Als protein has a relatively conserved N-terminal domain, a central domain consisting of a tandemly repeated motif, and a serine-threonine-rich C-terminal domain that is relatively variable across the family. The ALS family exhibits several types of variability that indicate the importance of considering strain and allelic differences when studying ALS genes and their encoded proteins. 45688 pfam05793: Transcription initiation factor IIF, alpha subunit (TFIIF-alpha). Transcription initiation factor IIF, alpha subunit (TFIIF-alpha) or RNA polymerase II-associating protein 74 (RAP74) is the large subunit of transcription factor IIF (TFIIF), which is essential for accurate initiation and stimulates elongation by RNA polymerase II. 45689 pfam05794: T-complex protein 11. This family consists of several eukaryotic T-complex protein 11 (Tcp11) related sequences. Tcp11 is only expressed in fertile adult mammalian testes and is thought to be important in sperm function and fertility. The family also contains the yeast Sok1 protein which is known to suppress cyclic AMP-dependent protein kinase mutants. 45690 pfam05795: Plasmodium vivax Vir protein. This family consists of several Vir proteins specific to Plasmodium vivax. The vir genes are present at about 600-1,000 copies per haploid genome and encode proteins that are immunovariant in natural infections, indicating that they may have a functional role in establishing chronic infection through antigenic variation. 45691 pfam05796: Chordopoxvirus protein G2. This family consists of several Chordopoxvirus isatin-beta-thiosemicarbazone dependent protein (protein G2) sequences. Inactivation of the gene coding for this protein renders the virus dependent upon isatin-beta-thiosemicarbazone (IBT) for growth. 45692 pfam05797: Yeast trans-acting factor (REP1/REP2). This family consists of the yeast trans-acting factor B and C (REP1 and 2) proteins. The yeast plasmid stability system consists of two plasmid-coded proteins, Rep1 and Rep2, and a cis-acting locus, STB. The Rep proteins show both self- and cross-interactions in vivo and in vitro, and bind to the STB DNA with assistance from host factor(s). Within the yeast nucleus, the Rep1 and Rep2 proteins tightly associate with STB-containing plasmids into well organised plasmid foci that form a cohesive unit in partitioning. It is generally accepted that the protein-protein and DNA-protein interactions engendered by the Rep-STB system are central to plasmid partitioning. Point mutations in Rep1 that knock out interaction with Rep2 or with STB simultaneously block the ability of these Rep1 variants to support plasmid stability. 45693 pfam05798: Bacteriophage FRD3 protein. This family consists of bacteriophage FRD3 proteins. 45694 pfam05799: Cytochrome c oxidase subunit Vc (COX5C). This family consists of plant cytochrome c oxidase subunit 5c proteins. 45695 pfam05800: Gas vesicle synthesis protein GvpO. This family consists of archaeal GvpO proteins which are required for gas vesicle synthesis. The family also contains two related sequences from Streptomyces coelicolor. . 45696 pfam05801: Lagovirus protein of unknown function (DUF840). This family consists of several Lagovirus sequences of unknown function, largely from rabbit hemorrhagic disease virus. 45697 pfam05802: Enterobacterial EspB protein. EspB is a type-III-secreted pore-forming protein of enteropathogenic Escherichia coli (EPEC) which is essential for EPEC pathogenesis. EspB is also found in Citrobacter rodentium. 45698 pfam05803: Chordopoxvirus L2 protein. This family consists of several Chordopoxvirus L2 proteins. 45699 pfam05804: Kinesin-associated protein (KAP). This family consists of several eukaryotic kinesin-associated (KAP) proteins. Kinesins are intracellular multimeric transport motor proteins that move cellular cargo on microtubule tracks. It has been shown that the sea urchin KRP85/95 holoenzyme associates with a KAP115 non-motor protein, forming a heterotrimeric complex in vitro, called the Kinesin-II. 45700 pfam05805: L6 membrane protein. This family consists of several eukaryotic L6 membrane proteins. L6, IL-TMP, and TM4SF5 are cell surface proteins predicted to have four transmembrane domains. Previous sequence analysis led to their assignment as members of the tetraspanin superfamily it has now been found that that they are not significantly related to genuine tetraspanins, but instead constitute their own L6 family. Several members of this family have been implicated in human cancer. 45701 pfam05806: Noggin. This family consists of the eukaryotic Noggin proteins. Noggin is a glycoprotein that binds bone morphogenetic proteins (BMPs) selectively and, when added to osteoblasts, it opposes the effects of BMPs. It has been found that noggin arrests the differentiation of stromal cells, preventing cellular maturation. 45702 pfam05807: Mycoplasma arthritidis MAA2 repeat. This family consists of a series of repeated 73 residue sequences from the Mycoplasma arthritidis MAA2 variable surface protein. MAA2 is implicated in in cytoadherence and virulence and has been shown to exhibit both size and phase variability. 45703 pfam05808: Podoplanin. This family consists of several mammalian podoplanin like proteins which are thought to control specifically the unique shape of podocytes. 45704 pfam05809: Eukaryotic protein of unknown function (DUF841). This family consists of several eukaryotic proteins with no known function. 45705 pfam05810: NinF protein. This family consists of several bacteriophage NinF proteins as well as related sequences from E. coli. 45706 pfam05811: Eukaryotic protein of unknown function (DUF842). This family consists of a number of conserved eukaryotic proteins of unknown function. 45707 pfam05812: Herpesvirus BLRF2 protein. This family consists of several Herpesvirus BLRF2 proteins. The family also contains the C terminal region of hypothetical human and mouse sequences, which align with the N terminus of the viral sequences. 45708 pfam05813: Orthopoxvirus F7 protein. 45709 pfam05814: Baculovirus protein of unknown function (DUF843). This family consists of several Baculovirus proteins of around 85 residues long with no known function. 45710 pfam05815: Baculovirus protein of unknown function (DUF844). This family consists of several Baculovirus sequences of between 350 and 380 residues long. The family has no known function. 45711 pfam05816: Toxic anion resistance protein (TelA). This family consists of several prokaryotic TelA like proteins. TelA and KlA are associated with tellurite resistance and plasmid fertility inhibition. 45712 pfam05817: Ribophorin II (RPN2). This family consists of several eukaryotic Ribophorin II (RPN2) proteins. The mammalian oligosaccharyltransferase (OST) is a protein complex that effects the cotranslational N-glycosylation of newly synthesised polypeptides, and is composed of at least four rough ER-specific membrane proteins: ribophorins I and II (RI and RII), OST48, and Dadl. The mechanism(s) by which the subunits of this complex are retained in the ER are not well understood. . 45713 pfam05818: Enterobacterial TraT complement resistance protein. The traT gene is one of the F factor transfer genes and encodes an outer membrane protein which is involved in interactions between an Escherichia coli and its surroundings. 45714 pfam05819: NolX protein. This family consists of Rhizobium NolX and Xanthomonas HrpF proteins. The interaction between the plant pathogen Xanthomonas campestris pv. vesicatoria and its host plants is controlled by hrp genes (hypersensitive reaction and pathogenicity), which encode a type III protein secretion system. Among type III-secreted proteins are avirulence proteins, effectors involved in the induction of plant defence reactions. HrpF is dispensable for protein secretion but required for AvrBs3 recognition in planta, is thought to function as a translocator of effector proteins into the host cell. NolX, a soybean cultivar specificity protein, is secreted by a type III secretion system (TTSS) and shows homology to HrpF of the plant pathogen Xanthomonas campestris pv. vesicatoria. It is not known whether NolX functions at the bacterium-plant interface or acts inside the host cell. NolX is expressed in planta only during the early stages of nodule development. 45715 pfam05820: Baculovirus protein of unknown function (DUF845). This family consists of several highly related Baculovirus proteins of unknown function. 45716 pfam05821: NADH-ubiquinone oxidoreductase ASHI subunit (CI-ASHI or NDUFB8). This family consists of several eukaryotic NADH-ubiquinone oxidoreductase ASHI subunit (CI-ASHI) proteins. NADH:ubiquinone oxidoreductase (complex I) is an extremely complicated multiprotein complex located in the inner mitochondrial membrane. Its main function is the transport of electrons from NADH to ubiquinone, which is accompanied by translocation of protons from the mitochondrial matrix to the intermembrane space. Human complex I appears to consist of 41 subunits. 45717 pfam05822: Pyrimidine 5'-nucleotidase (UMPH-1). This family consists of several eukaryotic pyrimidine 5'-nucleotidase proteins. P5 'N-1, also known as uridine monophosphate hydrolase-1 (UMPH-1), is a member of a large functional group of enzymes, characterised by the ability to dephosphorylate nucleic acids. P5'N-1 catalyses the dephosphorylation of pyrimidine nucleoside monophosphates to the corresponding nucleosides. Deficiencies in this proteins function can lead to several different disorders in humans. 45718 pfam05823: Nematode fatty acid retinoid binding protein (Gp-FAR-1). Parasitic nematodes produce at least two structurally novel classes of small helix-rich retinol- and fatty-acid-binding proteins that have no counterparts in their plant or animal hosts and thus represent potential targets for new nematicides. Gp-FAR-1 is a member of the nematode-specific fatty-acid- and retinol-binding (FAR) family of proteins but localises to the surface of the organism, placing it in a strategic position for interaction with the host. Gp-FAR-1 functions as a broad-spectrum retinol- and fatty-acid-binding protein, and it is thought that it is involved in the evasion of primary host plant defence systems. 45719 pfam05824: Pro-melanin-concentrating hormone (Pro-MCH). This family consists of several mammalian pro-melanin-concentrating hormone (Pro-MCH) 1 and 2 proteins. Melanin-concentrating hormone (MCH) is a 19 amino acid cyclic peptide that was first isolated from the pituitary of teleost fish. It is produced from pro-MCH that encodes, in addition to MCH, NEI, and a putative peptide, NGE. In lower vertebrates, MCH acts to regulate skin colour by antagonising the melanin-dispersing actions of small alpha, Greek-melanocyte stimulating hormone (small alpha, Greek-MSH). In mammals, MCH serves as a neuropeptide and is found in many regions of the brain and especially the hypothalamus. It affects many types of behaviours such as appetite, sexual receptivity, aggression, and anxiety. MCH also stimulates the release of luteinising hormone. 45720 pfam05825: Beta-microseminoprotein (PSP-94). This family consists of the mammalian specific protein beta-microseminoprotein. Prostatic secretory protein of 94 amino acids (PSP94), also called beta-microseminoprotein, is a small, nonglycosylated protein, rich in cysteine residues. It was first isolated as a major protein from human seminal plasma. The exact function of this protein is unknown. 45721 pfam05826: Phospholipase A2. This family consists of several phospholipase A2 like proteins mostly from insects. 45722 pfam05827: Vacuolar ATP synthase subunit S1 (ATP6S1). This family consists of eukaryotic vacuolar ATP synthase subunit S1 proteins. 45723 pfam05829: Adenovirus late L2 mu core protein (Protein X). This family consists of several Adenovirus late L2 mu core protein or Protein X sequences. 45724 pfam05830: Nodulation protein Z (NodZ). The nodulation genes of Rhizobia are regulated by the nodD gene product in response to host-produced flavonoids and appear to encode enzymes involved in the production of a lipo-chitose signal molecule required for infection and nodule formation. NodZ is required for the addition of a 2-O-methylfucose residue to the terminal reducing N-acetylglucosamine of the nodulation signal. This substitution is essential for the biological activity of this molecule. Mutations in nodZ result in defective nodulation. nodZ represents a unique nodulation gene that is not under the control of NodD and yet is essential for the synthesis of an active nodulation signal. 45725 pfam05831: GAGE protein. This family consists of several GAGE and XAGE proteins which are found exclusively in humans. The function of this family is unknown although they have been implicated in human cancers. 45726 pfam05832: Eukaryotic protein of unknown function (DUF846). This family consists of several of unknown function from a variety of eukaryotic organisms. 45727 pfam05833: Fibronectin-binding protein A N-terminus (FbpA). This family consists of the N-terminal region of the prokaryotic fibronectin-binding protein. Fibronectin binding is considered to be an important virulence factor in streptococcal infections. Fibronectin is a dimeric glycoprotein that is present in a soluble form in plasma and extracellular fluids; it is also present in a fibrillar form on cell surfaces. Both the soluble and cellular forms of fibronectin may be incorporated into the extracellular tissue matrix. While fibronectin has critical roles in eukaryotic cellular processes, such as adhesion, migration and differentiation, it is also a substrate for the attachment of bacteria. The binding of pathogenic Streptococcus pyogenes and Staphylococcus aureus to epithelial cells via fibronectin facilitates their internalisation and systemic spread within the host. 45728 pfam05834: Lycopene cyclase protein. This family consists of lycopene beta and epsilon cyclase proteins. Carotenoids with cyclic end groups are essential components of the photosynthetic membranes in all plants, algae, and cyanobacteria. These lipid-soluble compounds protect against photo-oxidation, harvest light for photosynthesis, and dissipate excess light energy absorbed by the antenna pigments. The cyclisation of lycopene (psi, psi-carotene) is a key branch point in the pathway of carotenoid biosynthesis. Two types of cyclic end groups are found in higher plant carotenoids: the beta and epsilon rings. Carotenoids with two beta rings are ubiquitous, and those with one beta and one epsilon ring are common; however, carotenoids with two epsilon rings are rare. 45729 pfam05835: Synaphin protein. This family consists of several eukaryotic synaphin 1 and 2 proteins. Synaphin/complexin is a cytosolic protein that preferentially binds to syntaxin within the SNARE complex. Synaphin promotes SNAREs to form precomplexes that oligomerise into higher order structures. A peptide from the central, syntaxin binding domain of synaphin competitively inhibits these two proteins from interacting and prevents SNARE complexes from oligomerising. It is thought that oligomerisation of SNARE complexes into a higher order structure creates a SNARE scaffold for efficient, regulated fusion of synaptic vesicles. Synaphin promotes neuronal exocytosis by promoting interaction between the complementary syntaxin and synaptobrevin transmembrane regions that reside in opposing membranes prior to fusion. 45730 pfam05836: Chorion protein S16. This family consists of several examples of the fruit fly specific chorion protein S16. The chorion genes of Drosophila are amplified in response to developmental signals in the follicle cells of the ovary. . 45731 pfam05837: Centromere protein H (CENP-H). This family consists of several eukaryotic centromere protein H (CENP-H) sequences. Macromolecular centromere-kinetochore complex plays a critical role in sister chromatid separation, but its complete protein composition as well as its precise dynamic function during mitosis has not yet been clearly determined. CENP-H contains a coiled-coil structure and a nuclear localisation signal. CENP-H is specifically and constitutively localised in kinetochores throughout the cell cycle. CENP-H may play a role in kinetochore organisation and function throughout the cell cycle. 45732 pfam05838: Protein of unknown function (DUF847). This family consists of several hypothetical bacterial sequences as well as one viral sequence, the function of this family is unknown. 45733 pfam05839: Apc13p protein. The anaphase-promoting complex (APC) is a conserved multi-subunit ubiquitin ligase required for the degradation of key cell cycle regulators Members of this family are components of the anaphase-promoting complex homologous to Apc13p. 45734 pfam05840: Bacteriophage replication gene A protein (GPA). This family consists of a group of bacteriophage replication gene A protein (GPA) like sequences from both viruses and bacteria. The members of this family are likely to be endonucleases. 45735 pfam05841: Apc15p protein. The anaphase-promoting complex (APC) is a conserved multi-subunit ubiquitin ligase required for the degradation of key cell cycle regulators Members of this family are components of the anaphase-promoting complex homologous to Apc15p. 45736 pfam05842: Euplotes octocarinatus mating pheromone protein. This family consists of several mating pheromone proteins from Euplotes octocarinatus. Cells of the ten mating types of the ciliate Euplotes octocarinatus communicate by pheromones before they enter conjugation. The pheromones induce homotypic pairing when applied to mating types that do not secrete the same pheromone(s). Heterotypic pairs (i.e., those between cells of different mating types) are formed only when both mating types in a mixture secrete a pheromone that the other does not. The genetics of mating types is based on four codominant mating type alleles, each allele determining production of a different pheromone. The pheromones not only induce pair formation but also attract cells. 45737 pfam05843: Suppressor of forked protein (Suf). This family consists of several eukaryotic suppressor of forked (Suf) like proteins. The Drosophila melanogaster Suppressor of forked [Su(f)] protein shares homology with the yeast RNA14 protein and the 77-kDa subunit of human cleavage stimulation factor, which are proteins involved in mRNA 3' end formation. This suggests a role for Su(f) in mRNA 3' end formation in Drosophila. The su(f) gene produces three transcripts; two of them are polyadenylated at the end of the transcription unit, and one is a truncated transcript, polyadenylated in intron 4. It is thought that su(f) plays a role in the regulation of poly(A) site utilisation and an important role of the GU-rich sequence for this regulation to occur. 45738 pfam05844: YopD protein. This family consists of several bacterial YopD like proteins. Virulent Yersinia species harbour a common plasmid that encodes essential virulence determinants (Yersinia outer proteins [Yops]), which are regulated by the extracellular stimuli Ca2+ and temperature. YopD is thought to be a possible transmembrane protein and contains an amphipathic alpha-helix in its carboxy terminus. . 45739 pfam05845: Bacterial phosphonate metabolism protein (PhnH). This family consists of several bacterial PhnH sequences which are known to be involved in phosphonate metabolism. 45740 pfam05846: Chordopoxvirus A15 protein. This family consists of several Chordopoxvirus A15 like sequences. 45741 pfam05847: Nucleopolyhedrovirus late expression factor 3 (LEF-3). This family consists of LEF-3 Nucleopolyhedrovirus late expression factor 3 (LEF-3) sequences which are known to be ssDNA-binding proteins. Alkaline nuclease (AN) and LEF-3 may participate in homologous recombination of the baculovirus genome in a manner similar to that of exonuclease (Redalpha) and DNA-binding protein (Redbeta) of the Red-mediated homologous recombination system of bacteriophage lambda. . 45742 pfam05848: Firmicute transcriptional repressor of class III stress genes (CtsR). This family consists of several Firmicute transcriptional repressor of class III stress genes (CtsR) proteins. CtsR of L. monocytogenes negatively regulates the clpC, clpP and clpE genes belonging to the CtsR regulon. 45743 pfam05849: Fibroin light chain (L-fibroin). This family consists of several moth fibroin light chain (L-fibroin) proteins. Fibroin of the silkworm, Bombyx mori, is secreted into the lumen of posterior silk gland (PSG) from the surrounding PSG cells as a molecular complex consisting of a heavy (H)-chain of approximately 350 kDa, a light (L)-chain of 25 kDa and a P25 of about 27 kDa. The H- and L-chains are disulfide-linked but P25 is associated with the H-L complex by non-covalent force. 45744 pfam05851: Lentivirus virion infectivity factor (VIF). This family consists of several feline specific Lentivirus virion infectivity factor (VIF) proteins. VIF is essential for productive FIV infection of host target cells in vitro. 45745 pfam05852: Gammaherpesvirus protein of unknown function (DUF848). This family consists of several uncharacterised proteins from the Gammaherpesvirinae. 45746 pfam05853: Prokaryotic protein of unknown function (DUF849). This family consists of several hypothetical prokaryotic proteins with no known function. 45747 pfam05854: Non-histone chromosomal protein MC1. This family consists of archaeal chromosomal protein MC1 sequences which protect DNA against thermal denaturation. 45748 pfam05855: Lipooligosaccharide sialyltransferase (LST). This family consists of several bacterial lipooligosaccharide sialyltransferases similar to the Haemophilus ducreyi LST protein. Haemophilus ducreyi is the cause of the sexually transmitted disease chancroid and produces a lipooligosaccharide (LOS) containing a terminal sialyl N-acetyllactosamine trisaccharide. 45749 pfam05856: ARP2/3 complex 20 kDa subunit (ARPC4). This family consists of several eukaryotic ARP2/3 complex 20 kDa subunit (P20-ARC) proteins. The Arp2/3 protein complex has been implicated in the control of actin polymerisation in cells. The human complex consists of seven subunits which include the actin related proteins Arp2 and Arp3 it has been suggested that the complex promotes actin assembly in lamellipodia and may participate in lamellipodial protrusion. 45750 pfam05857: TraX protein. This family consists of several bacterial TraX proteins. TraX is responsible for the amino-terminal acetylation of F-pilin subunits. 45751 pfam05858: Bovine immunodeficiency virus surface envelope protein (ENV). The bovine lentivirus also known as the bovine immunodeficiency-like virus (BIV) has conserved and hypervariable regions in the surface envelope gene. 45752 pfam05859: Mis12 protein. Kinetochores are the chromosomal sites for spindle interaction and play a vital role for chromosome segregation. Fission yeast kinetochore protein Mis12, is required for correct spindle morphogenesis, determining metaphase spindle length. Thirty-five to sixty percent extension of metaphase spindle length takes place in Mis12 mutants. It has been shown that Mis12 might genetically interact with Mal2p. 45753 pfam05860: haemagglutination activity domain. This domain is suggested to be a carbohydrate- dependent haemagglutination activity site. It is found in a range of haemagglutinins and haemolysins. 45754 pfam05861: Bacterial phosphonate metabolism protein (PhnI). This family consists of several Proteobacterial phosphonate metabolism protein (PhnI) sequences. Bacteria that use phosphonates as a phosphorus source must be able to break the stable carbon-phosphorus bond. In Escherichia coli phosphonates are broken down by a C-P lyase that has a broad substrate specificity. The genes for phosphonate uptake and degradation in E. coli are organised in an operon of 14 genes, named phnC to phnP. Three gene products (PhnC, PhnD and PhnE) comprise a binding protein-dependent phosphonate transporter, which also transports phosphate, phosphite, and certain phosphate esters such as phosphoserine; two gene products (PhnF and PhnO) may have a role in gene regulation; and nine gene products (PhnG, PhnH, PhnI, PhnJ, PhnK, PhnL, PhnM, PhnN, and PhnP) probably comprise a membrane-associated C-P lyase enzyme complex. 45755 pfam05862: Helicobacter pylori IceA2 protein. This family consists of several Helicobacter pylori specific IceA2 proteins. The function of this family is unknown. 45756 pfam05863: Eukaryotic protein of unknown function (DUF850). This family consists of several eukaryotic putative membrane proteins of unknown function. 45757 pfam05864: Chordopoxvirus DNA-directed RNA polymerase 7 kDa polypeptide (RPO7). This family consists of several Chordopoxvirus DNA-directed RNA polymerase 7 kDa polypeptide sequences. DNA-dependent RNA polymerase catalyses the transcription of DNA into RNA. 45758 pfam05865: Cypovirus polyhedrin protein. This family consists of several Cypovirus polyhedrin protein. Polyhedrin is known to form a crystalline matrix (polyhedra) in infected insect cells. 45759 pfam05866: Endodeoxyribonuclease RusA. This family consists of several bacterial and phage Holliday junction resolvase (RusA) like proteins. The RusA protein of Escherichia coli is an endonuclease that can resolve Holliday intermediates and correct the defects in genetic recombination and DNA repair associated with inactivation of RuvAB or RuvC. 45760 pfam05867: Caenorhabditis elegans repeat of unknown function (DUF851). This family consists of several repeated 32 residue sequences which seem only to be found in a Caenorhabditis elegans protein. 45761 pfam05868: Rotavirus major outer capsid protein VP7. This family consists of several Rotavirus major outer capsid protein VP7 sequences. The rotavirus capsid is composed of three concentric protein layers. Proteins VP4 and VP7 comprise the outer layer. VP4 forms spikes and is the viral attachment protein. VP7 is a glycoprotein and the major constituent of the outer protein layer. 45762 pfam05869: DNA N-6-adenine-methyltransferase (Dam). This family consists of several bacterial and phage DNA N-6-adenine-methyltransferase (Dam) like sequences. 45763 pfam05870: Phenolic acid decarboxylase (PAD). This family consists of several bacterial phenolic acid decarboxylase proteins. Phenolic acids, also called substituted cinnamic acids, are important lignin-related aromatic acids and natural constituents of plant cell walls. These acids (particularly ferulic, p-coumaric, and caffeic acids) bind the complex lignin polymer to the hemicellulose and cellulose in plants. The Phenolic acid decarboxylase (PAD) gene (pad) is transcriptionally regulated by p-coumaric, ferulic, or caffeic acid; these three acids are the three substrates of PAD. 45764 pfam05871: Eukaryotic protein of unknown function (DUF852). This family consists of several eukaryotic proteins with no known function. The family contains a fragment match to a protein, which is thought to be a ubiquitin carrier protein. 45765 pfam05872: Bacterial protein of unknown function (DUF853). This family consists of several bacterial proteins of unknown function. One member is thought to be an ATPase. 45766 pfam05873: ATP synthase D chain, mitochondrial (ATP5H). This family consists of several ATP synthase D chain, mitochondrial (ATP5H) proteins. Subunit d has no extensive hydrophobic sequences, and is not apparently related to any subunit described in the simpler ATP synthases in bacteria and chloroplasts. 45767 pfam05874: Pheromone biosynthesis activating neuropeptide (PBAN). This family consists of several moth pheromone biosynthesis activating neuropeptide (PBAN) sequences. Female moths produce and release species specific sex pheromones to attract males for mating. Pheromone biosynthesis is hormonally regulated by the Pheromone Biosynthesis Activating Neuropeptide (PBAN) which is biosynthesised in the subesophageal ganglion (SOG). . 45768 pfam05875: Alkaline phytoceramidase (aPHC). This family consists of several eukaryotic Alkaline phytoceramidase (aPHC) sequences. Ceramidases are enzymes involved in regulating cellular levels of ceramides, sphingoid bases, and their phosphates. Alkaline phytoceramidase (aPHC) is responsible for the hydrolysis of phytoceramide. 45769 pfam05876: Phage terminase large subunit (GpA). This family consists of several phage terminase large subunit proteins as well as related sequences from several bacterial species. The DNA packaging enzyme of bacteriophage lambda, terminase, is a heteromultimer composed of a small subunit, gpNu1, and a large subunit, gpA, products of the Nu1 and A genes, respectively. Terminase is involved in the site-specific binding and cutting of the DNA in the initial stages of packaging. It is now known that gpA is actively involved in late stages of packaging, including DNA translocation, and that this enzyme contains separate functional domains for its early and late packaging activities. . 45770 pfam05877: Caenorhabditis elegans repeat of unknown function (DUF854). This family consists of a number of 36 residue repeated sequences from a protein with unknown function. 45771 pfam05878: Phytoreovirus nonstructural protein Pns9/Pns10. This family consists of the Phytoreovirus nonstructural proteins Pns9 and Pns10. The function of this family is unknown. 45772 pfam05879: Root hair defective 3 GTP-binding protein (RHD3). This family consists of several eukaryotic root hair defective 3 like GTP-binding proteins. It has been speculated that the RHD3 protein is a member of a novel class of GTP-binding proteins that is widespread in eukaryotes and required for regulated cell enlargement. The family also contains the homologous yeast synthetic enhancement of YOP1 (SEY1) protein which is involved in membrane trafficking. 45773 pfam05880: Fijivirus 64 kDa capsid protein. This family consists of several Fijivirus 64 kDa capsid proteins. 45774 pfam05881: 2',3'-cyclic nucleotide 3'-phosphodiesterase (CNP or CNPase). This family consists of the eukaryotic protein 2',3'-cyclic nucleotide 3'-phosphodiesterase (CNP). 2',3'-cyclic nucleotide 3 '-phosphodiesterase (CNP) is one of the earliest myelin-related proteins expressed in differentiating oligodendrocytes and Schwann cells. CNP is abundant in the central nervous system and in oligodendrocytes. This protein is also found in mammalian photoreceptor cells, testis and lymphocytes. Although the biological function of CNP is unknown, it is thought to play a significant role in the formation of the myelin sheath, where it comprises 4% of total protein. CNP selectively cleaves 2',3'-cyclic nucleotides to produce 2'-nucleotides in vitro. Although physiologically relevant substrates with 2 ',3'-cyclic termini are still unknown, numerous cyclic phosphate containing RNAs occur transiently within eukaryotic cells. Other known protein families capable of hydrolysing 2',3'-cyclic nucleotides include tRNA ligases and plant cyclic phosphodiesterases. The catalytic domains from all these proteins contain two tetra-peptide motifs H-X-T/S-X, where X is usually a hydrophobic residue. Mutation of either histidine in CNP abolishes enzymatic activity. 45775 pfam05882: ACN9 family. Mutants of the yeast ACN9 gene have two- to fourfold elevated levels of enzymes of the glyoxylate cycle, gluconeogenesis, and acetyl-CoA metabolism. The ACN9 protein was localised to the mitochondrial intermembrane space. 45776 pfam05883: Baculovirus protein of unknown function (DUF855). This family consists of several Baculovirus proteins of around 130 residues in length. The function of this family is unknown. 45777 pfam05884: Caenorhabditis elegans protein of unknown function (DUF856). This family consists of several Caenorhabditis elegans specific proteins of unknown function. 45778 pfam05885: Domain of unknown function (DUF857). This family consists of several domains of unknown function which are commonly found in multiple copies in the pfam00246 family. 45779 pfam05886: Orthopoxvirus F8 protein. This family consists of several Orthopoxvirus F8 proteins. The function of this family is unknown. 45780 pfam05887: Procyclic acidic repetitive protein (PARP). This family consists of several Trypanosoma brucei procyclic acidic repetitive protein (PARP) like sequences. The procyclic acidic repetitive protein (parp) genes of Trypanosoma brucei encode a small family of abundant surface proteins whose expression is restricted to the procyclic form of the parasite. They are found at two unlinked loci, parpA and parpB; transcription of both loci is developmentally regulated. . 45781 pfam05888: Totivirus RNA-dependent RNA polymerase. This family consists of several RNA-dependent RNA polymerase proteins from the Totiviruses. 45782 pfam05889: Soluble liver antigen/liver pancreas antigen (SLA/LP autoantigen). This family consists of several eukaryotic and archaeal proteins which are related to the human soluble liver antigen/liver pancreas antigen (SLA/LP autoantigen). Autoantibodies are a hallmark of autoimmune hepatitis, but most are not disease specific. Autoantibodies to soluble liver antigen (SLA) and to liver and pancreas antigen (LP) have been described as disease specific, occurring in about 30% of all patients with autoimmune hepatitis. The function of SLA/LP is unknown, however, it has been suggested that the protein may function as a serine hydroxymethyltransferase and may be an important enzyme in the thus far poorly understood selenocysteine pathway. Some archaeal sequences are annotated as being pyridoxal phosphate-dependent enzymes. 45783 pfam05890: Eukaryotic rRNA processing protein EBP2. This family consists of several Eukaryotic rRNA processing protein EBP2 sequences. Ebp2p is required for the maturation of 25S rRNA and 60S subunit assembly. Ebp2p may be one of the target proteins of Rrs1p for executing the signal to regulate ribosome biogenesis. 45784 pfam05891: Eukaryotic protein of unknown function (DUF858). This family consists of several eukaryotic proteins of unknown function. 45785 pfam05892: Trichovirus coat protein. This family consists of several coat proteins which are specific to the ssRNA positive-strand, no DNA stage viruses such as the Trichovirus and Vitivirus. 45786 pfam05893: Acyl-CoA reductase (LuxC). This family consists of several bacterial Acyl-CoA reductase (LuxC) proteins. The channelling of fatty acids into the fatty aldehyde substrate for the bacterial bioluminescence reaction is catalysed by a fatty acid reductase multienzyme complex, which channels fatty acids through the thioesterase (LuxD), synthetase (LuxE) and reductase (LuxC) components. 45787 pfam05894: Podovirus DNA encapsidation protein (Gp16). This family consists of several DNA encapsidation protein (Gp16) sequences from the phi-29-like viruses. Gene product 16 catalyses the in vivo and in vitro genome-encapsidation reaction. . 45788 pfam05895: Siphovirus protein of unknown function (DUF859). This family consists of several uncharacterised proteins from the Siphoviruses as well as one bacterial sequence. Some of the members of this family are described as putative minor structural proteins. 45789 pfam05896: Na(+)-translocating NADH-quinone reductase subunit A (NQRA). This family consists of several bacterial Na(+)-translocating NADH-quinone reductase subunit A (NQRA) proteins. The Na(+)-translocating NADH: ubiquinone oxidoreductase (Na(+)-NQR) generates an electrochemical Na(+) potential driven by aerobic respiration. 45790 pfam05897: Lycopene cyclase (CrtY). This family consists of several bacterial Lycopene cyclase (CrtY) proteins. Lycopene cyclase is a key enzyme which converts the acyclic carotenoid lycopene into the cyclic carotenoid beta-carotene. 45791 pfam05898: Plant specific of unknown function (DUF860). This family consists of several plant proteins of unknown function. 45792 pfam05899: Protein of unknown function (DUF861). This family consists of several proteins which seem to be specific to plants and bacteria. The function of this family is unknown. 45793 pfam05900: Gammaherpesvirus BFRF1 protein. This family consists of several Epstein-barr virus BFRF1 like proteins from the Gammaherpesviruses. BFRF1 belongs to the lytic proteins, since its expression is achieved following activation of the EBV replication cycle. Furthermore, it can be classified as an early protein, given the fact that it is only partially inhibited by treatment with PAA and ACV (with the BFRF1 gene behaving like BALF5, a known early gene) and that it is present in Raji cells which harbour a defective EBV strain that does not allow expression of the late lytic genes. 45794 pfam05901: Excalibur domain. Extracellular Ca2+-dependent nuclease YokF from Bacillus subtilis and several other surface-exposed proteins from diverse bacteria are encoded in the genomes in two paralogous forms that differ by a ~45 amino acid fragment, which comprises a novel conserved domain. Sequence analysis of this domain revealed a conserved DxDxDGxxCE motif, which is strikingly similar to the Ca2+-binding loop of the calmodulin-like EF-hand domains, suggesting an evolutionary relationship between them. Functions of many of the other proteins in which the novel domain, named Excalibur (extracellular calcium-binding region), is found, as well as a structural model of its conserved motif are consistent with the notion that the Excalibur domain binds calcium. This domain is but one more example of the diversity of structural contexts surrounding the EF-hand-like calcium-binding loop in bacteria. This loop is thus more widespread than hitherto recognised and the evolution of EF-hand-like domains is probably more complex than previously appreciated. 45795 pfam05902: 4.1 protein C-terminal domain (CTD). At the C-terminus of all known 4.1 proteins is a sequence domain unique to these proteins, known as the C-terminal domain (CTD). Mammalian CTDs are associated with a growing number of protein-protein interactions, although such activities have yet to be associated with invertebrate CTDs. Mammalian CTDs are generally defined by sequence alignment as encoded by exons 18-21. Comparison of known vertebrate 4.1 proteins with invertebrate 4.1 proteins indicates that mammalian 4.1 exon 19 represents a vertebrate adaptation that extends the sequence of the CTD with a Ser/Thr-rich sequence. The CTD was first described as a 22/24-kDa domain by chymotryptic digestion of erythrocyte 4.1 (4.1R). CTD is thought to represent an independent folding structure which has gained function since the divergence of vertebrates from invertebrates. . 45796 pfam05903: Eukaryotic protein of unknown function (DUF862). This family consists of the N terminal portion of several eukaryotic sequences and is found in both animals and plants. The function of this family is unknown. 45797 pfam05904: Plant protein of unknown function (DUF863). This family consists of a number of hypothetical proteins from Arabidopsis thaliana and Oryza sativa. The function of this family is unknown. 45798 pfam05905: Helicobacter pylori protein of unknown function (DUF864). This family consists exclusively of Helicobacter pylori proteins of unknown function. 45799 pfam05906: Herpesvirus-7 repeat of unknown function (DUF865). This family consists of a series of 12 repeats of 35 amino acids in length which are found exclusively in Herpesvirus-7. The function of this family is unknown. 45800 pfam05907: Eukaryotic protein of unknown function (DUF866). This family consists of a number of hypothetical eukaryotic proteins of unknown function with an average length of around 165 residues. 45801 pfam05908: Protein of unknown function (DUF867). This family consists of a number of bacterial and phage proteins with no known function and is present in Bacillus species and the Lambda-like viruses. 45802 pfam05909: IWS1 C-terminus. This family consists of a the C terminal region of a number of eukaryotic hypothetical proteins which are homologous to the Saccharomyces cerevisiae protein IWS1. IWS1 is known to be an Pol II transcription elongation factor and interacts with Spt6 and Spt5. 45803 pfam05910: Plant protein of unknown function (DUF868). This family consists of several hypothetical proteins from Arabidopsis thaliana and Oryza sativa. The function of this family is unknown. 45804 pfam05911: Plant protein of unknown function (DUF869). This family consists of a number of sequences found in Arabidopsis thaliana, Oryza sativa and Lycopersicon esculentum (Tomato). The function of this family is unknown. 45805 pfam05912: Caenorhabditis elegans protein of unknown function (DUF870). This family consists of a number of hypothetical proteins which seem to be specific to Caenorhabditis elegans. The function of this family is unknown. 45806 pfam05913: Bacterial protein of unknown function (DUF871). This family consists of several conserved hypothetical proteins from bacteria and archaea. The function of this family is unknown. 45807 pfam05914: RIB43A. This family consists of several RIB43A-like eukaryotic proteins. Ciliary and flagellar microtubules contain a specialised set of protofilaments, termed ribbons, that are composed of tubulin and several associated proteins. RIB43A was first characterised in the unicellular biflagellate, Chlamydomonas reinhardtii although highly related sequences are present in several higher eukaryotes including humans. The function of this protein is unknown although the structure of RIB43A and its association with the specialised protofilament ribbons and with basal bodies is relevant to the proposed role of ribbons in forming and stabilising doublet and triplet microtubules and in organising their three-dimensional structure. Human RIB43A homologues could represent a structural requirement in centriole replication in dividing cells. 45808 pfam05915: Eukaryotic protein of unknown function (DUF872). This family consists of several uncharacterised eukaryotic proteins. The function of this family is unknown. 45809 pfam05916: Synthetic lethal mutants of dpb11-1 five. The GINS complex is essential for initiation of DNA replication in Xenopus egg extracts. This 100 kD stable complex includes Sld5, Psf1, Psf2, and Psf3. Homologues of these components are found also in yeasts and in humans. 45810 pfam05917: Helicobacter pylori protein of unknown function (DUF874). This family consists of several hypothetical proteins specific to Helicobacter pylori. The function of this family is unknown. 45811 pfam05918: Apoptosis inhibitory protein 5 (API5). This family consists of apoptosis inhibitory protein 5 (API5) sequences from several organisms. Apoptosis or programmed cell death is a physiological form of cell death that occurs in embryonic development and organ formation. It is characterised by biochemical and morphological changes such as DNA fragmentation and cell volume shrinkage. API5 is an anti apoptosis gene located in human chromosome 11, whose expression prevents the programmed cell death that occurs upon the deprivation of growth factors. 45812 pfam05919: Mitovirus RNA-dependent RNA polymerase. This family consists of several Mitovirus RNA-dependent RNA polymerase proteins. The family also contains fragment matches in the mitochondria of Arabidopsis thaliana. 45813 pfam05920: Coprinus cinereus mating-type protein. This family consists of several mating-type alpha and beta proteins from Coprinus cinereus (Inky cap fungus) as well as a related sequence from Schizophyllum commune (Bracket fungus). The A mating type locus of the fungus Coprinus cinereus is a complex, multigenic locus which regulates compatibility and subsequent sexual development. 45814 pfam05921: Actinomycete protein of unknown function (DUF875). This family consists of a series of hypothetical bacterial proteins of around 150 residues in length. The family contains members from Mycobacterium tuberculosis, Mycobacterium leprae and Streptomyces coelicolor and its function is unknown. 45815 pfam05922: Subtilisin N-terminal Region. This family is found at the N-terminus of a number of subtilisins. It is cleaved prior to activation of the enzyme. 45816 pfam05923: APC cysteine-rich region. This short region is found repeated in the mid region of the adenomatous polyposis proteins (APCs). In the human protein many cancer-linked SNPs are found near the first three occurrences of the motif. These repeats bind beta-catenin. 45817 pfam05924: SAMP Motif. This short region is found repeated in the mid region of the adenomatous polyposis proteins (APCs). This motif binds axin. 45818 pfam05925: Enterobacterial virulence protein IpgD. This family consists of several enterobacterial IpgD like virulence factor proteins. In the Gram-negative pathogen Shigella flexneri, the virulence factor IpgD is translocated directly into eukaryotic cells and acts as a potent inositol 4-phosphatase that specifically dephosphorylates phosphatidylinositol 4,5-bisphosphate [PtdIns(4,5)P(2)] into phosphatidylinositol 5-monophosphate [PtdIns(5)P] that then accumulates. Transformation of PtdIns(4,5)P(2) into PtdIns(5)P by IpgD is responsible for dramatic morphological changes of the host cell, leading to a decrease in membrane tether force associated with membrane blebbing and actin filament remodelling. 45819 pfam05926: Phage head completion protein (GPL). This family consists of several phage head completion protein (GPL) as well as related bacterial sequences. Members of this family allow the completion of filled heads by rendering newly packaged DNA in the heads resistant to DNase. The protein is thought to bind to DNA filled capsids. 45820 pfam05927: Penaeidin. This family consists of several isoforms of the penaeidin protein which is specific to shrimps. Penaeidins, a unique family of antimicrobial peptides (AMPs) with both proline and cysteine-rich domains, were initially identified in the hemolymph of the Pacific white shrimp, Litopenaeus vannamei. 45821 pfam05928: Zea mays MURB-like protein (MuDR). This family consists of several Zea mays specific MURB-like proteins. The transposition of Mu elements underlying Mutator activity in maize requires a transcriptionally active MuDR element. Despite variation in MuDR copy number and RNA levels in Mutator lines, transposition events are consistently late in plant development, and Mu excision frequencies are similar. 45822 pfam05929: Phage capsid scaffolding protein (GPO). This family consists of several bacteriophage capsid scaffolding protein (GPO) and some related bacterial sequences. GPO is thought to function in both the assembly of proheads and the cleavage of GPN. 45823 pfam05930: Prophage CP4-57 regulatory protein (AlpA). This family consists of several short bacterial and phage proteins which are related to the E. coli protein AlpA. AlpA suppress two phenotypes of a delta lon protease mutant, overproduction of capsular polysaccharide and sensitivity to UV light. Several of the sequences in this family are thought to be DNA-binding proteins. 45824 pfam05931: Staphylococcal AgrD protein. This family consists of several AgrD proteins from many Staphylococcus species. The agr locus was initially described in Staphylococcus aureus as an element controlling the production of exoproteins implicated in virulence. Its pattern of action has been shown to be complex, upregulating certain extracellular toxins and enzymes expressed post-exponentially and repressing some exponential-phase surface components. AgrD encodes the precursor of the autoinducing peptide (AIP).The AIP derived from AgrD by the action of AgrB interacts with AgrC in the membrane to activate AgrA, which upregulates transcription both from promoter P2, amplifying the response, and from P3, initiating the production of a novel effector: RNAIII. In S. aureus, delta-hemolysin is the only translation product of RNA III and is not involved in the regulatory functions of the transcript, which is therefore the primary agent for modulating the expression of other operons controlled by agr. 45825 pfam05932: Tir chaperone protein (CesT). This family consists of a number of bacterial sequences which are highly similar to the Tir chaperone protein in E. Coli. In many Gram-negative bacteria, a key indicator of pathogenic potential is the possession of a specialised type III secretion system, which is utilised to deliver virulence effector proteins directly into the host cell cytosol. Many of the proteins secreted from such systems require small cytosolic chaperones to maintain the secreted substrates in a secretion-competent state. CesT serves a chaperone function for the enteropathogenic Escherichia coli (EPEC) translocated intimin receptor (Tir) protein, which confers upon EPEC the ability to alter host cell morphology following intimate bacterial attachment. . 45826 pfam05933: Fungal ATP synthase protein 8 (A6L). This family consists of fungus specific ATP synthase protein 8 (EC:3.6.3.14). The family may be related to the ATP synthase protein 8 found in other eukaryotes pfam00895. 45827 pfam05934: Mid-1-related chloride channel (MCLC). This family consists of several mid-1-related chloride channels. mid-1-related chloride channel (MCLC) proteins function as a chloride channel when incorporated in the planar lipid bilayer. 45828 pfam05935: Arylsulfotransferase (ASST). This family consists of several bacterial Arylsulfotransferase proteins. Arylsulfotransferase (ASST) transfers a sulfate group from phenolic sulfate esters to a phenolic acceptor substrate. 45829 pfam05936: Bacterial protein of unknown function (DUF876). This family consists of a series of hypothetical bacterial sequences of unknown function. 45830 pfam05937: EB-1 Binding Domain. This region at the C-terminus of the APC proteins binds the microtubule-associating protein EB-1. At the C-terminus of the alignment is also a pfam00595 binding domain. A short motif in the middle of the region appears to be found in the APC2 proteins. 45831 pfam05938: Plant self-incompatibility protein S1. This family consists of a series of plant proteins which are related to the Papaver rhoeas S1 self-incompatibility protein. Self incompatibility (SI) is the single most important outbreeding device found in angiosperms and is a mechanism that regulates the acceptance or rejection of pollen. S1 is known to exhibit specific pollen-inhibitory properties. 45832 pfam05939: Phage minor tail protein. This family consists of a series of phage minor tail proteins and related sequences from several bacterial species. 45833 pfam05940: NnrS protein. This family consists of several bacterial NnrS like proteins. NnrS is a putative heme-Cu protein (NnrS) and a member of the short-chain dehydrogenase family. Expression of nnrS is dependent on the transcriptional regulator NnrR, which also regulates expression of genes required for the reduction of nitrite to nitrous oxide, including nirK and nor. NnrS is a haem- and copper-containing membrane protein. Genes encoding putative orthologues of NnrS are sometimes but not always found in bacteria encoding nitrite and/or nitric oxide reductase. 45834 pfam05941: Chordopoxvirus A20R protein. This family consists of several Chordopoxvirus A20R proteins. The A20R protein is required for DNA replication, is associated with the processive form of the viral DNA polymerase, and directly interacts with the viral proteins encoded by the D4R, D5R, and H5R open reading frames. A20R may contribute to the assembly or stability of the multiprotein DNA replication complex. 45835 pfam05942: Archaeal PaREP1 protein. This family consists of several archaeal PaREP1 proteins the function of this family is unknown. 45836 pfam05943: Protein of unknown function (DUF877). This family consists of a number of uncharacterised bacterial proteins. The function of this family is unknown. 45837 pfam05944: Phage small terminase subunit. This family consists of several phage small terminase subunit proteins as well as some related bacterial sequences. 45838 pfam05946: Toxin-coregulated pilus subunit TcpA. This family consists of toxin-coregulated pilus subunit (TcpA) proteins from Vibrio cholerae and related sequences. The major virulence factors of toxigenic Vibrio cholerae are cholera toxin (CT), which is encoded by a lysogenic bacteriophage (CTXPhi), and toxin-coregulated pilus (TCP), an essential colonisation factor which is also the receptor for CTXPhi. The genes for the biosynthesis of TCP are part of a larger genetic element known as the TCP pathogenicity island. 45839 pfam05947: Bacterial protein of unknown function (DUF879). This family consists of several hypothetical bacterial proteins of unknown function. 45840 pfam05948: Protein of unknown function (DUF880). This family consists of a number of hypothetical bacterial and plant proteins. The family also contains the C terminal region of a Cysteinyl-tRNA synthetase from Staphylococcus epidermidis. The function of this family is unknown. 45841 pfam05949: Bacterial protein of unknown function (DUF881). This family consists of a series of hypothetical bacterial proteins. One of the family members from Bacillus subtilis is thought to be involved in cell division and sporulation. 45842 pfam05950: Orthopoxvirus A36R protein. This family consists of several Orthopoxvirus A36R proteins. The A36R protein is predicted to be a type Ib membrane protein. 45843 pfam05951: Bacterial protein of unknown function (DUF882). This family consists of a series of hypothetical bacterial proteins of unknown function. 45844 pfam05952: Bacillus competence pheromone ComX. Natural genetic competence in Bacillus subtilis is controlled by quorum-sensing (QS). The ComP- ComA two-component system detects the signalling molecule ComX, and this signal is transduced by a conserved phosphotransfer mechanism. ComX is synthesised as an inactive precursor and is then cleaved and modified by ComQ before export to the extracellular environment. 45845 pfam05953: Allatostatin. This family consists of allatostatins, bombystatins, helicostatins, cydiastatins and schistostatin from several insect species. Allatostatins (ASTs) of the Tyr/Phe-Xaa-Phe-Gly Leu/Ile-NH2 family are a group of insect neuropeptides that inhibit juvenile hormone biosynthesis by the corpora allata. 45846 pfam05954: Phage late control gene D protein (GPD). This family consists of a number of phage late control gene D proteins and related bacterial sequences. 45847 pfam05955: Equine herpesvirus glycoprotein gp2. This family consists of a number of glycoprotein gp2 sequences from equine herpesviruses. 45848 pfam05956: APC basic domain. This region of the APC family of proteins is known as the basic domain. It contains a high proportion of positively charged amino acids and interacts with microtubules. 45849 pfam05957: Bacterial protein of unknown function (DUF883). This family consists of several hypothetical bacterial proteins of unknown function. 45850 pfam05958: tRNA (Uracil-5-)-methyltransferase. This family consists of (Uracil-5-)-methyltransferases EC:2.1.1.35 from bacteria, archaea and eukaryotes. A 5-methyluridine (m(5)U) residue at position 54 is a conserved feature of bacterial and eukaryotic tRNAs. The methylation of U54 is catalysed by the tRNA(m5U54)methyltransferase, which in Saccharomyces cerevisiae is encoded by the nonessential TRM2 gene. It is thought that tRNA modification enzymes might have a role in tRNA maturation not necessarily linked to their known catalytic activity. 45851 pfam05959: Nucleopolyhedrovirus protein of unknown function (DUF884). This family consists of several hypothetical Nucleopolyhedrovirus proteins of unknown function. 45852 pfam05960: Bacterial protein of unknown function (DUF885). This family consists of several hypothetical bacterial proteins several of which are putative membrane proteins. 45853 pfam05961: Chordopoxvirus A13L protein. This family consists of A13L proteins from the Chordopoxviruses. A13L or p8 is one of the three most abundant membrane proteins of the intracellular mature Vaccinia virus. 45854 pfam05962: Bacterial protein of unknown function (DUF886). This family consists of several hypothetical bacterial proteins of unknown function. 45855 pfam05963: Cytomegalovirus US3 protein. US3 of human cytomegalovirus is an endoplasmic reticulum resident transmembrane glycoprotein that binds to major histocompatibility complex class I molecules and prevents their departure. The endoplasmic reticulum retention signal of the US3 protein is contained in the luminal domain of the protein. 45856 pfam05964: F/Y-rich N-terminus. This region is normally found in the trithorax/ALL1 family proteins. It is similar to SMART:SM00541. 45857 pfam05965: F/Y rich C-terminus. This region is normally found in the trithorax/ALL1 family proteins. It is similar to SMART:SM00542. 45858 pfam05966: Chordopoxvirus A33R protein. This family consists of several Chordopoxvirus A33R proteins. A33R plays a role in promoting Ab-resistant cell-to-cell spread of virus and interacts with A36R to incorporate the protein into the outer membrane of intracellular enveloped virions (IEV).. 45859 pfam05967: Eukaryotic protein of unknown function (DUF887). This family consists of several conserved eukaryotic proteins of unknown function. 45860 pfam05968: Bacillus PapR protein. This family consists of the Bacillus species specific PapR protein. The papR gene belongs to the PlcR regulon and is located 70 bp downstream from plcR. It encodes a 48-amino-acid peptide. Disruption of the papR gene abolishes expression of the PlcR regulon, resulting in a large decrease in haemolysis and virulence in insect larvae. A processed form of PapR activates the PlcR regulon by allowing PlcR to bind to its DNA target. This activating mechanism is strain specific. 45861 pfam05969: Protein of unknown function (DUF888). This family consists of several short hypothetical plant and cyanobacterial proteins. In plants these proteins are localised to the chloroplast and are known as hypothetical chloroplast protein 12. This family is likely to play some role in photosynthesis. 45862 pfam05970: Eukaryotic protein of unknown function (DUF889). The majority of members in this family are from plant species although it does contain some Caenorhabditis elegans sequences. Most of the sequences in the family are described as hypothetical, however some are putative helicases. 45863 pfam05971: Protein of unknown function (DUF890). This family consists of several conserved hypothetical proteins from both eukaryotes and prokaryotes. The function of this family is unknown. 45864 pfam05972: APC 15 residue motif. This motif, known as the 15 aa repeat, is found in the APC protein family. They are involved in binding beta-catenin along with the pfam05923 repeats. Many human cancer mutations map to the region around these motifs, and may be involved in disrupting their binding of beta-catenin. 45865 pfam05973: Bacterial protein of unknown function (DUF891). This family consists of several hypothetical bacterial proteins of unknown function. 45866 pfam05974: Protein of unknown function (DUF892). This family consists of several hypothetical bacterial proteins of unknown function. 45867 pfam05975: Bacterial ABC transporter protein EcsB. This family consists of several bacterial ABC transporter proteins which are homologous to the EcsB protein of Bacillus subtilis. EcsB is thought to encode a hydrophobic protein with six membrane-spanning helices in a pattern found in other hydrophobic components of ABC transporters. . 45868 pfam05976: Bacterial membrane protein of unknown function (DUF893). This family consists of several putative bacterial membrane proteins of unknown function. 45869 pfam05977: Bacterial protein of unknown function (DUF894). This family consists of several bacterial proteins, many of which are annotated as putative transmembrane transport proteins. 45870 pfam05978: Eukaryotic protein of unknown function (DUF895). This family consists of several hypothetical eukaryotic proteins of unknown function. 45871 pfam05979: Bacterial protein of unknown function (DUF896). This family consists of several short, hypothetical bacterial proteins of unknown function. 45872 pfam05980: Toxin 7. This family consists of several short spider neurotoxin proteins including many from the Funnel-web spider. 45873 pfam05981: CreA protein. This family consists of several bacterial CreA proteins, the function of which is unknown. 45874 pfam05982: Domain of unknown function (DUF897). Family of bacterial proteins with unknown function. 45875 pfam05983: MED7 protein. This family consists of several eukaryotic proteins which are homologues of the yeast MED7 protein. Activation of gene transcription in metazoans is a multistep process that is triggered by factors that recognise transcriptional enhancer sites in DNA. These factors work with co-activators such as MED7 to direct transcriptional initiation by the RNA polymerase II apparatus. 45876 pfam05984: Cytomegalovirus UL20A protein. This family consists of several Cytomegalovirus UL20A proteins. UL20A is thought to be a glycoprotein. 45877 pfam05985: Ethanolamine ammonia-lyase light chain (EutC). This family consists of several bacterial ethanolamine ammonia-lyase light chain (EutC) EC:4.3.1.7 sequences. Ethanolamine ammonia-lyase is a bacterial enzyme that catalyses the adenosylcobalamin-dependent conversion of certain vicinal amino alcohols to oxo compounds and ammonia. 45878 pfam05986: ADAM-TS Spacer 1. This family represents the Spacer-1 region from the ADAM-TS family of metalloproteinases. 45879 pfam05987: Bacterial protein of unknown function (DUF898). This family consists of several bacterial proteins of unknown function. Some of the family members are described as putative membrane proteins. 45880 pfam05988: Bacterial protein of unknown function (DUF899). This family consists of several uncharacterised bacterial proteins of unknown function. 45881 pfam05989: Chordopoxvirus A35R protein. This family consists of several Chordopoxvirus sequences homologous to the Vaccinia virus A35R protein. The function of this family is unknown. 45882 pfam05990: Protein of unknown function (DUF900). This family consists of several hypothetical proteins of unknown function mostly found in Rhizobium species. 45883 pfam05991: Protein of unknown function (DUF901). This family consists of several hypothetical bacterial proteins as well as some uncharacterised sequences from Arabidopsis thaliana. The function of this family is unknown. 45884 pfam05992: SbmA/BacA-like family. The Rhizobium meliloti bacA gene encodes a function that is essential for bacterial differentiation into bacteroids within plant cells in the symbiosis between R. meliloti and alfalfa. An Escherichia coli homolog of BacA, SbmA, is implicated in the uptake of microcins and bleomycin. This family is likely to be a subfamily of the ABC transporter family. 45885 pfam05993: Reovirus major virion structural protein Mu-1/Mu-1C (M2). This family consists of several Reovirus major virion structural protein Mu-1/Mu-1C (M2) sequences. This family is family is thought to play a role in host cell membrane penetration. 45886 pfam05994: Cytoplasmic Fragile-X interacting family. CYFIP1/2 (Cytoplasmic fragile X mental retardation interacting protein) like proteins for a highly conserved protein family. The function of CYFIPs is unclear, but CYFIP interaction with fragile X mental retardation interacting protein (FMRP) involves the domain of FMRP which also mediating homo- and heteromerization. 45887 pfam05995: Cysteine dioxygenase type I. Cysteine dioxygenase type I (EC:1.13.11.20) converts cysteine to cysteinesulphinic acid and is the rate-limiting step in sulphate production. 45888 pfam05996: Ferredoxin-dependent bilin reductase. This family consists of several different but closely related proteins which include phycocyanobilin:ferredoxin oxidoreductase EC:1.3.7.5 (PcyA), 15,16-dihydrobiliverdin:ferredoxin oxidoreductase EC:1.3.7.2 (PebA) and phycoerythrobilin:ferredoxin oxidoreductase EC:1.3.7.3 (PebB). Phytobilins are linear tetrapyrrole precursors of the light-harvesting prosthetic groups of the phytochrome photoreceptors of plants and the phycobiliprotein photosynthetic antennae of cyanobacteria, red algae, and cryptomonads. It is known that that phytobilins are synthesised from heme via the intermediacy of biliverdin IX alpha (BV), which is reduced subsequently by ferredoxin-dependent bilin reductases with different double-bond specificities. . 45889 pfam05997: Nucleolar protein,Nop52. Nop52 believed to be involved in the generation of 28S rRNA. 45890 pfam05998: Helicobacter pylori FlgM protein. This family consists of several FlgM proteins from Helicobacter pylori. FlgM is an anti-sigma factor which along with FliA plays a central role in the regulation of flagellar biogenesis in H. pylori. . 45891 pfam05999: Herpesvirus U5-like family. This family of Herpesvirus includes U4, U5 and UL27. 45892 pfam06000: Ethanolamine utilisation protein EutL. This family consists of several bacterial ethanolamine utilisation protein (EutL) as well as propanediol utilisation protein (PduB) sequences. The eut operon of Salmonella typhimurium encodes proteins involved in the cobalamin-dependent degradation of ethanolamine. The exact function of EutL id unknown. 45893 pfam06001: Domain of Unknown Function (DUF902).. 45894 pfam06002: Alpha-2,3-sialyltransferase (CST-I). This family consists of several alpha-2,3-sialyltransferase (CST-I) proteins largely found in Campylobacter jejuni. 45895 pfam06003: Survival motor neuron protein (SMN). This family consists of several eukaryotic survival motor neuron (SMN) proteins. The Survival of Motor Neurons (SMN) protein, the product of the spinal muscular atrophy-determining gene, is part of a large macromolecular complex (SMN complex) that functions in the assembly of spliceosomal small nuclear ribonucleoproteins (snRNPs). The SMN complex functions as a specificity factor essential for the efficient assembly of Sm proteins on U snRNAs and likely protects cells from illicit, and potentially deleterious, non-specific binding of Sm proteins to RNAs. . 45896 pfam06004: Bacterial protein of unknown function (DUF903). This family consists of several small bacterial proteins several of which are classified as putative lipoproteins. The function of this family is unknown. 45897 pfam06005: Protein of unknown function (DUF904). This family consists of several bacterial and archaeal hypothetical proteins of unknown function. 45898 pfam06006: Bacterial protein of unknown function (DUF905). This family consists of several short hypothetical Enterobacteria proteins of unknown function. 45899 pfam06007: Phosphonate metabolism protein PhnJ. This family consists of several bacterial phosphonate metabolism (PhnJ) sequences. The exact role that PhnJ plays in phosphonate utilisation is unknown. 45900 pfam06008: Laminin Domain I. coiled-coil structure. It has been suggested that the domains I and II from laminin A, B1 and B2 may come together to form a triple helical coiled-coil structure. 45901 pfam06009: Laminin Domain II. coiled-coil structure. It has been suggested that the domains I and II from laminin A, B1 and B2 may come together to form a triple helical coiled-coil structure. 45902 pfam06010: Domain of Unknown Function (DUF906).. 45903 pfam06011: Fungal protein of unknown function (DUF907). This family consists of several hypothetical fungal proteins of unknown function. 45904 pfam06012: Domain of Unknown Function (DUF908).. 45905 pfam06013: Bacterial protein of unknown function (DUF909). This family consists of several short bacterial proteins of unknown function. 45906 pfam06014: Bacterial protein of unknown function (DUF910). This family consists of several short bacterial proteins of unknown function. 45907 pfam06015: Chordopoxvirus A30L protein. This family consists of several short Chordopoxvirus proteins which are homologous to the A30L protein of Vaccinia virus. The vaccinia virus A30L protein is required for the association of electron-dense, granular, proteinaceous material with the concave surfaces of crescent membranes, an early step in viral morphogenesis. A30L is known to interact with the G7L protein and it has been shown that the stability of each is dependent on its association with the other. 45908 pfam06016: Reovirus core-spike protein lambda-2 (L2). This family consists of several Reovirus core-spike protein lambda-2 (L2) sequences. The reovirus L2 genome segment encodes the core spike protein lambda-2, which mediates enzymatic reactions in 5' capping of the viral plus-strand transcripts. 45909 pfam06017: Myosin tail. 45910 pfam06018: GTP-sensing transcriptional pleiotropic repressor CodY. This family consists of several bacterial GTP-sensing transcriptional pleiotropic repressor CodY proteins. CodY has been found to repress the dipeptide transport operon (dpp) of Bacillus subtilis in nutrient-rich conditions. The CodY protein also has a repressor effect on many genes in Lactococcus lactis during growth in milk. 45911 pfam06019: Phage GP30.8 protein. This family consists of several GP30.8 proteins from the T4-like phages. The function of this family is unknown. 45912 pfam06020: Drosophila roughex protein. This family consists of several roughex (RUX) proteins specific to Drosophila species. Roughex can influence the intracellular distribution of cyclin A and is therefore defined as a distinct and specialised cell cycle inhibitor for cyclin A-dependent kinase activity. Rux is though to regulate the metaphase to anaphase transition during development. 45913 pfam06021: Aralkyl acyl-CoA:amino acid N-acyltransferase. This family consists of several mammalian specific aralkyl acyl-CoA:amino acid N-acyltransferase (glycine N-acyltransferase) proteins EC:2.3.1.13. 45914 pfam06022: Plasmodium variant antigen protein Cir/Yir/Bir. This family consists of several Cir, Yir and Bir proteins from the Plasmodium species P.chabaudi, P.yoelii and P.berghei. 45915 pfam06023: Archaeal protein of unknown function (DUF911). This family consists of several archaeal proteins of unknown function. 45916 pfam06024: Nucleopolyhedrovirus protein of unknown function (DUF912). This family consists of several Nucleopolyhedrovirus proteins of unknown function. 45917 pfam06025: Domain of Unknown Function (DUF913).. 45918 pfam06026: Ribose 5-phosphate isomerase A (phosphoriboisomerase A). This family consists of several ribose 5-phosphate isomerase A or phosphoriboisomerase A (EC:5.3.1.6) from bacteria, eukaryotes and archaea. . 45919 pfam06027: Eukaryotic protein of unknown function (DUF914). This family consists of several hypothetical proteins of unknown function. Some of the sequences in this family are annotated as being putative membrane proteins. 45920 pfam06028: Bacterial protein of unknown function (DUF915). This family consists of several bacterial proteins of unknown function. 45921 pfam06029: AlkA N-terminal domain. 45922 pfam06030: Bacterial protein of unknown function (DUF916). This family consists of several hypothetical bacterial proteins of unknown function. 45923 pfam06031: SERTA motif. This family consists of a novel motif designated as SERTA (for SEI-1, RBT1, and TARA), corresponding to the largest conserved region among TRIP-Br proteins. The function of this motif is uncertain, but the CDK4-interacting segment of p34SEI-1 (amino acid residues 44-161) includes most of the SERTA motif. 45924 pfam06032: Protein of unknown function (DUF917). This family consists of hypothetical bacterial and archaeal proteins of unknown function. 45925 pfam06033: Nucleopolyhedrovirus protein of unknown function (DUF918). This family consists of several Nucleopolyhedrovirus proteins with no known function. 45926 pfam06034: Nucleopolyhedrovirus protein of unknown function (DUF919). This family consists of several short Nucleopolyhedrovirus proteins of unknown function. 45927 pfam06035: Bacterial protein of unknown function (DUF920). This family consists of several hypothetical bacterial proteins of unknown function. 45928 pfam06036: Streptomyces protein of unknown function (DUF921). This family consists of several putative regulatory proteins from Streptomyces coelicolor and Streptomyces griseus. One of the sequences in this family is thought to be involved in sporulation. 45929 pfam06037: Bacterial protein of unknown function (DUF922). This family consists of several hypothetical bacterial proteins of unknown function. 45930 pfam06038: Protein of unknown function (DUF923). This family consists of several bacterial proteins of unknown function as well as two uncharacterised Arabidopsis thaliana sequences. 45931 pfam06039: Malate:quinone oxidoreductase (Mqo). This family consists of several bacterial Malate:quinone oxidoreductase (Mqo) proteins (EC:1.1.99.16). Mqo takes part in the citric acid cycle. It oxidises L-malate to oxaloacetate and donates electrons to ubiquinone-1 and other artificial acceptors or, via the electron transfer chain, to oxygen. NAD is not an acceptor and the natural direct acceptor for the enzyme is most likely a quinone. The enzyme is therefore called malate:quinone oxidoreductase, abbreviated to Mqo. Mqo is a peripheral membrane protein and can be released from the membrane by addition of chelators. 45932 pfam06040: Adenovirus E3 protein. This family consists of several Adenovirus E3 proteins. The E3 protein does not seem to be essential for virus replication in cultured cells suggesting that the protein may function in virus-host interactions. . 45933 pfam06041: Bacterial protein of unknown function (DUF924). This family consists of several hypothetical bacterial proteins of unknown function. 45934 pfam06042: Bacterial protein of unknown function (DUF925). This family consists of several hypothetical bacterial proteins of unknown function. 45935 pfam06043: Reovirus P9-like family. 45936 pfam06044: Dam-replacing family. Dam-replacing protein (DRP) is an restriction endonuclease that is flanked by pseudo-transposable small repeat elements. The replacement of Dam-methylase by DRP allows phase variation through slippage-like mechanisms in several pathogenic isolates of Neisseria meningitidis. 45937 pfam06045: Rhamnogalacturonate lyase family. Rhamnogalacturonate lyase (EC:4.2.2.-) degrades the rhamnogalacturonan I (RG-I) backbone of pectin. This family contains mainly members from plants, but also contains the plant pathogen Erwinia chrysanthemi. 45938 pfam06046: Exocyst complex component Sec6. Sec6 is a component of the multiprotein exocyst complex. Sec6 interacts with Sec8, Sec10 and Exo70.These exocyst proteins localise to regions of active exocytosis-at the growing ends of interphase cells and in the medial region of cells undergoing cytokinesis-in an F-actin-dependent and exocytosis- independent manner. 45939 pfam06047: Domain of Unknown Function (DUF926). Family of eukaryotic proteins with undetermined function. 45940 pfam06048: Domain of unknown function (DUF927). Family of bacterial proteins of unknown function. 45941 pfam06049: Coagulation Factor V LSPD Repeat. These repeats are found in coagulation factor V (five). The name LSPD derives from the conserved residues in the middle of the repeat.They occur in the B domain, which is cleaved prior to activation of the protein. It has been suggested that domain B bring domains A and C together for activation. 45942 pfam06050: 2-hydroxyglutaryl-CoA dehydratase, D-component. Degradation of glutamate via the hydroxyglutarate pathway involves the syn-elimination of water from 2-hydroxyglutaryl-CoA. This anaerobic process is catalysed by 2-hydroxyglutaryl-CoA dehydratase, an enzyme with two components (A and D) that reversibly associate during reaction cycles. This component contains one non-reducible [4Fe-4S]2+ cluster and a reduced riboflavin 5'-monophosphate. 45943 pfam06051: Domain of Unknown Function (DUF928). Family of uncharacterised bacterial protein. 45944 pfam06052: 3-hydroxyanthranilic acid dioxygenase. In eukaryotes 3-hydroxyanthranilic acid dioxygenase (EC:1.13.11.6) is part of the kynurenine pathway for the degradation of tryptophan and the biosynthesis of nicotinic acid.The prokaryotic homolog is involved in the 2-nitrobenzoate degradation pathway. 45945 pfam06053: Domain of unknown function (DUF929). Family of proteins from the archaeon Sulfolobus, with undetermined function. 45946 pfam06054: Competence protein CoiA-like family. Many of the members of this family are described as transcription factors. CoiA falls within a competence-specific operon in Streptococcus. CoiA is an uncharacterised protein. 45947 pfam06055: Exopolysaccharide synthesis, ExoD. Among the bacterial genes required for nodule invasion are the exo genes. These genes are involved in the production of an extracellular polysaccharide. Mutations in the exoD result in altered exopolysaccharide production and defects in nodule invasion. 45948 pfam06056: Putative ATPase subunit of terminase (gpP-like). This family of proteins are annotated as ATPase subunits of phage terminase after. Terminases are viral proteins that are involved in packaging viral DNA into the capsid. 45949 pfam06057: Bacterial virulence protein (VirJ). This family consists of several bacterial VirJ virulence proteins. VirJ is thought to be involved in the type IV secretion system. It is thought that the substrate proteins localised to the periplasm may associate with the pilus in a manner that is mediated by VirJ, and suggest a two-step process for type IV secretion in Agrobacterium. 45950 pfam06058: Dcp1-like decapping family. An essential step in mRNA turnover is decapping. In yeast, two proteins have been identified that are essential for decapping, Dcp1 (this family) and Dcp2 (pfam05026). The precise role of these proteins in the decapping reaction have not been established. Evidence suggests that the Dcp1 may enhance the function of Dcp2. 45951 pfam06059: Domain of Unknown Function (DUF930). Family of bacterial proteins with undetermined function. All bacteria in this family are from the Rhizobiales order. 45952 pfam06060: Pre-pro-megakaryocyte potentiating factor precursor (Mesothelin). This family consists of several mammalian pre-pro-megakaryocyte potentiating factor precursor (MPF) or mesothelin proteins. Mesothelin is a glycosylphosphatidylinositol-linked glycoprotein highly expressed in mesothelial cells, mesotheliomas, and ovarian cancer, but the biological function of the protein is not known. 45953 pfam06061: Baculoviridae ME53. ME53 is one of the major early-transcribed genes. The ME53 protein is reported to contain a putative zinc finger motif. 45954 pfam06062: Uncharacterised protein family (UPF0231). Family of uncharacterised Proteobacteria proteins. 45955 pfam06063: Domain of unknown function (DUF931). Family of transmembrane proteins with undetermined function. 45956 pfam06064: Host-nuclease inhibitor protein Gam. The Gam protein inhibits RecBCD nuclease and is found in both bacteria and bacteriophage. 45957 pfam06065: Cripto growth factor. This family consists of several eukaryotic cripto growth factor related proteins. Within a multicellular organism, communication between cells is essential during development to ensure proper execution of cell migration, cell fate decisions, and differentiation events. All vertebrates display a characteristic asymmetry of internal organs with the cardiac apex, stomach and spleen towards the left, and the liver and gall bladder on the right. Left-right (L-R) axis abnormalities or laterality defects are common in humans (1 in 8,500 live births). Cripto growth factor genes and Nodal signalling are responsible for left-right axis formation and this is conserved from fish to humans. 45958 pfam06066: SepZ. SepZ is a component of the type III secretion system use in bacteria. SepZ is a gene within the enterocyte effacement locus. SepZ mutants exhibit reduced invasion efficiency and lack of tyrosine phosphorylation of Hp90. 45959 pfam06067: Domain of unknown function (DUF932). Family of Proteobacteria proteins with unknown function. 45960 pfam06068: TIP49 C-terminus. This family consists of the C-terminal region of several eukaryotic and archaeal RuvB-like 1 (Pontin or TIP49a) and RuvB-like 2 (Reptin or TIP49b) proteins. The N-terminal domain contains the pfam00004 domain. In zebrafish, the liebeskummer (lik) mutation, causes development of hyperplastic embryonic hearts. lik encodes Reptin, a component of a DNA-stimulated ATPase complex. Beta-catenin and Pontin, a DNA-stimulated ATPase that is often part of complexes with Reptin, are in the same genetic pathways. The Reptin/Pontin ratio serves to regulate heart growth during development, at least in part via the beta-catenin pathway. TBP-interacting protein 49 (TIP49) was originally identified as a TBP-binding protein, and two related proteins are encoded by individual genes, tip49a and b. Although the function of this gene family has not been elucidated, they are supposed to play a critical role in nuclear events because they interact with various kinds of nuclear factors and have DNA helicase activities.TIP49a has been suggested to act as an autoantigen in some patients with autoimmune diseases. . 45961 pfam06069: PerC transcriptional activator. PerC is a transcriptional activator of EaeA/BfpA expression in enteropathogenic bacteria. 45962 pfam06070: Herpesvirus large structural phosphoprotein UL32. The large phosphorylated protein (UL32-like) of herpes viruses is the polypeptide most frequently reactive in immuno-blotting analyses with antisera when compared with other viral proteins. 45963 pfam06071: Protein of unknown function (DUF933). This family consists of several GTP binding proteins and represents the C terminal domain of the constituent proteins. The family contains both prokaryotic and eukaryotic sequences which are highly conserved. The N terminal region of this family contains pfam01018. 45964 pfam06072: Alphaherpesvirus tegument protein US9. This family consists of several US9 and related proteins from the Alphaherpesviruses. The function of the US9 protein is unknown although in Bovine herpesvirus 5 Us9 is essential for the anterograde spread of the virus from the olfactory mucosa to the bulb. 45965 pfam06073: Bacterial protein of unknown function (DUF934). This family consists of several bacterial proteins of unknown function. One of the members of this family is thought to be an oxidoreductase. 45966 pfam06074: Protein of unknown function (DUF935). This family consists of several bacterial proteins of unknown function as well as the Bacteriophage Mu gp29 protein. 45967 pfam06075: Plant protein of unknown function (DUF936). This family consists of several hypothetical proteins from Arabidopsis thaliana and Oryza sativa. The function of this family is unknown. 45968 pfam06076: Orthopoxvirus F14 protein. This family consists of several short Orthopoxvirus F14 proteins. The function of this protein is unknown. 45969 pfam06077: LR8 protein. This family consists of several LR8 like proteins from humans, mice and rats. The function of the human LR8 protein is unknown although it is known to be strongly expressed in the lung fibroblasts. 45970 pfam06078: Bacterial protein of unknown function (DUF937). This family consists of several hypothetical bacterial proteins of unknown function. 45971 pfam06079: Apyrase. This family consists of several eukaryotic apyrase proteins (EC:3.6.1.5). The salivary apyrases of blood-feeding arthropods are nucleotide hydrolysing enzymes implicated in the inhibition of host platelet aggregation through the hydrolysis of extracellular adenosine diphosphate. 45972 pfam06080: Protein of unknown function (DUF938). This family consists of several hypothetical proteins from both prokaryotes and eukaryotes. The function of this family is unknown. 45973 pfam06081: Bacterial protein of unknown function (DUF939). This family consists of several hypothetical bacterial proteins of unknown function. 45974 pfam06082: Bacterial putative lipoprotein (DUF940). This family consists of hypothetical bacterial proteins several of which are described as putative lipoproteins. 45975 pfam06083: Interleukin-17. IL-17 is a potent proinflammatory cytokine produced by activated memory T cells. The IL-17 family is thought to represent a distinct signaling system that appears to have been highly conserved across vertebrate evolution. 45976 pfam06084: Cytomegalovirus TRL10 protein. This family consists of several Cytomegalovirus TRL10 proteins. TRL10 represents a structural component of the virus particle and like the other HCMV envelope glycoproteins, is present in a disulfide-linked complex. 45977 pfam06085: Lipoprotein Rz1 precursor. This family consists of several bacteria and phage lipoprotein Rz1 precursors. Rz1 is a proline-rich lipoprotein from bacteriophage lambda which is known to have fusogenic properties. Rz1-induced liposome fusion is thought to be mediated primarily by the generation of local perturbation in the bilayer lipid membrane and to a lesser extent by electrostatic forces. 45978 pfam06086: Orthopoxvirus A26L/A30L protein. This family consists of several Orthopoxvirus A26L and A30L proteins. The Vaccinia A30L gene is regulated by a late promoter and encodes a protein of approximately 9 kDa. It is thought that the A30L protein is needed for vaccinia virus morphogenesis, specifically the association of the dense viroplasm with viral membranes. 45979 pfam06087: Tyrosyl-DNA phosphodiesterase. Covalent intermediates between topoisomerase I and DNA can become dead-end complexes that lead to cell death. Tyrosyl-DNA phosphodiesterase can hydrolyse the bond between topoisomerase I and DNA. 45980 pfam06088: Nucleopolyhedrovirus telokin-like protein-20 (TLP20). This family consists of several Nucleopolyhedrovirus telokin-like protein-20 (TLP20) sequences. The function of this family is unknown but TLP20 is known to shares some antigenic similarities to the smooth muscle protein telokin although the amino acid sequence shows no homologies to telokin. 45981 pfam06089: L-asparaginase II. This family consists of several bacterial L-asparaginase II proteins. L-asparaginase (EC:3.5.1.1) catalyses the hydrolysis of L-asparagine to L-aspartate and ammonium. Rhizobium etli possesses two asparaginases: asparaginase I, which is thermostable and constitutive, and asparaginase II, which is thermolabile, induced by asparagine and repressed by the carbon source. 45982 pfam06090: Domain of unknown function (DUF941). Family of eukaryotic proteins with unknown function. 45983 pfam06091: Bacterial protein of unknown function (DUF942). This family consists of several hypothetical bacterial proteins of unknown function,. 45984 pfam06092: Enterobacterial putative membrane protein (DUF943). This family consists of several hypothetical putative membrane proteins from Escherichia coli, Yersinia pestis and Salmonella typhi. 45985 pfam06093: Transcription initiation protein Spt4. This family consists of several eukaryotic transcription initiation Spt4 proteins. Three transcription-elongation factors Spt4, Spt5, and Spt6 are conserved among eukaryotes and are essential for transcription via the modulation of chromatin structure. Spt4 and Spt5 are tightly associated in a complex, while the physical association of the Spt4-Spt5 complex with Spt6 is considerably weaker. It has been demonstrated that Spt4, Spt5, and Spt6 play roles in transcription elongation in both yeast and humans including a role in activation by Tat. It is known that Spt4, Spt5, and Spt6 are general transcription-elongation factors, controlling transcription both positively and negatively in important regulatory and developmental roles. 45986 pfam06094: AIG2-like family. AIG2 is an Arabidopsis proteins that exhibit RPS2- and avrRpt2-dependent induction early after infection with Pseudomonas syringae pv maculicola strain ES4326 carrying avrRpt2. 45987 pfam06096: Baculoviridae 8.2 KDa protein. Family of proteins from various Baculoviruses with undetermined function. 45988 pfam06097: Bacterial protein of unknown function (DUF945). This family consists of several hypothetical bacterial proteins of unknown function. 45989 pfam06098: Radial spoke protein 3. This family consists of several radial spoke protein 3 (RSP3) sequences. Eukaryotic cilia and flagella present in diverse types of cells perform motile, sensory, and developmental functions in organisms from protists to humans. They are centred by precisely organised, microtubule-based structures, the axonemes. The axoneme consists of two central singlet microtubules, called the central pair, and nine outer doublet microtubules. These structures are well-conserved during evolution. The outer doublet microtubules, each composed of A and B sub-fibres, are connected to each other by nexin links, while the central pair is held at the centre of the axoneme by radial spokes. The radial spokes are T-shaped structures extending from the A-tubule of each outer doublet microtubule to the centre of the axoneme. Radial spoke protein 3 (RSP3), is present at the proximal end of the spoke stalk and helps in anchoring the radial spoke to the outer doublet. It is thought that radial spokes regulate the activity of inner arm dynein through protein phosphorylation and dephosphorylation. 45990 pfam06099: Phenol hydroxylase subunit. This family consists of several bacterial phenol hydroxylase subunit proteins which are part of a multicomponent phenol hydroxylase. Some bacteria can utilise phenol or some of its methylated derivatives as their sole source of carbon and energy. The first step in this process is the conversion of phenol into catechol. Catechol is then further metabolised via the meta-cleavage pathway into TCA cycle intermediates. . 45991 pfam06100: Streptococcal 67 kDa myosin-cross-reactive antigen like family. Members of this family are thought to have structural features in common with the beta chain of the class II antigens, as well as myosin, and may play an important role in the pathogenesis. 45992 pfam06101: Plant protein of unknown function (DUF946). This family consists of several hypothetical proteins from Arabidopsis thaliana and Oryza sativa. The function of this family is unknown. 45993 pfam06102: Domain of unknown function (DUF947). Family of eukaryotic proteins with unknown function. 45994 pfam06103: Bacterial protein of unknown function (DUF948). This family consists of bacterial sequences several of which are thought to be general stress proteins. 45995 pfam06104: Bacterial protein of unknown function (DUF949). This family consists of several hypothetical bacterial proteins of unknown function. 45996 pfam06105: Aph-1 protein. This family consists of several eukaryotic Aph-1 proteins.Gamma-secretase catalyses the intramembrane proteolysis of Notch, beta-amyloid precursor protein, and other substrates as part of a new signaling paradigm and as a key step in the pathogenesis of Alzheimer's disease. It is thought that the presenilin heterodimer comprises the catalytic site and that a highly glycosylated form of nicastrin associates with it. Aph-1 and Pen-2, two membrane proteins genetically linked to gamma-secretase, associate directly with presenilin and nicastrin in the active protease complex. Co-expression of all four proteins leads to marked increases in presenilin heterodimers, full glycosylation of nicastrin, and enhanced gamma-secretase activity. 45997 pfam06106: Staphylococcus protein of unknown function (DUF950). This family consists of several hypothetical proteins from different Staphylococcus species. The function of this family is unknown. 45998 pfam06107: Bacterial protein of unknown function (DUF951). This family consists of several short hypothetical bacterial proteins of unknown function. 45999 pfam06108: Protein of unknown function (DUF952). This family consists of several hypothetical bacterial and plant proteins of unknown function. 46000 pfam06109: Haemolysin E (HlyE). This family consists of several enterobacterial haemolysin (HlyE) proteins.Hemolysin E (HlyE) is a novel pore-forming toxin of Escherichia coli, Salmonella typhi, and Shigella flexneri. HlyE is unrelated to the well characterised pore-forming E. coli hemolysins of the RTX family, haemolysin A (HlyA), and the enterohaemolysin encoded by the plasmid borne ehxA gene of E. coli 0157. However, it is evident that expression of HlyE in the absence of the RTX toxins is sufficient to give a hemolytic phenotype in E. coli. HlyE is a protein of 34 kDa that is expressed during anaerobic growth of E. coli. Anaerobic expression is controlled by the transcription factor, FNR, such that, upon ingestion and entry into the anaerobic mammalian intestine, HlyE is produced and may then contribute to the colonisation of the host. 46001 pfam06110: Eukaryotic protein of unknown function (DUF953). This family consists of several hypothetical eukaryotic proteins of unknown function. 46002 pfam06111: Bacterial protein of unknown function (DUF954). This family consists of several hypothetical proteins of unknown function. 46003 pfam06112: Gammaherpesvirus capsid protein. This family consists of several Gammaherpesvirus capsid proteins. The exact function of this family is unknown. 46004 pfam06113: Brain and reproductive organ-expressed protein (BRE). This family consists of several eukaryotic brain and reproductive organ-expressed (BRE) proteins. BRE is a putative stress-modulating gene, found able to down-regulate TNF-alpha-induced-NF-kappaB activation upon over expression. A total of six isoforms are produced by alternative splicing predominantly at either end of the gene.Compared to normal cells, immortalised human cell lines uniformly express higher levels of BRE. Peripheral blood monocytes respond to LPS by down-regulating the expression of all the BRE isoforms.It is thought that the function of BRE and its isoforms is to regulate peroxisomal activities. 46005 pfam06114: Domain of unknown function (DUF955). Family of bacterial and viral proteins with undetermined function. A conserved H-E-X-X-H motif is suggestive of a catalytic active site and shows similarity to pfam01435. 46006 pfam06115: Domain of unknown function (DUF956). Family of bacterial sequences with undetermined function. 46007 pfam06116: Transcriptional activator RinB. This family consists of several Staphylococcus aureus bacteriophage RinB proteins and related sequences from their host. The int gene of staphylococcal bacteriophage phi 11 is the only viral gene responsible for the integrative recombination of phi 11. rinA and rinB, are both required to activate expression of the int gene. 46008 pfam06117: Enterobacterial protein of unknown function (DUF957). This family consists of several hypothetical proteins from Escherichia coli, Salmonella typhi, Shigella flexneri and Proteus vulgaris. The function of this family is unknown. 46009 pfam06118: Archaeal PaREP8 protein. This family consists of a number of archaeal specific PaREP8 proteins. The function of this protein is unknown. 46010 pfam06119: Domain of Unknown Function (DUF958).. 46011 pfam06120: Tail length tape measure protein. This family consists of the tail length tape measure protein from bacteriophage HK97 and related sequences from Escherichia coli O157:H7. 46012 pfam06121: Domain of Unknown Function (DUF959). This N-terminal domain is not expressed in the 'Short' isoform of Collagen A. 46013 pfam06122: TraH protein. This family consists of several bacterial TraH proteins which are involved in pilus assembly. 46014 pfam06123: Inner membrane protein CreD. This family consists of several bacterial CreD or Cet inner membrane proteins. Dominant mutations of the cet gene of Escherichia coli result in tolerance to colicin E2 and increased amounts of an inner membrane protein with an Mr of 42,000. The cet gene is shown to be in the same operon as the phoM gene, which is required in a phoR background for expression of the structural gene for alkaline phosphatase, phoA. Although the Cet protein is not required for phoA expression, it has been suggested that the Cet protein has an enhancing effect on the transcription of phoA. 46015 pfam06124: Staphylococcal protein of unknown function (DUF960). This family consists of several hypothetical proteins from several species of Staphylococcus. The function of this family is unknown. 46016 pfam06125: Bacterial protein of unknown function (DUF961). This family consists of several hypothetical bacterial proteins of unknown function. 46017 pfam06126: Herpesvirus Latent membrane protein 2. Family of Kaposi's sarcoma-associated herpesvirus (HHV8) latent membrane protein. 46018 pfam06127: Protein of unknown function (DUF962). This family consists of several eukaryotic and prokaryotic proteins of unknown function. A yeast hypothetical protein has been found to be non-essential for cell growth. 46019 pfam06128: Shigella flexneri OspC protein. This family consists of the Shigella flexneri specific protein OspC. The function of this family is unknown but it is thought that Osp proteins may be involved in postinvasion events related to virulence. Since bacterial pathogens adapt to multiple environments during the course of infecting a host, it has been proposed that Shigella evolved a mechanism to take advantage of a unique intracellular cue, which is mediated through MxiE, to express proteins when the organism reaches the eukaryotic cytosol. 46020 pfam06129: Chordopoxvirus G3 protein. This family consists of several Chordopoxvirus specific G3 proteins. The function of this family is unknown. 46021 pfam06130: Propanediol utilisation protein PduL. This family consists of several bacterial propanediol utilisation protein (PduL) sequences. The exact role of this protein in propanediol utilisation is unknown. 46022 pfam06131: Schizosaccharomyces pombe repeat of unknown function (DUF963). This family consists of a series of repeated sequences from one hypothetical protein found in Schizosaccharomyces pombe. The function of this family is unknown. 46023 pfam06132: Ethanolamine utilisation protein EutS. This family consists of several bacterial EutS ethanolamine utilisation proteins. The eut operon of Salmonella typhimurium encodes proteins involved in the cobalamin-dependent degradation of ethanolamine. The exact function of EutS is unknown. 46024 pfam06133: Protein of unknown function (DUF964). This family consists of several relatively short bacterial and archaeal hypothetical sequences. The function of this family is unknown. 46025 pfam06134: L-rhamnose isomerase (RhaA). This family consists of several bacterial L-rhamnose isomerase proteins (EC:5.3.1.14).. 46026 pfam06135: Bacterial protein of unknown function (DUF965). This family consists of several hypothetical bacterial proteins. The function of the family is unknown. 46027 pfam06136: Domain of unknown function (DUF966). Family of plant proteins with unknown function. 46028 pfam06137: Transcription elongation factor A, SII-related family. The function of this family is unclear, but some members are described as transcription elongation factor A, SII-like proteins. 46029 pfam06138: Chordopoxvirus E11 protein. This family consists of several Chordopoxvirus E11 proteins. The E11 gene of vaccinia virus encodes a 15-kDa polypeptide. Mutations in the E11 gene makes the virus temperature-sensitive due to either the fact that virus infectivity requires a threshold level of active E11 protein or that E11 function is conditionally essential. 46030 pfam06139: BphX-like. Family of bacterial proteins located in the phenyl dioxygenase (bph) operon. The function of this family is unknown. 46031 pfam06140: Interferon-induced 6-16 family. 46032 pfam06141: Phage minor tail protein U. Tail fibre component U of bacteriophage. 46033 pfam06142: Protein of unknown function (DUF967). Family of proteins with unknown function. 46034 pfam06143: Baculovirus 11 kDa family. Family of uncharacterised Baculovirus proteins that are all about 11 kDa in size. 46035 pfam06144: DNA polymerase III, delta subunit. DNA polymerase III, delta subunit (EC 2.7.7.7) is required for, along with delta' subunit, the assembly of the processivity factor beta(2) onto primed DNA in the DNA polymerase III holoenzyme-catalysed reaction. The delta subunit is also known as HolA. 46036 pfam06145: Coronavirus nonstructural protein NS1. Bovine coronavirus NS1 encodes a 4.9 kDa protein. 46037 pfam06146: Phosphate-starvation-inducible E. Phosphate-starvation-inducible E (PsiE) expression is under direct positive and negative control by PhoB and cAMP-CRP, respectively. The function of PsiE remains to be determined. 46038 pfam06147: Protein of unknown function (DUF968). Family of uncharacterised prophage proteins that are also found in bacteria and humans. 46039 pfam06148: COG (conserved oligomeric Golgi) complex component, COG2. The COG complex comprises of eight proteins COG1-8. The COG complex plays critical roles in Golgi structure and function. 46040 pfam06149: Protein of unknown function (DUF969). Family of uncharacterised bacterial membrane proteins. 46041 pfam06150: ChaB. This family of proteins contain a conserved 60 residue region. This protein is known as ChaB in E. coli and is found next to ChaA which is a cation transporter protein. ChaB may be regulate ChaA function in some way. 46042 pfam06151: Trehalose receptor. In Drosophila, taste is perceived by gustatory neurons located in sensilla distributed on several different appendages throughout the body of the animal. This family represents the taste receptor sensitive to trehalose. 46043 pfam06152: Phage minor capsid protein 2. Family of related phage minor capsid proteins. 46044 pfam06153: Protein of unknown function (DUF970). Family of uncharacterised bacterial proteins. 46045 pfam06154: YagB/YeeU/YfjZ family. This family of proteins includes three proteins from E. coli YagB, YeeU and YfjZ. The function of these proteins is unknown. They are about 120 amino acids in length. 46046 pfam06155: Protein of unknown function (DUF971). This family consists of several short bacterial proteins and one sequence from Oryza sativa. The function of this family is unknown. 46047 pfam06156: Protein of unknown function (DUF972). This family consists of several hypothetical bacterial and one Caenorhabditis elegans sequence. The function of this family is unknown. 46048 pfam06157: Protein of unknown function (DUF973). This family consists of several hypothetical archaeal proteins of unknown function. 46049 pfam06158: Phage tail protein E. Family of small phage tail protein, referred to as protein E. 46050 pfam06159: Protein of unknown function (DUF974). Family of uncharacterised eukaryotic proteins. 46051 pfam06160: Septation ring formation regulator, EzrA. During the bacterial cell cycle, the tubulin-like cell-division protein FtsZ polymerises into a ring structure that establishes the location of the nascent division site. EzrA modulates the frequency and position of FtsZ ring formation. 46052 pfam06161: Protein of unknown function (DUF975). Family of uncharacterised bacterial proteins. 46053 pfam06162: Caenorhabditis elegans protein of unknown function (DUF976). This family consists of several hypothetical Caenorhabditis elegans proteins of unknown function. 46054 pfam06163: Bacterial protein of unknown function (DUF977). This family consists of several hypothetical bacterial proteins from Escherichia coli and Salmonella typhi. The function of this family is unknown. 46055 pfam06164: Bacterial protein of unknown function (DUF978). This family consists of several hypothetical bacterial proteins of unknown function. 46056 pfam06165: Glycosyltransferase family 36. The glycosyltransferase family 36 includes cellobiose phosphorylase (EC:2.4.1.20), cellodextrin phosphorylase (EC:2.4.1.49), chitobiose phosphorylase (EC:2.4.1.-). Many members of this family contain two copies of this domain. 46057 pfam06166: Protein of unknown function (DUF979). This family consists of several putative bacterial membrane proteins. The function of this family is unclear. 46058 pfam06167: Protein of unknown function (DUF980). Family of uncharacterised bacterial sequences. 46059 pfam06168: Protein of unknown function (DUF981). Family of uncharacterised proteins found in bacteria and archaea. 46060 pfam06169: Protein of unknown function (DUF982). This family consists of several hypothetical proteins from Rhizobium meliloti, Rhizobium loti and Agrobacterium tumefaciens. The function of this family is unknown. 46061 pfam06170: Protein of unknown function (DUF983). This family consists of several bacterial proteins of unknown function. 46062 pfam06171: Protein of unknown function (DUF984). Family of bacterial proteins with unknown function. 46063 pfam06172: Protein of unknown function (DUF985). Family of uncharacterised proteins found in bacteria and eukaryotes. 46064 pfam06173: Protein of unknown function (DUF986). This family consists of several bacterial putative membrane proteins of unknown function. 46065 pfam06174: Protein of unknown function (DUF987). Family of bacterial proteins that are related to the hypothetical protein yeeT. 46066 pfam06175: tRNA-(MS[2]IO[6]A)-hydroxylase (MiaE). This family consists of several bacterial tRNA-(MS[2]IO[6]A)-hydroxylase (MiaE) proteins. The modified nucleoside 2-methylthio-N-6-isopentenyl adenosine (ms2i6A) is present at position 37 (3' of the anticodon) of tRNAs that read codons beginning with U except tRNA(I,V Ser) in Escherichia coli. Salmonella typhimurium 2-methylthio-cis-ribozeatin (ms2io6A) is found in tRNA, probably in the corresponding species that have ms2i6A in E. coli. The miaE gene is absent in E. coli, a finding consistent with the absence of the hydroxylated derivative of ms2i6A in this species. 46067 pfam06176: Lipopolysaccharide core biosynthesis protein (WaaY). This family consists of several bacterial lipopolysaccharide core biosynthesis proteins (WaaY or RfaY). The waaY, waaQ, and waaP genes are located in the central operon of the waa (formerly rfa) locus on the chromosome of Escherichia coli. This locus contains genes whose products are involved in the assembly of the core region of the lipopolysaccharide molecule. WaaY is the enzyme that phosphorylates HepII in this system. 46068 pfam06177: Protein of unknown function (DUF988). This family consists of several hypothetical bacterial proteins of unknown function. 46069 pfam06178: Oligogalacturonate-specific porin protein (KdgM). This family consists of several bacterial proteins which are homologous to the oligogalacturonate-specific porin protein KdgM from Erwinia chrysanthemi. The phytopathogenic Gram-negative bacteria Erwinia chrysanthemi secretes pectinases, which are able to degrade the pectic polymers of plant cell walls, and uses the degradation products as a carbon source for growth. KdgM is a major outer membrane protein, whose synthesis is strongly induced in the presence of pectic derivatives. KdgM behaves like a voltage-dependent porin that is slightly selective for anions and that exhibits fast block in the presence of trigalacturonate. In contrast to most porins, KdgM seems to be monomeric. . 46070 pfam06179: Surfeit locus protein 5. This family consists of several eukaryotic Surfeit locus protein 5 (SURF5) sequences. The human Surfeit locus has been mapped on chromosome 9q34.1. The locus includes six tightly clustered housekeeping genes (Surf1-6), and the gene organisation is similar in human, mouse and chicken Surfeit locus. The exact function of this family is unknown. 46071 pfam06180: Cobalt chelatase (CbiK). This family consists of several bacterial cobalt chelatase (CbiK) proteins (EC:4.99.1.-).. 46072 pfam06181: Protein of unknown function (DUF989). This family consists of several hypothetical bacterial proteins of unknown function. 46073 pfam06182: Protein of unknown function (DUF990). This family consists of a number of hypothetical bacterial proteins of unknown function. 46074 pfam06183: DinI-like family. This family of short proteins includes DNA-damage-inducible protein I (DinI) and related proteins. The SOS response, a set of cellular phenomena exhibited by eubacteria, is initiated by various causes that include DNA damage-induced replication arrest, and is positively regulated by the co- protease activity of RecA. Escherichia coli DinI, a LexA-regulated SOS gene product, shuts off the initiation of the SOS response when overexpressed in vivo. Biochemical and genetic studies indicated that DinI physically interacts with RecA to inhibit its co-protease activity. The structure of DinI is known. 46075 pfam06184: Potexvirus coat protein. This family consists of several Potexvirus coat proteins. 46076 pfam06185: Protein of unknown function (DUF991). This family consists of several bacterial YecM proteins of unknown function. 46077 pfam06186: Protein of unknown function (DUF992). This family consists of several hypothetical bacterial proteins of unknown function. 46078 pfam06187: Protein of unknown function (DUF993). This family consists of several hypothetical bacterial proteins of unknown function. 46079 pfam06188: HrpE protein. This family consists of several bacterial HrpE proteins. The exact function of this family is unknown but it is thought that HrpE is involved in the secretion of HrpZ (harpinPss).. 46080 pfam06189: 5'-nucleotidase. This family consists of both eukaryotic and prokaryotic 5'-nucleotidase sequences (EC:3.1.3.5).. 46081 pfam06190: Protein of unknown function (DUF994). This family consists of several bacterial and phage proteins of unknown function. 46082 pfam06191: Protein of unknown function (DUF995). Family of uncharacterised Proteobacteria proteins. 46083 pfam06192: Cytoplasmic chaperone TorD. This family consists of several bacterial TorD proteins. As many prokaryotic molybdoenzymes, the TMAO reductase (TorA) of Escherichia coli requires the insertion of a bis(molybdopterin guanine dinucleotide) molybdenum (bis(MGD)Mo) cofactor in its catalytic site to be active and translocated to the periplasm. The TorD chaperone increases apoTorA activation up to four-fold, allowing maturation of most of the apoprotein. Therefore TorD is involved in the first step of TorA maturation to make it competent to receive the cofactor. 46084 pfam06193: Orthopoxvirus A5L protein. This family consists of several Orthopoxvirus A5L proteins. The vaccinia virus WR A5L open reading frame (corresponding to open reading frame A4L in vaccinia virus Copenhagen) encodes an immunodominant late protein found in the core of the vaccinia virion. The A5 protein appears to be required for the immature virion to form the brick-shaped intracellular mature virion. 46085 pfam06194: Phage Conserved Open Reading Frame 51. Family of conserved bacteriophage open reading frames. 46086 pfam06195: Protein of unknown function (DUF996). Family of uncharacterised bacterial and archaeal proteins. 46087 pfam06196: Protein of unknown function (DUF997). Family of predicted bacterial membrane protein with unknown function. 46088 pfam06197: Protein of unknown function (DUF998). Family of conserved archaeal proteins. 46089 pfam06198: Protein of unknown function (DUF999). Family of conserved Schizosaccharomyces pombe proteins with unknown function. 46090 pfam06199: Phage major tail protein 2. 46091 pfam06200: ZIM motif. This short motif is found in a variety of plant transcription factors that contain GATA domains as well as other motifs. The most conserved amino acids form the pattern TIFF/YXG. This motif may be involved in binding DNA (Bateman A pers. obs).. 46092 pfam06201: Domain of Unknown Function (DUF1000).. 46093 pfam06202: Amylo-alpha-1,6-glucosidase. This family includes human glycogen branching enzyme, which contains a number of distinct catalytic activities. It has been shown for the yeast homologue that mutations in this region disrupt the enzymes Amylo-alpha-1,6-glucosidase (EC:3.2.1.33).. 46094 pfam06203: CCT motif. This short motif is found in a number of plant proteins. It is rich in basic amino acids and has been called a CCT motif after Co, Col and Toc1. The CCT motif is about 45 amino acids long and contains a putative nuclear localisation signal within the second half of the CCT motif. Toc1 mutants have been identified in this region. 46095 pfam06204: Putative carbohydrate binding domain. 46096 pfam06205: Glycosyltransferase 36 associated family. 46097 pfam06206: Protein of unknown function (DUF1001). This family consists of proteins of unknown function. These proteins are around 200 amino acids in length. The proteins contain a conserved motif PYR in the amino terminal half of the protein that may be functionally important. The species distribution of the family is interesting. So far it is restricted to cyanobacteria, cryptomonads and plants. This suggests that this protein may be involved in some aspect of a photosynthetic lifestyle (Bateman A pers. obs.).. 46098 pfam06207: Protein of unknown function (DUF1002). This protein family has no known function. Its members are about 300 amino acids in length. It has so far been detected in Firmicute bacteria and some archaebacteria. 46099 pfam06208: Borna disease virus G protein. This family consists of Borna disease virus G glycoprotein sequences. Borna disease virus (BDV) infection produces a variety of clinical diseases, from behavioural illnesses to classical fatal encephalitis. G protein is important for viral entry into the host cell. 46100 pfam06209: Cofactor of BRCA1 (COBRA1). This family consists of several cofactor of BRCA1 (COBRA1) like proteins. It is thought that COBRA1 along with BRCA1 is involved in chromatin unfolding. COBRA1 is recruited to the chromosome site by the first BRCT repeat of BRCA1, and is itself sufficient to induce chromatin unfolding. BRCA1 mutations that enhance chromatin unfolding also increase its affinity for, and recruitment of, COBRA1. It is thought that that reorganisation of higher levels of chromatin structure is an important regulated step in BRCA1-mediated nuclear functions. 46101 pfam06210: Protein of unknown function (DUF1003). This family consists of several hypothetical bacterial proteins of unknown function. 46102 pfam06211: BMP and activin membrane-bound inhibitor (BAMBI). This family consists of several eukaryotic BMP and activin membrane-bound inhibitor (BAMBI) proteins. Members of the transforming growth factor-beta (TGF-beta) superfamily, including TGF-beta, bone morphogenetic proteins (BMPs), activins and nodals, are vital for regulating growth and differentiation. BAMBI is related to TGF-beta-family type I receptors but lacks an intracellular kinase domain. BAMBI is co-expressed with the ventralising morphogen BMP4 during Xenopus embryogenesis and requires BMP signalling for its expression. The protein stably associates with TGF-beta-family receptors and inhibits BMP and activin as well as TGF-beta signalling. 46103 pfam06212: GRIM-19 protein. This family consists of several eukaryotic gene associated with retinoic-interferon-induced mortality 19 (GRIM-19) proteins. GRIM-19, was reported to encode a small protein primarily distributed in the nucleus and was able to promote cell death induced by IFN-# and RA. A bovine homologue of GRIM-19 was co-purified with mitochondrial NADH:ubiquinone oxidoreductase (complex I) in bovine heart. Therefore, its exact cellular localisation and function are unclear. It has now been discovered that GRIM-19 is a specific interacting protein which negatively regulates Stat3 activity. . 46104 pfam06213: Cobalamin biosynthesis protein CobT. This family consists of several bacterial cobalamin biosynthesis (CobT) proteins. CobT is involved in the transformation of precorrin-3 into cobyrinic acid. . 46105 pfam06214: Signaling lymphocytic activation molecule (SLAM) protein. This family consists of several mammalian signaling lymphocytic activation molecule (SLAM) proteins. Optimal T cell activation and expansion require engagement of the TCR plus co-stimulatory signals delivered through accessory molecules. SLAM, a 70-kDa co-stimulatory molecule belonging to the Ig superfamily, is defined as a human cell surface molecule that mediates CD28-independent proliferation of human T cells and IFN-gamma production by human Th1 and Th2 clones. SLAM has also been recognised as a receptor for measles virus. 46106 pfam06215: Infectious salmon anaemia virus haemagglutinin. This family consists of several infectious salmon anaemia virus haemagglutinin proteins. Infectious salmon anaemia virus (ISAV), an orthomyxovirus-like virus, is an important fish pathogen in marine aquaculture. . 46107 pfam06216: Rice tungro bacilliform virus P46 protein. This family consists of several Rice tungro bacilliform virus P46 proteins. The function of this family is unknown. 46108 pfam06217: GAGA binding protein-like family. This family includes gbp a protein from Soybean that binds to GAGA element dinucleotide repeat DNA. It seems likely that the this domain mediates DNA binding. This putative domain contains several conserved cysteines and a histidine suggesting this may be a zinc-binding DNA interaction domain. 46109 pfam06218: Nitrogen permease regulator 2. This family of regulators are involved in post-translational control of nitrogen permease. 46110 pfam06219: Protein of unknown function (DUF1005). Family of plant proteins with undetermined function. 46111 pfam06220: U1 zinc finger. This family consists of several U1 small nuclear ribonucleoprotein C (U1-C) proteins. The U1 small nuclear ribonucleoprotein (U1 snRNP) binds to the pre-mRNA 5' splice site (ss) at early stages of spliceosome assembly. Recruitment of U1 to a class of weak 5' ss is promoted by binding of the protein TIA-1 to uridine-rich sequences immediately downstream from the 5' ss. Binding of TIA-1 in the vicinity of a 5' ss helps to stabilise U1 snRNP recruitment, at least in part, via a direct interaction with U1-C, thus providing one molecular mechanism for the function of this splicing regulator. This domain is probably a zinc-binding. It is found in multiple copies in some members of the family. 46112 pfam06221: Putative zinc finger motif, C2HC5-type. This zinc finger appears to be common in activating signal cointegrator 1/thyroid receptor interacting protein 4. 46113 pfam06222: Phage tail assembly chaperone. 46114 pfam06223: Minor tail protein T. Minor tail protein T is located at the distal end and is involved in the assembly of the initiator complex for tail polymerisation. 46115 pfam06224: Protein of unknown function (DUF1006). Family of conserved bacterial proteins with unknown function. 46116 pfam06225: Poxvirus A46 family. The camelpox virus A46 homolog is described as a toll-like receptor inhibitor. 46117 pfam06226: Protein of unknown function (DUF1007). Family of conserved bacterial proteins with unknown function. 46118 pfam06227: Orthopoxvirus N1 protein. This family consists of several Orthopoxvirus N1 proteins. The function of this family is unknown. 46119 pfam06228: Protein of unknown function (DUF1008). This family consists of several hypothetical bacterial proteins of unknown function. 46120 pfam06229: FRG1-like family. The human FRG1 gene maps to human chromosome 4q35 and has been identified as a candidate for facioscapulohumeral muscular dystrophy. Currently, the function of FRG1 is unknown. 46121 pfam06230: Protein of unknown function (DUF1009). Family of uncharacterised bacterial proteins. 46122 pfam06231: Protein of unknown function (DUF1010). Family of plasmid encoded proteins with unknown function. 46123 pfam06232: Embryo-specific protein 3, (ATS3). Family of plant seed-specific proteins. 46124 pfam06233: Usg-like family. Family of bacterial proteins, referred to as Usg. Usg is found in the same operon as trpF, trpB, and trpA and is expressed in a coupled transcription-translation system. 46125 pfam06234: Toluene-4-monooxygenase system protein B (TmoB). This family consists of several Toluene-4-monooxygenase system protein B (TmoB) sequences. Pseudomonas mendocina KR1 metabolises toluene as a carbon source. The initial step of the pathway is hydroxylation of toluene to form p-cresol by a multicomponent toluene-4-monooxygenase (T4MO) system. 46126 pfam06235: NADH dehydrogenase subunit 4L (NAD4L). This family consists of NADH dehydrogenase subunit 4L (NAD4L) proteins from the mitochondria of several parasitic flatworms. 46127 pfam06236: Tyrosinase co-factor MelC1. This family consists of several tyrosinase co-factor MELC1 proteins from a number of Streptomyces species. The melanin operon (melC) of Streptomyces antibioticus contains two genes, melC1 and melC2 (apotyrosinase). It is thought that MelC1 forms a transient binary complex with the downstream apotyrosinase MelC2 to facilitate the incorporation of copper ion and the secretion of tyrosinase indicating that MelC1 is a chaperone for the apotyrosinase MelC2. 46128 pfam06237: Protein of unknown function (DUF1011). Family of uncharacterised eukaryotic proteins. 46129 pfam06238: Borrelia burgdorferi BBR25 lipoprotein. This family consists of a number of lipoproteins from the Lyme disease spirochete Borrelia burgdorferi. 46130 pfam06239: ECSIT (evolutionarily conserved signaling intermediate in Toll pathways). Activation of NF-kappaB as a consequence of signaling through the Toll and IL-1 receptors is a major element of innate immune responses. ECSIT plays an important role in signalling to NF-kappaB, functioning as the intermediate in the signaling pathways between TRAF-6 and MEKK-1. 46131 pfam06240: Carbon monoxide dehydrogenase subunit G (CoxG). The CO dehydrogenase structural genes coxMSL are flanked by nine accessory genes arranged as the cox gene cluster. The cox genes are specifically and coordinately transcribed under chemolithoautotrophic conditions in the presence of CO as carbon and energy source. 46132 pfam06241: Protein of unknown function (DUF1012). Family of uncharacterised proteins found in both eukaryotes and bacteria. 46133 pfam06242: Protein of unknown function (DUF1013). Family of uncharacterised proteins found in Proteobacteria. 46134 pfam06243: Phenylacetic acid degradation B. Phenylacetic acid degradation protein B (PaaB) is thought to be part of a multicomponent oxygenase involved in phenylacetyl-CoA hydroxylation. 46135 pfam06244: Protein of unknown function (DUF1014). This family consists of several hypothetical eukaryotic proteins of unknown function. 46136 pfam06245: Protein of unknown function (DUF1015). Family of proteins with unknown function found in archaea and bacteria. 46137 pfam06246: Isy1-like splicing family. Isy1 protein is important in the optimisation of splicing. 46138 pfam06247: Plasmodium ookinete surface protein Pvs28. This family consists of several ookinete surface protein (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunised animals. . 46139 pfam06248: Centromere/kinetochore Zw10. Zw10 and rough deal proteins are both required for correct metaphase check-pointing during mitosis. These proteins bind to the centromere/kinetochore. 46140 pfam06249: Ethanolamine utilisation protein EutQ. The eut operon of Salmonella typhimurium encodes proteins involved in the cobalamin-dependent degradation of ethanolamine. The role of EutQ in this process is unclear. 46141 pfam06250: Protein of unknown function (DUF1016). Family of uncharacterised proteins found in viruses, archaea and bacteria. 46142 pfam06251: Protein of unknown function (DUF1017). Family of uncharacterised protein from proteobacteria. 46143 pfam06252: Protein of unknown function (DUF1018). This family consists of several bacterial and phage proteins of unknown function. 46144 pfam06253: Trimethylamine methyltransferase (MTTB). This family consists of several trimethylamine methyltransferase (MTTB) (EC:2.1.1.-) proteins from numerous Rhizobium and Methanosarcina species. 46145 pfam06254: Protein of unknown function (DUF1019). Family of uncharacterised proteins found in Proteobacteria. 46146 pfam06255: Protein of unknown function (DUF1020). This family consists of several MafB proteins from Neisseria meningitidis and Neisseria gonorrhoeae. The function of this family is unknown. 46147 pfam06256: Nucleopolyhedrovirus LEF-12 protein. This family consists of several Nucleopolyhedrovirus late expression factor-12 (LEF-12) proteins. The function of this family is unknown. 46148 pfam06257: Protein of unknown function (DUF1021). This family consists of several hypothetical bacterial proteins of unknown function. 46149 pfam06258: Protein of unknown function (DUF1022). This family consists of several hypothetical eukaryotic and prokaryotic proteins. The function of this family is unknown. 46150 pfam06259: Domain of unknown function (DUF1023). Family of uncharacterised proteins found in Actinobacteria. 46151 pfam06260: Protein of unknown function (DUF1024). This family consists of several hypothetical Staphylococcus aureus and Staphylococcus aureus phage phi proteins. The function of this family is unknown. 46152 pfam06261: Actinobacillus actinomycetemcomitans leukotoxin activator LktC. This family consists of several Actinobacillus actinomycetemcomitans leukotoxin activator (LktC) proteins. Actinobacillus actinomycetemcomitans is a Gram-negative bacterium that has been implicated in the etiology of several forms of periodontitis, especially localised juvenile periodontitis. LktC along with LktB and LktD are thought to be required for activation and localisation of the leukotoxin. 46153 pfam06262: Domain of unknown function (DUF1025). Family of bacterial protein with undetermined function. 46154 pfam06263: Bacterial FdrA protein. This family consists of several bacterial FdrA proteins. FdrA is known to play a role in the suppression of dominant negative FtsH proteins. 46155 pfam06264: Protein of unknown function (DUF1026). This family consists of several uncharacterised bacterial and phage proteins. The function of this family is unknown. 46156 pfam06265: Protein of unknown function (DUF1027). This family consists of several hypothetical bacterial proteins of unknown function. 46157 pfam06266: HrpF protein. The species Pseudomonas syringae encompasses plant pathogens with differing host specificities and corresponding pathovar designations. P. syringae requires the Hrp (type III protein secretion) system, encoded by a 25-kb cluster of hrp and hrc genes, in order to elicit the hypersensitive response (HR) in nonhosts or to be pathogenic in hosts. The exact function of HrpF is unknown but the protein is needed for pathogenicity. 46158 pfam06267: Family of unknown function (DUF1028). Family of bacterial and archaeal proteins with unknown function. 46159 pfam06268: Fascin protein. This family consists of several eukaryotic fascin or singed proteins. The fascins are a structurally unique and evolutionarily conserved group of actin cross-linking proteins. Fascins function in the organisation of two major forms of actin-based structures: dynamic, cortical cell protrusions and cytoplasmic microfilament bundles. The cortical structures, which include filopodia, spikes, lamellipodial ribs, oocyte microvilli and the dendrites of dendritic cells, have roles in cell-matrix adhesion, cell interactions and cell migration, whereas the cytoplasmic actin bundles appear to participate in cell architecture. . 46160 pfam06269: Protein of unknown function (DUF1029). This family consists of several short Chordopoxvirus proteins of unknown function. 46161 pfam06270: Protein of unknown function (DUF1030). This family consists of several short Circovirus proteins of unknown function. 46162 pfam06271: RDD family. This family of proteins contain three highly conserved amino acids: one arginine and two aspartates, hence the name of RDD family. This region contains two predicted transmembrane regions. The arginine occurs at the N terminus of the first helix and the first aspartate occurs in the middle of this helix. The molecular function of this region is unknown. However this region may be involved in transport of an as yet unknown set of ligands (Bateman A pers. obs.).. 46163 pfam06272: Eukaryotic translation initiation factor 3 subunit 11 (eIF-3 p25). This family consists of several eukaryotic translation initiation factor 3 subunit 11 (eIF-3 p25) proteins. Eukaryotic initiation factor 3 (eIF3) is a multisubunit complex that is required for binding of mRNA to 40 S ribosomal subunits, stabilisation of ternary complex binding to 40 S subunits, and dissociation of 40 and 60 S subunits. 46164 pfam06273: Plant specific eukaryotic initiation factor 4B. This family consists of several plant specific eukaryotic initiation factor 4B proteins. 46165 pfam06274: Bacteriophage Mu tail sheath protein (GpL). This family consists of several bacteriophage Mu-like tail sheath (GpL) proteins as well as several related hypothetical bacterial proteins. 46166 pfam06275: Protein of unknown function (DUF1031). This family consists of several Lactococcus lactis bacteriophage and Lactococcus lactis proteins of unknown function. 46167 pfam06276: Ferric iron reductase protein FhuF. This family consists of several bacterial ferric iron reductase protein (FhuF) sequences. FhuF is involved in the reduction of ferric iron in cytoplasmic ferrioxamine B. 46168 pfam06277: Ethanolamine utilisation protein EutA. This family consists of several bacterial EutA ethanolamine utilisation proteins. The EutA protein is thought to protect the lyase (EutBC) from inhibition by CNB12. 46169 pfam06278: Protein of unknown function (DUF1032). This family consists of several conserved eukaryotic proteins of unknown function. 46170 pfam06279: Protein of unknown function (DUF1033). This family consists of several hypothetical bacterial proteins. Many of the sequences in this family are annotated as putative DNA binding proteins but the function of this family is unknown. 46171 pfam06280: Domain of Unknown Function (DUF1034). This family consists of several domains of unknown function which are present in several bacterial and plant peptidases. This domain is found in conjunction with pfam00082, pfam02225 and is often found with pfam00746. 46172 pfam06281: Protein of unknown function (DUF1035). This family consists of several Sulfolobus and Sulfolobus virus proteins of unknown function. 46173 pfam06282: Protein of unknown function (DUF1036). This family consists of several hypothetical bacterial proteins of unknown function. 46174 pfam06283: Protein of unknown function (DUF1037). This family consists of several bacterial ThuA like proteins. The function of the family is unknown. 46175 pfam06284: Cytomegalovirus UL84 protein. This family consists of several Cytomegalovirus UL84 proteins. The open reading frame UL84 of human cytomegalovirus encodes a multifunctional regulatory protein which is required for viral DNA replication and binds with high affinity to the immediate-early transactivator IE2-p86. 46176 pfam06285: Protein of unknown function (DUF1038). This family consists of several Orthopoxvirus proteins of unknown function. 46177 pfam06286: Coleoptericin. This family consists of several insect Coleoptericin, Acaloleptin, Holotricin and Rhinocerosin proteins which are all known to be antibacterial proteins. 46178 pfam06287: Protein of unknown function (DUF1039). This family consists of several hypothetical bacterial proteins from Escherichia coli and Citrobacter rodentium. The function of this family is unknown. 46179 pfam06288: Protein of unknown function (DUF1040). This family consists of several bacterial YihD proteins of unknown function. 46180 pfam06289: Flagellar protein (FlbD). This family consists of several bacterial FlbD flagellar proteins. The exact function of this family is unknown. 46181 pfam06290: Plasmid SOS inhibition protein (PsiB). This family consists of several plasmid SOS inhibition protein (PsiB) sequences. 46182 pfam06291: Bor protein. This family consists of several Bacteriophage lambda Bor and Escherichia coli Iss proteins. Expression of bor significantly increases the survival of the Escherichia coli host cell in animal serum. This property is a well known bacterial virulence determinant indeed, bor and its adjacent sequences are highly homologous to the iss serum resistance locus of the plasmid ColV2-K94, which confers virulence in animals. It has been suggested that lysogeny may generally have a role in bacterial survival in animal hosts, and perhaps in pathogenesis. 46183 pfam06292: Domain of Unknown Function (DUF1041). This family consists of several eukaryotic domains of unknown function. Members of this family are often found in tandem repeats and co-occur with pfam00168, pfam00130 and pfam00169 domains. 46184 pfam06293: Lipopolysaccharide kinase (Kdo/WaaP) family. These lipopolysaccharide kinases are related to protein kinases pfam00069. This family includes waaP (rfaP) gene product is required for the addition of phosphate to O-4 of the first heptose residue of the lipopolysaccharide (LPS) inner core region. It has previously been shown that WaaP is necessary for resistance to hydrophobic and polycationic antimicrobials in E. coli and that it is required for virulence in invasive strains of S. enterica. 46185 pfam06294: Domain of Unknown Function (DUF1042).. 46186 pfam06295: Protein of unknown function (DUF1043). This family consists of several hypothetical bacterial proteins of unknown function. 46187 pfam06296: Protein of unknown function (DUF1044). This family consists of several hypothetical bacterial proteins of unknown function. 46188 pfam06297: PET Domain. This domain is suggested to be involved in protein-protein interactions. The family is found in conjunction with pfam00412. 46189 pfam06298: Photosystem II protein Y (PsbY). This family consists of several bacterial and plant photosystem II protein Y (PsbY) sequences. PsbY is a manganese-binding protein that has an L-arginine metabolising enzyme activity. 46190 pfam06299: Protein of unknown function (DUF1045). This family consists of several hypothetical proteins from Agrobacterium, Rhizobium and Brucella species. The function of this family is unknown. 46191 pfam06300: Tsp45I type II restriction enzyme. This family consists of several type II restriction enzymes. 46192 pfam06301: Bacteriophage lambda Kil protein. This family consists of several Bacteriophage lambda Kil protein like sequences from both phages and bacteria. Induction of a lambda prophage causes the death of the host cell even in the absence of phage replication and lytic functions due to expression of the lambda kil gene. 46193 pfam06302: Protein of unknown function (DUF1046). This family consists of several highly related Orthopoxvirus proteins of unknown function. 46194 pfam06303: Protein of unknown function (DUF1047). This family consists of several uncharacterised bacterial proteins of unknown function. 46195 pfam06304: Protein of unknown function (DUF1048). This family consists of several hypothetical bacterial proteins of unknown function. 46196 pfam06305: Protein of unknown function (DUF1049). This family consists of several hypothetical bacterial proteins of unknown function. 46197 pfam06306: Beta-1,4-N-acetylgalactosaminyltransferase (CgtA). This family consists of several beta-1,4-N-acetylgalactosaminyltransferase proteins from Campylobacter jejuni. 46198 pfam06307: Herpesvirus IR6 protein. This family consists of several Herpesvirus IR6 proteins. The equine herpesvirus 1 (EHV-1) IR6 protein forms typical rod-like structures in infected cells, influences virus growth at elevated temperatures, and determines the virulence of EHV-1 Rac strains. 46199 pfam06308: 23S rRNA methylase leader peptide (ErmC). This family consists of several very short bacterial 23S rRNA methylase leader peptide (ErmC) sequences. ermC confers resistance to macrolide-lincosamide streptogramin B antibiotics by specifying a ribosomal RNA methylase, which results in decreased ribosomal affinity for these antibiotics. ermC expression is induced by exposure to erythromycin. . 46200 pfam06309: Torsin. This family consists of several eukaryotic torsin proteins. Torsion dystonia is an autosomal dominant movement disorder characterised by involuntary, repetitive muscle contractions and twisted postures. The most severe early-onset form of dystonia has been linked to mutations in the human DYT1 (TOR1A) gene encoding a protein termed torsinA. While causative genetic alterations have been identified, the function of torsin proteins and the molecular mechanism underlying dystonia remain unknown. Phylogenetic analysis of the torsin protein family indicates these proteins share distant sequence similarity with the large and diverse family of (pfam00004) proteins. It has been suggested that torsins play a role in effectively managing protein folding and that possible breakdown in a neuroprotective mechanism that is, in part, mediated by torsins may be responsible for the neuronal dysfunction associated with dystonia. 46201 pfam06311: NUMB phenylalanine-rich region. This domain is in the Numb family of proteins. 46202 pfam06312: Neurexophilin. This family consists of mammalian neurexophilin proteins. Mammalian brains contain four different neurexophilin proteins. Neurexophilins form a family of related glycoproteins that are proteolytically processed after synthesis and bind to alpha-neurexins. The structure and characteristics of neurexophilins indicate that they function as neuropeptides that may signal via alpha-neurexins. 46203 pfam06313: Drosophila ACP53EA protein. This family consists of several Drosophila ACP53EA accessory gland (seminal) proteins. 46204 pfam06314: Acetoacetate decarboxylase (ADC). This family consists of several acetoacetate decarboxylase (ADC) proteins (EC:4.1.1.4).. 46205 pfam06315: Isocitrate dehydrogenase kinase/phosphatase (AceK). This family consists of several bacterial isocitrate dehydrogenase kinase/phosphatase (AceK) proteins (EC:2.7.1.116).. 46206 pfam06316: Enterobacterial Ail/Lom protein. This family consists of several bacterial and phage Ail/Lom-like proteins. The Yersinia enterocolitica Ail protein is a known virulence factor. Proteins in this family are predicted to consist of eight transmembrane beta-sheets and four cell surface-exposed loops. It is thought that Ail directly promotes invasion and loop 2 contains an active site, perhaps a receptor-binding domain. The phage protein Lom is expressed during lysogeny, and encode host-cell envelope proteins. Lom is found in the bacterial outer membrane, and is homologous to virulence proteins of two other enterobacterial genera. It has been suggested that lysogeny may generally have a role in bacterial survival in animal hosts, and perhaps in pathogenesis. 46207 pfam06317: Arenavirus RNA polymerase. This family consists of several Arenavirus RNA polymerase proteins (EC:2.7.7.48).. 46208 pfam06318: Protein of unknown function (DUF1051). This family consists of several Tobravirus proteins of unknown function. 46209 pfam06319: Protein of unknown function (DUF1052). This family consists of several bacterial proteins of unknown function. 46210 pfam06320: GCN5-like protein 1 (GCN5L1). This family consists of several eukaryotic GCN5-like protein 1 (GCN5L1) sequences. The function of this family is unknown. 46211 pfam06321: Porphyromonas gingivalis major fimbrial subunit protein (FimA). This family consists of several Porphyromonas gingivalis major fimbrial subunit protein (FimA) sequences. Fimbriae of Porphyromonas gingivalis, a periodontopathogen, play an important role in its adhesion to and invasion of host cells. The fimA genes encoding fimbrillin (FimA), a subunit protein of fimbriae, have been classified into five types, types I to V, based on nucleotide sequences. It has been found that type II FimA can bind to epithelial cells most efficiently through specific host receptors. 46212 pfam06322: Phage NinH protein. This family consists of several phage NinH proteins. The function of this family is unknown. 46213 pfam06323: Phage antitermination protein Q. This family consists of several phage antitermination protein Q and related bacterial sequences. Phage 82 gene Q encodes a phage-specific positive regulator of late gene expression, thought, by analogy to the corresponding gene of phage lambda, to be a transcription antiterminator. 46214 pfam06324: Pigment-dispersing hormone (PDH). This family consists of several eukaryotic pigment-dispersing hormone (PDH) proteins. The pigment-dispersing hormone (PDH) is produced in the eyestalks of Crustacea where it induces light-adapting movements of pigment in the compound eye and regulates the pigment dispersion in the chromatophores. 46215 pfam06325: Ribosomal protein L11 methyltransferase (PrmA). This family consists of several Ribosomal protein L11 methyltransferase (EC:2.1.1.-) sequences. 46216 pfam06326: Vesiculovirus matrix protein. This family consists of several Vesiculovirus matrix proteins. The matrix (M) protein of vesicular stomatitis virus (VSV) expressed in the absence of other viral components causes many of the cytopathic effects of VSV, including an inhibition of host gene expression and the induction of cell rounding. It has been shown that M protein also induces apoptosis in the absence of other viral components. It is thought that the activation of apoptotic pathways causes the inhibition of host gene expression and cell rounding by M protein. 46217 pfam06327: Domain of Unknown Function (DUF1053). This domain is found in Adenylate cyclases. 46218 pfam06328: Ig-like C2-type domain. This domain is a ligand-binding immunoglobulin-like domain. The two cysteine residues form a disulphide bridge. 46219 pfam06329: Fungal ornithine decarboxylase antizyme. This family consists of several fungal ornithine decarboxylase antizyme proteins. The polyamine biosynthetic enzyme ornithine decarboxylase (ODC) is degraded by the 26 S proteasome via a ubiquitin-independent pathway. Its degradation is greatly accelerated by association with the polyamine-induced regulatory protein antizyme 1 (AZ1). This family is specific to fungal species but is related to the pfam02100 family. 46220 pfam06330: Trichodiene synthase (TRI5). This family consists of several fungal trichodiene synthase proteins (EC:4.2.3.6). TRI5 encodes the enzyme trichodiene synthase, which has been shown to catalyse the first step in the trichothecene pathways of Fusarium and Trichothecium species. . 46221 pfam06331: REX1 DNA Repair. REX1 is required for DNA repair in yeast, and has homologues in other Eukaryotes. 46222 pfam06332: RNA-directed polymerase. The RNA polymerase domain of the Nidovirales, which include the Coronaviruses. 46223 pfam06333: TRAP240. Members of this family have been shown to be involved in transcriptional repression via the Mediator complex. 46224 pfam06334: Orthopoxvirus A47 protein. This family consists of several Orthopoxvirus A47 proteins. The function of this family is unknown. 46225 pfam06335: Protein of unknown function (DUF1054). This family consists of several hypothetical bacterial proteins of unknown function. 46226 pfam06336: Coronavirus 5a protein. This family consists of several Coronavirus 5a proteins. The function of this family is unknown. 46227 pfam06337: Domain of Unknown Function (DUF1055). This region is found in Ubiquitin-specific proteases. 46228 pfam06338: ComK protein. This family consists of several bacterial ComK proteins. The ComK protein of Bacillus subtilis positively regulates the transcription of several late competence genes as well as comK itself. It has been found that ClpX plays an important role in the regulation of ComK at the post-transcriptional level. 46229 pfam06339: Ectoine synthase. This family consists of several bacterial ectoine synthase proteins. The ectABC genes encode the diaminobutyric acid acetyltransferase (EctA), the diaminobutyric acid aminotransferase (EctB), and the ectoine synthase (EctC). Together these proteins constitute the ectoine biosynthetic pathway. 46230 pfam06340: Vibrio cholerae toxin co-regulated pilus biosynthesis protein F (TcpF). This family consists of several Vibrio cholerae toxin co-regulated pilus biosynthesis protein F (TcpF) sequences. TcpF is known to be a secreted virulence protein but its exact function is unknown. 46231 pfam06341: Protein of unknown function (DUF1056). This family consists of several putative head-tail joining bacteriophage proteins. 46232 pfam06342: Protein of unknown function (DUF1057). This family consists of several Caenorhabditis elegans specific proteins of unknown function. 46233 pfam06343: Cys_protease 3C-related family. This family is related to pfam00548. 46234 pfam06344: Parechovirus Genome-linked protein. This family is of the Parechovirus genome-linked protein Vpg type P3B. 46235 pfam06345: DRF Autoregulatory Domain. This motif is found in Diaphanous-related formins. It binds the N-terminal GTPase-binding domain; this link is broken when GTP-bound Rho binds to the GBD and activates the protein. The addition of DAD to mammalian cells induces actin filament formation, stabilises microtubules, and activates serum-response mediated transcription. 46236 pfam06346: Formin Homology Region 1. This region is found in some of the Diaphanous related formins (Drfs). It consists of low complexity repeats of around 12 residues. 46237 pfam06347: Protein of unknown function (DUF1058). This family consists of several hypothetical bacterial proteins of unknown function. 46238 pfam06348: Protein of unknown function (DUF1059). This family consists of several short hypothetical archaeal proteins of unknown function. 46239 pfam06349: Domain of Unknown Function (DUF1060). This region is found in the coronavirus polyprotein. 46240 pfam06350: Hormone-sensitive lipase (HSL) N-terminus. This family consists of several mammalian hormone-sensitive lipase (HSL) proteins (EC:3.1.1.-). Hormone-sensitive lipase, a key enzyme in fatty acid mobilisation, overall energy homeostasis, and possibly steroidogenesis, is acutely controlled through reversible phosphorylation by catecholamines and insulin. 46241 pfam06351: Allene oxide cyclase. This family consists of several plant specific allene oxide cyclase proteins (EC:5.3.99.6). The allene oxide cyclase (AOC)-catalysed step in jasmonate (JA) biosynthesis is important in the wound response of tomato. 46242 pfam06352: Protein of unknown function (DUF1061). This family consists of several hypothetical bacterial proteins of unknown function. 46243 pfam06353: Protein of unknown function (DUF1062). This family consists of several hypothetical bacterial proteins of unknown function. 46244 pfam06354: Protein of unknown function (DUF1063). This family consists of several hypothetical proteins of unknown function. 46245 pfam06355: Aegerolysin. This family consists of several bacterial and eukaryotic Aegerolysin-like proteins. It has been found that aegerolysin and ostreolysin are expressed during formation of primordia and fruiting bodies. It has been suggested that these haemolysins play an important role in initial phase of fungal fruiting. The bacterial members of this family are expressed during sporulation. 46246 pfam06356: Protein of unknown function (DUF1064). This family consists of several phage and bacterial proteins of unknown function. 46247 pfam06357: Omega-atracotoxin. This family consists of several Hadronyche versuta (Blue mountains funnel-web spider) specific omega-atracotoxin proteins. Omega-Atracotoxin-Hv1a is an insect-specific neurotoxin whose phylogenetic specificity derives from its ability to antagonise insect, but not vertebrate, voltage-gated calcium channels. Two spatially proximal residues, Asn(27) and Arg(35), form a contiguous molecular surface that is essential for toxin activity. It has been proposed that this surface of the beta-hairpin is a key site for interaction of the toxin with insect calcium channels. . 46248 pfam06358: Protein of unknown function (DUF1065). This family consists of several Benyvirus proteins of unknown function. 46249 pfam06359: Protein of unknown function (DUF1066). This family consists of several Mycobacterium ESAT-6 like proteins of unknown function. 46250 pfam06360: Euplotes raikovi mating pheromone. This family consists of several Euplotes raikovi mating pheromone proteins. Diffusible polypeptide pheromones, which distinguish otherwise morphologically identical vegetative cell types from one another, are produced by some species of ciliates. In the marine sand-dwelling protozoan ciliate Euplotes raikovi, pheromone molecules promote the vegetative reproduction (mitogenic proliferation or growth) of the same cells from which they originate. As, understandably, such autocrine pheromone activity is primary to that of targeting and inducing a foreign cell to mate (paracrine functions), this finding provides an example of how the original function of a molecule can be obscured during evolution by the acquisition of a new one. 46251 pfam06361: Rice tungro bacilliform virus P12 protein. This family consists of several Rice tungro bacilliform virus P12 proteins. The function of this family is unknown. 46252 pfam06362: Protein of unknown function (DUF1067). This family consists of several hypothetical Mycobacterium leprae specific proteins. The function of this family is unknown. 46253 pfam06363: Picornaviridae P3A protein. This family consists of the P3A protein of picornaviridae. P3A has been identified as a genome-linked protein (VPg) and is involved in replication. 46254 pfam06364: Protein of unknown function (DUF1068). This family consists of several hypothetical plant proteins from Arabidopsis thaliana and Oryza sativa. The function of this family is unknown. 46255 pfam06365: CD34 antigen protein. This family consists of several mammalian CD34 antigen proteins. The CD34 antigen is a human leukocyte membrane protein expressed specifically by lymphohematopoietic progenitor cells. CD34 is a phosphoprotein. Activation of protein kinase C (PKC) has been found to enhance CD34 phosphorylation. 46256 pfam06366: Flagellar protein FlhE. This family consists of several Enterobacterial FlhE flagellar proteins. The exact function of this family is unknown. 46257 pfam06367: Diaphanous FH3 Domain. This region is found in the Formin-like and and diaphanous proteins. 46258 pfam06368: Methylaspartate mutase E chain (MutE). This family consists of several methylaspartate mutase E chain proteins (EC:5.4.99.1). Glutamate mutase catalyses the first step in the fermentation of glutamate by Clostridium tetanomorphum. This is an unusual isomerisation in which L-glutamate is converted to threo-beta-methyl L-aspartate. 46259 pfam06369: Sea anemone cytotoxic protein. Sea anemones are a rich source of cytotoxic proteins. Cytolysins comprise a group of more than 30 highly basic proteins with molecular masses of about 20 kDa. Cytolysins isolated from the sea anemone, Heteractis magnifica, include magnificalysin I (HMg I), magnificalysin II (HMg II) and Heteractis magnifica toxin (HMgtxn). These are highly homologous at their N-terminals. HMg I and II have molecular masses of approximately 19 kDa, and pI values of 9.4 and 10.0, respectively. Cytolysins isolated from other sea anemones Actinia tenebrosa (Tenebrosin-C, TN-C), Actinia equina (Equinatoxin, EqT) and Stichodactyla helianthus (ShC) exhibit pore-forming, haemolytic, cytotoxic, and heart stimulatory activities. 46260 pfam06370: Protein of unknown function (DUF1069). This family consists of several Maize streak virus 21.7 kDa proteins. The function of this family is unknown. 46261 pfam06371: Diaphanous GTPase-binding Domain. This domain is bound to by GTP-attached Rho proteins, leading to activation of the Drf protein. 46262 pfam06372: Gemin6 protein. This family consists of several mammalian Gemin6 proteins. The exact function of Gemin6 is unknown but it has been found to form part of the pfam06003 complex. The SMN complex plays a key role in the biogenesis of spliceosomal small nuclear ribonucleoproteins (snRNPs) and other ribonucleoprotein particles. 46263 pfam06373: Cocaine and amphetamine regulated transcript protein (CART). This family consists of several cocaine and amphetamine regulated transcript type I protein (CART) sequences. Cocaine and amphetamine regulated transcript (CART) peptide has been shown to be an anorectic peptide that inhibits both normal and starvation-induced feeding and completely blocks the feeding response induced by neuropeptide Y and regulated by leptin in the hypothalamus. The C-terminal part containing the three disulfide bridges is the biologically active part of the molecule affecting food intake. The solution structure of the active part of CART has a fold equivalent to other functionally distinct small proteins. CART consists mainly of turns and loops spanned by a compact framework composed by a few small stretches of antiparallel beta-sheet common to cystine knots. . 46264 pfam06374: NADH-ubiquinone oxidoreductase subunit b14.5b (NDUFC2). This family consists of several NADH-ubiquinone oxidoreductase subunit b14.5b proteins (EC:1.6.5.3).. 46265 pfam06375: Bovine leukaemia virus receptor (BLVR). This family consists of several bovine specific leukaemia virus receptors which are thought to function as transmembrane proteins, although their exact function is unknown. 46266 pfam06376: Protein of unknown function (DUF1070). This family consists of several short hypothetical plant proteins of unknown function. 46267 pfam06377: Adipokinetic hormone. This family consists of several insect adipokinetic hormone as well as the related crustacean red pigment concentrating hormone. Flight activity of insects comprises one of the most intense biochemical processes known in nature, and therefore provides an attractive model system to study the hormonal regulation of metabolism during physical exercise. In long-distance flying insects, such as the migratory locust, both carbohydrate and lipid reserves are utilised as fuels for sustained flight activity. The mobilisation of these energy stores in Locusta migratoria is mediated by three structurally related adipokinetic hormones (AKHs), which are all capable of stimulating the release of both carbohydrates and lipids from the fat body. 46268 pfam06378: Protein of unknown function (DUF1071). This family consists of several hypothetical bacterial and phage proteins of unknown function. 46269 pfam06379: L-rhamnose-proton symport protein (RhaT). This family consists of several bacterial L-rhamnose-proton symport protein (RhaT) sequences. 46270 pfam06380: Protein of unknown function (DUF1072). This family consists of several Barley yellow dwarf virus proteins of unknown function. 46271 pfam06381: Protein of unknown function (DUF1073). This family consists of several hypothetical Borrelia burgdorferi and Borrelia hermsii proteins. The function of this family is unknown. 46272 pfam06382: Protein of unknown function (DUF1074). This family consists of several proteins which appear to be specific to Drosophila melanogaster. The function of this family is unknown. 46273 pfam06383: Cobalamin biosynthesis protein CobS. This family consists of several bacterial CobS proteins which are related to the Pseudomonas denitrificans CobS. It is thought that CobS is involved in cobalt insertion-mediating reactions. This family is not related to the pfam02654 family. 46274 pfam06384: Beta-catenin-interacting protein ICAT. This family consists of several eukaryotic beta-catenin-interacting (ICAT) proteins. Beta-catenin is a multifunctional protein involved in both cell adhesion and transcriptional activation. Transcription mediated by the beta-catenin/Tcf complex is involved in embryological development and is upregulated in various cancers. ICAT selectively inhibits beta-catenin/Tcf binding in vivo, without disrupting beta-catenin/cadherin interactions. 46275 pfam06385: Baculovirus LEF-11 protein. This family consists of several Baculovirus LEF-11 proteins. The exact function of this family is unknown although it has been shown that LEF-11 is required for viral DNA replication during the infection cycle. 46276 pfam06386: Gas vesicle synthesis protein GvpL/GvpF. This family consists of several bacterial and archaeal gas vesicle synthesis protein (GvpL/GvpF) sequences. The exact function of this family is unknown. 46277 pfam06387: D1 dopamine receptor-interacting protein (calcyon). This family consists of several D1 dopamine receptor-interacting (calcyon) proteins. D1/D5 dopamine receptors in the basal ganglia, hippocampus, and cerebral cortex modulate motor, reward, and cognitive behaviour. D1-like dopamine receptors likely modulate neocortical and hippocampal neuronal excitability and synaptic function via Ca(2+) as well as cAMP-dependent signaling. Defective calcyon proteins have been implicated in both attention-deficit/hyperactivity disorder (ADHD) and schizophrenia. 46278 pfam06388: Protein of unknown function (DUF1075). This family consists of several eukaryotic proteins of unknown function. 46279 pfam06389: Filovirus membrane-associated protein VP24. This family consists of several membrane-associated protein VP24 sequences from a variety of Ebola and Marburg viruses. The VP24 protein of Ebola virus is believed to be a secondary matrix protein and minor component of virions. VP24 possesses structural features commonly associated with viral matrix proteins and that VP24 may have a role in virus assembly and budding. 46280 pfam06390: Neuroendocrine-specific golgi protein P55 (NESP55). This family consists of several mammalian neuroendocrine-specific golgi protein P55 (NESP55) sequences. NESP55 is a novel member of the chromogranin family and is a soluble, acidic, heat-stable secretory protein that is expressed exclusively in endocrine and nervous tissues, although less widely than chromogranins. 46281 pfam06391: CDK-activating kinase assembly factor MAT1. MAT1 is an assembly/targeting factor for cyclin-dependent kinase-activating kinase (CAK), which interacts with the transcription factor TFIIH. The domain found to the N-terminal side of this domain is a C3HC4 RING finger. 46282 pfam06392: Acid shock protein repeat. The Asr protein is synthesised as a precursor and the cleavage is essential for moderate to high acid tolerance. 46283 pfam06393: BH3 interacting domain (BID). BID is a member of the BCL-2 superfamily of proteins are key regulators of programmed cell death, hence this family is related to pfam00452 . BID is a pro-apoptotic member of the Bcl-2 superfamily and as such posses the ability to target intracellular membranes and contains the BH3 death domain. The activity of BID is regulated by a Caspase 8-mediated cleavage event, exposing the BH3 domain and significantly changing the surface charge and hydrophobicity, which causes a change of cellular localisation. 46284 pfam06394: Pepsin inhibitor-3-like repeated domain. Pepsin inhibitor-3 consisting of two domains, each comprising an antiparallel beta-sheet flanked by an alpha-helix. In the enzyme-inhibitor complex, the N-terminal beta-strand of PI-3 pairs with one strand of the active site flap region of pepsin. The two domains are tandem repeats of sequence, and has therefore been termed repeated domain. 46285 pfam06395: CDC24 Calponin. Is a calponin homology domain. 46286 pfam06396: Angiotensin II, type I receptor-associated protein (AGTRAP). This family consists of several angiotensin II, type I receptor-associated protein (AGTRAP) sequences. AGTRAP is known to interact specifically with the carboxyl-terminal cytoplasmic region of the angiotensin II type 1 (AT(1)) receptor to regulate different aspects of AT(1) receptor physiology. The function of this family is unclear. 46287 pfam06397: Desulfoferrodoxin, N-terminal domain. Most members of this family are small (approximately 36 amino acids) proteins that from homodimeric complexes. Each subunit contains a high-spin iron atom tetrahedrally bound to four cysteinyl sulphur atoms This family has a similar fold to the rubredoxin metal binding domain. It is also found as the N-terminal domain of desulfoferrodoxin, see (pfam01880). . 46288 pfam06398: Integral peroxisomal membrane peroxin. Peroxisomes play diverse roles in the cell, compartmentalising many activities related to lipid metabolism and functioning in the decomposition of toxic hydrogen peroxide. Sequence similarity was identified between two hypothetical proteins and the peroxin integral membrane protein Pex24p. 46289 pfam06399: GTP cyclohydrolase I feedback regulatory protein (GFRP). Tetrahydrobiopterin, the cofactor required for hydroxylation of aromatic amino acids regulates its own synthesis in via feedback inhibition of GTP cyclohydrolase I. This mechanism is mediated by the regulatory subunit called GTP cyclohydrolase I feedback regulatory protein (GFRP).. 46290 pfam06400: Alpha-2-macroglobulin RAP, N-terminal domain. The alpha-2-macroglobulin receptor-associated protein (RAP) is a intracellular glycoprotein that binds to the 2-macroglobulin receptor and other members of the low density lipoprotein receptor family. The protein inhibits binding of all currently known ligands of these receptors. The N-terminal domain is predominately alpha helical. Two different studies have provided conflicted domain boundaries. 46291 pfam06401: Alpha-2-macroglobulin RAP, C-terminal domain. The alpha-2-macroglobulin receptor-associated protein (RAP) is a intracellular glycoprotein that binds to the 2-macroglobulin receptor and other members of the low density lipoprotein receptor family. The protein inhibits binding of all currently known ligands of these receptors. Two different studies have provided conflicted domain boundaries. 46292 pfam06402: Histidine-rich actin-binding protein (hisactophilin). Dictyostelium hisactophilin, a unique actin-binding protein, is a submembranous pH sensor that signals slight changes of the H+ concentration to actin by inducing actin polymerisation and binding to microfilaments only at pH values below seven. Members of this family are histidine rich, typically contain the repeated motif of HHXH. . 46293 pfam06403: Lamprin. This family consists of several lamprin proteins from the Sea lamprey Petromyzon marinus. Lamprin, an insoluble non-collagen, non-elastin protein, is the major connective tissue component of the fibrillar extracellular matrix of lamprey annular cartilage. Although not generally homologous to any other protein, soluble lamprins contain a tandemly repeated peptide sequence (GGLGY) which is present in both silkmoth chorion proteins and spider dragline silk. Strong homologies to this repeat sequence are also present in several mammalian and avian elastins. It is thought that these proteins share a structural motif which promotes self-aggregation and fibril formation in proteins through interdigitation of hydrophobic side chains in beta-sheet/beta-turn structures, a motif that has been preserved in recognisable form over several hundred million years of evolution. 46294 pfam06404: Phytosulfokine precursor protein (PSK). This family consists of several plant specific phytosulfokine precursor proteins. Phytosulfokines, are active as either a pentapeptide or a C-terminally truncated tetrapeptide. These compounds were first isolated because of their ability to stimulate cell division in somatic embryo cultures of Asparagus officinalis. 46295 pfam06405: Red chlorophyll catabolite reductase (RCC reductase). This family consists of several red chlorophyll catabolite reductase (RCC reductase) proteins. Red chlorophyll catabolite (RCC) reductase (RCCR) and pheophorbide (Pheide) a oxygenase (PaO) catalyse the key reaction of chlorophyll catabolism, porphyrin macrocycle cleavage of Pheide a to a primary fluorescent catabolite (pFCC). . 46296 pfam06406: StbA protein. This family consists of several bacterial StbA plasmid stability proteins. 46297 pfam06407: Borna disease virus P40 protein. This family consists of several Borna disease virus P40 proteins. Borna disease (BD) is a persistent viral infection of the central nervous system caused by the single-negative-strand, nonsegmented RNA Borna disease virus (BDV). P40 is known to be a nucleoprotein. 46298 pfam06408: Homospermidine synthase. This family consists of several homospermidine synthase proteins (EC:2.5.1.44). Homospermidine synthase (HSS) catalyses the synthesis of the polyamine homospermidine from 2 mol putrescine in an NAD(+)-dependent reaction. . 46299 pfam06409: Nuclear pore complex interacting protein (NPIP). This family consists of a series of primate specific nuclear pore complex interacting protein (NPIP) sequences. The function of this family is unknown but is well conserved from African apes to humans. 46300 pfam06410: Gurmarin. Gurmarin is a 35-residue polypeptide from the Asclepiad vine Gymnema sylvestre. It has been utilised as a pharmacological tool in the study of sweet-taste transduction because of its ability to selectively inhibit the neural response to sweet tastants in rats. 46301 pfam06411: hns-dependent expression protein A (HdeA). HdeA is a single domain alpha-helical protein localised in the periplasmic space. HdeA is involved in acid resistance essential for infectivity of enteric bacterial pathogens. Functional studies demonstrate that HdeA is activated by a dimer-to-monomer transition at acidic pH, leading to suppression of aggregation by acid-denatured proteins. The gene encoding HdeA was initially identified as part of an operon regulated by the nucleoid protein H-NS. 46302 pfam06412: Conjugal transfer protein TraD. This family consists of several bacterial TraD conjugal transfer proteins. 46303 pfam06413: Neugrin. This family consists of several mouse and human neugrin proteins. Neugrin and m-neugrin are mainly expressed in neurons in the nervous system, and are thought to play an important role in the process of neuronal differentiation. 46304 pfam06414: Zeta toxin. This family consists of several bacterial zeta toxin proteins. Zeta toxin is thought to be part of a postregulational killing system in bacteria. It relies on antitoxin/toxin systems that secure stable inheritance of low and medium copy number plasmids during cell division and kill cells that have lost the plasmid. 46305 pfam06415: BPG-independent PGAM N-terminus (iPGM_N). This family represents the N-terminal region of the 2,3-bisphosphoglycerate-independent phosphoglycerate mutase (or phosphoglyceromutase or BPG-independent PGAM) protein (EC:5.4.2.1). The family is found in conjunction with pfam01676 (located in the C-terminal region of the protein).. 46306 pfam06416: Protein of unknown function (DUF1076). This family consists of several hypothetical bacterial proteins exclusive to Escherichia coli and Salmonella typhi. The function of this family is unknown. 46307 pfam06417: Protein of unknown function (DUF1077). This family consists of several hypothetical eukaryotic proteins of unknown function. 46308 pfam06418: CTP synthase N-terminus. This family consists of the N-terminal region of the CTP synthase protein (EC:6.3.4.2). This family is found in conjunction with pfam00117 located in the C-terminal region of the protein. CTP synthase catalyses the synthesis of CTP from UTP by amination of the pyrimidine ring at the 4-position. 46309 pfam06419: Conserved oligomeric complex COG6. COG6 is a component of the conserved oligomeric golgi complex, which is composed of eight different subunits and is required for normal golgi morphology and localisation. 46310 pfam06420: Mitochondrial genome maintenance MGM101. The mgm101 gene was identified as essential for maintenance of the mitochondrial genome in Saccharomyces cerevisiae. Based on its DNA-binding activity, and experimental work with a temperature-sensitive mgm101 mutant, it has been proposed that the mgm101 gene product performs an essential function in the repair of oxidatively damaged mitochondrial DNA. 46311 pfam06421: GTP-binding protein LepA C-terminus. This family consists of the C-terminal region of several pro- and eukaryotic GTP-binding LepA proteins. 46312 pfam06422: CDR ABC transporter. Corresponds to a region of the PDR/CDR subgroup of ABC transporters comprising extracellular loop 3, transmembrane segment 6 and linker region. 46313 pfam06423: GWT1. Glycosylphosphatidylinositol (GPI) is a conserved post-translational modification to anchor cell surface proteins to plasma membrane in eukaryotes. GWT1 is involved in GPI anchor biosynthesis; it is required for inositol acylation in yeast. 46314 pfam06424: PRP1 splicing factor, N-terminal. This domain is specific to the N-terminal part of the prp1 splicing factor, which is involved in mRNA splicing (and possibly also poly(A)+ RNA nuclear export and cell cycle progression). This domain is specific to the N terminus of the RNA splicing factor encoded by prp1. It is involved in mRNA splicing and possibly also poly(A)and RNA nuclear export and cell cycle progression. 46315 pfam06425: Partner of SLD five, PSF3. The GINS complex is essential for initiation of DNA replication in Xenopus egg extracts. This 100 kD stable complex includes Sld5, Psf1, Psf2, and Psf3. Homologues of these components are found also in yeasts and in humans. 46316 pfam06426: Serine acetyltransferase, N-terminal. The N-terminal domain of serine acetyltransferase has a sequence that is conserved in plants and bacteria. 46317 pfam06427: UDP-glucose:Glycoprotein Glucosyltransferase. The N-terminal region of this group of proteins is required for correct folding of the ER UDP-Glc: glucosyltransferase. 46318 pfam06428: GDP/GTP exchange factor Sec2p. In Saccharomyces cerevisiae, Sec2p is a GDP/GTP exchange factor for Sec4p, which is required for vesicular transport at the post-Golgi stage of yeast secretion. 46319 pfam06429: Domain of unknown function (DUF1078). This family consists of a number of C-terminal domains of unknown function. This domain seems to be specific to flagellar basal-body rod and flagellar hook proteins in which pfam00460 is often present at the extreme N terminus. 46320 pfam06430: Lactococcus lactis RepB C-terminus. This family consists of the C-terminal region of RepB proteins from Lactococcus lactis (See pfam01051).. 46321 pfam06431: Polyomavirus large T antigen C-terminus. 46322 pfam06432: Phosphatidylinositol N-acetylglucosaminyltransferase. Glycosylphosphatidylinositol (GPI) represents an important anchoring molecule for cell surface proteins. The first step in its synthesis is the transfer of N-acetylglucosamine (GlcNAc) from UDP-N-acetylglucosamine to phosphatidylinositol (PI). This step involves products of three or four genes in both yeast (GPI1, GPI2 and GPI3) and mammals (GPI1, PIG A, PIG H and PIG C), respectively. 46323 pfam06433: Methylamine dehydrogenase heavy chain (MADH). Methylamine dehydrogenase (EC:1.4.99.3) a periplasmic quinoprotein found in several methyltrophic bacteria. Induced when grown on methylamine as a carbon source MADH catalyses the oxidative deamination of amines to there corresponding aldehydes. MADH is a hetero- tetramer, comprised of two heavy chains (H) and two light chains (L). The H-chain forms a beta-propeller like structure. 46324 pfam06434: Aconitate hydratase 2 N-terminus. This family represents the N-terminal region of several bacterial Aconitate hydratase 2 proteins and is found in conjunction with pfam00330. 46325 pfam06435: Repeat of unknown function (DUF1079). This family consists of several repeats of 31 residues in length and seems to be exclusive to Moraxella catarrhalis UspA proteins. The UspA1 and UspA2 proteins of Moraxella catarrhalis are structurally related and are exposed on the bacterial cell surface where can function adhesins. This family is commonly found with the pfam03895 family. 46326 pfam06436: Pneumovirus matrix protein 2 (M2). This family consists of several Pneumovirus matrix glycoprotein M2 sequences. This family functions as a transcription processivity factor that is essential for virus replication. 46327 pfam06437: IMP-specific 5'-nucleotidase. The Saccharomyces cerevisiae ISN1 (YOR155c) gene encodes an IMP-specific 5 '-nucleotidase, which catalyses degradation of IMP to inosine as part of the purine salvage pathway. 46328 pfam06438: Heme-binding protein A (HasA). Free iron is limited in vertebrate hosts, thus an alternative to siderophores has been developed by pathogenic bacteria to access host iron bound in protein complexes. HasA is a secreted hemophore that has the ability to obtain iron from hemoglobin. Once bound to HasA, the heme is shuttled to the receptor HasR, which releases the heme into the bacterium. 46329 pfam06439: Domain of Unknown Function (DUF1080).. 46330 pfam06440: DNA polymerase III, theta subunit. DNA polymerase III (EC 2.7.7.7) is comprised of three tightly associated subunits, alpha, epsilon and theta. This family contains the theta subunit. The structure of the theta subunit shows that the N-terminal two thirds is comprised of three helices while the C-terminal third is disordered. The function of the theta subunit is poorly understood, but the interaction of the theta subunit with the epsilon subunit is thought to enhance the 3' to 5' exonucleolytic proofreading activity of epsilon. 46331 pfam06441: Epoxide hydrolase N terminus. This family represents the N-terminal region of the eukaryotic epoxide hydrolase protein. Epoxide hydrolases (EC:3.3.2.3) comprise a group of functionally related enzymes that catalyse the addition of water to oxirane compounds (epoxides), thereby usually generating vicinal trans-diols. EHs have been found in all types of living organisms, including mammals, invertebrates, plants, fungi and bacteria. In animals, the major interest in EH is directed towards their detoxification capacity for epoxides since they are important safeguards against the cytotoxic and genotoxic potential of oxirane derivatives that are often reactive electrophiles because of the high tension of the three-membered ring system and the strong polarisation of the C--O bonds. This is of significant relevance because epoxides are frequent intermediary metabolites which arise during the biotransformation of foreign compounds. This family is often found in conjunction with pfam00561. 46332 pfam06442: R67 dihydrofolate reductase. R67 dihydrofolate reductase is a plasmid encoded enzyme that provides resistance to the antibacterial drug trimethoprim. The R67 dihydrofolate reductase does not share significant similarity to the chromosomal encoded dihydrofolate reductase. 46333 pfam06443: SEF14-like adhesin. Family of enterotoxigenic bacterial adhesins. . 46334 pfam06444: NADH dehydrogenase subunit 2 C-terminus. This family consists of the C-terminal region specific to the eukaryotic NADH dehydrogenase subunit 2 protein and is found in conjunction with pfam00361. 46335 pfam06445: Bacterial transcription activator, effector binding domain. This family contains the probable effector binding domain of a number of different bacterial transcription activators. This family also contains DNA gyrase inhibitors. The absence of a HTH motif in the DNA gyrase inhibitors is thought to indicate the fact that these do not bind DNA. 46336 pfam06446: Hepcidin. Hepcidin is a antibacterial and antifungal protein expressed in the liver and is also a signaling molecule in iron metabolism. The hepcidin protein is cysteine-rich and forms a distorted beta-sheet with an unusual disulphide bond found at the turn of the hairpin. 46337 pfam06447: TraB pilus assembly protein. This family consists of several bacterial TraB pilus assembly proteins. TraB is know to be essential for piliation and transfer but very little is known about its specific role in this process. It has been suggested that TraB extends into the periplasmic space and is anchored in the inner membrane via a single transmembrane segment near the N terminus. It is also thought that TraB may interact with TraP, in order to stabilise the proposed transmembrane complex formed by the tra operon products. 46338 pfam06448: Domain of Unknown Function (DUF1081). This region is found in Apolipophorin proteins. 46339 pfam06449: Mitochondrial domain of unknown function (DUF1082). This family consists of the C-terminal region of several plant mitochondria specific proteins. The function of this family is unknown. This family is found in conjunction with pfam02326. 46340 pfam06450: Bacterial Na+/H+ antiporter B (NhaB). This family consists of several bacterial Na+/H+ antiporter B (NhaB) proteins. The exact function of this family is unknown. 46341 pfam06451: Moricin. Moricin is a antibacterial peptide that is highly basic. The structure of moricin reveals that it is comprised of a long alpha-helix. The N-terminus of the helix is amphipathic, and the C-terminus of the helix is predominately hydrophobic. The amphipathic N-terminal segment of the alpha- helix is mainly responsible for the increase in permeability of the bacterial membrane which kills the bacteria. 46342 pfam06452: Domain of unknown function (DUF1083). This family consists of several domains of unknown function exclusively found in bacterial xylanase proteins (usually at the C-terminus) although it is tandemly repeated in a number of family members. This family is always found in conjunction with pfam00331 and usually with either pfam02018 or pfam00395. The function of this family is unknown. 46343 pfam06453: Type II heat-labile enterotoxin , B subunit (LT-IIB). Family of B subunits from the type II heat-labile enterotoxin. The B subunits form a pentameric ring, which interacts with one A subunit. Thus, the structural arrangement of type I and type II heat-labile enterotoxins are very similar. . 46344 pfam06454: Protein of unknown function (DUF1084). This family consists of several hypothetical plant specific proteins of unknown function. 46345 pfam06455: NADH dehydrogenase subunit 5 C-terminus. This family represents the C-terminal region of several NADH dehydrogenase subunit 5 proteins and is found in conjunction with pfam00361 and pfam00662. 46346 pfam06456: Arfaptin-like domain. Arfaptin interacts with ARF1, a small GTPase involved in vesicle budding at the Golgi complex and immature secretory granules. The structure of arfaptin shows that upon binding to a small GTPase, arfaptin forms a an elongated, crescent-shaped dimer of three-helix coiled-coils. The N-terminal region of ICA69 is similar to arfaptin. . 46347 pfam06457: Ectatomin. Ectatomin is a toxic component from the Ectatomma tuberculatum ant venom. It is comprised of two subunits, A and B, which are homologous. The structure of ectatomin reveals that each subunit is comprised of two helices and a connecting hinge region, the forms a hairpin structure that is stabilised by disulphide bridges. The two hinges are connected by a disulphide bond. 46348 pfam06458: Repeat of unknown function (DUF1085). This family consists of a series of repeated sequences of around 50 residues in length. The repeat is found in bacterial peptidoglycan bound proteins and is often found in conjunction with pfam00746 and pfam00560. 46349 pfam06459: Ryanodine Receptor TM 4-6. This region covers TM regions 4-6 of the ryanodine receptor 1 family. 46350 pfam06460: Coronavirus NSP13. This family covers the NSP13 region of the coronavirus polyprotein. This protein has the predicted function of an mRNA cap-1 methyltransferase function. 46351 pfam06461: Domain of Unknown Function (DUF1086). This family consists of several eukaryotic domains of unknown function which are present in chromodomain helicase DNA binding proteins. This domain is often found in conjunction with pfam00176, pfam00271, pfam06465, pfam00385 and pfam00628. 46352 pfam06462: Propeller. Probable beta-propeller. 46353 pfam06463: Molybdenum Cofactor Synthesis C. This region contains two iron-sulphur (3Fe-4S) binding sites. Mutations in this region of human MOCS1A cause MOCOD (Molybdenum Co-Factor Deficiency) type A. 46354 pfam06464: DMAP1-binding Domain. This domain binds DMAP1, a transcriptional co-repressor. 46355 pfam06465: Domain of Unknown Function (DUF1087).. 46356 pfam06466: PCAF (P300/CBP-associated factor) N-terminal domain. This region is spliced out of hsGCN5 isoform 2. It is predicted to be of a mixed alpha/beta fold - though predominantly helical. 46357 pfam06467: MYM-type Zinc finger. MYM-type zinc fingers were identified in MYM family proteins. Human zinc finger protein 261 is involved in a chromosomal translocation and may be responsible for X-linked retardation in XQ13.1. Human zinc finger protein 198 is also involved in disease. In myeloproliferative disorders it is fused to FGF receptor 1; in atypical myeloproliferative disorders it is rearranged). Members of the family generally are involved in development. 46358 pfam06468: Spondin_N. This conserved region is found at the in the N-terminal half of several Spondin proteins. Spondins are involved in patterning axonal growth trajectory through either inhibiting or promoting adhesion of embryonic nerve cells. . 46359 pfam06469: Domain of Unknown Function (DUF1088). This family is found in the neurobeachins. The function of this region is not known. 46360 pfam06470: SMC proteins Flexible Hinge Domain. This family represents the hinge region of the SMC (Structural Maintenance of Chromosomes) family of proteins. The hinge region is responsible for formation of the DNA interacting dimer. It is also possible that the precise structure of it is an essential determinant of the specificity of the DNA-protein interaction. . 46361 pfam06471: NSP11. This region of coronavirus polyproteins encodes the NSP11 protein. 46362 pfam06472: ABC transporter N-terminus. This region covers the N-terminus and first two membrane regions of a small family of ABC transporters. Mutations in this domain in PMP70 are believed responsible for Zellweger Syndrome-2; mutations in ALDP are responsible for recessive X-linked adrenoleukodystrophy. A Saccharomyces cerevisiae homolog is involved in the import of long-chain fatty acids. 46363 pfam06473: FGF binding protein 1 (FGF-BP1). This family consists of several mammalian FGF binding protein 1. Fibroblast growth factors (FGFs) play important roles during fetal and embryonic development. Fibroblast growth factor-binding protein (FGF-BP) 1 is a secreted protein that can bind fibroblast growth factors (FGFs) 1 and 2. 46364 pfam06474: MLTD_N. 46365 pfam06475: Protein of unknown function (DUF1089). This family consists of several hypothetical bacterial proteins. The function of this family is unknown. 46366 pfam06476: Protein of unknown function (DUF1090). This family consists of several bacterial proteins of unknown function and is known as YqjC in E. coli. 46367 pfam06477: Protein of unknown function (DUF1091). This family consists of several Drosophila melanogaster specific proteins. The function of this family is unknown. 46368 pfam06478: Coronavirus RPol N-terminus. This family covers the N-terminal region of the coronavirus RNA-directed RNA Polymerase. 46369 pfam06479: Ribonuclease 2-5A. This domain is a endoribonuclease. Specifically it cleaves an intron from Hac1 mRNA in humans, which causes it to be much more efficiently translated. 46370 pfam06480: FtsH Extracellular. This domain is found in the FtsH family of proteins. FtsH is the only membrane-bound ATP-dependent protease universally conserved in prokaryotes. It only efficiently degrades proteins that have a low thermodynamic stability - e.g. it lacks robust unfoldase activity. This feature may be key and implies that this could be a criterion for degrading a protein. In Oenococcus oeni FtsH is involved in protection against environmental stress, and shows increased expression under heat or osmotic stress. These two lines of evidence suggest that it is a fundamental prokaryotic self-protection mechanism that checks if proteins are correctly folded (personal obs: Yeats C). The precise function of this N-terminal region is unclear. 46371 pfam06481: COX Aromatic Rich Motif. COX2 (Cytochrome O ubiquinol OXidase 2) is a major component of the respiratory complex during vegetative growth. It transfers electrons from a quinol to the binuclear centre of the catalytic subunit 1. The function of this region is not known. 46372 pfam06482: Collagenase NC10 and Endostatin. NC10 stands for Non-helical region 10. A mutation in this region in Collagen alpha 1(XVIII) precursor is associated with an increased risk of prostate cancer. This domain is cleaved from the precursor and forms endostatin. Endostatin is a key tumour suppressor and has been used highly successfully to treat cancer. It is a potent angiogenesis inhibitor. Endostatin also binds a zinc ion near the N-terminus; this is likely to be of structural rather than functional importance according to. 46373 pfam06483: Chitinase C. This ~170 aa region is found at the C-terminus of pfam00704. 46374 pfam06484: Teneurin Intracellular Region. This family is found in the intracellular N-terminal region of the Teneurin family of proteins. These proteins are 'pair-rule' genes and are involved in tissue patterning, specifically probably neural patterning. The intracellular domain is cleaved in response to homophilic interaction of the extracellular domain, and translocates to the nucleus. Here it probably carries out to some transcriptional regulatory activity. The length of this region and the conservation suggests that there may be two structural domains here (personal obs:C Yeats).. 46375 pfam06485: Protein of unknown function (DUF1092). This family consists of several hypothetical proteins of unknown function all from photosynthetic organisms including plants and cyanobacteria. 46376 pfam06486: Protein of unknown function (DUF1093). This family consists of several hypothetical bacterial proteins of unknown function. 46377 pfam06487: Sin3 associated polypeptide p18 (SAP18). This family consists of several eukaryotic Sin3 associated polypeptide p18 (SAP18) sequences. SAP18 is known to be a component of the Sin3-containing complex which is responsible for the repression of transcription via the modification of histone polypeptides. SAP18 is also present in the ASAP complex which is thought to be involved in the regulation of splicing during the execution of programmed cell death. 46378 pfam06488: Lactococcus lactis bacteriophage major structural protein. This family consists of several Lactococcus lactis bacteriophage major structural proteins. 46379 pfam06489: Orthopoxvirus A49R protein. This family consists of several Orthopoxvirus A49R proteins. The function of this family is unknown. 46380 pfam06490: Flagellar regulatory protein FleQ. This domain is found at the N terminus of a subset of sigma54-dependent transcriptional activators that are involved in regulation of flagellar motility e.g. FleQ in Pseudomonas aeruginosa. It is clearly related to pfam00072, but lacks the conserved aspartate residue that undergoes phosphorylation in the classic two-component system response regulator (pfam00072).. 46381 pfam06491: Protein of unknown function (DUF1094). This family consists of several hypothetical bacterial proteins of unknown function. 46382 pfam06492: Protein of unknown function (DUF1095). This domain is found in several bacterial transcriptional regulators and putative adenylate cyclases. 46383 pfam06493: Protein of unknown function (DUF1096). This family represents the N-terminal region of several proteins found in C. elegans. The family is often found with pfam02363. 46384 pfam06495: Fruit fly transformer protein. This family consists of transformer proteins from several Drosophila species and also from Ceratitis capitata (Mediterranean fruit fly). The transformer locus (tra) produces an RNA processing protein that alternatively splices the doublesex pre-mRNA in the sex determination hierarchy of Drosophila melanogaster. 46385 pfam06496: Protein of unknown function (DUF1097). This family consists of several bacterial putative membrane proteins. 46386 pfam06497: Protein of unknown function (DUF1098). This family consists of several hypothetical Baculovirus proteins of unknown function. 46387 pfam06498: Adenylate cyclase associated. This region is found in several bacterial transcriptional regulators and putative adenylate cyclases. It appears to contain sequences that share some sequence similarity with pfam00515. 46388 pfam06499: Protein of unknown function (DUF1099). This family consists of several hypothetical bacterial proteins of unknown function. 46389 pfam06500: Protein of unknown function (DUF1100). This family consists of several hypothetical bacterial proteins of unknown function. 46390 pfam06501: Human herpesvirus U55 protein. This family consists of several human herpesvirus U55 proteins. The function of this family is unknown. 46391 pfam06502: Equine infectious anaemia virus S2 protein. This family consists of several equine infectious anaemia virus S2 proteins. The function of this family is unknown. 46392 pfam06503: Protein of unknown function (DUF1101). This family consists of several hypothetical Fijivirus proteins of unknown function. 46393 pfam06504: Replication protein C (RepC). This family consists of several bacterial replication protein C (RepC) sequences. 46394 pfam06505: Activator of aromatic catabolism. This domain is found at the N terminus of a subset of sigma54-dependent transcriptional activators in several proteobacteria, including activators of phenol degradation such as XylR. It is found adjacent to pfam02830. 46395 pfam06506: Propionate catabolism activator. This domain is found at the N terminus of several sigma54- dependent transcriptional activators including PrpR, which activates catabolism of propionate. 46396 pfam06507: Auxin response factor. A conserved region of auxin-responsive transcription factors. 46397 pfam06508: ExsB. This family includes putative transcriptional regulators from Bacteria and Archaea. 46398 pfam06509: HtpX N-terminus. This family represents the N-terminal region of the bacterial heat shock protein HtpX which is found in conjunction with pfam01435. 46399 pfam06510: Protein of unknown function (DUF1102). This family consists of several hypothetical archaeal proteins of unknown function. 46400 pfam06511: Invasion plasmid antigen IpaD. This family consists of several invasion plasmid antigen IpaD proteins. Entry of Shigella flexneri into epithelial cells and lysis of the phagosome involve the IpaB, IpaC, and IpaD proteins, which are secreted by type III secretion machinery. 46401 pfam06512: Sodium ion transport-associated. Members of this family contain a region found exclusively in eukaryotic sodium channels or their subunits, many of which are voltage-gated. Members very often also contain between one and four copies of pfam00520 and, less often, one copy of pfam00612. 46402 pfam06513: Repeat of unknown function (DUF1103). This family consists of several repeats of around 30 residues in length which are found specifically in mature-parasite-infected erythrocyte surface antigen proteins from Plasmodium falciparum. This family often found in conjunction with pfam00226. 46403 pfam06514: Photosystem II 12 kDa extrinsic protein (PsbU). This family consists of several photosystem II 12 kDa extrinsic protein (PsbU) proteins from cyanobacteria and algae. PsbU is an extrinsic protein of the photosystem II complex of cyanobacteria and red algae. PsbU is known to stabilise the oxygen-evolving machinery of the photosystem II complex against heat-induced inactivation. 46404 pfam06515: Borna disease virus P10 protein. This family consists of several Borna disease virus P10 (or X) proteins. Borna disease virus (BDV) is unique among the non-segmented negative-strand RNA viruses of animals and man because it transcribes and replicates its genome in the nucleus of the infected cell. It has been suggested that the p10 protein plays a role in viral RNA synthesis or ribonucleoprotein transport. 46405 pfam06516: Purine nucleoside permease (NUP). This family consists of several purine nucleoside permease from both bacteria and fungi. 46406 pfam06517: Orthopoxvirus A43R protein. This family consists of several Orthopoxvirus A43R proteins. The function of this family is unknown. 46407 pfam06518: Protein of unknown function (DUF1104). This family consists of several hypothetical proteins of unknown function which appear to be found exclusively in Helicobacter pylori. 46408 pfam06519: TolA protein. This family consists of several bacterial TolA proteins as well as two eukaryotic proteins of unknown function. Tol proteins are involved in the translocation of group A colicins. Colicins are bacterial protein toxins, which are active against Escherichia coli and other related species (See pfam01024). TolA is anchored to the cytoplasmic membrane by a single membrane spanning segment near the N-terminus, leaving most of the protein exposed to the periplasm. 46409 pfam06520: Protein of unknown function (DUF1105). This family consists of several hypothetical bacterial proteins of unknown function. 46410 pfam06521: PAR1 protein. This family consists of several plant specific PAR1 proteins from Nicotiana tabacum and Arabidopsis thaliana. The function of this family is unknown. 46411 pfam06522: B12D protein. This family consists of several plant specific B12D proteins. The function of this protein is unknown but in barley B12D transcripts are expressed mainly during seed maturation and germination. . 46412 pfam06523: Protein of unknown function (DUF1106). This family consists of several hypothetical bacterial proteins found in Escherichia coli and Citrobacter rodentium. The function of this family is unknown. 46413 pfam06524: NOA36 protein. This family consists of several NOA36 proteins which contain 29 highly conserved cysteine residues. The function of this protein is unknown. 46414 pfam06525: Sulfocyanin (SoxE). This family consists of several archaeal sulfocyanin (or blue copper protein) sequences from a number of Sulfolobus species. 46415 pfam06526: Protein of unknown function (DUF1107). This family consists of several short, hypothetical bacterial proteins of unknown function. 46416 pfam06527: TniQ. This family consists of several bacterial TniQ proteins. TniQ along with TniA and B is involved in the transposition of the mercury-resistance transposon Tn5053 which carries the mer operon. It has been suggested that the tni genes are involved in the dissemination of integrons. 46417 pfam06528: Phage P2 GpE. This family consists of several phage and bacterial proteins which are closely related to the GpE tail protein from Phage P2. 46418 pfam06529: Vertebrate interleukin-3 regulated transcription factor. This family includes vertebrate transcription factors, some of which are regulated by IL-3/adenovirus E4 promoter binding protein. Others were found to strongly repress transcription in a DNA-binding-site-dependent manner. 46419 pfam06530: Phage antitermination protein Q. This family consists of several phage antitermination protein Q and related bacterial sequences. Antiterminator proteins control gene expression by recognising control signals near the promoter and preventing transcriptional termination which would otherwise occur at sites that may be a long way downstream. . 46420 pfam06531: Protein of unknown function (DUF1108). This family consists of several bacterial proteins from Staphylococcus aureus as well as a number of phage proteins. The function of this family is unknown. 46421 pfam06532: Protein of unknown function (DUF1109). This family consists of several hypothetical bacterial proteins of unknown function. 46422 pfam06533: Protein of unknown function (DUF1110). This family consists of hypothetical proteins specific to Oryza sativa. One sequence appears to be tandemly repeated. 46423 pfam06534: Repulsive guidance molecule (RGM) C-terminus. This family consists of several mammalian and one bird sequence from Gallus gallus (Chicken). This family represents the C-terminal region of several sequences but in others it represents the full protein. All of the mammalian proteins are hypothetical and have no known function but a member from chicken is annotated as being a repulsive guidance molecule (RGM). RGM is a GPI-linked axon guidance molecule of the retinotectal system. RGM is repulsive for a subset of axons, those from the temporal half of the retina. Temporal retinal axons invade the anterior optic tectum in a superficial layer, and encounter RGM expressed in a gradient with increasing concentration along the anterior-posterior axis. Temporal axons are able to receive posterior-dependent information by sensing gradients or concentrations of guidance cues. Thus, RGM is likely to provide positional information for temporal axons invading the optic tectum in the stratum opticum. 46424 pfam06535: Repulsive guidance molecule (RGM) N-terminus. This family consists of the N-terminal region of several mammalian and one bird sequence from Gallus gallus (Chicken). All of the mammalian proteins are hypothetical and have no known function but a member from chicken is annotated as being a repulsive guidance molecule (RGM). RGM is a GPI-linked axon guidance molecule of the retinotectal system. RGM is repulsive for a subset of axons, those from the temporal half of the retina. Temporal retinal axons invade the anterior optic tectum in a superficial layer, and encounter RGM expressed in a gradient with increasing concentration along the anterior-posterior axis. Temporal axons are able to receive posterior-dependent information by sensing gradients or concentrations of guidance cues. Thus, RGM is likely to provide positional information for temporal axons invading the optic tectum in the stratum opticum. 46425 pfam06536: Avian adenovirus fibre. This family contains avian adenovirus fibre proteins, which have been linked to variations in virulence. Avian adenoviruses possess penton capsomers that consist of a pentameric base associated with two fibres. 46426 pfam06537: Protein of unknown function (DUF1111). This family consists of several hypothetical bacterial proteins of unknown function. 46427 pfam06538: Orthopoxvirus C5L protein. This family consists of several Orthopoxvirus C5L proteins. The function of this family is unknown. 46428 pfam06539: Protein of unknown function (DUF1112). This family consists of several hypothetical bacterial proteins of unknown function. 46429 pfam06540: Galanin message associated peptide (GMAP). This family consists of several galanin message associated peptides. In rat preprogalanin, galanin is C-terminally flanked by a 60 amino acid long peptide: galanin message-associated peptide (GMAP). GMAP sequences in different species show high degree of homology, but the biological function of this family is unknown. 46430 pfam06541: Protein of unknown function (DUF1113). This family consists of several bacterial proteins of unknown function. 46431 pfam06542: Protein of unknown function (DUF1114). This family consists of hypothetical C. elegans proteins. 46432 pfam06543: Lactococcus bacteriophage repressor. This family represents the C-terminus of Lactococcus bacteriophage repressor proteins. 46433 pfam06544: Protein of unknown function (DUF1115). This family represents the C-terminus of hypothetical eukaryotic proteins of unknown function. 46434 pfam06545: Protein of unknown function (DUF1116). This family contains hypothetical bacterial proteins of unknown function. 46435 pfam06546: Vertebrate heat shock transcription factor. This family represents the C-terminal region of vertebrate heat shock transcription factors. Heat shock transcription factors regulate the expression of heat shock proteins - a set of proteins that protect the cell from damage caused by stress and aid the cell's recovery after the removal of stress. This C-terminal region is found with the N-terminal pfam00447, and may contain a three-stranded coiled-coil trimerisation domain and a CE2 regulatory region, the latter of which is involved in sustained heat shock response. 46436 pfam06547: Protein of unknown function (DUF1117). This family represents the C-terminus of a number of hypothetical plant proteins. 46437 pfam06548: Kinesin-related. This family represents a region within kinesin-related proteins from higher plants. Many family members also contain the pfam00225 domain. Kinesins are ATP-driven microtubule motor proteins that produce directed force. Some family members are associated with the phragmoplast, a structure composed mainly of microtubules that executes cytokinesis in higher plants. 46438 pfam06549: Protein of unknown function (DUF1118). This family consists of several hypothetical plant proteins of unknown function. 46439 pfam06550: Protein of unknown function (DUF1119). This family consists of several hypothetical archaeal proteins of unknown function. 46440 pfam06551: Protein of unknown function (DUF1120). This family consists of several hypothetical bacterial proteins of unknown function. 46441 pfam06552: Plant specific mitochondrial import receptor subunit TOM20. This family consists of several plant specific mitochondrial import receptor subunit TOM20 (translocase of outer membrane 20 kDa subunit) proteins. Most mitochondrial proteins are encoded by the nuclear genome, and are synthesised in the cytosol. TOM20 is a general import receptor that binds to mitochondrial pre-sequences in the early step of protein import into the mitochondria. . 46442 pfam06553: BNIP3. This family consists of several mammalian specific BCL2/adenovirus E1B 19-kDa protein-interacting protein 3 or BNIP3 sequences. BNIP3 belongs to the Bcl-2 homology 3 (BH3)-only family, a Bcl-2-related family possessing an atypical Bcl-2 homology 3 (BH3) domain, which regulates PCD from mitochondrial sites by selective Bcl-2/Bcl-XL interactions. BNIP3 family members contain a C-terminal transmembrane domain that is required for their mitochondrial localisation, homodimerisation, as well as regulation of their pro-apoptotic activities. BNIP3-mediated apoptosis has been reported to be independent of caspase activation and cytochrome c release and is characterised by early plasma membrane and mitochondrial damage, prior to the appearance of chromatin condensation or DNA fragmentation. 46443 pfam06554: Olfactory marker protein. This family consists of several olfactory marker proteins. Expression of the olfactory marker protein (OMP) is highly restricted to mature olfactory receptor neurons in virtually all vertebrate species from fish to man. 46444 pfam06555: Protein of unknown function (DUF1121). This family consists of several hypothetical proteins from bacteria and from Dictyostelium discoideum (Slime mold). The function of this family is unknown. 46445 pfam06556: IAP-like protein p27 C-terminus. This family represents the C-terminal region of the African swine fever virus IAP-like protein p27. This family is found in conjunction with pfam00653. It has been suggested that the family may be a host range gene involved in aspects of infection in the arthropod host, ticks of the genus Ornithodoros. 46446 pfam06557: Protein of unknown function (DUF1122). This family consists of several hypothetical archaeal and bacterial proteins of unknown function. 46447 pfam06558: Secretion monitor precursor protein (SecM). This family consists of several bacterial Secretion monitor precursor (SecM) proteins. SecM is known to regulate SecA expression. The eubacterial protein secretion machinery consists of a number of soluble and membrane associated components. One critical element is SecA ATPase, which acts as a molecular motor to promote protein secretion at translocation sites that consist of SecYE, the SecA receptor, and SecG and SecDFyajC proteins, which regulate SecA membrane cycling. 46448 pfam06559: 2'-deoxycytidine 5'-triphosphate deaminase (DCD). This family consists of several bacterial 2'-deoxycytidine 5 '-triphosphate deaminase proteins (EC:3.5.4.13).. 46449 pfam06560: Glucose-6-phosphate isomerase (GPI). This family consists of several bacterial and archaeal glucose-6-phosphate isomerase (GPI) proteins (EC:5.3.1.9).. 46450 pfam06561: Protein of unknown function (DUF1123). This family consists of several hypothetical bacterial proteins of unknown function. 46451 pfam06562: Putative ribose 5-phosphate isomerase (DUF1124). This family consists of several hypothetical and putative bacterial ribose 5-phosphate isomerase proteins. 46452 pfam06563: Protein of unknown function (DUF1125). This family consists of several short Lactococcus lactis and bacteriophage proteins. The function of this family is unknown. 46453 pfam06564: YhjQ protein. This family consists of several bacterial YhjQ proteins. The function of this family is unknown. 46454 pfam06565: Repeat of unknown function (DUF1126). This family consists of several eukaryote specific repeats of around 35 residues in length. The function of this family is unknown. 46455 pfam06566: Chondroitin sulphate attachment domain. This family represents the chondroitin sulphate attachment domain of vertebrate neural transmembrane proteoglycans that contain EGF modules. Evidence has been accumulated to support the idea that neural proteoglycans are involved in various cellular events including mitogenesis, differentiation, axonal outgrowth and synaptogenesis. This domain contains several potential sites of chondroitin sulphate attachment, as well as potential sites of N-linked glycosylation. 46456 pfam06567: Neural chondroitin sulphate proteoglycan cytoplasmic domain. This family represents the C-terminal cytoplasmic domain of vertebrate neural chondroitin sulphate proteoglycans that contain EGF modules. Evidence has been accumulated to support the idea that neural proteoglycans are involved in various cellular events including mitogenesis, differentiation, axonal outgrowth and synaptogenesis. This domain contains a number of potential sites of phosphorylation by protein kinase C. 46457 pfam06568: Domain of unknown function (DUF1127). This family is found in several hypothetical bacterial proteins. In some cases it represents it represents the C-terminal region whereas in others it represents the whole sequence. 46458 pfam06569: Protein of unknown function (DUF1128). This family consists of several short, hypothetical bacterial proteins of unknown function. 46459 pfam06570: Protein of unknown function (DUF1129). This family consists of several hypothetical bacterial proteins of unknown function. 46460 pfam06571: Protein of unknown function (DUF1130). This family consists of several hypothetical bacterial proteins of unknown function. 46461 pfam06572: Protein of unknown function (DUF1131). This family consists of several hypothetical bacterial proteins of unknown function. 46462 pfam06573: Churchill protein. This family consists of several eukaryotic Churchill proteins. This protein contains a novel zinc binding region that mediates FGF signaling during neural development (unpublished obs Sheng G and Stern C).. 46463 pfam06574: Riboflavin kinase (Flavokinase). This family represents the N-terminal region of the bifunctional riboflavin biosynthesis protein known as RibC in Bacillus subtilis. pfam01687 is often found toward the C-terminal region of this protein. The RibC protein from Bacillus subtilis has both flavokinase and flavin adenine dinucleotide synthetase (FAD-synthetase) activities. RibC plays an essential role in the flavin metabolism. 46464 pfam06575: Protein of unknown function (DUF1132). This family consists of several hypothetical proteins from Neisseria meningitidis. The function of this family is unknown. 46465 pfam06576: Protein of unknown function (DUF1133). This family consists of a number of hypothetical proteins from Escherichia coli O157:H7 and Salmonella typhi. The function of this family is unknown. 46466 pfam06577: Protein of unknown function (DUF1134). This family consists of several hypothetical bacterial proteins of unknown function. 46467 pfam06578: YOP proteins translocation protein K (YscK). This family consists of several YscK proteins. The function of this protein is unknown but it belongs to an operon involved in the secretion of Yop proteins across bacterial membranes. 46468 pfam06579: Caenorhabditis elegans ly-6-related protein. This family consists of several Caenorhabditis elegans specific ly-6-related HOT and ODR proteins. These proteins are involved in the olfactory system. Odr-2 mutants are known to be defective in the ability to chemotax to odorants that are recognised by the two AWC olfactory neurons. Odr-2 encodes a membrane-associated protein related to the Ly-6 superfamily of GPI-linked signaling proteins. 46469 pfam06580: Histidine kinase. This family represents a region within bacterial histidine kinase enzymes. Two-component signal transduction systems such as those mediated by histidine kinase are integral parts of bacterial cellular regulatory processes, and are used to regulate the expression of genes involved in virulence. Members of this family often contain pfam02518 and/or pfam00672. 46470 pfam06581: Protein of unknown function (DUF1135). This family consists of several hypothetical mammalian proteins of unknown function. 46471 pfam06582: Repeat of unknown function (DUF1136). This family consists of several eukaryote specific repeats of unknown function. This repeat seems to always be found with pfam00047. 46472 pfam06583: Neogenin C-terminus. This family represents the C-terminus of eukaryotic neogenin precursor proteins, which contains several potential phosphorylation sites. Neogenin is a member of the N-CAM family of cell adhesion molecules (and therefore contains multiple copies of pfam00047 and pfam00041) and is closely related to the DCC tumour suppressor gene product - these proteins may play an integral role in regulating differentiation programmes and/or cell migration events within many adult and embryonic tissues. 46473 pfam06584: DIRP. DIRP (Domain in Rb-related Pathway) is postulated to be involved in the Rb-related pathway, which is encoded by multiple eukaryotic genomes and is present in proteins including lin-9 of Caenorhabditis elegans, aly of fruit fly and mustard weed. Studies of lin-9 and aly of fruit fly proteins containing DIRP suggest that this domain might be involved in development. Aly, lin-9, act in parallel to, or downstream of, activation of MAPK by the RTK-Ras signalling pathway. 46474 pfam06585: Haemolymph juvenile hormone binding protein (JHBP). This family consists of several insect specific haemolymph juvenile hormone binding proteins (JHBP). Juvenile hormone (JH) has a profound effect on insects. It regulates embryogenesis, maintains the status quo of larva development and stimulates reproductive maturation in the adult forms. JH is transported from the sites of its synthesis to target tissues by a haemolymph carrier called juvenile hormone-binding protein (JHBP). This protects the JH molecules from hydrolysis by non-specific esterases present in the insect haemolymph. 46475 pfam06586: TraK protein. This family consists of several TraK proteins from Escherichia coli, Salmonella typhi and Salmonella typhimurium. TraK is known to be essential for pilus assembly but its exact role in this process is unknown. 46476 pfam06587: Protein of unknown function (DUF1137). This family consists of several hypothetical proteins specific to Chlamydia species. The function of this family is unknown. 46477 pfam06588: Muskelin N-terminus. This family represents the N-terminal region of muskelin and is found in conjunction with several pfam01344 repeats. Muskelin is an intracellular, kelch repeat protein that is needed in cell-spreading responses to the matrix adhesion molecule, thrombospondin-1. 46478 pfam06589: Circumsporozoite-related antigen (CRA). This family consists of several circumsporozoite-related antigen (CRA) or exported protein-1 (EXP1) sequences found specifically in Plasmodium species. The function of this family is unknown. 46479 pfam06590: PerB protein. This family consists of several PerB or BfpV proteins found specifically in Escherichia coli. PerB is thought to play a role in regulating the expression of BfpA. 46480 pfam06591: T4-like phage nuclear disruption protein (Ndd). This family consists of several nuclear disruption (Ndd) proteins from T4-like phages. Early in a bacteriophage T4 infection, the phage ndd gene causes the rapid destruction of the structure of the Escherichia coli nucleoid. The targets of Ndd action may be the chromosomal sequences that determine the structure of the nucleoid. 46481 pfam06592: Protein of unknown function (DUF1138). This family consists of several hypothetical short plant proteins from Arabidopsis thaliana and Oryza sativa. The function of this family is unknown. 46482 pfam06593: Raspberry bushy dwarf virus coat protein. This family consists of several Raspberry bushy dwarf virus coat proteins. 46483 pfam06594: Haemolysin-type calcium binding protein related domain. This family consists of a number of bacteria specific domains which are found in haemolysin-type calcium binding proteins. This family is found in conjunction with pfam00353 and is often found in multiple copies. 46484 pfam06595: Borna disease virus P24 protein. This family consists of several Borna disease virus (BDV) P24 proteins. The function of this family is unknown. 46485 pfam06596: Photosystem II reaction centre X protein (PsbX). This family consists of several photosystem II reaction centre X protein (PsbX) sequences from both prokaryotes and eukaryotes. 46486 pfam06597: Clostridium P-47 protein. This family consists of several P-47 proteins from various Clostridium species as well as two related sequences from Pseudomonas putida. The function of this family is unknown. 46487 pfam06598: Chlorovirus glycoprotein repeat. This family consists of s number of repeats found in Chlorovirus glycoproteins. The function of this family is unknown. 46488 pfam06599: Protein of unknown function (DUF1139). This family consists of several hypothetical Fijivirus proteins of unknown function. 46489 pfam06600: Protein of unknown function (DUF1140). This family consists of several short, hypothetical phage and bacterial proteins. The function of this family is unknown. 46490 pfam06601: Orthopoxvirus F6 protein. This family consists of several Orthopoxvirus F6L proteins the function of which are unknown. 46491 pfam06602: Myotubularin-related. This family represents a region within eukaryotic myotubularin-related proteins that is sometimes found with pfam02893. Myotubularin is a dual-specific lipid phosphatase that dephosphorylates phosphatidylinositol 3-phosphate and phosphatidylinositol (3,5)-bi-phosphate. Mutations in gene encoding myotubularin-related proteins have been associated with disease. 46492 pfam06603: Protein of unknown function (DUF1141). This family consists of several hypothetical proteins of unknown function and seems to be specific to Bacteroides species. 46493 pfam06604: Bacterial outer membrane lipoprotein omp19. This family consists of several bacterial outer membrane lipoprotein omp19 sequences. 46494 pfam06605: Protein of unknown function (DUF1142). This family consists of hypothetical bacterial and viral proteins of unknown function. 46495 pfam06606: Phlebovirus nucleocapsid (N) protein. This family consists of several Phlebovirus nucleocapsid (N) proteins. 46496 pfam06607: Prokineticin. This family consists of several prokineticin proteins and related BM8 sequences. The suprachiasmatic nucleus (SCN) controls the circadian rhythm of physiological and behavioural processes in mammals. It has been shown that prokineticin 2 (PK2), a cysteine-rich secreted protein, functions as an output molecule from the SCN circadian clock. PK2 messenger RNA is rhythmically expressed in the SCN, and the phase of PK2 rhythm is responsive to light entrainment. Molecular and genetic studies have revealed that PK2 is a gene that is controlled by a circadian clock. 46497 pfam06608: Protein of unknown function (DUF1143). This family consists of several hypothetical mammalian proteins (from mouse and human). The function of this family is unknown. 46498 pfam06609: Fungal trichothecene efflux pump (TRI12). This family consists of several fungal specific trichothecene efflux pump proteins. Many of the genes involved in trichothecene toxin biosynthesis in Fusarium sporotrichioides are present within a gene cluster.It has been suggested that TRI12 may play a role in F. sporotrichioides self-protection against trichothecenes. 46499 pfam06610: Protein of unknown function (DUF1144). This family consists of several hypothetical bacterial proteins of unknown function. 46500 pfam06611: Protein of unknown function (DUF1145). This family consists of several hypothetical bacterial proteins of unknown function. 46501 pfam06612: Protein of unknown function (DUF1146). This family consists of several hypothetical bacterial proteins of unknown function. 46502 pfam06613: KorB C-terminus. This family consists of several KorB transcriptional repressor proteins. The korB gene is a major regulatory element in the replication and maintenance of broad host-range plasmid RK2. It negatively controls the replication gene trfA, the host-lethal determinants kilA and kilB, and the korA-korB operon. This family is found in conjunction with pfam02195. 46503 pfam06614: Neuromodulin. This family consists of several neuromodulin (Axonal membrane protein GAP-43) sequences and is found in conjunction with pfam00612. GAP-43 is a neuronal calmodulin-binding phosphoprotein that is concentrated in growth cones and pre-synaptic terminals. 46504 pfam06615: Protein of unknown function (DUF1147). This family consists of several short Circovirus proteins of unknown function. 46505 pfam06616: BsuBI/PstI restriction endonuclease C-terminus. This family represents the C-terminus of bacterial enzymes similar to type II restriction endonucleases BsuBI and PstI (EC:3.1.21.4). The enzymes of the BsuBI restriction/modification (R/M) system recognise the target sequence 5'CTGCAG and are functionally identical with those of the PstI R/M system. 46506 pfam06617: M-phase inducer phosphatase. This family represents a region within eukaryotic M-phase inducer phosphatases (EC:3.1.3.48), which also contain the pfam00581 domain. These proteins are involved in the control of mitosis. 46507 pfam06618: Protein of unknown function (DUF1148). This family consists of several Maize streak virus proteins of unknown function. 46508 pfam06619: Protein of unknown function (DUF1149). This family consists of several hypothetical bacterial proteins of unknown function. 46509 pfam06620: Protein of unknown function (DUF1150). This family consists of several hypothetical bacterial proteins of unknown function. 46510 pfam06621: Single-minded protein C-terminus. This family represents the C-terminal region of the eukaryotic single-minded (SIM) protein. Drosophila single-minded acts as a positive master gene regulator in central nervous system midline formation. There are two homologues in mammals: SIM1 and SIM2, which are members of the basic-helix-loop-helix PAS family of transcription factors. SIM1 and SIM2 are novel heterodimerisation partners for ARNT in vitro, and they may function both as positive and negative transcriptional regulators in vivo, during embryogenesis and in the adult organism. SIM2 is thought to contribute to some specific Down syndrome phenotypes. This family is found in conjunction with a pfam00989 domain and associated pfam00785 motif. 46511 pfam06622: SepQ protein. This family consists of several enterobacterial SepQ proteins from Escherichia coli and Citrobacter rodentium. The function of this family is unclear. 46512 pfam06623: MHC_I C-terminus. This family represents the C-terminal region of the MHC class I antigen. The family is found in conjunction with pfam00129 and pfam00047. 46513 pfam06624: Ribosome associated membrane protein RAMP4. This family consists of several ribosome associated membrane protein RAMP4 (or SERP1) sequences. Stabilisation of membrane proteins in response to stress involves the concerted action of a rescue unit in the ER membrane comprised of SERP1/RAMP4, other components of the translocon, and molecular chaperones in the ER. 46514 pfam06625: Protein of unknown function (DUF1151). This family consists of several hypothetical eukaryotic proteins of unknown function. 46515 pfam06626: Protein of unknown function (DUF1152). This family consists of several hypothetical archaeal proteins of unknown function. 46516 pfam06627: Protein of unknown function (DUF1153). This family consists of several short, hypothetical bacterial proteins of unknown function. 46517 pfam06628: Catalase-related. This family represents a small conserved region within catalase enzymes (EC:1.11.1.6). All members also contain the pfam00199 domain. Catalase decomposes hydrogen peroxide into water and oxygen, serving to protect cells from its toxic effects. 46518 pfam06629: MltA-interacting protein MipA. This family consists of several bacterial MltA-interacting protein (MipA) like sequences. As well as interacting with the membrane-bound lytic transglycosylase MltA, MipA is known to bind to PBP1B, a bifunctional murein transglycosylase/transpeptidase. MipA is considered to be a structural protein mediating the assembly of MltA to PBP1B into a complex. 46519 pfam06630: Enterobacterial exodeoxyribonuclease VIII. This family consists of several Enterobacterial exodeoxyribonuclease VIII proteins. 46520 pfam06631: Protein of unknown function (DUF1154). This family represents a small conserved region of unknown function within eukaryotic phospholipase C (EC:3.1.4.3). All members also contain pfam00387 and pfam00388. 46521 pfam06632: DNA double-strand break repair and V(D)J recombination protein XRCC4. This family consists of several mammalian specific DNA double-strand break repair and V(D)J recombination protein XRCC4 sequences. In the non-homologous end joining pathway of DNA double-strand break repair, the ligation step is catalysed by a complex of XRCC4 and DNA ligase IV. It is thought that XRCC4 and ligase IV are essential for alignment-based gap filling, as well as for final ligation of the breaks. 46522 pfam06633: Protein of unknown function (DUF1155). This family consists of several Cucumber mosaic virus ORF IIB proteins. The function of this family is unknown. 46523 pfam06634: Protein of unknown function (DUF1156). This family represents a conserved region within hypothetical prokaryotic and archaeal proteins of unknown function. 46524 pfam06635: Nodulation protein NolV. This family consists of several nodulation protein NolV sequences from different Rhizobium species. The function of this family is unclear. 46525 pfam06636: Protein of unknown function (DUF1157). This family consists of several uncharacterised proteins from Melanoplus sanguinipes entomopoxvirus (MsEPV). The function of this family is unknown. 46526 pfam06637: PV-1 protein (PLVAP). This family consists of several PV-1 (PLVAP) proteins which seem to be specific to mammals. PV-1 is a novel protein component of the endothelial fenestral and stomatal diaphragms. The function of this family is unknown. 46527 pfam06638: Strabismus protein. This family consists of several strabismus (STB) or Van Gogh-like (VANGL) proteins 1 and 2. The exact function of this family is unknown. It is thought, however that STB1 gene and STB2 may be potent tumour suppressor gene candidates. 46528 pfam06639: Basal layer antifungal peptide (BAP). This family consists of several basal layer antifungal peptide (BAP) sequences specific to Zea mays. The BAP2 peptide exhibits potent broad-range activity against a range of filamentous fungi, including several plant pathogens. . 46529 pfam06640: P protein C-terminus. This family represents the C-terminus of plant P proteins. The maize P gene is a transcriptional regulator of genes encoding enzymes for flavonoid biosynthesis in the pathway leading to the production of a red phlobaphene pigment, and P proteins are homologous to the DNA-binding domain of myb-like transcription factors. All members of this family contain the pfam00249 domain. 46530 pfam06641: Paramyxovirus structural protein V. This family consists of several Paramyxovirus structural protein V sequences from the Nipah and Hendra viruses. 46531 pfam06642: Taurine transport system periplasmic protein TauA C-terminus. This family represents the C-terminal region of the bacterial taurine transport system periplasmic protein. This family is found in conjunction with pfam04069. TauA is thought to be involved in taurine uptake. 46532 pfam06643: Protein of unknown function (DUF1158). This family consists of several enterobacterial YbdJ proteins. The function of this family is unknown. 46533 pfam06644: ATP11 protein. This family consists of several eukaryotic ATP11 proteins. In Saccharomyces cerevisiae, expression of functional F1-ATPase requires two proteins encoded by the ATP11 and ATP12 genes. 46534 pfam06645: Microsomal signal peptidase 12 kDa subunit (SPC12). This family consists of several microsomal signal peptidase 12 kDa subunit proteins. Translocation of polypeptide chains across the endoplasmic reticulum (ER) membrane is triggered by signal sequences. Subsequently, signal recognition particle interacts with its membrane receptor and the ribosome-bound nascent chain is targeted to the ER where it is transferred into a protein-conducting channel. At some point, a second signal sequence recognition event takes place in the membrane and translocation of the nascent chain through the membrane occurs. The signal sequence of most secretory and membrane proteins is cleaved off at this stage. Cleavage occurs by the signal peptidase complex (SPC) as soon as the lumenal domain of the translocating polypeptide is large enough to expose its cleavage site to the enzyme. The signal peptidase complex is possibly also involved in proteolytic events in the ER membrane other than the processing of the signal sequence, for example the further digestion of the cleaved signal peptide or the degradation of membrane proteins. Mammalian signal peptidase is as a complex of five different polypeptide chains. This family represents the 12 kDa subunit (SPC12).. 46535 pfam06646: High affinity transport system protein p37. This family consists of several high affinity transport system protein p37 sequences which are specific to Mycoplasma species. The p37 gene is part of an operon encoding two additional proteins which are highly similar to components of the periplasmic binding-protein-dependent transport systems of Gram-negative bacteria.It has been suggested that p37 is part of a homologous, high-affinity transport system in M. hyorhinis, a Gram-positive bacterium. 46536 pfam06647: Protein of unknown function (DUF1159). This family consists of several hypothetical bacterial proteins of unknown function. 46537 pfam06648: Protein of unknown function (DUF1160). This family consists of several hypothetical Baculovirus proteins of unknown function. 46538 pfam06649: Protein of unknown function (DUF1161). This family consists of several short, hypothetical bacterial proteins of unknown function. 46539 pfam06650: Protein of unknown function (DUF1162). This family represents a conserved region within several hypothetical eukaryotic proteins. Family members might be vacuolar protein sorting related-proteins. Vacuolar sorting protein is an ATPase required for endosomal trafficking. 46540 pfam06651: Protein of unknown function (DUF1163). This family represents the C-terminus of hypothetical Arabidopsis thaliana proteins of unknown function. . 46541 pfam06652: Methuselah N-terminus. This family represents the N-terminal region of the Drosophila specific Methuselah protein. Drosophila Methuselah (Mth) mutants have a 35% increase in average lifespan and increased resistance to several forms of stress, including heat, starvation, and oxidative damage. The protein affected by this mutation is related to G protein-coupled receptors of the secretin receptor family. Mth, like secretin receptor family members, has a large N-terminal ectodomain, which may constitute the ligand binding site. This family is found in conjunction with pfam00002. 46542 pfam06653: Protein of unknown function (DUF1164). This family consists of several Caenorhabditis elegans specific proteins of unknown function. 46543 pfam06654: Protein of unknown function (DUF1165). This family represents a conserved region approximately 150 residues long within a number of hypothetical Oryza sativa proteins of unknown function. 46544 pfam06656: Tenuivirus PVC2 protein. This family consists of several Tenuivirus PVC2 proteins from Rice grassy stunt virus, Maize stripe virus and Rice hoja blanca virus. The function of this family is unknown. 46545 pfam06657: Protein of unknown function (DUF1167). This family consists of several uncharacterised mammalian proteins of unknown function. 46546 pfam06658: Protein of unknown function (DUF1168). This family consists of several hypothetical eukaryotic proteins of unknown function. . 46547 pfam06660: Flocculin carbohydrate-binding. This family represents the carbohydrate-binding N-terminus of yeast flocculin. Flocculin is a cell-surface lectin that causes adhesion of cells in clumps resulting in their sedimentation. 46548 pfam06661: VirE3. This family represents a conserved region within Agrobacterium tumefaciens VirE3. Agrobacterium tumefaciens (a plant pathogen) has a tumour-inducing (Ti) plasmid of which part, the transfer (T)-region, is transferred to plant cells during the infection process. Vir proteins mediate the processing of the T-region and the transfer of a single-stranded (ss) DNA copy of this region, the T-strand, into the recipient cells. VirE3 is a translocated effector protein, but its specific role has not been established. 46549 pfam06662: D-glucuronyl C5-epimerase C-terminus. This family represents the C-terminus of D-glucuronyl C5-epimerase (EC:5.1.3.-). Glucuronyl C5-epimerases catalyse the conversion of D-glucuronic acid (GlcUA) to L-iduronic acid (IdceA) units during the biosynthesis of glycosaminoglycans. 46550 pfam06663: Protein of unknown function (DUF1170). This family represents a conserved region of unknown function within MAGUIN, a neuronal membrane-associated guanylate kinase-interacting protein. This region is situated between the pfam00595 and pfam00169 domains. All family members also contain an N-terminal pfam00536 domain. 46551 pfam06664: Protein of unknown function (DUF1171). This family represents a conserved region of unknown function within a number of hypothetical eukaryotic proteins. 46552 pfam06665: Protein of unknown function (DUF1172). This family represents a conserved region of unknown function within NAC1 and a number of hypothetical proteins whose sequences bear resemblance to it. NAC1 is a constitutively-expressed POZ/BTB transcription factor found in mammalian neurones that can regulate behaviours associated with cocaine use. All family members contain the pfam00651 domain. 46553 pfam06666: Protein of unknown function (DUF1173). This family contains a group of hypothetical bacterial proteins that contain three conserved cysteine residues towards the N-terminal. The function of these proteins is unknown. 46554 pfam06667: Phage shock protein B. This family consists of several bacterial phage shock protein B (PspB) sequences. The phage shock protein (psp) operon is induced in response to heat, ethanol, osmotic shock and infection by filamentous bacteriophages. Expression of the operon requires the alternative sigma factor sigma54 and the transcriptional activator PspF. In addition, PspA plays a negative regulatory role, and the integral-membrane proteins PspB and PspC play a positive one. 46555 pfam06668: Inter-alpha-trypsin inhibitor heavy chain C-terminus. This family represents the C-terminal region of inter-alpha-trypsin inhibitor heavy chains. Inter-alpha-trypsin inhibitors are glycoproteins with a high inhibitory activity against trypsin, built up from different combinations of four polypeptides: bikunin and the three heavy chains that belong to this family (HC1, HC2, HC3). The heavy chains do not have any protease inhibitory properties but have the capacity to interact in vitro and in vivo with hyaluronic acid, which promotes the stability of the extra-cellular matrix. All family members contain the pfam00092 domain. 46556 pfam06669: Xylella fastidiosa surface protein related. This family consists of several Xylella fastidiosa surface protein specific repeats which are found in found in conjunction with pfam05662, pfam05658 and pfam03895. 46557 pfam06670: Microneme protein Etmic-2. This family consists of several Microneme protein Etmic-2 sequences from Eimeria tenella. Etmic-2 is a 50 kDa acidic protein, which is found within the microneme organelles of Eimeria tenella sporozoites and merozoites. 46558 pfam06671: Repeat of unknown function (DUF1174). This family consists of a number of Caenorhabditis elegans specific repeats of around 36 residues in length which are found in two hypothetical proteins. This family is found in conjunction with pfam00024. 46559 pfam06672: Protein of unknown function (DUF1175). This family consists of several hypothetical bacterial proteins of around 210 residues in length. The function of this family is unknown. 46560 pfam06673: Lactococcus lactis bacteriophage major capsid protein. This family consists of several Lactococcus lactis bacteriophage major capsid proteins. 46561 pfam06674: Protein of unknown function (DUF1176). This family consists of several hypothetical bacterial proteins of around 340 residues in length. Members of this family contain six highly conserved cysteine residues. The function of this family is unknown. 46562 pfam06675: Protein of unknown function (DUF1177). This family consists of several hypothetical archaeal and and bacterial proteins of around 300 residues in length. The function of this family is unknown. 46563 pfam06676: Protein of unknown function (DUF1178). This family consists of several hypothetical bacterial proteins of around 150 residues in length. The function of this family is unknown. 46564 pfam06677: Sjogren's syndrome/scleroderma autoantigen 1 (Autoantigen p27). This family consists of several Sjogren's syndrome/scleroderma autoantigen 1 (Autoantigen p27) sequences. It is thought that the potential association of anti-p27 with anti-centromere antibodies suggests that autoantigen p27 might play a role in mitosis. 46565 pfam06678: Protein of unknown function (DUF1179). This family consists of several hypothetical Caenorhabditis elegans proteins of around 106 residues in length. The function of the family is unknown. 46566 pfam06679: Protein of unknown function (DUF1180). This family consists of several hypothetical mammalian proteins of around 190 residues in length. The function of this family is unknown. 46567 pfam06680: Protein of unknown function (DUF1181). This family consists of several hypothetical proteins of around 120 residues in length which are found specifically in Trypanosoma brucei. The function of this family is unknown. 46568 pfam06681: Protein of unknown function (DUF1182). This family consists of several hypothetical proteins of around 360 residues in length and seems to be specific to Caenorhabditis elegans. The function of this family is unknown. 46569 pfam06682: Protein of unknown function (DUF1183). This family consists of several eukaryotic proteins of around 360 residues in length. The function of this family is unknown. 46570 pfam06683: Protein of unknown function (DUF1184). This family contains a number of hypothetical proteins of unknown function from Arabidopsis thaliana. 46571 pfam06684: Protein of unknown function (DUF1185). This family consists of several hypothetical bacterial proteins of around 180 residues in length. The function of this family is unknown. 46572 pfam06685: Protein of unknown function (DUF1186). This family consists of several hypothetical bacterial proteins of around 250 residues in length and is found in several Chlamydia and Anabaena species. The function of this family is unknown. 46573 pfam06686: Stage III sporulation protein AC (SpoIIIAC). This family consists of several bacterial stage III sporulation protein AC (SpoIIIAC) sequences. The exact function of this family is unknown. 46574 pfam06687: SUR7 protein. This family consists of several fungal specific SUR7 proteins. In Saccharomyces cerevisiae the SUR7 gene encodes a putative integral membrane protein with four transmembrane domains. It has been suggested that the Rvs161 and Rvs167 proteins act together in relation with SUR7. The transmembranous character of SUR7 suggests a membrane localisation of the Rvs function, a localisation which is consistent with the different rvs phenotypes and the actin-Rvs167p interaction. It has also been suggested that SUR7 may play a role in sporulation. 46575 pfam06688: Protein of unknown function (DUF1187). This family consists of several short, hypothetical bacterial proteins of around 62 residues in length. Members of this family are found in Escherichia coli and Salmonella typhi. The function of this family is unknown. 46576 pfam06689: ClpX C4-type zinc finger. The ClpX heat shock protein of Escherichia coli is a member of the universally conserved Hsp100 family of proteins, and possesses a putative zinc finger motif of the C4 type. This presumed zinc binding domain is found at the N-terminus of the ClpX protein. ClpX is an ATPase which functions both as a substrate specificity component of the ClpXP protease and as a molecular chaperone. The molecular function of this domain is now known. 46577 pfam06690: Protein of unknown function (DUF1188). This family consists of several hypothetical archaeal proteins of around 260 residues in length which seem to be specific to Methanobacterium, Methanococcus and Methanopyrus species. The function of this family is unknown. 46578 pfam06691: Protein of unknown function (DUF1189). This family consists of several hypothetical bacterial proteins of around 260 residues in length. The function of this family is unknown. 46579 pfam06692: Melon necrotic spot virus P7B protein. This family consists of several Melon necrotic spot virus (MNSV) P7B proteins. The function of this family is unknown. 46580 pfam06693: Protein of unknown function (DUF1190). This family consists of several hypothetical Enterobacterial proteins of around 212 residues in length and is known as YjfM in Escherichia coli. The function of this family is unknown. 46581 pfam06694: Plant nuclear matrix protein 1 (NMP1). This family consists of several plant specific nuclear matrix protein 1 (NMP1) sequences. Nuclear Matrix Protein 1 is a ubiquitously expressed 36 kDa protein, which has no homologues in animals and fungi, but is highly conserved among flowering and non-flowering plants. NMP1 is located both in the cytoplasm and nucleus and that the nuclear fraction is associated with the nuclear matrix. NMP1 is a candidate for a plant-specific structural protein with a function both in the nucleus and cytoplasm. . 46582 pfam06695: Putative small multi-drug export protein. This family contains a small number of putative small multi-drug export proteins. 46583 pfam06696: Streptococcal surface antigen repeat. This family consists of a number of ~25 residue long repeats found commonly in Streptococcal surface antigens although one copy is present in the HPSR2-heavy chain potential motor protein of Giardia lamblia. This family is often found in conjunction with pfam00746. 46584 pfam06697: Protein of unknown function (DUF1191). This family contains hypothetical plant proteins of unknown function. 46585 pfam06698: Protein of unknown function (DUF1192). This family consists of several short, hypothetical, bacterial proteins of around 60 residues in length. The function of this family is unknown. 46586 pfam06699: Phospho-ethanolamine N-methyltransferase. Phospho-ethanolamine N-methyltransferase is involved in glycosylphosphatidylinositol (GPI) anchor biosynthesis. 46587 pfam06701: Mib_herc2. Named ""mib/herc2 domain"" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). . 46588 pfam06702: Protein of unknown function (DUF1193). This family represents the C-terminus of several hypothetical eukaryotic proteins of unknown function. Family members contain two conserved motifs: DRHHYE and QCC, as well as a number of conserved cysteine residues. 46589 pfam06703: Microsomal signal peptidase 25 kDa subunit (SPC25). This family consists of several microsomal signal peptidase 25 kDa subunit proteins. Translocation of polypeptide chains across the endoplasmic reticulum (ER) membrane is triggered by signal sequences. Subsequently, signal recognition particle interacts with its membrane receptor and the ribosome-bound nascent chain is targeted to the ER where it is transferred into a protein-conducting channel. At some point, a second signal sequence recognition event takes place in the membrane and translocation of the nascent chain through the membrane occurs. The signal sequence of most secretory and membrane proteins is cleaved off at this stage. Cleavage occurs by the signal peptidase complex (SPC) as soon as the lumenal domain of the translocating polypeptide is large enough to expose its cleavage site to the enzyme. The signal peptidase complex is possibly also involved in proteolytic events in the ER membrane other than the processing of the signal sequence, for example the further digestion of the cleaved signal peptide or the degradation of membrane proteins. Mammalian signal peptidase is as a complex of five different polypeptide chains. This family represents the 25 kDa subunit (SPC25).. 46590 pfam06704: DspF/AvrF protein. This family consists of several DspF and related sequences from several plant pathogenic bacteria. The ""disease-specific"" (dsp) region next to the hrp gene cluster of Erwinia amylovora is required for pathogenicity but not for elicitation of the hypersensitive reaction. DspF and AvrF are small (16 kDa and 14 kDa) and acidic with predicted amphipathic alpha helices in their C termini; they resemble chaperones for virulence factors secreted by type III secretion systems of animal pathogens. . 46591 pfam06705: SF-assemblin/beta giardin. This family consists of several eukaryotic SF-assemblin and related beta giardin proteins. During mitosis the SF-assemblin-based cytoskeleton is reorganised; it divides in prophase and is reduced to two dot-like structures at each spindle pole in metaphase. During anaphase, the two dots present at each pole are connected again. In telophase there is an asymmetrical outgrowth of new fibres. It has been suggested that SF-assemblin is involved in re-establishing the microtubular root system characteristic of interphase cells after mitosis. . 46592 pfam06706: Citrus tristeza virus 6-kDa protein. This family consists of several Citrus tristeza virus (CTV) 6-kDa, 51 residue long hydrophobic (P6) proteins. The function of this family is unknown. 46593 pfam06707: Protein of unknown function (DUF1194). This family consists of several hypothetical Rhizobiales specific proteins of around 270 residues in length. The function of this family is unknown. 46594 pfam06708: Protein of unknown function (DUF1195). This family consists of several plant specific hypothetical proteins of around 160 residues in length. The function of this family is unknown. 46595 pfam06709: Protein of unknown function (DUF1196). This family consists of several hypothetical bacterial proteins of around 51 residues in length which seem to be specific to Vibrio cholerae. The function of this family is unknown. 46596 pfam06710: Protein of unknown function (DUF1197). This family represents a conserved region within the Z subunit of bacterial chlorophyllide reductase (EC:1.18.-.-). This enzyme converts chlorophylls to bacteriochlorophylls by reducing ring B of the tetrapyrrole. All family members contain the pfam00148 domain. 46597 pfam06711: Protein of unknown function (DUF1198). This family consists of several bacterial proteins of around 150 residues in length which are specific to Escherichia coli, Salmonella species and Yersinia pestis. The function of this family is unknown. 46598 pfam06712: Protein of unknown function (DUF1199). This family consists of several hypothetical Feline immunodeficiency virus (FIV) proteins. Members of this family are typically around 67 residues long and are often annotated as ORF3 proteins. The function of this family is unknown. 46599 pfam06713: Protein of unknown function (DUF1200). This family consists of several hypothetical proteins specific to Oceanobacillus and Bacillus species. Members of this family are typically around 130 residues in length. The function of this family is unknown. 46600 pfam06714: Gp5 N-terminal OB domain. This domain is found at the N terminus of the Gp5 baseplate protein of bacteriophage T4. This domain binds to the Gp27 protein. This domain has the common OB fold. 46601 pfam06715: Gp5 C-terminal repeat (3 copies). This repeat composes the C-terminal part of the bacteriophage T4 baseplate protein Gp5. This region of the protein forms a needle like projection from the baseplate that is presumed to puncture the bacterial cell membrane. Structurally three copies of the repeated region trimerise to form a beta solenoid type structure. This family also includes repeats from bacterial Vgr proteins. 46602 pfam06716: Protein of unknown function (DUF1201). This family consists of several Sugar beet yellow virus (SBYV) putative membrane-binding proteins of around 54 residues in length. The function of this family is unknown. 46603 pfam06717: Protein of unknown function (DUF1202). This family consists of several hypothetical bacterial proteins of around 335 residues in length. Members of this family are found exclusively in Escherichia coli and Salmonella species and are often referred to as YggM proteins. The function of this family is unknown. 46604 pfam06718: Protein of unknown function (DUF1203). This family consists of several hypothetical bacterial proteins of around 155 residues in length. Family members are present in Rhizobium, Agrobacterium and Streptomyces species. 46605 pfam06719: AraC-type transcriptional regulator N-terminus. This family represents the N-terminus of bacterial ARAC-type transcriptional regulators. In E. coli, these regulate the L-arabinose operon through sensing the presence of arabinose, and when the sugar is present, transmitting this information from the arabinose-binding domains to the protein's DNA-binding domains. This family might represent the N-terminal arm of the protein, which binds to the C-terminal DNA binding domains to hold them in a state where the protein prefers to loop and remain non-activating. All family members contain the pfam00165 domain. 46606 pfam06720: Bacteriophage phi-29 early protein GP16.7. This family consists of several bacteriophage phi-29 early protein GP16.7 sequences of around 130 residues in length. The function of this family is unknown. 46607 pfam06721: Protein of unknown function (DUF1204). This family represents the C-terminus of a number of Arabidopsis thaliana hypothetical proteins of unknown function. Family members contain a conserved DFD motif. 46608 pfam06722: Protein of unknown function (DUF1205). This family represents a conserved region of unknown function within bacterial glycosyl transferases. Many family members contain pfam03033. 46609 pfam06723: MreB/Mbl protein. This family consists of bacterial MreB and Mbl proteins as well as two related archaeal sequences. MreB is known to be a rod shape-determining protein in bacteria and goes to make up the bacterial cytoskeleton. Genes coding for MreB/Mbl are only found in elongated bacteria, not in coccoid forms. It has been speculated that constituents of the eukaryotic cytoskeleton (tubulin, actin) may have evolved from prokaryotic precursor proteins closely related to today's bacterial proteins FtsZ and MreB/Mbl. 46610 pfam06724: Domain of Unknown Function (DUF1206). This region consists of two a pair of transmembrane helices and occurs three times in each of the family member proteins. 46611 pfam06725: 3D domain. This short presumed domain contains three conserved aspartate residues, hence the name 3D. This conservation is suggestive of a cation binding function. The central aspartate is found in a DTG motif that is suggestive of a peptidase like active site (Bateman A pers. obs.).. 46612 pfam06726: Bladder cancer-related protein BC10. This family consists of a series of short proteins of around 90 residues in length. The human protein BC10 has been implicated in bladder cancer where the transcription of the gene coding for this protein is nearly completely abolished in highly invasive transitional cell carcinomas (TCCs). The function of this family is unknown. 46613 pfam06727: Protein of unknown function (DUF1207). This family consists of a number of hypothetical bacterial proteins of around 410 residues in length which seem to be specific to Chlamydia species. The function of this family is unknown. 46614 pfam06728: GPI transamidase subunit PIG-U. Many eukaryotic proteins are anchored to the cell surface via glycosylphosphatidylinositol (GPI), which is posttranslationally attached to the carboxyl-terminus by GPI transamidase. The mammalian GPI transamidase is a complex of at least four subunits, GPI8, GAA1, PIG-S, and PIG-T. PIG-U is thought to represent a fifth subunit in this complex and may be involved in the recognition of either the GPI attachment signal or the lipid portion of GPI. . 46615 pfam06729: Nuclear receptor co-activator NRIF3. This family consists of mammalian nuclear receptor co-activator NRIF3 proteins. NRIF3 exhibits a distinct receptor specificity in interacting with and potentiating the activity of only TRs and RXRs but not other examined nuclear receptors. NRIF3 as a coregulator that possesses both transactivation and transrepression domains and/or functions. Collectively, the NRIF3 family of coregulators may play dual roles in mediating both positive and negative regulatory effects on gene expression. 46616 pfam06730: Protein of unknown function (DUF1208). This family consists of several eukaryotic sequences of around 270 residues in length. Members of this family are found in mouse, human and Drosophila melanogaster. The function of this family is unknown. 46617 pfam06731: Protein of unknown function (DUF1209). This family consists of several hypothetical bacterial proteins of around 155 residues in length. Members of this family are found in Rhizobium, Xanthomonas, Mycobacterium and Agrobacterium species. The function of this family is unknown. 46618 pfam06732: Pescadillo N-terminus. This family represents the N-terminal region of Pescadillo. Pescadillo protein localises to distinct substructures of the interphase nucleus including nucleoli, the site of ribosome biogenesis. During mitosis pescadillo closely associates with the periphery of metaphase chromosomes and by late anaphase is associated with nucleolus-derived foci and prenucleolar bodies. Blastomeres in mouse embryos lacking pescadillo arrest at morula stages of development, the nucleoli fail to differentiate and accumulation of ribosomes is inhibited. It has been proposed that in mammalian cells pescadillo is essential for ribosome biogenesis and nucleologenesis and that disruption to its function results in cell cycle arrest. This family is often found in conjunction with a pfam00533 domain. 46619 pfam06733: DEAD_2. This represents a conserved region within a number of RAD3-like DNA-binding helicases that are seemingly ubiquitous - members include proteins of eukaryotic, bacterial and archaeal origin. RAD3 is involved in nucleotide excision repair, and forms part of the transcription factor TFIIH in yeast. 46620 pfam06734: UL97. This family represents a conserved region within viral UL97 phosphotransferases. UL97 participates in the phosphorylation of the nucleoside analog ganciclovir (GCV) to produce GCV-monophosphate. 46621 pfam06735: Protein of unknown function (DUF1210). This family represents a conserved region within plant proline-rich proteins. 46622 pfam06736: Protein of unknown function (DUF1211). This family represents a conserved region within a number of hypothetical proteins of unknown function found in eukaryotes, bacteria and archaea. These may possibly be integral membrane proteins. 46623 pfam06737: Transglycosylase-like domain. This family of proteins are very likely to act as transglycosylase enzymes related to pfam00062 and pfam01464. These other families are weakly matched by this family, and include the known active site residues. 46624 pfam06738: Protein of unknown function (DUF1212). This family represents a conserved region within a number of hypothetical proteins of unknown function found in eukaryotes, bacteria and archaea. Some family members are membrane proteins. 46625 pfam06739: Beta-propeller repeat. This family is related to pfam00400 and is likely to also form a beta-propeller. SBBP stands for Seven Bladed Beta Propeller. 46626 pfam06740: Protein of unknown function (DUF1213). This family represents a short conserved repeat within Drosophila melanogaster proteins of unknown function. Approximately 50 copies of this repeat are present in each protein. 46627 pfam06741: Ataxin-2 N-terminal region. This family represents a conserved region approximately 250 residues long located towards the C-terminus of eukaryotic ataxin-2. Ataxin-2 is a protein of unknown function, within which expansion of a polyglutamine tract (due to expansion of unstable CAG repeats in the coding region of the SCA2 gene) causes spinocerebellar ataxia type 2 (SCA2), a late-onset neurodegenerative disorder. The expanded polyglutamine repeat in ataxin-2 causes disruption of the normal morphology of the Golgi complex and increased incidence of cell death. Ataxin-2 is predicted to consist of mostly non-globular domains. 46628 pfam06742: Protein of unknown function (DUF1214). This family represents the C-terminal region of several hypothetical proteins of unknown function. Family members are mostly bacterial, but a few are also found in eukaryotes and archaea. 46629 pfam06743: FAST kinase leucine-rich. This family represents a conserved region of eukaryotic Fas-activated serine/threonine (FAST) kinases (EC:2.7.1.-) that contains several conserved leucine residues. FAST kinase is rapidly activated during Fas-mediated apoptosis, when it phosphorylates TIA-1, a nuclear RNA-binding protein that has been implicated as an effector of apoptosis. Note that many family members are hypothetical proteins. 46630 pfam06744: Protein of unknown function (DUF1215). This family represents a conserved region situated towards the C-terminal end of several hypothetical bacterial proteins of unknown function. A few members resemble the ImcF protein, which has been proposed to be involved in Vibrio cholerae cell surface reorganisation that results in increased adherence to epithelial cells line and increased conjugation frequency. 46631 pfam06745: KaiC. This family represents a conserved region within bacterial and archaeal proteins, most of which are hypothetical. More than one copy is sometimes found in each protein. This family includes KaiC, which is one of the Kai proteins among which direct protein-protein association may be a critical process in the generation of circadian rhythms in cyanobacteria. 46632 pfam06746: Protein of unknown function (DUF1216). This family represents a conserved region within Arabidopsis thaliana proteins of unknown function. Family members sometimes contain more than one copy. 46633 pfam06747: CHCH domain. we have identified a conserved motif in the LOC118487 protein that we have called the CHCH motif. Alignment of this protein with related members showed the presence of three subgroups of proteins, which are called the S (Small), N (N-terminal extended) and C (C-terminal extended) subgroups. All three sub-groups of proteins have in common that they contain a predicted conserved [coiled coil 1]-[helix 1]-[coiled coil 2]-[helix 2] domain (CHCH domain). Within each helix of the CHCH domain, there are two cysteines present in a C-X9-C motif. The N-group contains an additional double helix domain, and each helix contains the C-X9-C motif. This family contains a number of characterised proteins: Cox19 protein - a nuclear gene of Saccharomyces cerevisiae, codes for an 11-kDa protein (Cox19p) required for expression of cytochrome oxidase. Because cox19 mutants are able to synthesise the mitochondrial and nuclear gene products of cytochrome oxidase, Cox19p probably functions post-translationally during assembly of the enzyme. Cox19p is present in the cytoplasm and mitochondria, where it exists as a soluble intermembrane protein. This dual location is similar to what was previously reported for Cox17p, a low molecular weight copper protein thought to be required for maturation of the CuA centre of subunit 2 of cytochrome oxidase. Cox19p have four conserved potential metal ligands, these are three cysteines and one histidine. Mrp10 - belongs to the class of yeast mitochondrial ribosomal proteins that are essential for translation. Eukaryotic NADH-ubiquinone oxidoreductase 19 kDa (NDUFA8) subunit. 46634 pfam06748: Protein of unknown function (DUF1217). This family represents a conserved region that is found within bacterial proteins, most of which are hypothetical. Some members contain multiple copies. 46635 pfam06749: Protein of unknown function (DUF1218). This family contains hypothetical plant proteins of unknown function. Family members contain a number of conserved cysteine residues. 46636 pfam06750: Bacterial Peptidase A24 N-terminal domain. This family is found at the N-terminus of the prepilin peptidases (pfam01478). It's function has not been specifically determined; however some of the family have been characterised as bifunctional, and this domain may contain the N-methylation activity (EC:2.1.1.-). It consists of an intracellular region between a pair of transmembrane. This region contains an invariant proline and two almost fully conserved disulphide bridges - hence the name DiS-P-DiS. The cysteines have been shown to be essential to the overall function of the enzyme, but their role was incorrectly ascribed. . 46637 pfam06751: Ethanolamine ammonia lyase large subunit (EutB). This family consists of several bacterial ethanolamine ammonia lyase large subunit (EutB) proteins (EC:4.3.1.7). Ethanolamine ammonia-lyase is a bacterial enzyme that catalyses the adenosylcobalamin-dependent conversion of certain vicinal amino alcohols to oxo compounds and ammonia. The enzyme is a heterodimer composed of subunits of Mr approximately 55,000 (EutB) and 35,000 (EutC). . 46638 pfam06752: Enhancer of Polycomb C-terminus. This family represents the C-terminus of eukaryotic enhancer of polycomb proteins, which have roles in heterochromatin formation. This family contains several conserved motifs. 46639 pfam06753: Bradykinin. This family consists of several Bombina species specific bradykinin sequences. The skins of anuran amphibians, in addition to mucus glands, contain highly specialised poison glands, which, in reaction to stress or attack, exude a complex noxious cocktail of biologically active molecules. These secretions often contain a plethora of peptides among which bradykinin or structural variants have been identified. 46640 pfam06754: Phosphonate metabolism protein PhnG. This family consists of several bacterial phosphonate metabolism protein PhnG sequences. In Escherichia coli, the phn operon encodes proteins responsible for the uptake and breakdown of phosphonates. The exact function of PhnG is unknown, however it is thought likely that along with six other proteins PhnG makes up the the C-P (carbon-phosphorus) lyase. 46641 pfam06755: Protein of unknown function (DUF1219). This family consists of several hypothetical proteins which seem to be specific to the Enterobacteria Escherichia coli and Shigella flexneri. Family members are often known as YeeV proteins and are around 125 residues in length. The function of this family is unknown. 46642 pfam06756: Chorion protein S19 C-terminal. This family represents the C-terminal region of eukaryotic chorion protein S19. In Drosophilidae, the S19 gene is known to form part of an autosomal cluster that also contains s16, s15 and s18. Note that members of this family contain a conserved PVA motif, and many contain pfam03964. 46643 pfam06757: Insect allergen related repeat. This family consists of several insect specific allergen repeats. Members of this family are commonly found in cockroaches, fruit flies and mosquitos. It has been suggested that the repeat sequences have evolved by duplication of an ancestral amino acid domain, which may have arisen from the mitochondrial energy transfer proteins. 46644 pfam06758: Repeat of unknown function (DUF1220). This family consists of several mammalian specific repeats of around 65 residues in length and is found in multiple copies in several human proteins. The function of this family is unknown. 46645 pfam06759: Mosquito specific cecropin. This family consists of the mosquito specific antibiotic peptide cecropin. Cecropins have been implicated in reducing the development of parasites in mosquitoes and have an antimicrobial activity. 46646 pfam06760: Protein of unknown function (DUF1221). This is a family of plant proteins, most of which are hypothetical and of unknown function. All members contain the pfam00069 domain, suggesting that they may possess kinase activity. 46647 pfam06761: ImcF-related. This family represents a conserved region within several bacterial proteins that resemble ImcF, which has been proposed to be involved in Vibrio cholerae cell surface reorganisation, resulting in increased adherence to epithelial cells and increased conjugation frequency. Note that many family members are hypothetical proteins. 46648 pfam06762: Protein of unknown function (DUF1222). This family, which includes bacterial and eukaryotic members, represents a conserved region located towards the C-terminal end of a number of hypothetical proteins of unknown function. These are possibly integral membrane proteins. 46649 pfam06763: Prophage minor tail protein Z (GPZ). This family consists of several prophage minor tail protein Z like sequences from Escherichia coli, Salmonella typhimurium and Lambda-like bacteriophages. 46650 pfam06764: Protein of unknown function (DUF1223). This family consists of several hypothetical proteins of around 250 residues in length which are found in both plants and bacteria. The function of this family is unknown. 46651 pfam06765: Heparan sulfate 6-sulfotransferase (HS6ST). This family consists of several heparan sulfate 6-sulfotransferase (HS6ST) proteins. Heparan sulphate 6- O -sulphotransferase (HS6ST) catalyses the transfer of sulphate from adenosine 3'-phosphate, 5'-phosphosulphate to the 6th position of the N -sulphoglucosamine residue in heparan sulphate. 46652 pfam06766: Fungal hydrophobin. This is a family of fungal hydrophobins that seems to be restricted to ascomycetes. These are small, moderately hydrophobic extracellular proteins that have eight cysteine residues arranged in a strictly conserved motif. Hydrophobins are generally found on the outer surface of conidia and of the hyphal wall, and may be involved in mediating contact and communication between the fungus and its environment. Note that some family members contain multiple copies. 46653 pfam06767: Sif protein. This family consists of several SifA and SifB and SseJ proteins which seem to be specific to the Salmonella species. SifA, SifB and SseJ have been demonstrated to localise to the Salmonella-containing vacuole (SCV) and to Salmonella-induced filaments (Sifs). Trafficking of SseJ and SifB away from the SCV requires the SPI-2 effector SifA. SseJ trafficking away from the SCV along Sifs is unnecessary for its virulence function. 46654 pfam06768: G6b protein. This family consists of several G6b proteins which seem to be specific to humans. The G6b gene, located in the class III region of the human major histocompatibility complex, has been suggested to encode a putative receptor of the immunoglobulin superfamily. 46655 pfam06769: Protein of unknown function (DUF1224). This family consists of several short, hypothetical bacterial proteins of around 85 residues in length. The function of this family is unknown. 46656 pfam06770: Actin-rearrangement-inducing factor (Arif-1). This family consists of several Nucleopolyhedrovirus actin-rearrangement-inducing factor (Arif-1) proteins. In response to Autographa californica multicapsid nuclear polyhedrosis virus (AcMNPV) infection, a sequential rearrangement of the actin cytoskeleton occurs this is induced by Arif-1. Arif-1 is tyrosine phosphorylated and is located at the plasma membrane as a component of the actin rearrangement-inducing complex. 46657 pfam06771: Viral Desmoplakin N-terminus. This family represents the N-terminus of viral desmoplakin. Desmoplakin is a component of mature desmosomes, which are the main adhesive junctions in epithelia and cardiac muscle. Desmoplakin is also essential for the maturation of adherens junctions. Note that many family members are hypothetical. 46658 pfam06772: Bacterial low temperature requirement A protein (LtrA). This family consists of several bacteria specific low temperature requirement A (LtrA) protein sequences which have been found to be essential for growth at low temperatures in Listeria monocytogenes. 46659 pfam06773: Bim protein N-terminus. This family represents the N-terminal region of several mammal specific Bim proteins. The Bim protein is one of the BH3-only proteins, members of the Bcl-2 family that have only one of the Bcl-2 homology regions, BH3. BH3-only proteins are essential initiators of apoptotic cell death. 46660 pfam06774: Protein of unknown function (DUF1225). This family represents the N-terminus of archaeal neutral proteases. Note that many family members are hypothetical proteins. 46661 pfam06775: Protein of unknown function (DUF1226). This family consists of several hypothetical eukaryotic proteins of unknown function. 46662 pfam06776: Invasion associated locus B (IalB) protein. This family consists of several invasion associated locus B (IalB) proteins and related sequences. IalB is known to be a major virulence factor in Bartonella bacilliformis where it was shown to have a direct role in human erythrocyte parasitism. IalB is upregulated in response to environmental cues signaling vector-to-host transmission. Such environmental cues would include, but not be limited to, temperature, pH, oxidative stress, and haemin limitation. It is also thought that IalB would aide B. bacilliformis survival under stress-inducing environmental conditions. The role of this protein in other bacterial species is unknown. 46663 pfam06777: Protein of unknown function (DUF1227). This family represents a conserved region within a number of eukaryotic DNA repair helicases (EC:3.6.1.-).. 46664 pfam06778: Chlorite dismutase. This family contains chlorite dismutase enzymes of bacterial and archaeal origin. This enzyme catalyses the disproportionation of chlorite into chloride and oxygen. Note that many family members are hypothetical proteins. 46665 pfam06779: Protein of unknown function (DUF1228). This family represents the N-terminus of several putative bacterial membrane proteins, which may be sugar transporters. Note that many family members are hypothetical proteins. 46666 pfam06780: Erp protein C-terminus. This family represents the C-terminus of bacterial Erp proteins that seem to be specific to Borrelia burgdorferi (a causative agent of Lyme disease). Borrelia Erp proteins are particularly heterogeneous, which might enable them to interact with a wide variety of host components. 46667 pfam06781: Uncharacterised protein family (UPF0233).. 46668 pfam06782: Uncharacterised protein family (UPF0236).. 46669 pfam06783: Uncharacterised protein family (UPF0239).. 46670 pfam06784: Uncharacterised protein family (UPF0240).. 46671 pfam06785: Uncharacterised protein family (UPF0242).. 46672 pfam06786: Uncharacterised protein family (UPF0253).. 46673 pfam06787: Uncharacterised protein family (UPF0254).. 46674 pfam06788: Uncharacterised protein family (UPF0257).. 46675 pfam06789: Uncharacterised protein family (UPF0258).. 46676 pfam06790: Uncharacterised protein family (UPF0259).. 46677 pfam06791: Prophage tail length tape measure protein. This family represents a conserved region located towards the N-terminal end of prophage tail length tape measure protein (TMP). TMP is important for assembly of phage tails and involved in tail length determination. Mutated forms TMP cause tail fibres to be shortened. 46678 pfam06792: Uncharacterised protein family (UPF0261).. 46679 pfam06793: Uncharacterised protein family (UPF0262).. 46680 pfam06794: Uncharacterised protein family (UPF0270).. 46681 pfam06795: Erythrovirus X protein. This family consists of several Erythrovirus X proteins which seem to be found exclusively in human parvovirus and human erythrovirus. The function of this family is unknown. 46682 pfam06796: Periplasmic nitrate reductase protein NapE. This family consists of several bacterial periplasmic nitrate reductase NapE proteins. Seven genes, napKEFDABC, encoding the periplasmic nitrate reductase system were cloned from the denitrifying phototrophic bacterium Rhodobacter sphaeroides f. sp. denitrificans IL106. NapE is thought to be a transmembrane protein. 46683 pfam06797: Protein of unknown function (DUF1229). This family consists of several hypothetical proteins of around 415 residues in length which seem to be specific to the bacterium Leptospira interrogans. 46684 pfam06798: PrkA serine protein kinase. This is a family of PrkA bacterial and archaeal serine kinases approximately 630 residues long. PrkA possesses the A-motif of nucleotide-binding proteins and exhibits distant homology to eukaryotic protein kinases. Note that many family members are hypothetical. 46685 pfam06799: Protein of unknown function (DUF1230). This family consists of several hypothetical plant and photosynthetic bacterial proteins of around 160 residues in length. The function of this family is unknown although looking at the species distribution the protein may play a part in photosynthesis. 46686 pfam06800: Sugar transport protein. This is a family of bacterial sugar transporters approximately 300 residues long. Members include glucose uptake proteins, ribose transport proteins, and several putative and hypothetical membrane proteins probably involved in sugar transport across bacterial membranes. 46687 pfam06801: Metallocarboxypeptidase inhibitor (MCPI). This family consists of several metallocarboxypeptidase inhibitor (MCPI) proteins from a number of plant species including potato and tomato. 46688 pfam06802: Protein of unknown function (DUF1231). This family consists of several Orthopoxvirus specific proteins predominantly of around 340 residues in length. This family contains both B17 and B15 proteins, the function of which are unknown. 46689 pfam06803: Protein of unknown function (DUF1232). This family represents a conserved region of approximately 60 residues within a number of hypothetical bacterial and archaeal proteins of unknown function. 46690 pfam06804: NlpB/DapX lipoprotein. This family consists of a number of bacterial lipoproteins often known as NlpB or DapX. This lipoprotein is detected in outer membrane vesicles in Escherichia coli and appears to be nonessential. 46691 pfam06805: Bacteriophage lambda tail assembly protein I. This family consists of several Bacteriophage lambda tail assembly protein I and related phage and bacterial sequences. Members of this family are typically around 200 residues in length. The function of this family is unknown. 46692 pfam06806: Putative excisionase (DUF1233). This family consists of several putative phage excisionase proteins of around 80 residues in length. 46693 pfam06807: Pre-mRNA cleavage complex II protein Clp1. This family consists of several pre-mRNA cleavage complex II Clp1 (or HeaB) proteins. Six different protein factors are required in vitro for 3' end formation of mammalian pre-mRNAs by endonucleolytic cleavage and polyadenylation. Clp1 is a subunit of cleavage complex IIA, which is required for cleavage, but not for polyadenylation of pre-mRNA. 46694 pfam06808: TRAP C4-dicarboxylate transport (Dct) system permease DctM subunit. This family represents a conserved region located towards the N-terminus of the DctM subunit of the bacterial and archaeal TRAP C4-dicarboxylate transport (Dct) system permease. In general, C4-dicarboxylate transport systems allow C4-dicarboxylates like succinate, fumarate, and malate to be taken up. TRAP C4-dicarboxylate carriers are secondary carriers that use an electrochemical H+ gradient as the driving force for transport. DctM is an integral membrane protein that is one of the constituents of TRAP carriers. Note that many family members are hypothetical proteins. 46695 pfam06809: Neural proliferation differentiation control-1 protein (NPDC1). This family consists of several neural proliferation differentiation control-1 (NPDC1) proteins. NPDC1 plays a role in the control of neural cell proliferation and differentiation. It has been suggested that NPDC1 may be involved in the development of several secretion glands. This family also contains the C-terminal region of the C. elegans protein CAB-1 which is known to interact with AEX-3. 46696 pfam06810: Phage minor structural protein GP20. This family consists of several phage minor structural protein GP20 sequences of around 180 residues in length. The function of this family is unknown. 46697 pfam06812: ImpA-related N-terminal. This family represents a conserved region located towards the N-terminal end of ImpA and related proteins. ImpA is an inner membrane protein, which has been suggested to be involved with proteins that are exported and associated with colony variations in Actinobacillus actinomycetemcomitans. Note that many family members are hypothetical proteins. 46698 pfam06813: Nodulin-like. This family represents a conserved region within plant nodulin-like proteins. 46699 pfam06814: Lung seven transmembrane receptor. This family represents a conserved region with eukaryotic lung seven transmembrane receptors and related proteins. 46700 pfam06815: Reverse transcriptase connection domain. This domain is known as the connection domain. This domain lies between the thumb and palm domains. 46701 pfam06816: NOTCH protein. NOTCH signalling plays a fundamental role during a great number of developmental processes in multicellular animals. NOD (NOTCH protein domain) represents a region present in many NOTCH proteins and NOTCH homologs in multiple species such as 0, NOTCH2 and NOTCH3, LIN12, SC1 and TAN1. Role of NOD domain remains to be elucidated. 46702 pfam06817: Reverse transcriptase thumb domain. This domain is known as the thumb domain. It is composed of a four helix bundle. 46703 pfam06818: Fez1. This family represents the eukaryotic Fez1 protein. Fez1 contains a leucine-zipper region with similarity to the DNA-binding domain of the cAMP-responsive activating-transcription factor 5. There is evidence that Fez1 inhibits cancer cell growth through regulation of mitosis, and that its alterations result in abnormal cell growth. Note that some family members contain more than one copy of this region. 46704 pfam06819: Archaeal Peptidase A24 C-terminal Domain. This region is of unknown function but is found in some archaeal pfam01478. It is predicted to be of mixed alpha/beta secondary structure by JPred. 46705 pfam06820: Putative prophage tail fibre C-terminus. This family represents the C-terminus of a prophage tail fibre protein found mostly in E. coli. All family members contain a conserved RLGP motif. 46706 pfam06821: Protein of unknown function (DUF1234). This family contains a number of hypothetical bacterial proteins of unknown function, which may be cytosolic. 46707 pfam06822: Protein of unknown function (DUF1235). This family contains a number of viral proteins of unknown function. 46708 pfam06823: Protein of unknown function (DUF1236). This family contains a number of hypothetical bacterial proteins of unknown function. Some family members contain more than one copy of the region represented by this family. 46709 pfam06824: Protein of unknown function (DUF1237). This family contains a number of hypothetical proteins of about 450 residues in length. Their function is unknown, and most are bacterial. 46710 pfam06825: Heat shock factor binding protein 1. Heat shock factor binding protein 1 (HSBP1) appears to be a negative regulator of the heat shock response. 46711 pfam06826: Predicted Permease Membrane Region. This family represents five transmembrane helices that are normally found flanking (five either side) a pair of pfam02080 domains. This suggests that the paired regions form a ten helical structure, probably forming the pore, whereas pfam02080 binds a ligand for export or regulation of the pore. One member is described as a aspartate-alanine antiporter. In conjunction with another component it forms a 'proton motive metabolic cycle catalysed by an aspartate-alanine exchange'. The general conservation of domain architecture in this family suggests that they are functional orthologues. 46712 pfam06827: Zinc finger found in FPG and IleRS. This zinc binding domain is found at the C-terminus of isoleucyl tRNA synthetase and the enzyme Formamidopyrimidine-DNA glycosylase EC:3.2.2.23. 46713 pfam06828: Fukutin-related. Fukutin is a eukaryotic protein necessary for the maintenance of muscle integrity, cortical histiogenesis, and normal ocular development. Mutations in the fukutin gene have been shown to result in Fukuyama-type congenital muscular dystrophy characterised by brain malformation - one of the most common autosomal-recessive disorders in Japan. This family represents a short conserved region within fukutin-related proteins that is sometimes repeated. Note that many family members are hypothetical proteins. 46714 pfam06829: Protein of unknown function (DUF1238). This family consists of several hypothetical bacterial proteins of around 325 residues in length. The function of this family is unknown. 46715 pfam06830: Root cap. The cells at the periphery of the root cap are continuously sloughed off from the root into the mucilage, and are thought to be programmed to die.This family represents a conserved region approximately 60 residues in length within plant root cap proteins, which may be involved in the process. 46716 pfam06831: Formamidopyrimidine-DNA glycosylase H2TH domain. Formamidopyrimidine-DNA glycosylase (Fpg) is a DNA repair enzyme that excises oxidised purines from damaged DNA. This family is the central domain containing the DNA-binding helix-two turn-helix domain. 46717 pfam06832: Penicillin-Binding Protein C-terminus Family. This conserved region of approximately 90 residues is found in a sub-group of bacterial Penicillin-Binding Proteins (PBPs). A variable length loop region separates this region from the transpeptidase unit (pfam00905). It is predicted by PROF to be an all beta fold. 46718 pfam06833: Malonate decarboxylase gamma subunit (MdcE). This family consists of several bacterial malonate decarboxylase gamma subunit proteins. Malonate decarboxylase of Klebsiella pneumoniae consists of four different subunits and catalyses the conversion of malonate plus H+ to acetate and CO2. The catalysis proceeds via acetyl and malonyl thioester residues with the phosphribosyl-dephospho-CoA prosthetic group of the acyl carrier protein (ACP) subunit. MdcD and E together probably function as malonyl-S-ACP decarboxylase. . 46719 pfam06834: TraU protein. This family consists of several bacterial TraU proteins. TraU appears to be more essential to conjugal DNA transfer than to assembly of pilus filaments. 46720 pfam06835: Protein of unknown function (DUF1239). This family consists of several hypothetical bacterial proteins of around 190 residues in length. The function of this family is unknown. 46721 pfam06836: Protein of unknown function (DUF1240). This family consists of a number of hypothetical putative membrane proteins which seem to be specific to Yersinia pestis. The function of this family is unknown. 46722 pfam06837: Fijivirus P9-2 protein. This family consists of several Fijivirus specific P9-2 proteins from Rice black streaked dwarf virus (RBSDV) and Fiji disease virus. The function of this family is unknown. 46723 pfam06838: Aluminium resistance protein. This family represents the aluminium resistance protein, which confers resistance to aluminium in bacteria. 46724 pfam06839: GRF zinc finger. This presumed zinc binding domain is found in a variety of DNA-binding proteins. It seems likely that this domain is involved in nucleic acid binding. It is named GRF after three conserved residues in the centre of the alignment of the domain. This zinc finger may be related to pfam01396. 46725 pfam06840: Protein of unknown function (DUF1241). This family consists of several programmed cell death 10 protein (PDCD10 or TFAR15) sequences. The function of this family is unknown. 46726 pfam06841: T4-like virus tail tube protein gp19. This family consists of several tail tube protein gp19 sequences from the T4-like viruses. 46727 pfam06842: Protein of unknown function (DUF1242). This family consists of a number of eukaryotic proteins of around 72 residues in length. The function of this family is unknown. 46728 pfam06843: Protein of unknown function (DUF1243). This family consists of a number of hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown. 46729 pfam06844: Protein of unknown function (DUF1244). This family consists of several short bacterial proteins of around 100 residues in length. The function of this family is unknown. 46730 pfam06845: Myo-inositol catabolism protein IolB. This family consists of several bacterial Myo-inositol catabolism (IolB) proteins. The Bacillus subtilis inositol operon (iolABCDEFGHIJ) is involved in myo-inositol catabolism. Glucose repression of the iol operon induced by inositol is exerted through catabolite repression mediated by CcpA and the iol induction system mediated by IolR. The exact function of IolB is unknown. 46731 pfam06846: Protein of unknown function (DUF1245). This family represents a conserved region approximately 80 residues long within Pyrobaculum aerophilum family 1964 protein. 46732 pfam06847: Archaeal Peptidase A24 C-terminus Type II. This region is of unknown function but is found in some archaeal pfam01478. It is predicted to be of mixed alpha/beta secondary structure by Prof. 46733 pfam06848: Disaggregatase related repeat. This family consists of several repeats which seem to be specific to the Methanosarcina archaea species and are often found in multiple copies in disaggregatase proteins. Members of this family are also found in single copies in several hypothetical proteins. 46734 pfam06849: Protein of unknown function (DUF1246). This family represents the N-terminus of a number of hypothetical archaeal proteins of unknown function. 46735 pfam06850: PHB de-polymerase C-terminus. This family represents the C-terminus of bacterial poly(3-hydroxybutyrate) (PHB) de-polymerase. This degrades PHB granules to oligomers and monomers of 3-hydroxy-butyric acid. 46736 pfam06851: Protein of unknown function (DUF1247). This family contains a number of hypothetical viral proteins of unknown function approximately 200 residues long. 46737 pfam06852: Protein of unknown function (DUF1248). This family represents a conserved region within a number of proteins of unknown function that seem to be specific to C. elegans. Note that some family members contain more than one copy of this region. 46738 pfam06853: Protein of unknown function (DUF1249). This family consists of several hypothetical bacterial proteins of around 150 residues in length. The function of this family is unknown. 46739 pfam06854: Bacteriophage Gp15 protein. This family consists of bacteriophage Gp15 proteins and related bacterial sequences. The function of this family is unknown. 46740 pfam06855: Protein of unknown function (DUF1250). This family consists of several short hypothetical bacterial proteins of around 70 residues in length. Members of this family seem to all belong to the order Bacillales or Lactobacillales. The function of this family is unknown. 46741 pfam06856: Protein of unknown function (DUF1251). This family consists of the N-terminal region of several hypothetical Nucleopolyhedrovirus proteins of unknown function. 46742 pfam06857: Malonate decarboxylase delta subunit (MdcD). This family consists of several bacterial malonate decarboxylase delta subunit (MdcD) proteins. Malonate decarboxylase of Klebsiella pneumoniae consists of four different subunits and catalyses the conversion of malonate plus H+ to acetate and CO2. The catalysis proceeds via acetyl and malonyl thioester residues with the phosphribosyl-dephospho-CoA prosthetic group of the acyl carrier protein (ACP) subunit. MdcC is the (apo) ACP subunit. 46743 pfam06858: Nucleolar GTP-binding protein 1 (NOG1). This family represents a conserved region of approximately 60 residues in length within nucleolar GTP-binding protein 1 (NOG1). In S. cerevisiae, the NOG1 gene has been shown to be essential for cell viability, suggesting that NOG1 may play an important role in nucleolar functions. Family members include eukaryotic, bacterial and archaeal proteins. 46744 pfam06859: Bicoid-interacting protein 3 (Bin3). This family represents a conserved region of approximately 120 residues within eukaryotic Bicoid-interacting protein 3 (Bin3). Bin3, which shows similarity to a number of protein methyltransferases that modify RNA-binding proteins, interacts with Bicoid, which itself directs pattern formation in the early Drosophila embryo. The interaction might allow Bicoid to switch between its dual roles in transcription and translation. Note that family members contain a conserved HLN motif. 46745 pfam06860: Protein of unknown function (DUF1252). This family consists of several bacterial proteins of around 180 residues in length. Members of this family seem to be specific to Listeria species and the function of the family is unknown. 46746 pfam06861: BALF1 protein. This family consists of several BALF1 proteins which seem to be specific to the Lymphocryptoviruses. BALF1, inhibits the antiapoptotic activity of EBV BHRF1 and of KSBcl-2. 46747 pfam06862: Protein of unknown function (DUF1253). This family represents the C-terminal portion (approximately 500 residues) of several hypothetical eukaryotic proteins of unknown function. 46748 pfam06863: Protein of unknown function (DUF1254). This family represents a conserved region about 130 residues long within hypothetical proteins of unknown function. Family members include eukaryotic, bacterial and archaeal proteins. 46749 pfam06864: Pilin accessory protein (PilO). This family consists of several enterobacterial PilO proteins. The function of PilO is unknown although it has been suggested that it is a cytoplasmic protein in the absence of other Pil proteins, but PilO protein is translocated to the outer membrane in the presence of other Pil proteins. Alternatively, PilO protein may form a complex with other Pil protein(s). PilO has been predicted to function as a component of the pilin transport apparatus and thin-pilus basal body. This family does not seem to be related to pfam04350. 46750 pfam06865: Protein of unknown function (DUF1255). This family consists of several conserved hypothetical bacterial proteins of around 95 residues in length. The function of this family is unknown. 46751 pfam06866: Protein of unknown function (DUF1256). This family consists of several uncharacterised bacterial proteins which seem to be specific to the orders Clostridia and Bacillales. Family members are typically around 180 residues in length. The function of this family is unknown. 46752 pfam06868: Protein of unknown function (DUF1257). This family contains hypothetical proteins of unknown function that are approximately 120 residues long. Family members include eukaryotic and bacterial proteins. 46753 pfam06869: Protein of unknown function (DUF1258). This family represents a conserved region approximately 260 residues long within a number of hypothetical proteins of unknown function that seem to be specific to C. elegans. Note that this family contains a number of conserved cysteine and histidine residues. 46754 pfam06870: A49-like RNA polymerase I associated factor. Saccharomyces cerevisiae A49 is a specific subunit associated with RNA polymerase I (Pol I) in eukaryotes. Pol I maintains transcription activities in A49 deletion mutants. However, such mutants are deficient in transcription activity at low temperatures. Deletion analysis of the fusion yeast homolog indicate that only the C-terminal two thirds are required for function. Transcript analysis has demonstrated that A49 is maximising transcription of ribosomal DNA. 46755 pfam06871: TraH_2. This family consists of several TraH proteins which seem to be specific to Agrobacterium and Rhizobium species. This protein is thought to be involved in conjugal transfer but its function is unknown. This family does not appear to be related to pfam06122. 46756 pfam06872: EspG protein. This family consists of several EspG like proteins from Citrobacter rodentium and Escherichia coli. EspG is secreted by the type III secretory system and is translocated into host epithelial cells. EspG is homologous with Shigella flexneri protein VirA and can rescue invasion in a Shigella virA mutant, indicating that these proteins are functionally equivalent in Shigella. EspG plays an accessory but as yet undefined role in EPEC virulence that may involve intestinal colonisation. 46757 pfam06873: Cell surface immobilisation antigen SerH. This family consists of several cell surface immobilisation antigen SerH proteins which seem to be specific to Tetrahymena thermophila. The SerH locus of Tetrahymena thermophila is one of several paralogous loci with genes encoding variants of the major cell surface protein known as the immobilisation antigen (i-ag). . 46758 pfam06874: Firmicute fructose-1,6-bisphosphatase. This family consists of several bacterial fructose-1,6-bisphosphatase proteins (EC:3.1.3.11) which seem to be specific to phylum Firmicutes. Fructose-1,6-bisphosphatase (FBPase) is a well known enzyme involved in gluconeogenesis. This family does not seem to be structurally related to pfam00316. 46759 pfam06875: Plethodontid receptivity factor PRF. This family consists of several plethodontid receptivity factor (PRF) proteins which seem to be specific to Plethodon jordani (Jordan's salamander). PRF is a courtship pheromone produced by males increase female receptivity. 46760 pfam06876: Plant self-incompatibility response (SCRL) protein. This family consists of several Plant self-incompatibility response (SCRL) proteins. The male component of the self-incompatibility response in Brassica has been shown to be encoded by the S locus cysteine-rich gene (SCR). SCR is related, at the sequence level, to the pollen coat protein (PCP) gene family whose members encode small, cysteine-rich proteins located in the proteo-lipidic surface layer (tryphine) of Brassica pollen grains. . 46761 pfam06877: Protein of unknown function (DUF1260). This family consists of several hypothetical bacterial proteins of around 120 residues in length. The function of this family is unknown. 46762 pfam06878: Pkip-1 protein. This family consists of several Pkip-1 proteins which seem to be specific to Nucleopolyhedroviruses. The function of this family is unknown although it has been found that Pkip-1 is not essential for virus replication in cell culture or by in vivo intrahaemocoelic injection. . 46763 pfam06879: Protein of unknown function (DUF1261). This family contains hypothetical proteins of unknown function that are approximately 200 residues long. They seem to be specific to C. elegans. 46764 pfam06880: Protein of unknown function (DUF1262). This family represents a conserved region within a number of proteins of unknown function that seem to be specific to Arabidopsis thaliana. Note that some family members contain more than one copy of this region. 46765 pfam06881: RNA polymerase II transcription factor SIII (Elongin) subunit A. This family represents a conserved region within RNA polymerase II transcription factor SIII (Elongin) subunit A. In mammals, the Elongin complex activates elongation by RNA polymerase II by suppressing transient pausing of the polymerase at many sites within transcription units. Elongin is a heterotrimer composed of A, B, and C subunits of 110, 18, and 15 kilodaltons, respectively. Subunit A has been shown to function as the transcriptionally active component of Elongin. 46766 pfam06882: Protein of unknown function (DUF1263). This family represents a conserved region located towards the C-terminus of a number proteins of unknown function that seem to be specific to Oryza sativa. 46767 pfam06883: RNA polymerase I, Rpa2 specific domain. This domain is found between domain 3 (pfam04565) and domain 5 (pfam04565), but shows no homology to domain 4 of Rpb2. The external domains in multisubunit RNA polymerase (those most distant from the active site) are known to demonstrate more sequence variability. 46768 pfam06884: Protein of unknown function (DUF1264). This family contains a number of bacterial and eukaryotic proteins of unknown function that are approximately 200 residues long. Some family members are annotated as putative lipoproteins. 46769 pfam06885: MMS19 N-terminus. This family represents the N-terminus of the eukaryotic repair/transcription protein MMS19. MMS19 is involved in excision repair of DNA damaged by UV radiation and by other agents that distort the DNA helix, as well as having a role in transcription as a component of TFIIH. Mutations cause deficiencies in transcription-coupled and nucleotide excision repair. 46770 pfam06886: Targeting protein for Xklp2 (TPX2). This family represents a conserved region approximately 60 residues long within the eukaryotic targeting protein for Xklp2 (TPX2). Xklp2 is a kinesin-like protein localised on centrosomes throughout the cell cycle and on spindle pole microtubules during metaphase. In Xenopus, it has been shown that Xklp2 protein is required for centrosome separation and maintenance of spindle bi-polarity. TPX2 is a microtubule-associated protein that mediates the binding of the C-terminal domain of Xklp2 to microtubules. It is phosphorylated during mitosis in a microtubule-dependent way. 46771 pfam06887: Protein of unknown function (DUF1265). This family represents a conserved region approximately 50 residues long within a number of proteins of unknown function that seem to be restricted to C. elegans. 46772 pfam06888: Putative Phosphatase. This family contains a number of putative eukaryotic acid phosphatases. Some family members represent the products of the PSI14 phosphatase family in Lycopersicon esculentum (Tomato).. 46773 pfam06889: Protein of unknown function (DUF1266). This family consists of several hypothetical bacterial proteins of around 235 residues in length. Members of this family seem to be found exclusively in the Enterobacteria Salmonella typhimurium and Escherichia coli. The function of this family is unknown. 46774 pfam06890: Bacteriophage Mu Gp45 protein. This family consists of Bacteriophage Mu Gp45 related proteins from both phages and bacteria. The function of this family is unknown although it has been suggested that family members may be involved in baseplate assembly. 46775 pfam06891: P2 phage tail completion protein R (GpR). This family consists of P2 phage tail completion protein R (GpR) like sequences. GpR is thought to be a tail completion protein which is essential for stable head joining. 46776 pfam06892: Phage regulatory protein CII (CP76). This family consists of several phage regulatory protein CII (CP76) sequences which are thought to be DNA binding proteins which are involved in the establishment of lysogeny. 46777 pfam06893: Bacteriophage Mu P protein. This family consists of Bacteriophage Mu P proteins and related sequences. The function of this family is unknown. 46778 pfam06894: Bacteriophage lambda minor tail protein (GpG). This family consists of Bacteriophage lambda minor tail protein G and related sequences. The role of GpG in tail assembly is not known. 46779 pfam06895: Protein of unknown function (DUF1267). This family consists of several Lactococcus lactis and Lactococcus phage proteins of around 74 residues in length. The function of this family is unknown. 46780 pfam06896: Protein of unknown function (DUF1268). This family consists of several bacterial and phage proteins of around 115 residues in length. The function of this family is unknown. 46781 pfam06897: Protein of unknown function (DUF1269). This family consists of several bacterial and archaeal proteins of around 200 residues in length. The function of this family is unknown. 46782 pfam06898: Putative stage IV sporulation protein YqfD. This family consists of several putative bacterial stage IV sporulation (SpoIV) proteins. YqfD of Bacillus subtilis is known to be essential for efficient sporulation although its exact function is unknown. 46783 pfam06899: WzyE protein. This family consists of several WzyE proteins which appear to be specific to Enterobacteria. Members of this family are described as putative ECA polymerases this has been found to be incorrect. The function of this family is unknown. 46784 pfam06900: Protein of unknown function (DUF1270). This family consists of several hypothetical Staphylococcus aureus and phage proteins of 53 residues in length. The function of this family is unknown. 46785 pfam06901: RTX iron-regulated protein FrpC. This family consists of several RTX iron-regulated FrpC proteins which appear to be found exclusively in Neisseria meningitidis. FrpC has been shown to be related to the RTX family of bacterial cytotoxins. FrpC is found in the meningococcal outer membrane. The function of this family is unknown although it is thought to be a virulence factor. 46786 pfam06902: Protein of unknown function (DUF1271). This family consists of a number of hypothetical bacterial proteins of around 70 residues in length. Members of this family contain three highly conserved cysteine residues. The function of this family is unknown. 46787 pfam06903: VirK protein. This family consists of several bacterial VirK proteins of around 145 residues in length. The function of this family is unknown. 46788 pfam06904: Extensin-like protein C-terminus. This family represents the C-terminus (approx. 120 residues) of a number of bacterial extensin-like proteins. Extensins are cell wall glycoproteins normally associated with plants, where they strengthen the cell wall in response to mechanical stress. Note that many family members of this family are hypothetical. 46789 pfam06905: Fas apoptotic inhibitory molecule (FAIM). This family consists of several fas apoptotic inhibitory molecule (FAIM) proteins. FAIM expression is upregulated in B cells by anti-Ig treatment that induces Fas-resistance, and overexpression of FAIM diminishes sensitivity to Fas-mediated apoptosis of B and non-B cell lines. FAIM is highly evolutionarily conserved and is widely expressed in murine tissues, suggesting that FAIM plays an important role in cellular physiology. 46790 pfam06906: Protein of unknown function (DUF1272). This family consists of several hypothetical bacterial proteins of around 80 residues in length. This family contains a number of conserved cysteine residues and its function is unknown. 46791 pfam06907: Latexin. This family consists of several animal specific latexin proteins. Latexin is a carboxypeptidase A inhibitor and is expressed in a cell type-specific manner in both central and peripheral nervous systems in the rat. 46792 pfam06908: Protein of unknown function (DUF1273). This family consists of several hypothetical bacterial proteins of around 180 residues in length. The function of this family is unknown. 46793 pfam06909: Protein of unknown function (DUF1274). This family consists of several Chordopoxvirus proteins of around 160 residues in length. The function of this family is unknown. 46794 pfam06910: Male enhanced antigen 1 (MEA1). This family consists of several mammalian male enhanced antigen 1 (MEA1) proteins. The Mea-1 gene is found to be localised in primary and secondary spermatocytes and spermatids, but the protein products are detected only in spermatids. Intensive transcription of Mea-1 gene and specific localisation of the gene product suggest that Mea-1 may play a important role in the late stage of spermatogenesis. . 46795 pfam06911: Senescence-associated protein. This family contains a number of plant senescence-associated proteins of approximately 450 residues in length. In Hemerocallis, petals have a genetically based program that leads to senescence and cell death approximately 24 hours after the, flower opens, and it is believed that senescence proteins produced around that time have a role in this program. 46796 pfam06912: Protein of unknown function (DUF1275). This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown although a few members are thought to be membrane proteins. 46797 pfam06913: Protein of unknown function (DUF1276). This family consists of several ESAT-6 like proteins from Mycobacterium leprae and Mycobacterium tuberculosis. Members of this family are around 100 residues in length and their function is unknown. 46798 pfam06914: Protein of unknown function (DUF1277). This family contains a number of hypothetical proteins of unknown function approximately 350 residues long. These are of bacterial and viral origin. 46799 pfam06915: Protein of unknown function (DUF1278). This family consists of several hypothetical plant specific proteins of around 150 residues in length. Members of this family contain several conserved cysteine residues. The function of the family is unknown. 46800 pfam06916: Protein of unknown function (DUF1279). This family represents the C-terminus (approx. 120 residues) of a number of eukaryotic proteins of unknown function. 46801 pfam06917: Periplasmic pectate lyase. This family consists of several Enterobacterial periplasmic pectate lyase proteins (EC:4.2.2.2). A major virulence determinant of the plant-pathogenic enterobacterium Erwinia chrysanthemi is the production of pectate lyase enzymes that degrade plant cell walls. 46802 pfam06918: Protein of unknown function (DUF1280). This family represents a conserved region approximately 200 residues long within a number of proteins of unknown function that seem to be specific to C. elegans. 46803 pfam06919: Phage Gp30.7 protein. This family consists of several phage Gp30.7 proteins of 121 residues in length. Family members seem to be exclusively from the T4-like viruses. The function of this family is unknown. 46804 pfam06920: Dedicator of cytokinesis. This family represents a conserved region approximately 200 residues long within a number of eukaryotic dedicator of cytokinesis proteins. These are potential guanine nucleotide exchange factors, which activate some small GTPases by exchanging bound GDP for free GTP. 46805 pfam06921: VIRB2 type IV secretion protein. This family consists of several VIRB2 type IV secretion proteins. The virB2 gene encodes a putative type IV secretion system and is known to be a pathogenicity factor in Bartonella species. . 46806 pfam06922: Citrus tristeza virus P13 protein. This family consists of several Citrus tristeza virus (CTV) P13 13-kDa proteins. Citrus tristeza virus (CTV), a member of the closterovirus group, is one of the more complex single-stranded RNA viruses. The function of this family is unknown. 46807 pfam06923: Glucitol operon activator protein (GutM). This family consists of several glucitol operon activator (GutM) proteins. Expression of the glucitol (gut) operon in Escherichia coli is regulated by an unusual, complex system which consists of an activator (encoded by the gutM gene) and a repressor (encoded by the gutR gene) in addition to the cAMP-CRP complex (CRP, cAMP receptor protein). Synthesis of the mRNA, which initiates at the promoter specific to the gutR gene, occurs within the gutM gene. Expressional control of the gut operon appears to occur as a consequence of the antagonistic action of the products of the autogenously regulated gutM and gutR genes. 46808 pfam06924: Protein of unknown function (DUF1281). This family consists of several hypothetical enterobacterial proteins of around 170 residues in length. Members of this family are found in Escherichia coli, Salmonella typhimurium and Shigella species. The function of this family is unknown. 46809 pfam06925: Monogalactosyldiacylglycerol (MGDG) synthase. This family represents a conserved region of approximately 180 residues within plant and bacterial monogalactosyldiacylglycerol (MGDG) synthase (EC:2.4.1.46). In Arabidopsis, there are two types of MGDG synthase which differ in their N-terminal portion: type A and type B. 46810 pfam06926: Putative replisome organiser protein C-terminus. This family represents the C-terminus (approximately 100 residues) of a putative replisome organiser protein in Lactococcus bacteriophages. 46811 pfam06927: Splicing factor 3B subunit 1 (Spliceosome associated protein 155). This family represents a conserved region approximately 100 residues long within eukaryotic splicing factor 3B subunit 1 (Spliceosome associated protein 155). This is positioned near the spliceosome catalytic centre, and contacts pre-mRNA on both sides of the branch site early in spliceosome assembly. It is phosphorylated, so is an example of a protein modification regulated with splicing catalysis. 46812 pfam06928: Mast C-terminus. This family represents the C-terminus (approximately 150 residues) of Mast, a microtubule-associated protein. It has been suggested that Mast plays an essential role in centrosome separation and organisation of the bipolar mitotic spindle. Mast mutations in neuroblasts cause them to be highly polyploid and show severe mitotic abnormalities. Note that many family members are hypothetical proteins. 46813 pfam06929: Rotavirus VP3 protein. This family consists of several Rotavirus specific VP3 proteins. VP3 is known to be a viral guanylyltransferase and is thought to posses methyltransferase activity and therefore VP3 is a predicted multifunctional capping enzyme. . 46814 pfam06930: Protein of unknown function (DUF1282). This family consists of several hypothetical proteins of around 200 residues in length. The function of this family is unknown although a number of family members are thought to be putative membrane proteins. 46815 pfam06931: Mastadenovirus E4 ORF3 protein. This family consists of several Mastadenovirus E4 ORF3 proteins. Early proteins E4 ORF3 and E4 ORF6 have complementary functions during viral infection. Both proteins facilitate efficient viral DNA replication, late protein expression, and prevention of concatenation of viral genomes. A unique function of E4 ORF3 is the reorganisation of nuclear structures known as PML oncogenic domains (PODs). The function of these domains is unclear, but PODs have been implicated in a number of important cellular processes, including transcriptional regulation, apoptosis, transformation, and response to interferon. 46816 pfam06932: Protein of unknown function (DUF1283). This family consists of several hypothetical proteins of around 115 residues in length which seem to be specific to Enterobacteria. The function of the family is unknown. 46817 pfam06933: Special lobe-specific silk protein SSP160. This family consists of several special lobe-specific silk protein SSP160 sequences which appear to be specific to Chironomus (Midge) species. 46818 pfam06934: Fatty acid cis/trans isomerase (CTI). This family consists of several fatty acid cis/trans isomerase proteins which appear to be found exclusively in bacteria of the orders Vibrionales and Pseudomonadales. Cis/trans isomerase (CTI) catalyses the cis-trans isomerisation of esterified fatty acids in phospholipids, mainly cis-oleic acid (C(16:1,9)) and cis-vaccenic acid (C(18:1,11)), in response to solvents. The CTI protein has been shown to be involved in solvent resistance in Pseudomonas putida. 46819 pfam06935: Protein of unknown function (DUF1284). This family consists of several hypothetical bacterial and archaeal proteins of around 130 residues in length. The function of this family is unknown, although it is thought that they may be iron-sulphur binding proteins. 46820 pfam06936: Selenoprotein S (SelS). This family consists of several mammalian selenoprotein S (SelS) sequences. SelS is a plasma membrane protein and is present in a variety of tissues and cell types. The function of this family is unknown. 46821 pfam06937: EURL protein. This family consists of several animal EURL proteins. EURL is preferentially expressed in chick retinal precursor cells as well as in the anterior epithelial cells of the lens at early stages of development. EURL transcripts are found primarily in the peripheral dorsal retina, i.e., the most undifferentiated part of the dorsal retina. EURL transcripts are also detected in the lens at stage 18 and remain abundant in the proliferating epithelial cells of the lens until at least day 11. The distribution pattern of EURL in the developing retina and lens suggest a role before the events leading to cell determination and differentiation. . 46822 pfam06938: Protein of unknown function (DUF1285). This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown. 46823 pfam06939: Protein of unknown function (DUF1286). This family consists of several hypothetical archaeal proteins of around 120 residues in length. All members of this family seem to be Sulfolobus species specific. The function of this family is unknown. 46824 pfam06940: Protein of unknown function (DUF1287). This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown. 46825 pfam06941: 5' nucleotidase, deoxy (Pyrimidine), cytosolic type C protein (NT5C). This family consists of several 5' nucleotidase, deoxy (Pyrimidine), cytosolic type C (NT5C) proteins. 5'(3')-Deoxyribonucleotidase is a ubiquitous enzyme in mammalian cells whose physiological function is not known. . 46826 pfam06942: GlpM protein. This family consists of several bacterial GlpM membrane proteins. GlpM is a hydrophobic protein containing 109 amino acids. It is thought that GlpM may play a role in alginate biosynthesis in Pseudomonas aeruginosa. 46827 pfam06943: LSD1 zinc finger. This family consists of several plant specific LSD1 zinc finger domains. Arabidopsis lsd1 mutants are hyper-responsive to cell death initiators and fail to limit the extent of cell death. Superoxide is a necessary and sufficient signal for cell death propagation. LSD1 monitors a superoxide-dependent signal and negatively regulates a plant cell death pathway. LSD1 protein contains three zinc finger domains, defined by CxxCxRxxLMYxxGASxVxCxxC. It has been suggested that LSD1 defines a zinc finger protein subclass and that LSD1 regulates transcription, via either repression of a pro-death pathway or activation of an anti-death pathway, in response to signals emanating from cells undergoing pathogen-induced hypersensitive cell death. . 46828 pfam06944: Protein of unknown function (DUF1288). This family consists of several archaeal proteins of around 150 residues in length. The function of this family is unknown. 46829 pfam06945: Protein of unknown function (DUF1289). This family consists of a number of hypothetical bacterial proteins. The aligned region spans around 56 residues and contains 4 highly conserved cysteine residues towards the N-terminus. The function of this family is unknown. 46830 pfam06946: Phage holin. This family consists of several Listeria bacteriophage holin proteins and related bacterial sequences. Holins are a diverse family of proteins that cause bacterial membrane lysis during late-protein synthesis. It is thought that the temporal precision of holin-mediated lysis may occur through the build up of a holin oligomer which causes the lysis. 46831 pfam06947: Protein of unknown function (DUF1290). This family consists of several bacterial small basic proteins of around 100 residues in length. The function of this family is unknown. 46832 pfam06948: Protein of unknown function (DUF1291). This family consists of several hypothetical archaeal and one bacterial protein of around 115 residues in length. The function of this family is unknown. 46833 pfam06949: Protein of unknown function (DUF1292). This family consists of several hypothetical bacterial proteins of around 90 residues in length. The function of this family is unknown. 46834 pfam06950: Protein of unknown function (DUF1293). This family consists of several bacterial and phage proteins of around 115 residues in length. The function of this family is unknown. 46835 pfam06951: Group XII secretory phospholipase A2 precursor (PLA2G12). This family consists of several group XII secretory phospholipase A2 precursor (PLA2G12) (EC:3.1.1.4) proteins. Group XII and group V PLA(2)s are thought to participate in helper T cell immune response through release of immediate second signals and generation of downstream eicosanoids. . 46836 pfam06952: PsiA protein. This family consists of several Enterobacterial PsiA proteins. The function of PsiA is unknown although it is thought that it may affect the generation of an SOS signal in Escherichia coli. 46837 pfam06953: Arsenical resistance operon trans-acting repressor ArsD. This family consists of several bacterial arsenical resistance operon trans-acting repressor ArsD proteins. ArsD is a trans-acting repressor of the arsRDABC operon that confers resistance to arsenicals and antimonials in Escherichia coli. It possesses two-pairs of vicinal cysteine residues, Cys(12)-Cys(13) and Cys(112)-Cys(113), that potentially form separate binding sites for the metalloids that trigger dissociation of ArsD from the operon. However, as a homodimer it has four vicinal cysteine pairs. 46838 pfam06954: Resistin. This family consists of several mammalian resistin proteins. Resistin is a 12.5-kDa cysteine-rich secreted polypeptide first reported from rodent adipocytes. It belongs to a multigene family termed RELMs or FIZZ proteins. Plasma resistin levels are significantly increased in both genetically susceptible and high-fat-diet-induced obese mice. Immunoneutralisation of resistin improves hyperglycemia and insulin resistance in high-fat-diet-induced obese mice, while administration of recombinant resistin impairs glucose tolerance and insulin action in normal mice. It has been demonstrated that increases in circulating resistin levels markedly stimulate glucose production in the presence of fixed physiological insulin levels, whereas insulin suppressed resistin expression. It has been suggested that resistin could be a link between obesity and type 2 diabetes. 46839 pfam06955: Xyloglucan endo-transglycosylase (XET) C-terminus. This family represents the C-terminus (approximately 60 residues) of plant xyloglucan endo-transglycosylase (XET). Xyloglucan is the predominant hemicellulose in the cell walls of most dicotyledons. With cellulose, it forms a network that strengthens the cell wall. XET catalyses the splitting of xyloglucan chains and the linking of the newly generated reducing end to the non-reducing end of another xyloglucan chain, thereby loosening the cell wall. Note that all family members contain the pfam00722 domain. 46840 pfam06956: Regulator of RNA terminal phosphate cyclase. RtcR is a sigma54-dependent enhancer binding protein, which activates transcription of the rtcBA operon. The product of the rtcA gene is an RNA 3 '-terminal phosphate cyclase. This domain is found at the N terminus of the RtcR sequence. RtcR, and other sigma54-dependent activators, contain pfam00158 in the central region of the protein sequence. 46841 pfam06957: Coatomer (COPI) alpha subunit C-terminus. This family represents the C-terminus (approximately 500 residues) of the eukaryotic coatomer alpha subunit. Coatomer (COPI) is a large cytosolic protein complex which forms a coat around vesicles budding from the Golgi apparatus. Such coatomer-coated vesicles have been proposed to play a role in many distinct steps of intracellular transport. Note that many family members also contain the pfam04053 domain. 46842 pfam06958: S-type Pyocin. This family represents a conserved region approximately 180 residues long within bacterial S-type pyocins. Pyocins are polypeptide toxins produced by, and active against, bacteria. S-type pyocins cause cell death by DNA breakdown due to endonuclease activity. 46843 pfam06959: RecQ helicase protein-like 5 (RecQ5). This family represents a conserved region approximately 200 residues long within eukaryotic RecQ helicase protein-like 5 (RecQ5). The RecQ helicases have been implicated in DNA repair and recombination, and RecQ5 may have an important role in DNA metabolism. 46844 pfam06960: Myo-inositol catabolism protein N-terminus. This family represents the N-terminus (approximately 120 residues) of bacterial myo-inositol catabolism proteins. These are involved in the myo-inositol catabolism pathway, and is required for growth on myo-inositol in Rhizobium leguminosarum bv. viciae. 46845 pfam06961: Protein of unknown function (DUF1294). This family includes a number of hypothetical bacterial and archaeal proteins of unknown function. 46846 pfam06962: Putative rRNA methylase. This family contains a number of putative rRNA methylases. Note that many family members are hypothetical proteins. 46847 pfam06963: Ferroportin1 (FPN1). This family represents a conserved region approximately 100 residues long within eukaryotic Ferroportin1 (FPN1), a protein that may play a role in iron export from the cell. This family may represent a number of transmembrane regions in Ferroportin1. 46848 pfam06964: Alpha-L-arabinofuranosidase C-terminus. This family represents the C-terminus (approximately 200 residues) of bacterial and eukaryotic alpha-L-arabinofuranosidase (EC:3.2.1.55). This catalyses the hydrolysis of nonreducing terminal alpha-L-arabinofuranosidic linkages in L-arabinose-containing polysaccharides. 46849 pfam06965: Na+/H+ antiporter 1. This family contains a number of bacterial Na+/H+ antiporter 1 proteins. These are integral membrane proteins that catalyse the exchange of H+ for Na+ in a manner that is highly dependent on the pH. 46850 pfam06966: Protein of unknown function (DUF1295). This family contains a number of bacterial and eukaryotic proteins of unknown function that are approximately 300 residues long. 46851 pfam06967: Mo-dependent nitrogenase C-terminus. This family represents the C-terminus (approximately 80 residues) of a number of bacterial Mo-dependent nitrogenases. These are involved in nitrogen fixation in cyanobacteria. Note that many family members are hypothetical proteins. 46852 pfam06968: Biotin and Thiamin Synthesis associated domain. Biotin synthase (BioB), EC:2.8.1.6 , catalyses the last step of the biotin biosynthetic pathway. The reaction consists in the introduction of a sulphur atom into dethiobiotin. BioB functions as a homodimer. Thiamin synthesis if a complex process involving at least six gene products (ThiFSGH, ThiI and ThiJ). Two of the proteins required for the biosynthesis of the thiazole moiety of thiamine (vitamin B(1)) are ThiG and ThiH (this family) and form a heterodimer. Both of these reactions are thought of involve the binding of co-factors, and both function as dimers. This domain therefore may be involved in co-factor binding or dimerisation (Finn, RD personal observation).. 46853 pfam06969: HemN C-terminal region. Members of this family are all oxygen-independent coproporphyrinogen-III oxidases (HemN). This enzyme catalyses the oxygen-independent conversion of coproporphyrinogen-III to protoporphyrinogen-IX, one of the last steps in haem biosynthesis. The function of this domain is unclear, but comparison to other proteins containing a radical SAM domain (pfam04055) suggest it may be a substrate binding domain. 46854 pfam06970: Replication initiator protein A (RepA) N-terminus. This family represents the N-terminus (approximately 80 residues) of replication initiator protein A (RepA), a DNA replication initiator in plasmids. Most proteins in this family are bacterial, but archaeal and eukaryotic members are also included. 46855 pfam06971: Putative DNA-binding protein C-terminus. This family represents the C-terminus (approximately 30 residues) of a number of putative bacterial DNA-binding proteins. 46856 pfam06972: Protein of unknown function (DUF1296). This family represents a conserved region approximately 60 residues long within a number of plant proteins of unknown function. 46857 pfam06973: Protein of unknown function (DUF1297). This family represents the C-terminus (approximately 200 residues) of a number of archaeal proteins of unknown function. One member is annotated as being a possible carboligase enzyme. 46858 pfam06974: Protein of unknown function (DUF1298). This family represents the C-terminus (approximately 170 residues) of a number of hypothetical plant proteins of unknown function. 46859 pfam06975: Protein of unknown function (DUF1299). This family represents a conserved region approximately 50 residues long within a number of proteins of unknown function that seem to be specific to Arabidopsis thaliana. Note that many family members contain multiple copies of this region. 46860 pfam06976: Protein of unknown function (DUF1300). This family represents a conserved region approximately 80 residues long within a number of proteins of unknown function that seem to be specific to C. elegans. Some family members contain more than one copy of this region. 46861 pfam06977: SdiA-regulated. This family represents a conserved region approximately 100 residues long within a number of hypothetical bacterial proteins that may be regulated by SdiA, a member of the LuxR family of transcriptional regulators. Some family members contain the pfam01436 repeat. 46862 pfam06978: Ribonucleases P/MRP protein subunit POP1. This family represents a conserved region approximately 150 residues long located towards the N-terminus of the POP1 subunit that is common to both the RNase MRP and RNase P ribonucleoproteins (EC:3.1.26.5). These RNA-containing enzymes generate mature tRNA molecules by cleaving their 5' ends. 46863 pfam06979: Protein of unknown function (DUF1301). This family contains a number of eukaryotic proteins of unknown function that are approximately 160 residues long. 46864 pfam06980: Protein of unknown function (DUF1302). This family contains a number of hypothetical bacterial proteins of unknown function that are approximately 600 residues long. Most family members seem to be from Pseudomonas. 46865 pfam06981: Flp pilus assembly protein CpaB. This family represents a conserved region approximately 120 residues long within the bacterial Flp pilus assembly protein CpaB. 46866 pfam06982: Anaerobic glycerol-3-phosphate dehydrogenase subunit B (GlpB). This family represents the B subunit of anaerobic glycerol-3-phosphate dehydrogenase (EC:1.1.99.5) in bacteria and archaea. Glycerol-3-phosphate dehydrogenase converts glycerol-3-phosphate to dihydroxyacetone. In E. coli, GlpB is thought to mediate electron transfer from the enzyme to the terminal electron acceptor, fumarate. 46867 pfam06983: 3-demethylubiquinone-9 3-methyltransferase. This family represents a conserved region approximately 100 residues long within a number of bacterial and archaeal 3-demethylubiquinone-9 3-methyltransferases (EC:2.1.1.64). Note that some family members contain more than one copy of this region, and that many members are hypothetical proteins. 46868 pfam06984: Mitochondrial 39-S ribosomal protein L47 (MRP-L47). This family represents the N-terminal region (approximately 8 residues) of the eukaryotic mitochondrial 39-S ribosomal protein L47 (MRP-L47). Mitochondrial ribosomal proteins (MRPs) are the counterparts of the cytoplasmic ribosomal proteins, in that they fulfil similar functions in protein biosynthesis. However, they are distinct in number, features and primary structure. 46869 pfam06985: Heterokaryon incompatibility protein (HET). This family represents a conserved region approximately 150 residues long within various heterokaryon incompatibility proteins that seem to be restricted to ascomycete fungi. Genetic differences in specific het genes prevent a viable heterokaryotic fungal cell from being formed by the fusion of filaments from two different wild-type strains. Many family members also contain the pfam00400 repeat and the pfam05729 domain. 46870 pfam06986: Mating pair stabilisation protein TraN. This family represents a short conserved region (approximately 40 residues) within TraN bacterial mating pair stabilisation proteins. TraN is thought to be required for the formation of stable mating aggregates during F-directed conjugation. This region contains five conserved cysteine residues. 46871 pfam06987: Protein of unknown function (DUF1303). This family consists of several highly conserved Orthopoxvirus proteins known as the C5L protein in Variola virus. The function of this family is unknown. 46872 pfam06988: NifT/FixU protein. This family consists of several NifT and FixU bacterial proteins. The function of NifT is unknown although it is thought that the protein may be involved in biosynthesis of the FeMo cofactor of nitrogenase although perturbation of nifT expression in K. pneumoniae has only a limited effect on nitrogen fixation. 46873 pfam06989: BAALC N-terminus. This family represents the N-terminal region of the mammalian BAALC proteins. BAALC (brain and acute leukaemia, cytoplasmic), that is highly conserved among mammals but evidently absent from lower organisms. Two isoforms are specifically expressed in neuroectoderm-derived tissues, but not in tumours or cancer cell lines of non-neural tissue origin. It has been shown that blasts from a subset of patients with acute leukaemia greatly overexpress eight different BAALC transcripts, resulting in five protein isoforms. Among patients with acute myeloid leukaemia, those overexpressing BAALC show distinctly poor prognosis, pointing to a key role of the BAALC products in leukaemia. It has been suggested that BAALC is a gene implicated in both neuroectodermal and hematopoietic cell functions. . 46874 pfam06990: Galactose-3-O-sulfotransferase. This family consists of several mammalian galactose-3-O-sulfotransferase proteins. Gal-3-O-sulfotransferase is thought to play a critical role in 3 '-sulfation of N-acetyllactosamine in both O- and N-glycans. 46875 pfam06991: Micro-fibrillar-associated protein 1 C-terminus. This family represents the C-terminus (approximately 300 residues) of eukaryotic micro-fibrillar-associated protein 1, which is a component of elastin-associated microfibrils in the extracellular matrix. 46876 pfam06992: Replication protein P. This family consists of several Bacteriophage lambda replication protein P like proteins. The bacteriophage lambda P protein promoters replication of the phage chromosome by recruiting a key component of the cellular replication machinery to the viral origin. Specifically, P protein delivers one or more molecules of Escherichia coli DnaB helicase to a nucleoprotein structure formed by the lambda O initiator at the lambda replication origin. . 46877 pfam06993: Protein of unknown function (DUF1304). This family consists of several hypothetical bacterial proteins of around 120 residues in length. The function of this family is unknown. 46878 pfam06994: Involucrin. This family represents a conserved region approximately 60 residues long, multiple copies of which are found within eukaryotic involucrin, and which is rich in glutamine and glutamic acid residues. Involucrin forms part of the insoluble cornified cell envelope (a specialised protective barrier) of stratified squamous epithelia. Members of this family seem to be restricted to mammals. 46879 pfam06995: Phage P2 GpU. This family consists of several bacterial and phage proteins of around 130 residues in length which seem to be related to the bacteriophage P2 GpU protein which is thought to be involved in tail assembly. 46880 pfam06996: Protein of unknown function (DUF1305). This family consists of several hypothetical bacterial proteins of around 300 residues in length. The function of this family is unknown although one member from Salmonella enterica is thought to be involved in virulence. 46881 pfam06997: Protein of unknown function (DUF1306). This family consists of a number of bacterial and phage proteins of around 250 residues in length. The family contains several hypothetical proteins and the Gp17 protein from bacteriophage A118. The function of this family is unknown. 46882 pfam06998: Protein of unknown function (DUF1307). This family consists of several hypothetical bacterial proteins of around 150 residues in length. Some family members are described as putative lipoproteins but the function of the family is unknown. 46883 pfam06999: Sucrase/ferredoxin-like. This family contains a number of bacterial and eukaryotic proteins approximately 400 residues long that resemble ferredoxin and appear to have sucrolytic activity. 46884 pfam07000: Protein of unknown function (DUF1308). This family consists of several hypothetical eukaryotic sequences of around 400 residues in length. The function of this family is unknown. 46885 pfam07001: BAT2 N-terminus. This family represents the N-terminus (approximately 200 residues) of the proline-rich protein BAT2. BAT2 is similar to other proteins with large proline-rich domains, such as some nuclear proteins, collagens, elastin, and synapsin. 46886 pfam07002: Copine. This family represents a conserved region approximately 180 residues long within eukaryotic copines. Copines are Ca(2+)-dependent phospholipid-binding proteins that are thought to be involved in membrane-trafficking, and may also be involved in cell division and growth. 46887 pfam07003: RofA transcriptional regulator. This family contains a number of bacterial RofA transcriptional regulators that seem to be largely restricted to streptococci. These proteins have been shown to regulate the expression of important bacterial adhesins. 46888 pfam07004: Protein of unknown function (DUF1309). This family represents a short conserved region (approximately 30 residues long) that is repeated in several eukaryotic proteins of unknown function. One member of this family is annotated as possibly being related to alpha collagen. 46889 pfam07005: Type III effector Hrp-dependent outer proteins (Hop). This family represents a conserved region approximately 200 residues long within bacterial type III effector Hrp-dependent outer proteins (Hop). These form part of a secretion system in certain Gram-negative bacterial pathogens of plants and animals that allows them to inject virulence effector proteins into host cells. Many members of this family are hypothetical proteins. 46890 pfam07006: Protein of unknown function (DUF1310). This family consists of several hypothetical proteins of around 125 residues in length. Members of this family seem to be specific to Listeria and Streptococcus species. The function of this family is unknown. 46891 pfam07007: Protein of unknown function (DUF1311). This family consists of several bacterial proteins of around 120 residues in length. Members of this family contain four highly conserved cysteine residues. The function of this family is unknown. 46892 pfam07008: Polyadenylate binding protein-interacting protein 2 (Paip2). This family consists of several eukaryotic polyadenylate binding protein-interacting protein 2 (Paip2) sequences. The cap structure and the poly(A) tail of eukaryotic mRNAs act synergistically to enhance translation. This effect is mediated by a direct interaction of eukaryotic initiation factor 4G and poly(A) binding protein (PABP), which brings about circularisation of the mRNA. There are two PABP-interacting proteins, one, Paip1,stimulates translation, and the other, Paip2, competes with Paip1 for binding to PABP, repressing translation. 46893 pfam07009: Protein of unknown function (DUF1312). This family consists of several bacterial proteins of around 120 residues in length. The function of this family is unknown. 46894 pfam07010: Endomucin. This family consists of several mammalian endomucin proteins. Endomucin is an early endothelial-specific antigen that is also expressed on putative hematopoietic progenitor cells. . 46895 pfam07011: Protein of unknown function (DUF1313). This family consists of several hypothetical plant proteins of around 100 residues in length. The function of this family is unknown. 46896 pfam07012: Curlin associated repeat. This family consists of several bacterial repeats of around 30 residues in length. These repeats are often found in multiple copies in the curlin proteins CsgA and CsgB. Curli fibres are thin aggregative surface fibres, connected with adhesion, which bind laminin, fibronectin, plasminogen, human contact phase proteins, and major histocompatibility complex (MHC) class I molecules. Curli fibres are coded for by the csg gene cluster, which is comprised of two divergently transcribed operons. One operon encodes the csgB, csgA, and csgC genes, while the other encodes csgD, csgE, csgF, and csgG. The assembly of the fibres is unique and involves extracellular self-assembly of the curlin subunit (CsgA), dependent on a specific nucleator protein (CsgB). CsgD is a transcriptional activator essential for expression of the two curli fibre operons, and CsgG is an outer membrane lipoprotein involved in extracellular stabilisation of CsgA and CsgB. . 46897 pfam07013: Protein of unknown function (DUF1314). This family consists of several Alphaherpesvirus proteins of around 200 residues in length. The function of this family is unknown. 46898 pfam07014: Hs1pro-1 protein C-terminus. This family represents the C-terminus (approximately 270 residues) of a number of plant Hs1pro-1 proteins, which are believed to confer nematode resistance. 46899 pfam07015: VirC1 protein. This family consists of several bacterial VirC1 proteins. In Agrobacterium tumefaciens, a cis-active 24-base-pair sequence adjacent to the right border of the T-DNA, called overdrive, stimulates tumour formation by increasing the level of T-DNA processing. It is thought that the virC operon which enhances T-DNA processing probably does so because the VirC1 protein interacts with overdrive. It has now been shown that the virC1 gene product binds to overdrive but not to the right border of T-DNA. . 46900 pfam07016: Cysteine-rich, acidic integral membrane protein precursor (CRAM) repeat. This family consists of several 24 residue repeats from the Trypanosoma brucei cysteine-rich, acidic integral membrane protein precursor (CRAM). CRAM is concentrated in the flagellar pocket, an invagination of the cell surface of the trypanosome where endocytosis has been documented. . 46901 pfam07017: Antimicrobial peptide resistance and lipid A acylation protein PagP. This family consists of several bacterial antimicrobial peptide resistance and lipid A acylation (PagP) proteins. The bacterial outer membrane enzyme PagP transfers a palmitate chain from a phospholipid to lipid A. In a number of pathogenic Gram-negative bacteria, PagP confers resistance to certain cationic antimicrobial peptides produced during the host innate immune response. . 46902 pfam07018: SepL/SsaL protein. This family consists of several bacterial SepL and SsaL proteins. SepL plays an essential role in the infection process of enterohemorrhagic Escherichia coli and is thought to be responsible for the secretion of EspA, EspD, and EspB. SsaL of Salmonella typhimurium is thought to be a component of the type III secretion system. 46903 pfam07019: Rab5-interacting protein (Rab5ip). This family consists of several Rab5-interacting protein (RIP5 or Rab5ip ) sequences. The ras-related GTPase rab5 is rate-limiting for homotypic early endosome fusion. Rab5ip represents a novel rab5 interacting protein that may function on endocytic vesicles as a receptor for rab5-GDP and participate in the activation of rab5. 46904 pfam07020: Orthopoxvirus C10L protein. This family consists of several Orthopoxvirus C10L proteins. C10L viral protein can play an important role in vaccinia virus evasion of the host immune system. It may consist in the blockade of IL-1 receptors by the C10L protein, a homologue of the IL-1 Ra. . 46905 pfam07021: Methionine biosynthesis protein MetW. This family consists of several bacterial and one archaeal methionine biosynthesis MetW proteins. Biosynthesis of methionine from homoserine in Pseudomonas putida takes place in three steps. The first step is the acylation of homoserine to yield an acyl-L-homoserine. This reaction is catalysed by the products of the metXW genes and is equivalent to the first step in enterobacteria, gram-positive bacteria and fungi, except that in these microorganisms the reaction is catalysed by a single polypeptide (the product of the metA gene in Escherichia coli and the met5 gene product in Neurospora crassa). In Pseudomonas putida, as in gram-positive bacteria and certain fungi, the second and third steps are a direct sulfhydrylation that converts the O-acyl-L-homoserine into homocysteine and further methylation to yield methionine. The latter reaction can be mediated by either of the two methionine synthetases present in the cells. . 46906 pfam07022: Bacteriophage CI repressor protein. This family consists of several phage CI repressor proteins and related bacterial sequences. The CI repressor is known to function as a transcriptional switch, determining whether transcription is lytic or lysogenic. 46907 pfam07023: Protein of unknown function (DUF1315). This family consists of several bacterial proteins of around 90 residues in length. The function of this family is unknown. 46908 pfam07024: ImpE protein. This family consists of several bacterial proteins including ImpE from Rhizobium leguminosarum. It has been suggested that the imp locus is involved in the secretion to the environment of proteins, including periplasmic RbsB protein, that cause blocking of infection specifically in pea plants. The exact function of this family is unknown. 46909 pfam07025: Protein of unknown function (DUF1316). This family consists of several hypothetical bacterial proteins of around 150 residues in length. The function of this family is unknown. 46910 pfam07026: Protein of unknown function (DUF1317). This family consists of several hypothetical bacterial and phage proteins of around 60 residues in length. The function of this family is unknown. 46911 pfam07027: Protein of unknown function (DUF1318). This family consists of several bacterial proteins of around 100 residues in length and is often known as YdbL. The function of this family is unknown. 46912 pfam07028: Protein of unknown function (DUF1319). This family contains a number of viral proteins of unknown function approximately 200 residues long. Family members seem to be restricted to badnaviruses. 46913 pfam07029: CryBP1 protein. This family consists of several CryBP1 like proteins from Bacillus thuringiensis and Paenibacillus popilliae. Members of this family are thought to be involved in the overall toxicity of the bacteria to their hosts. 46914 pfam07030: Protein of unknown function (DUF1320). This family consists of both hypothetical bacterial and phage proteins of around 145 residues in length. The function of this family is unknown. 46915 pfam07031: Protein of unknown function (DUF1321). This family consists of several hypothetical bacterial proteins of around 170 residues in length. The function of this family is unknown. 46916 pfam07032: Protein of unknown function (DUF1322). This family consists of several hypothetical 9.4 kDa Borrelia burgdorferi (Lyme disease spirochete) proteins of around 78 residues in length. The function of this family is unknown. 46917 pfam07033: Orthopoxvirus B11R protein. This family consists of several Orthopoxvirus B11R proteins of around 70 residues in length. The function of this family is unknown. 46918 pfam07034: Origin recognition complex (ORC) subunit 3 N-terminus. This family represents the N-terminus (approximately 300 residues) of subunit 3 of the eukaryotic origin recognition complex (ORC). Origin recognition complex (ORC) is composed of six subunits that are essential for cell viability. They collectively bind to the autonomously replicating sequence (ARS) in a sequence-specific manner and lead to the chromatin loading of other replication factors that are essential for initiation of DNA replication. 46919 pfam07035: Colon cancer-associated protein Mic1-like. This family represents the C-terminus (approximately 160 residues) of a number of proteins that resemble colon cancer-associated protein Mic1. 46920 pfam07036: Starch synthase III. This family represents a conserved region approximately 160 residues long that is repeated multiple times in plant starch synthase III (EC:2.4.1.21). Starch synthases extend alpha-1,4 glucan chains by catalysing the transfer of the glucosyl moiety of ADP-Glc to the non-reducing end of a pre-existing alpha-1,4 glucan. SS-III is thought to be primarily involved in the synthesis of amylopectin rather than amylose. 46921 pfam07037: Protein of unknown function (DUF1323). This family consists of several hypothetical Enterobacterial proteins of around 120 residues in length. The function of this family is unknown. 46922 pfam07038: Protein of unknown function (DUF1324). This family consists of several Circovirus proteins of around 60 residues in length. The function of this family is unknown. 46923 pfam07039: Protein of unknown function (DUF1325). This family consists of several hypothetical eukaryotic proteins of around 300 residues in length. The function of this family is unknown. 46924 pfam07040: Protein of unknown function (DUF1326). This family consists of several hypothetical bacterial proteins which seem to be found exclusively in Rhizobium and Ralstonia species. Members of this family are typically around 210 residues in length and contain 5 highly conserved cysteine residues at their N-terminus. The function of this family is unknown. 46925 pfam07041: Protein of unknown function (DUF1327). This family consists of several hypothetical bacterial proteins of around 115 residues in length which seem to be specific to Escherichia coli. The function of this family is unknown. 46926 pfam07042: TrfA protein. This family consists of several bacterial TrfA proteins. The trfA operon of broad-host-range IncP plasmids is essential to activate the origin of vegetative replication in diverse species. The trfA operon encodes two ORFs. The first ORF is highly conserved and encodes a putative single-stranded DNA binding protein (Ssb). The second, trfA, contains two translational starts as in the IncP alpha plasmids, generating related polypeptides of 406 (TrfA1) and 282 (TrfA2) amino acids. TrfA2 is very similar to the IncP alpha product, whereas the N-terminal region of TrfA1 shows very little similarity to the equivalent region of IncP alpha TrfA1. This region has been implicated in the ability of IncP alpha plasmids to replicate efficiently in Pseudomonas aeruginosa. 46927 pfam07043: Protein of unknown function (DUF1328). This family consists of several hypothetical bacterial proteins of around 50 residues in length. The function of this family is unknown. 46928 pfam07044: Protein of unknown function (DUF1329). This family consists of several hypothetical bacterial proteins of around 475 residues in length. The majority of family members are from Pseudomonas species but the family also contains sequences from Shewanella oneidensis and Thauera aromatica. 46929 pfam07045: Protein of unknown function (DUF1330). This family consists of several hypothetical bacterial proteins of around 90 residues in length. The function of this family is unknown. 46930 pfam07046: Cytoplasmic repetitive antigen (CRA) like repeat. This family consists of several repeats of around 42 residues in length. These repeated sequences are found in multiple copies in Trypanosoma cruzi antigens, one member contains 23 copies of this repeat. 46931 pfam07047: Optic atrophy 3 protein (OPA3). This family consists of several optic atrophy 3 (OPA3) proteins. OPA3 deficiency causes type III 3-methylglutaconic aciduria (MGA) in humans. This disease manifests with early bilateral optic atrophy, spasticity, extrapyramidal dysfunction, ataxia, and cognitive deficits, but normal longevity. . 46932 pfam07048: Protein of unknown function (DUF1331). This family consists of several Circovirus proteins of around 35 residues in length. Members of this family are described as ORF-10 proteins and their function is unknown. 46933 pfam07049: Protein of unknown function (DUF1332). This family consists of several hypothetical bacterial proteins of around 165 residues in length. The function of this family is unknown. 46934 pfam07050: Protein of unknown function (DUF1333). This family consists of several hypothetical bacterial proteins of around 145 residues in length. Members of this family appear to be specific to the Orders Bacillales and Lactobacillales. The function of this family is unknown. 46935 pfam07051: Ovarian carcinoma immunoreactive antigen (OCIA). This family consists of several ovarian carcinoma immunoreactive antigen (OCIA) and related eukaryotic sequences. The function of this family is unknown. 46936 pfam07052: Hepatocellular carcinoma-associated antigen 59. This family represents a conserved region approximately 100 residues long within mammalian hepatocellular carcinoma-associated antigen 59 and similar proteins. Family members are found in a variety of eukaryotes, mainly as hypothetical proteins. 46937 pfam07053: Protein of unknown function (DUF1334). This family contains a number of hypothetical archaeal, bacterial and eukaryotic proteins of unknown function. These may possibly be integral membrane proteins. 46938 pfam07054: Pericardin like repeat. This family consists of several repeated sequences of around 34 residues in length. This repeat is found in multiple copies in the Drosophila pericardin and other extracellular matrix proteins. 46939 pfam07055: Short-chain alcohol dehydrogenase. This family contains a number of bacterial short-chain alcohol dehydrogenases that are approximately 400 residues long. Alcohol dehydrogenases display a wide variety of substrate specificities, and play an important role in a broad range of physiological processes. Short-chain alcohol dehydrogenases form part of a group of alcohol dehydrogenases that are dependent upon NADP. 46940 pfam07056: Protein of unknown function (DUF1335). This family represents a conserved region approximately 130 residues long within a number of proteins of unknown function that seem to be specific to the white spot syndrome virus (WSSV).. 46941 pfam07057: DNA helicase TraI. This family represents a conserved region approximately 130 residues long within the bacterial DNA helicase TraI (EC:3.6.1.-). TraI is a bifunctional protein that catalyses the unwinding of duplex DNA as well as acts as a sequence-specific DNA trans-esterase, providing the site- and strand-specific nick required to initiate DNA transfer. 46942 pfam07058: Myosin II heavy chain-like. This family represents a conserved region within a number of myosin II heavy chain-like proteins that seem to be specific to Arabidopsis thaliana. 46943 pfam07059: Protein of unknown function (DUF1336). This family represents the C-terminus (approximately 250 residues) of a number of hypothetical plant proteins of unknown function. 46944 pfam07060: ProFAR isomerase-like. This family contains a number of ProFAR isomerase-like proteins found in eukaryotes, bacteria and archaea. ProFAR isomerase (EC:5.3.1.16) is involved in the biosynthesis of the amino acid histidine, through catalysis of the irreversible isomerisation of an amino-aldose to an amino-ketose. 46945 pfam07061: Protein of unknown function (DUF1337).. 46946 pfam07062: Clc-like. This family contains a number of Clc-like proteins that are approximately 250 residues long. These seem to be specific to Caenorhabditis elegans. 46947 pfam07063: Protein of unknown function (DUF1338). This family represents the C-terminus (approximately 100 residues) of a number of hypothetical bacterial proteins of unknown function. 46948 pfam07064: Protein of unknown function (DUF1339). This family represents a conserved region approximately 300 residues long within a number of hypothetical eukaryotic proteins of unknown function. These are possibly integral membrane proteins. 46949 pfam07065: D123. This family contains a number of eukaryotic D123 proteins approximately 330 residues long. It has been shown that mutated variants of D123 exhibit temperature-dependent differences in their degradation rate. 46950 pfam07066: Lactococcus phage M3 protein. This family consists of several Lactococcus phage middle-3 (M3) proteins of around 160 residues in length. The function of this family is unknown. 46951 pfam07067: Protein of unknown function (DUF1340). This family consists of several hypothetical Streptococcus thermophilus bacteriophage proteins of around 235 residues in length. The function of this family is unknown. 46952 pfam07068: Major capsid protein Gp23. This family contains a number of major capsid Gp23 proteins approximately 500 residues long, from T4-like bacteriophages. 46953 pfam07069: Porcine reproductive and respiratory syndrome virus (PRRSV) 2b protein. This family consists of several Porcine reproductive and respiratory syndrome virus (PRRSV) ORF2b proteins. The function of this family is unknown however it is known that large amounts of 2b protein are present in the virion and it is thought that this protein may be an integral component of the virion. 46954 pfam07070: SpoOM protein. This family consists of several bacterial SpoOM proteins which are thought to control sporulation in Bacillus subtilis.Spo0M exerts certain negative effects on sporulation and its gene expression is controlled by sigmaH. 46955 pfam07071: Protein of unknown function (DUF1341). This family consists of several hypothetical bacterial proteins of around 220 residues in length. The function of this family is unknown. 46956 pfam07072: Protein of unknown function (DUF1342). This family consists of several hypothetical bacterial proteins of around 250 residues in length. Members of this family are often known as YacF after the Escherichia coli protein. The function of this family is unknown. 46957 pfam07073: Modulator of Rho-dependent transcription termination (ROF). This family consists of several bacterial modulator of Rho-dependent transcription termination (ROF) proteins. ROF binds transcription termination factor Rho and inhibits Rho-dependent termination in vivo. . 46958 pfam07074: Translocon-associated protein, gamma subunit (TRAP-gamma). This family consists of several eukaryotic translocon-associated protein, gamma subunit (TRAP-gamma) sequences. The translocation site (translocon), at which nascent polypeptides pass through the endoplasmic reticulum membrane, contains a component previously called 'signal sequence receptor' that is now renamed as 'translocon-associated protein' (TRAP). The TRAP complex is comprised of four membrane proteins alpha, beta, gamma and delta which are present in a stoichiometric relation, and are genuine neighbours in intact microsomes. The gamma subunit is predicted to span the membrane four times. . 46959 pfam07075: Protein of unknown function (DUF1343). This family consists of several hypothetical bacterial proteins of around 400 residues in length. The function of this family is unknown. 46960 pfam07076: Protein of unknown function (DUF1344). This family consists of several short, hypothetical bacterial proteins of around 80 residues in length. Members of this family are found in Rhizobium, Agrobacterium and Brucella species. The function of this family is unknown. 46961 pfam07077: Protein of unknown function (DUF1345). This family consists of several hypothetical bacterial proteins of around 230 residues in length. The function of this family is unknown. 46962 pfam07078: Protein of unknown function (DUF1346). This family consists of several hypothetical mammalian proteins of around 320 residues in length. The function of this family is unknown although several of the family members are annotated as putative 40-2-3 proteins. 46963 pfam07079: Protein of unknown function (DUF1347). This family consists of several hypothetical bacterial proteins of around 610 residues in length. Members of this family are highly conserved and seem to be specific to Chlamydia species. The function of this family is unknown. 46964 pfam07080: Protein of unknown function (DUF1348). This family consists of several highly conserved hypothetical proteins of around 150 residues in length. The function of this family is unknown. 46965 pfam07081: Protein of unknown function (DUF1349). This family consists of several hypothetical bacterial proteins but contains one sequence from Saccharomyces cerevisiae. Members of this family are typically around 200 residues in length. The function of this family is unknown. 46966 pfam07082: Protein of unknown function (DUF1350). This family consists of several hypothetical proteins from both cyanobacteria and plants. Members of this family are typically around 250 residues in length. The function of this family is unknown but the species distribution indicates that the family may be involved in photosynthesis. 46967 pfam07083: Protein of unknown function (DUF1351). This family consists of several bacterial and phage proteins of around 230 residues in length. The function of this family is unknown. 46968 pfam07084: Thyroid hormone-inducible hepatic protein Spot 14. This family consists of several thyroid hormone-inducible hepatic protein (Spot 14 or S14) sequences. Mainly expressed in tissues that synthesise triglycerides, the mRNA coding for Spot 14 has been shown to be increased in rat liver by insulin, dietary carbohydrates, glucose in hepatocyte culture medium, as well as thyroid hormone. In contrast, dietary fats and polyunsaturated fatty acids, have been shown to decrease the amount of Spot 14 mRNA, while an elevated level of cAMP acts as a dominant negative factor. In addition, liver-specific factors or chromatin organisation of the gene have been shown to contribute to the regulation of its expression. Spot 14 protein is thought to be required for induction of hepatic lipogenesis. 46969 pfam07085: DRTGG domain. This presumed domain is about 120 amino acids in length. It is found associated with CBS domains pfam00571, as well as the CbiA domain pfam01656. The function of this domain is unknown. It is named the DRTGG domain after some of the most conserved residues. This domain may be very distantly related to a pair of CBS domains. There are no significant sequence similarities, but its length and association with CBS domains supports this idea (Bateman A, pers. obs.).. 46970 pfam07086: Protein of unknown function (DUF1352). This family consists of several hypothetical eukaryotic proteins of around 190 residues in length. The function of this family is unknown. 46971 pfam07087: Protein of unknown function (DUF1353). This family consists of several hypothetical bacterial proteins of around 100 residues in length. The function of this family is unknown. 46972 pfam07088: GvpD gas vesicle protein. This family consists of several archaeal GvpD gas vesicle proteins. GvpD is thought to be involved in the regulation of gas vesicle formation. 46973 pfam07089: Protein of unknown function (DUF1354). This family consists of several hypothetical bacterial proteins of around 570 residues in length. Members of this family are found in Escherichia coli, Yersinia pestis, Salmonella and Vibrio species. 46974 pfam07090: Protein of unknown function (DUF1355). This family consists of several hypothetical bacterial proteins of around 250 residues in length. The function of this family is unknown. 46975 pfam07091: Ribosomal RNA methyltransferase (FmrO). This family consists of several bacterial ribosomal RNA methyltransferase (aminoglycoside-resistance methyltransferase) proteins. 46976 pfam07092: Protein of unknown function (DUF1356). This family consists of several hypothetical mammalian proteins of around 250 residues in length. The function of this family is unknown. 46977 pfam07093: SGT1 protein. This family consists of several eukaryotic SGT1 proteins. Human SGT1 or hSGT1 is known to suppress GCR2 and is highly expressed in the muscle and heart. The function of this family is unknown although it has been speculated that SGT1 may be functionally analogous to the Gcr2p protein of Saccharomyces cerevisiae which is known to be a regulatory factor of glycolytic gene expression. 46978 pfam07094: Protein of unknown function (DUF1357). This family consists of several hypothetical bacterial proteins of around 225 residues in length. Members of this family appear to be specific Borrelia burgdorferi (Lyme disease spirochete). The function of this family is unknown. 46979 pfam07095: Intracellular growth attenuator protein IgaA. This family consists of several bacterial intracellular growth attenuator (IgaA) proteins. IgaA is involved in negative control of bacterial proliferation within fibroblasts. IgaA is homologous to the E. coli YrfF and P. mirabilis UmoB proteins. Whereas the biological function of YrfF is currently unknown, UmoB has been shown elsewhere to act as a positive regulator of FlhDC, the master regulator of flagella and swarming. FlhDC has been shown to repress cell division during P. mirabilis swarming, suggesting that UmoB could repress cell division via FlhDC. This biological function, if maintained in S. enterica, could sustain a putative negative control of cell division and growth exerted by IgaA in intracellular bacteria. . 46980 pfam07096: Protein of unknown function (DUF1358). This family consists of several hypothetical eukaryotic proteins of around 125 residues in length. The function of this family is unknown. 46981 pfam07097: Protein of unknown function (DUF1359). This family consists of several hypothetical bacterial and phage proteins of around 100 residues in length. Members of this family seem to be found exclusively in Lactococcus lactis and the bacteriophages that infect this species. The function of this family is unknown. 46982 pfam07098: Protein of unknown function (DUF1360). This family consists of several bacterial proteins of around 115 residues in length. Members of this family are found in Bacillus species and Streptomyces coelicolor, the function of the family is unknown. 46983 pfam07099: Protein of unknown function (DUF1361). This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown although some members are annotated as being putative integral membrane proteins. 46984 pfam07100: Protein of unknown function (DUF1362). This family consists of several hypothetical bacterial proteins of around 125 resides in length. The function of this family is unknown. 46985 pfam07101: Protein of unknown function (DUF1363). This family consists of several Trypanosoma brucei putative variant specific antigen proteins of around 80 residues in length. 46986 pfam07102: Protein of unknown function (DUF1364). This family consists of several bacterial and phage proteins of around 95 residues in length. The function of this family is unknown. . 46987 pfam07103: Protein of unknown function (DUF1365). This family consists of several bacterial and plant proteins of around 250 residues in length. The function of this family is unknown. 46988 pfam07104: Protein of unknown function (DUF1366). This family consists of several hypothetical Streptococcus thermophilus bacteriophage proteins of around 130 residues in length. One of the sequences in this family, from phage Sfi11 is known as Gp149. The function of this family is unknown. . 46989 pfam07105: Protein of unknown function (DUF1367). This family consists of several highly conserved, hypothetical bacterial and phage proteins of around 200 resides in length. The function of this family is unknown. 46990 pfam07106: Tat binding protein 1(TBP-1)-interacting protein (TBPIP). This family consists of several eukaryotic TBP-1 interacting protein (TBPIP) sequences. TBP-1 has been demonstrated to interact with the human immunodeficiency virus type 1 (HIV-1) viral protein Tat, then modulate the essential replication process of HIV. In addition, TBP-1 has been shown to be a component of the 26S proteasome, a basic multiprotein complex that degrades ubiquitinated proteins in an ATP-dependent fashion. Human TBPIP interacts with human TBP-1 then modulates the inhibitory action of human TBP-1 on HIV-Tat-mediated transactivation. 46991 pfam07107: Wound-induced protein WI12. This family consists of several plant wound-induced protein sequences related to WI12 from Mesembryanthemum crystallinum. Wounding, methyl jasmonate, and pathogen infection is known to induce local WI12 expression. WI12 expression is also thought to be developmentally controlled in the placenta and developing seeds. WI12 preferentially accumulates in the cell wall and it has been suggested that it plays a role in the reinforcement of cell wall composition after wounding and during plant development. . 46992 pfam07108: PipA protein. This family consists of several Salmonella PipA (pathogenicity island-encoded protein A) and related phage sequences. PipA is thought to contribute to enteric but not to systemic salmonellosis. . 46993 pfam07109: Magnesium-protoporphyrin IX methyltransferase C-terminus. This family represents the C-terminus (approximately 100 residues) of bacterial and eukaryotic Magnesium-protoporphyrin IX methyltransferase (EC:2.1.1.11). This converts magnesium-protoporphyrin IX to magnesium-protoporphyrin IX metylester using S-adenosyl-L-methionine as a cofactor. 46994 pfam07110: EthD protein. This family consists of several bacterial sequences which are related to the EthD protein of Rhodococcus ruber. In Rhodococcus ruber, EthD is thought to be involved in the degradation of ethyl tert-butyl ether (ETBE). EthD synthesis is induced by ETBE but it's exact function is unknown, it is however thought to be essential to the ETBE degradation system. 46995 pfam07111: Alpha helical coiled-coil rod protein (HCR). This family consists of several mammalian alpha helical coiled-coil rod HCR proteins. The function of HCR is unknown but it has been implicated in psoriasis in humans and is thought to affect keratinocyte proliferation. 46996 pfam07112: Protein of unknown function (DUF1368). This family consists of several proteins with seem to be specific to red algae plasmids. Members of this family are typically around 415 residues in length. The function of this family is unknown. 46997 pfam07113: Protein of unknown function (DUF1369). This family consists of several hypothetical bacterial proteins of around 95 residues in length. Members of this family seem to be specific to the Orders Bacillales and Lactobacillales. The function of this family is unknown. 46998 pfam07114: Protein of unknown function (DUF1370). This family consists of several hypothetical eukaryotic proteins of around 200 residues in length. Members of this family seem to be specific to mammals and their function is unknown. 46999 pfam07115: Protein of unknown function (DUF1371). This family consists of several hypothetical bacterial proteins of around 110 residues in length. The function of this family is unknown but members seem to be specific to Borrelia burgdorferi (Lyme disease spirochete).. 47000 pfam07116: Protein of unknown function (DUF1372). This family consists of several Streptococcus bacteriophage sequences and related proteins from Streptococcus species. Members of this family are typically around 100 residues in length and their function is unknown. 47001 pfam07117: Protein of unknown function (DUF1373). This family consists of several hypothetical proteins which seem to be specific to Oryzias latipes (Japanese ricefish). Members of this family are typically around 200 residues in length. The function of this family is unknown. 47002 pfam07118: Protein of unknown function (DUF1374). This family consists of several hypothetical Sulfolobus virus proteins of around 100 residues in length. The function of this family is unknown. 47003 pfam07119: Protein of unknown function (DUF1375). This family consists of several hypothetical, putative lipoproteins of around 80 residues in length. Members of this family seem to be specific to the Class Gammaproteobacteria. The function of this family is unknown. 47004 pfam07120: Protein of unknown function (DUF1376). This family consists of several hypothetical bacterial proteins of around 95 residues in length. The function of this family is unknown. 47005 pfam07121: Protein of unknown function (DUF1377). This family consists of several Drosophila melanogaster proteins of around 400 residues in length. The function of this family is unknown. 47006 pfam07122: Variable length PCR target protein (VLPT). This family consists of a number of 29 residue repeats which seem to be specific to the Ehrlichia chaffeensis variable length PCR target (VLPT) protein. Ehrlichia chaffeensis is a tick-transmitted rickettsial agent and is responsible for human monocytic ehrlichiosis (HME). The function of this family is unknown. . 47007 pfam07123: Photosystem II reaction centre W protein (PsbW). This family consists of several plant specific photosystem II reaction centre W (PsbW) proteins. PsbW is a nuclear-encoded protein located in the thylakoid membrane of the chloroplast. PsbW is a core component of photosystem II but not photosystem I. This family does not appear to be related to pfam03912. 47008 pfam07124: Phytoreovirus outer capsid protein P8. This family consists of several Phytoreovirus outer capsid protein P8 sequences. 47009 pfam07125: Protein of unknown function (DUF1378). This family consists of hypothetical bacterial and phage proteins of around 59 residues in length. Bacterial members of this family seem to be specific to Enterobacteria. The function of this family is unknown. 47010 pfam07126: Protein of unknown function (DUF1379). This family consists of several hypothetical bacterial proteins of around 180 residues in length. The function of this family is unknown. 47011 pfam07127: Late nodulin protein. This family consists of several plant specific late nodulin sequences which are homologous to the Pisum sativum (Garden pea) ENOD3 protein. ENOD3 is expressed in the late stages of root nodule formation and contains two pairs of cysteine residues toward the proteins C-terminus which may be involved in metal-binding. 47012 pfam07128: Protein of unknown function (DUF1380). This family consists of several hypothetical bacterial proteins of around 140 residues in length. Members of this family seem to be specific to Enterobacteria. The function of this family is unknown. 47013 pfam07129: Protein of unknown function (DUF1381). This family consists of several hypothetical Staphylococcus aureus and Staphylococcus aureus bacteriophage proteins of around 65 residues in length. The function of this family is unknown. 47014 pfam07130: YebG protein. This family consists of several bacterial YebG proteins of around 75 residues in length. The exact function of this protein is unknown but it is thought to be involved in the SOS response. The induction of the yebG gene occurs as cell enter into the stationary growth phase and is dependent on is dependent on cyclic AMP and H-NS. . 47015 pfam07131: Protein of unknown function (DUF1382). This family consists of several hypothetical Escherichia coli and bacteriophage lambda-like proteins of around 60 residues in length. The function of this family is unknown. 47016 pfam07132: Harpin protein (HrpN). This family consists of several bacterial HrpN harpin proteins. HrpN is a virulence determinant which elicits lesion formation in Arabidopsis and tobacco and triggers systemic resistance in Arabidopsis. . 47017 pfam07133: Merozoite surface protein (SPAM). This family consists of several Plasmodium falciparum SPAM (secreted polymorphic antigen associated with merozoites) proteins. Variation among SPAM alleles is the result of deletions and amino acid substitutions in non-repetitive sequences within and flanking the alanine heptad-repeat domain. Heptad repeats in which the a and d position contain hydrophobic residues generate amphipathic alpha-helices which give rise to helical bundles or coiled-coil structures in proteins. SPAM is an example of a P. falciparum antigen in which a repetitive sequence has features characteristic of a well-defined structural element. . 47018 pfam07134: Protein of unknown function (DUF1383). This family consists of several hypothetical Nucleopolyhedrovirus proteins of around 375 residues in length. The function of this family is unknown. 47019 pfam07135: Protein of unknown function (DUF1384). This family consists of several hypothetical Enterobacterial proteins of around 250 residues in length. The function of this family is unknown. 47020 pfam07136: Protein of unknown function (DUF1385). This family contains a number of hypothetical bacterial proteins of unknown function approximately 300 residues in length. Some family members are predicted to be metal-dependent. 47021 pfam07137: Violaxanthin de-epoxidase (VDE). This family represents a conserved region approximately 350 residues long within plant violaxanthin de-epoxidase (VDE). In higher plants, violaxanthin de-epoxidase forms part of a conserved system that dissipates excess energy as heat in the light-harvesting complexes of photosystem II (PSII), thus protecting them from photo-inhibitory damage. 47022 pfam07138: Protein of unknown function (DUF1386). This family consists of several hypothetical Nucleopolyhedrovirus proteins of around 350 residues in length. The function of this family is unknown. 47023 pfam07139: Protein of unknown function (DUF1387). This family represents a conserved region approximately 300 residues long within a number of hypothetical proteins of unknown function that seem to be restricted to mammals. 47024 pfam07140: Interferon gamma receptor alpha chain (IFNGR1). This family consists of several mammalian interferon gamma receptor (IFNGR1) proteins. Molecular interactions among cytokines and cytokine receptors form the basis of many cell-signaling pathways relevant to immune function. Human interferon-gamma (IFN-gamma) signals through a multimeric receptor complex consisting of two different but structurally related transmembrane chains: the high-affinity receptor-binding subunit (IFN-gammaRalpha) and a species specific accessory factor (AF-1 or IFN-gammaRbeta).. 47025 pfam07141: Putative bacteriophage terminase small subunit. This family consists of several putative Lactococcus bacteriophage terminase small subunit proteins. The exact function of this family is unknown. 47026 pfam07142: Repeat of unknown function (DUF1388). This family consists of several repeats of around 29 residues in length. Members of this family are found in the variable surface lipoproteins in Mycoplasma bovis and in mammalian neurofilament triplet H (NefH or NF-H) proteins. This repeat contains several Lys-Ser-Pro (KSP) motifs and in NefH these are thought to function as the main target for neurofilament directed protein kinases in vivo. 47027 pfam07143: Hydroxyneurosporene synthase (CrtC). This family consists of several purple photosynthetic bacterial hydroxyneurosporene synthase (CrtC) proteins. The enzyme catalyses the conversion of various acyclic carotenes including 1-hydroxy derivatives. This broad substrate specificity reflects the participation of CrtC in 1 '-HO-spheroidene and in spirilloxanthin biosynthesis. 47028 pfam07144: Varicellovirus UL45 protein. This family consists of several Varicellovirus UL45 or gene 15 proteins. The Equine herpesvirus 1 UL45 protein represents a type II membrane glycoprotein which has found to be non-essential for EHV-1 growth in vitro but deletion reduces the viruses' replication efficiency. 47029 pfam07145: Ataxin-2 C-terminal region. This family represents a conserved region approximately 250 residues long located towards the C-terminus of eukaryotic ataxin-2. Ataxin-2 is a protein of unknown function, within which expansion of a polyglutamine tract (due to expansion of unstable CAG repeats in the coding region of the SCA2 gene) causes spinocerebellar ataxia type 2 (SCA2), a late-onset neurodegenerative disorder. The expanded polyglutamine repeat in ataxin-2 causes disruption of the normal morphology of the Golgi complex and increased incidence of cell death. Ataxin-2 is predicted to consist of mostly non-globular domains. 47030 pfam07146: Protein of unknown function (DUF1389). This family consists of several hypothetical bacterial proteins which seem to be specific to Chlamydia pneumoniae. Members of this family are typically around 400 residues in length. The function of this family is unknown. 47031 pfam07147: Mitochondrial 28S ribosomal protein S30 (PDCD9). This family consists of several eukaryotic mitochondrial 28S ribosomal protein S30 (or programmed cell death protein 9 PDCD9) sequences. The exact function of this family is unknown although it is known to be a component of the mitochondrial ribosome and a component in cellular apoptotic signaling pathways. 47032 pfam07148: Maltose operon periplasmic protein precursor (MalM). This family consists of several maltose operon periplasmic protein precursor (MalM) sequences. The function of this family is unknown. 47033 pfam07149: Pes-10. This family consists of several Caenorhabditis elegans pes-10 and related proteins. Members of this family are typically around 400 residues in length. The function of this family is unknown. 47034 pfam07150: Protein of unknown function (DUF1390). This family consists of several Paramecium bursaria chlorella virus 1 (PBCV-1) proteins of around 250 residues in length. The function of this family is unknown. 47035 pfam07151: Protein of unknown function (DUF1391). This family consists of several Enterobacterial proteins of around 50 residues in length. Members of this family are found in Escherichia coli and Salmonella typhi where they are often known as YdfA. The function of this family is unknown. 47036 pfam07152: YaeQ protein. This family consists of several hypothetical bacterial proteins of around 180 residues in length which are often known as YaeQ. YaeQ is homologous to RfaH, a specialised transcription elongation protein. YaeQ is known to compensate for loss of RfaH function. 47037 pfam07153: Marek's disease-like virus SORF3 protein. This family consists of several SORF3 proteins from the Marek's disease-like viruses. Members of this family are around 350 residues in length. The function of this family is unknown. 47038 pfam07154: Protein of unknown function (DUF1392). This family consists of several hypothetical cyanobacterial proteins of around 150 residues in length which seem to be specific to Anabaena species. The function of this family is unknown. 47039 pfam07155: Protein of unknown function (DUF1393). This family consists of several bacterial proteins of around 180 residues in length. The function of this family is unknown. 47040 pfam07156: Prenylcysteine lyase. This family contains prenylcysteine lyases (EC:1.8.3.5) that are approximately 500 residues long. Prenylcysteine lyase is a FAD-dependent thioether oxidase that degrades a variety of prenylcysteines, producing free cysteine, an isoprenoid aldehyde and hydrogen peroxide as products of the reaction. It has been noted that this enzyme has considerable homology with ClP55, a 55 kDa protein that is associated with chloride ion pumps. 47041 pfam07157: DNA circulation protein N-terminus. This family represents the N-terminus (approximately 100 residues) of a number of phage DNA circulation proteins. 47042 pfam07158: Dicarboxylate carrier protein MatC N-terminus. This family represents the N-terminal region of the bacterial dicarboxylate carrier protein MatC. The MatC protein is an integral membrane protein that could function as a malonate carrier. 47043 pfam07159: Protein of unknown function (DUF1394). This family consists of several hypothetical eukaryotic proteins of around 320 residues in length. The function of this family is unknown. 47044 pfam07160: Protein of unknown function (DUF1395). This family consists of several hypothetical eukaryotic proteins of around 250 residues in length. The function of this family is unknown. 47045 pfam07161: Protein of unknown function (DUF1396). This family consists of several putative lipoproteins from Mycobacterium species. The function of this family is unknown. 47046 pfam07162: B9 protein. This family represents a conserved region approximately 100 residues long within the eukaryotic protein B9. B9 has been isolated from endothelial precursor cells. 47047 pfam07163: Pex26 protein. This family consists of Pex26 and related mammalian proteins. Pex26 is a type II peroxisomal membrane protein which recruits Pex6-Pex1 complexes to peroxisomes. Mutations in Pex26 can lead to human disorders. 47048 pfam07164: Putative flagellar hook-associated protein 3 (HAP3). This family consists of several putative bacterial flagellar hook associated protein 3 (HAP3 or FlgL) sequences. Members of this family appear to be specific to the Order Rhizobiales. No experimental evidence could be found to support the function assigned to family members. 47049 pfam07165: Protein of unknown function (DUF1397). This family consists of several insect specific proteins. One member is annotated as being a haemolymph glycoprotein precursor. The function of this family is unknown. 47050 pfam07166: Protein of unknown function (DUF1398). This family consists of several hypothetical Enterobacterial proteins of around 130 residues in length. Members of this family seem to be found exclusively in Escherichia coli and Salmonella species. The function of this family is unknown. 47051 pfam07167: Poly-beta-hydroxybutyrate polymerase (PhaC) N-terminus. This family represents the N-terminal region of the bacterial poly-beta-hydroxybutyrate polymerase (PhaC). Polyhydroxyalkanoic acids (PHAs) are carbon and energy reserve polymers produced in some bacteria when carbon sources are plentiful and another nutrient, such as nitrogen, phosphate, oxygen, or sulfur, becomes limiting. PHAs composed of monomeric units ranging from 3 to 14 carbons exist in nature. When the carbon source is exhausted, PHA is utilised by the bacterium. PhaC links D-(-)-3-hydroxybutyrl-CoA to an existing PHA molecule by the formation of an ester bond. 47052 pfam07168: Fatty acid elongase 3-ketoacyl-CoA synthase 1. This family contains fatty acid elongase 3-ketoacyl-CoA synthase 1, a plant enzyme approximately 350 residues long. 47053 pfam07169: Triadin. This family consists of several eukaryotic triadin proteins. Triadin is a ryanodine receptor and calsequestrin binding protein located in junctional sarcoplasmic reticulum of striated muscles. . 47054 pfam07170: Sortase B. This family contains bacterial sortase B proteins that are approximately 200 residues long. Sortase, a transpeptidase present in almost all Gram-positive bacteria, anchors a range of important surface proteins to the cell wall. 47055 pfam07171: MlrC C-terminus. This family represents the C-terminus (approximately 200 residues) of the product of a bacterial gene cluster that is involved in the degradation of the cyanobacterial toxin microcystin LR. Many members of this family are hypothetical proteins. 47056 pfam07172: Glycine rich protein family. This family of proteins includes several glycine rich proteins as well as two nodulins 16 and 24. The family also contains proteins that are induced in response to various stresses. 47057 pfam07173: Protein of unknown function (DUF1399). This family represents a conserved region approximately 150 residues long within a number of hypothetical plant proteins of unknown function. 47058 pfam07174: Fibronectin-attachment protein (FAP). This family contains bacterial fibronectin-attachment proteins (FAP). Family members are rich in alanine and proline, are approximately 300 long, and seem to be restricted to mycobacteria. These proteins contain a fibronectin-binding motif that allows mycobacteria to bind to fibronectin in the extracellular matrix. 47059 pfam07175: Osteoregulin. This family represents a conserved region approximately 180 residues long within osteoregulin, a bone-remodelling protein expressed highly in osteocytes within trabecular and cortical bone. A conserved RGD motif is found towards the C-terminal end of this region, and this is potentially involved in integrin recognition. 47060 pfam07176: Protein of unknown function (DUF1400). This family contains a number of hypothetical proteins of unknown function that seem to be specific to cyanobacteria. 47061 pfam07177: Neuralized. This family contains a conserved region approximately 60 residues long within eukaryotic neuralized and neuralized-like proteins. Neuralized belongs to a group of ubiquitin ligases and is required in a subset of Notch pathway-mediated cell fate decisions during development of the Drosophila nervous system. Some family members contain multiple copies of this region. 47062 pfam07178: TraL protein. This family consists of several bacterial TraL proteins. TraL is a predicted peripheral membrane protein which is thought to be involved in bacterial sex pilus assembly. The exact function of this family is unclear. 47063 pfam07179: SseB protein. This family consists of several SseB proteins which appear to be found exclusively in Enterobacteria. SseB is known to enhance serine-sensitivity in Escherichia coli, and is part of the Salmonella pathogenicity island 2 (SPI-2) translocon. 47064 pfam07180: Protein of unknown function (DUF1401). This family consists of several hypothetical bacterial proteins of around 135 residues in length. Members of this family appear to be found exclusively in the Enterobacteria Escherichia coli, Citrobacter rodentium and Salmonella typhi. The function of this family is unknown. 47065 pfam07181: VirC2 protein. This family consists of several VirC2 proteins which seem to be found exclusively in Agrobacterium species and Rhizobium etli. VirC2 is known to be involved in virulence in Agrobacterium species but its exact function is unclear. 47066 pfam07182: Protein of unknown function (DUF1402). This family consists of several hypothetical bacterial proteins of around 310 residues in length. Members of this family seem to be found exclusively in Agrobacterium, Rhizobium and Brucella species. The function of this family is unknown. 47067 pfam07183: Protein of unknown function (DUF1403). This family consists of several hypothetical bacterial proteins of around 320 residues in length. Members of this family are mainly found in Rhizobium and Agrobacterium species. The function of this family is unknown. 47068 pfam07184: Citrus tristeza virus P33 protein. This family consists of several Citrus tristeza virus (CTV) P33 proteins. The function of P33 is unclear although it is known that the protein is not needed for virion formation. 47069 pfam07185: Protein of unknown function (DUF1404). This family consists of several archaeal proteins of around 180 residues in length. Members of this family seem to be found exclusively in Sulfolobus tokodaii and Sulfolobus solfataricus. The function of this family is unknown. 47070 pfam07186: TraB protein. This family consists of several TraB proteins which seem to be found exclusively in Agrobacterium species. TraB is known to be involved in conjugal transfer. This family does not appear to be related to pfam01963 or pfam06447. 47071 pfam07187: Protein of unknown function (DUF1405). This family consists of several bacterial and related archaeal protein of around 180 residues in length. The function of this family is unknown. 47072 pfam07188: Kaposi's sarcoma-associated herpesvirus (KSHV) K8 protein. This family consists of Kaposi's sarcoma-associated herpesvirus (KSHV) K8 proteins. KSHV is a human Gammaherpesvirus related to Epstein-Barr virus (EBV) and herpesvirus saimiri. KSHV open reading frame K8 encodes a basic region-leucine zipper protein of 237 aa that homodimerises. K8 interacts and co-localises with human pfam04855, a cellular chromatin-remodelling factor, both in vivo and in vitro. K8 is thought to function as a transcriptional activator under specific conditions and its transactivation activity requires its interaction with the cellular chromatin remodelling factor hSNF5. . 47073 pfam07189: Splicing factor 3B subunit 10 (SF3b10). This family consists of several eukaryotic splicing factor 3B subunit 10 (SF3b10) proteins. SF3b10 is a 10 kDa subunit of the splicing factor SF3b. SF3b associates with the splicing factor SF3a and a 12S RNA unit to form the U2 small nuclear ribonucleoproteins complex. SF3b10 and SF3b14b are also thought to facilitate the interaction of U2 with the branch site. 47074 pfam07190: Protein of unknown function (DUF1406). This family consists of several Orthopoxvirus proteins of around 185 resides in length. Members of this family seem to be exclusive to Vaccinia, Camelpox and Cowpox viruses. Some family members are annotated as being C8 proteins but their function is unknown. 47075 pfam07191: Protein of unknown function (DUF1407). This family consists of several short, hypothetical bacterial proteins of around 70 residues in length. Members of this family 8 highly conserved cysteine residues. The function of the family is unknown. 47076 pfam07192: SNURF/RPN4 protein. This family consists of several mammalian SNRPN upstream reading frame (SNURF) proteins. SNURF or RPF4 is a RING-finger protein and a coregulator of androgen receptor-dependent transcription. It has been suggested that SNURF is involved in the regulation of processes required for late steps of spermatid maturation. 47077 pfam07193: Protein of unknown function (DUF1408). This family consists of several hypothetical Lactococcus lactis and related phage proteins of around 75 residues in length. The function of this family is unknown. 47078 pfam07194: P2 response regulator binding domain. The response regulators for CheA bind to the P2 domain, which is found between pfam01627 and pfam02895 as either one or two copies. Highly flexible linkers connect P2 to the rest of CheA and impart remarkable mobility to the P2 domain. This feature is thought to enhance the inter CheA dimer phosphotransfer reactions within the signalling complex, thereby amplifying the phosphorylation signal. 47079 pfam07195: Flagellar hook-associated protein 2 C-terminus. The flagellar hook-associated protein 2 (HAP2 or FliD) forms the distal end of the flagella, and plays a role in mucin specific adhesion of the bacteria. This alignment covers the C-terminal region of this family of proteins. 47080 pfam07196: Flagellin hook IN motif. The function of this region is not clear, but it is found in many flagellar hook proteins, including FliD homologues. It is normally repeated, but is also apparently seen as a singleton. A conserved IN is seen at the centre of the motif. The diversity of these motifs makes it likely that some members of the family are not identified. 47081 pfam07197: Protein of unknown function (DUF1409). This family represents a short conserved region (approximately 50 residues long), sometimes repeated, within a number of hypothetical Oryza sativa proteins of unknown function. 47082 pfam07198: Protein of unknown function (DUF1410). This family represents a conserved region approximately 180 residues long, multiple copies of which are sometimes found within hypothetical Ureaplasma parvum proteins of unknown function. 47083 pfam07199: Protein of unknown function (DUF1411). This family represents a conserved region approximately 150 residues long that is sometimes repeated within some Babesia bovis proteins of unknown function. 47084 pfam07200: Modifier of rudimentary (Mod(r)) protein. This family represents a conserved region approximately 150 residues long within a number of eukaryotic proteins that show homology with Drosophila melanogaster Modifier of rudimentary (Mod(r)) proteins. The N-terminal half of Mod(r) proteins is acidic, whereas the C-terminal half is basic, and both of these regions are represented in this family. 47085 pfam07201: Hypersensitivity response secretion protein HrpJ. This family represents a conserved region approximately 200 residues long within a number of bacterial hypersensitivity response secretion protein HrpJ and similar proteins. HrpJ forms part of a type III secretion system through which, in phytopathogenic bacterial species, virulence factors are thought to be delivered to plant cells. 47086 pfam07202: T-complex protein 10 C-terminus. This family represents the C-terminus (approximately 180 residues) of eukaryotic T-complex protein 10. The T-complex is involved in spermatogenesis in mice. 47087 pfam07203: Protein of unknown function (DUF1412). This family consists of several Caenorhabditis elegans proteins of around 70-75 residues in length. The function of this family is unknown. 47088 pfam07204: Orthoreovirus membrane fusion protein p10. This family consists of several Orthoreovirus membrane fusion protein p10 sequences. p10 is thought to be a multifunctional protein that plays a key role in virus-host interaction. . 47089 pfam07205: Protein of unknown function (DUF1413). This family consists of several hypothetical bacterial proteins which seem to be specific to Staphylococcus species. Members of this family are typically around 100 residues in length. The function of this family is unknown. 47090 pfam07206: Baculovirus late expression factor 10 (LEF-10). This family consists of several Baculovirus specific late expression factor 10 (LEF-10) sequences. LEF-10 is thought to be a late expressed structural protein although its exact function is unknown. 47091 pfam07207: Light regulated protein Lir1. This family consists of several plant specific light regulated Lir1 proteins. Lir1 mRNA accumulates in the light, reaching maximum and minimum steady-state levels at the end of the light and dark period, respectively. Plants germinated in the dark have very low levels of lir1 mRNA, whereas plants germinated in continuous light express lir1 at an intermediate but constant level. It is thought that lir1 expression is controlled by light and a circadian clock. The exact function of this family is unclear. 47092 pfam07208: Protein of unknown function (DUF1414). This family consists of several hypothetical bacterial proteins of around 70 residues in length. Members of this family are often referred to as YejL. The function of this family is unknown. 47093 pfam07209: Protein of unknown function (DUF1415). This family consists of several hypothetical bacterial proteins of around 180 residues in length. The function of this family is unknown. 47094 pfam07210: Protein of unknown function (DUF1416). This family consists of several hypothetical bacterial proteins of around 100 residues in length. Members of this family appear to be Actinomycete specific. The function of this family is unknown. 47095 pfam07211: Protein of unknown function (DUF1417). This family consists of several hypothetical bacterial and phage proteins of around 180 residues in length. The function of this family is unknown. 47096 pfam07212: Hyaluronidase protein (HylP). This family consists of several phage associated hyaluronidase proteins (EC:3.2.1.35) which seem to be specific to Streptococcus pyogenes and Streptococcus pyogenes bacteriophages. The substrate of hyaluronidase is hyaluronic acid, a sugar polymer composed of alternating N-acetylglucosamine and glucuronic acid residues. Hyaluronic acid is found in the ground substance of human connective tissue and the vitreous of the eye and also is the sole component of the capsule of group A streptococci. The capsule has been shown to be an important virulence factor of this organism by virtue of its ability to resist phagocytosis. Production by S. pyogenes of both a hyaluronic acid capsule and hyaluronidase enzymatic activity capable of destroying the capsule is an interesting, yet-unexplained, phenomenon. 47097 pfam07213: DAP10 membrane protein. This family consists of several mammalian DAP10 membrane proteins. In activated mouse natural killer (NK) cells, the NKG2D receptor associates with two intracellular adaptors, DAP10 and DAP12, which trigger phosphatidyl inositol 3 kinase (PI3K) and Syk family protein tyrosine kinases, respectively. It has been suggested that the DAP10-PI3K pathway is sufficient to initiate NKG2D-mediated killing of target cells. 47098 pfam07214: Protein of unknown function (DUF1418). This family consists of several hypothetical Enterobacterial proteins of around 100 residues in length. Members of this family are often described as YbjC. In E. coli the ybjC gene is located downstream of nfsA (which encodes the major oxygen-insensitive nitroreductase). It is thought that nfsA and ybjC form an operon an its promoter is a class I SoxS-dependent promoter. The function of this family is unknown. 47099 pfam07215: Protein of unknown function (DUF1419). This family consists of several bacterial proteins of around 110 residues in length. Members of this family seem to be specific to Agrobacterium species and to Rhizobium loti. The function of this family is unknown. 47100 pfam07216: LcrG protein. This family consists of several bacterial LcrG proteins. Yersiniae are equipped with the Yop virulon, an apparatus that allows extracellular bacteria to deliver toxic Yop proteins inside the host cell cytosol in order to sabotage the communication networks of the host cell or even to cause cell death. LcrG is a component of the Yop virulon involved in the regulation of secretion of the Yops. . 47101 pfam07217: Heterokaryon incompatibility protein Het-C. In filamentous fungi, het loci (for heterokaryon incompatibility) are believed to regulate self/nonself-recognition during vegetative growth. As filamentous fungi grow, hyphal fusion occurs within an individual colony to form a network. Hyphal fusion can occur also between different individuals to form a heterokaryon, in which genetically distinct nuclei occupy a common cytoplasm. However, heterokaryotic cells are viable only if the individuals involved have identical alleles at all het loci. . 47102 pfam07218: Rhoptry-associated protein 1 (RAP-1). This family consists of several rhoptry-associated protein 1 (RAP-1) sequences which appear to be specific to Plasmodium falciparum. 47103 pfam07219: HemY protein N-terminus. This family represents the N-terminus (approximately 150 residues) of bacterial HemY porphyrin biosynthesis proteins. This is a membrane protein involved in a late step of protoheme IX synthesis. 47104 pfam07220: Protein of unknown function (DUF1420). This family consists of several hypothetical putative lipoproteins which seem to be found specifically in the bacterium Leptospira interrogans. Members of this family are typically around 670 resides in length and their function is unknown. 47105 pfam07221: N-acylglucosamine 2-epimerase (GlcNAc 2-epimerase). This family contains a number of eukaryotic and bacterial N-acylglucosamine 2-epimerase (GlcNAc 2-epimerase) enzymes (EC:5.3.1.8) approximately 500 residues long. This converts N-acyl-D-glucosamine to N-acyl-D-mannosamine. 47106 pfam07222: Proacrosin binding protein sp32. This family consists of several mammalian specific proacrosin binding protein sp32 sequences. sp32 is a sperm specific protein which is known to bind with with 55- and 53-kDa proacrosins and the 49-kDa acrosin intermediate. The exact function of sp32 is unclear, it is thought however that the binding of sp32 to proacrosin may be involved in packaging the acrosin zymogen into the acrosomal matrix. . 47107 pfam07223: Protein of unknown function (DUF1421). This family represents a conserved region approximately 350 residues long within a number of plant proteins of unknown function. 47108 pfam07224: Chlorophyllase. This family consists of several plant specific Chlorophyllase proteins (EC:3.1.1.14). Chlorophyllase (Chlase) is the first enzyme involved in chlorophyll (Chl) degradation and catalyses the hydrolysis of ester bond to yield chlorophyllide and phytol. 47109 pfam07225: NADH-ubiquinone oxidoreductase B15 subunit (NDUFB4). This family consists of several NADH-ubiquinone oxidoreductase B15 subunit proteins (EC:1.6.5.3). . 47110 pfam07226: Protein of unknown function (DUF1422). This family consists of several hypothetical bacterial proteins of around 120 residues in length. The function of this family is unknown. 47111 pfam07227: Protein of unknown function (DUF1423). This family represents a conserved region approximately 500 residues long within a number of Arabidopsis thaliana proteins of unknown function. 47112 pfam07228: Stage II sporulation protein E (SpoIIE). This family contains a number of bacterial stage II sporulation E proteins (EC:3.1.3.16). These are required for formation of a normal polar septum during sporulation. The N-terminal region is hydrophobic and is expected to contain up to 12 membrane-spanning segments. 47113 pfam07229: VirE2. This family consists of several VirE2 proteins which seem to be specific to Agrobacterium tumefaciens and Rhizobium etli. VirE2 is known to interact, via its C terminus, with VirD4. Agrobacterium tumefaciens transfers oncogenic DNA and effector proteins to plant cells during the course of infection. Substrate translocation across the bacterial cell envelope is mediated by a type IV secretion (TFS) system composed of the VirB proteins, as well as VirD4, a member of a large family of inner membrane proteins implicated in the coupling of DNA transfer intermediates to the secretion machine. VirE2 is therefore thought to be a protein substrate of a type IV secretion system which is recruited to a member of the coupling protein superfamily. . 47114 pfam07230: Bacteriophage T4-like capsid assembly protein (Gp20). This family consists of several bacteriophage T4-like capsid assembly (or portal) proteins. The exact mechanism by which the double-stranded (ds) DNA bacteriophages incorporate the portal protein at a unique vertex of the icosahedral capsid is unknown. In phage T4, there is evidence that this vertex, constituted by 12 subunits of gp20, acts as an initiator for the assembly of the major capsid protein and the scaffolding proteins into a prolate icosahedron of precise dimensions. The regulation of portal protein gene expression is an important regulator of prohead assembly in bacteriophage T4. . 47115 pfam07231: Hs1pro-1 N-terminus. This family represents the N-terminus (approximately 180 residues) of plant Hs1pro-1, which is believed to confer resistance to nematodes. 47116 pfam07232: Protein of unknown function (DUF1424). This family consists of several archaeal proteins of around 320 residues in length. Members of this family seem to be found exclusively in Halobacterium and Haloferax species. The function of this family is unknown. 47117 pfam07233: Protein of unknown function (DUF1425). This family consists of several hypothetical bacterial proteins of around 125 residues in length. Several members of this family are described as putative lipoproteins and are often known as YcfL. The function of this family is unknown. 47118 pfam07234: Protein of unknown function (DUF1426). This family consists of several Banana bunchy top virus proteins of around 120 residues in length. One member is annotated a movement protein whereas most other family members are hypothetical. The function of this family is unknown. 47119 pfam07235: Protein of unknown function (DUF1427). This family consists of several bacterial proteins of around 100 residues in length. The function of this family is unknown. 47120 pfam07236: Phytoreovirus S7 protein. This family consists of several Phytoreovirus S7 proteins which are thought to be viral core proteins. 47121 pfam07237: Protein of unknown function (DUF1428). This family consists of several hypothetical bacterial and one archaeal sequence of around 120 residues in length. The function of this family is unknown. 47122 pfam07238: Type IV pilus assembly protein PilZ. This family consists of several bacterial type IV pilus assembly (PilZ) proteins. PilZ is thought to have a cytoplasmic location and be essential for type 4 fimbrial biogenesis but its exact function is unknown. 47123 pfam07239: Outer membrane protein OpcA. This family consists of several Neisseria species specific OpcA outer membrane proteins. Opc (formerly called 5C) is one of the major outer membrane proteins and has been shown to play an important role in meningococcal adhesion and invasion of both epithelial and endothelial cells. 47124 pfam07240: Stress-inducible humoral factor Turandot. This family consists of several Drosophila species specific Turandot proteins. The Turandot A (TotA) gene encodes a humoral factor, which is secreted from the fat body and accumulates in the body fluids. TotA is strongly induced upon bacterial challenge, as well as by other types of stress such as high temperature, mechanical pressure, dehydration, UV irradiation, and oxidative agents. It is also upregulated during metamorphosis and at high age. Flies that overexpress TotA show prolonged survival and retain normal activity at otherwise lethal temperatures. Although TotA is only induced by severe stress, it responds to a much wider range of stimuli than heat shock genes such as hsp70 or immune genes such as Cecropin A1. 47125 pfam07241: Protein of unknown function (DUF1429). This family consists of several hypothetical bacterial proteins of around 90 residues in length. The function of this family is unknown. 47126 pfam07242: Protein of unknown function (DUF1430). This family represents the C-terminus (approximately 120 residues) of a number of hypothetical bacterial proteins of unknown function. These are possibly membrane proteins involved in immunity. 47127 pfam07243: Phlebovirus glycoprotein G1. This family consists of several Phlebovirus glycoprotein G1 sequences. Members of the Bunyaviridae family acquire an envelope by budding through the lipid bilayer of the Golgi complex. The budding compartment is thought to be determined by the accumulation of the two heterodimeric membrane glycoproteins G1 and G2 in the Golgi. 47128 pfam07244: Surface antigen variable number repeat. This family is found primarily in bacterial surface antigens, normally as variable number repeats at the N-terminus. The C-terminus of these proteins is normally represented by pfam01103. There may also be a relationship to pfam03865 (personal obs: C Yeats). The alignment centres on a -GY- or -GF- motif. Some members of this family are found in the mitochondria. It is predicted to have a mixed alpha/beta secondary structure. 47129 pfam07245: Phlebovirus glycoprotein G2. This family consists of several Phlebovirus glycoprotein G2 sequences. Members of the Bunyaviridae family acquire an envelope by budding through the lipid bilayer of the Golgi complex. The budding compartment is thought to be determined by the accumulation of the two heterodimeric membrane glycoproteins G1 and G2 in the Golgi. 47130 pfam07246: Phlebovirus nonstructural protein NS-M. This family consists of several Phlebovirus nonstructural NS-M proteins which represent the N-terminal region of the M polyprotein precursor. The function of this family is unknown. 47131 pfam07247: Alcohol acetyltransferase. This family contains a number of alcohol acetyltransferase (EC:2.3.1.84) enzymes approximately 500 residues long that seem to be restricted to Saccharomyces. These catalyse the esterification of isoamyl alcohol by acetyl coenzyme A. 47132 pfam07248: Protein of unknown function (DUF1431). This family contains a number of Drosophila melanogaster proteins of unknown function. These contain several conserved cysteine residues. 47133 pfam07249: Cerato-platanin. This family contains a number of fungal cerato-platanin phytotoxic proteins approximately 150 residues long. Cerato-platanin contains four cysteine residues that form two disulphide bonds. 47134 pfam07250: Glyoxal oxidase N-terminus. This family represents the N-terminus (approximately 300 residues) of a number of plant and fungal glyoxal oxidase enzymes. Glyoxal oxidase catalyses the oxidation of aldehydes to carboxylic acids, coupled with reduction of dioxygen to hydrogen peroxide. It is an essential component of the extracellular lignin degradation pathways of the wood-rot fungus Phanerochaete chrysosporium. 47135 pfam07251: Protein of unknown function (DUF1432). This family contains a number of hypothetical bacterial proteins of unknown function that are approximately 300 residues long. 47136 pfam07252: Protein of unknown function (DUF1433). This family contains a number of hypothetical bacterial proteins of unknown function approximately 100 residues in length. 47137 pfam07253: Gypsy protein. This family consists of several Gypsy/Env proteins from Drosophila and Ceratitis fruit fly species. Gypsy is an endogenous retrovirus of Drosophila melanogaster. Phylogenetic studies suggest that occasional horizontal transfer events of gypsy occur between Drosophila species. gypsy possesses infective properties associated with the products of the envelope gene that might be at the origin of these interspecies transfers. 47138 pfam07254: Protein of unknown function (DUF1434). This family consists of several hypothetical bacterial proteins of around 135 residues in length. Members of this family all appear to be Enterobacterial proteins. The function of this family is unknown. 47139 pfam07255: Benyvirus 14KDa protein. This family consists of several Benyvirus specific 14KDa proteins of around 125 residues in length. Members of this family contain 9 conserved cysteine residues. The function of this family is unknown. 47140 pfam07256: Protein of unknown function (DUF1435). This family consists of several hypothetical Enterobacterial proteins of around 80 residues in length. The function of this family is unknown. 47141 pfam07257: Interleukin 23 alpha subunit p19 (IL23A). This family consists of several interleukin 23 alpha subunit p19 (IL23A) proteins. p19 shows no biological activity by itself; instead, it combines with the p40 subunit of IL-12 to form a novel, biologically active, composite cytokine IL-23. Activated dendritic cells secrete detectable levels of this complex. IL-23 binds to IL-12R beta 1 but fails to engage IL-12R beta 2; nonetheless, IL-23 activates Stat4 in PHA blast T cells. IL-23 induces strong proliferation of mouse memory (CD4(+)CD45Rb(low)) T cells, a unique activity of IL-23 as IL-12 has no effect on this cell population. Similar to IL-12, human IL-23 stimulates IFN-gamma production and proliferation in PHA blast T cells, as well as in CD45RO (memory) T cells. . 47142 pfam07258: HCaRG protein. This family consists of several mammalian HCaRG(hypertension-related, calcium-regulated gene) proteins. HCaRG is negatively regulated by extracellular calcium concentration, and its basal mRNA levels are higher in hypertensive animals. HCaRG is a nuclear protein potentially involved in the control of cell proliferation. . 47143 pfam07259: ProSAAS precursor. This family consists of several mammalian proSAAS precursor proteins. ProSAAS mRNA is expressed primarily in brain and other neuroendocrine tissues (pituitary, adrenal, pancreas); within brain, the mRNA is broadly distributed among neurons. ProSAAS is thought to be an endogenous inhibitor of prohormone convertase 1, it may function as a neuropeptide. N-terminal fragments of proSAAS in intracellular Pick Bodies (PBs) may cause a functional disturbance of neurons in Pick's disease. 47144 pfam07260: Progressive ankylosis protein (ANKH). This family consists of several progressive ankylosis protein (ANK or ANKH) sequences. The ANK protein spans the outer cell membrane and shuttles inorganic pyrophosphate (PPi), a major inhibitor of physiologic and pathologic calcification, bone mineralisation and bone resorption. Mutations in ANK are thought to give rise to Craniometaphyseal dysplasia (CMD) which is a rare skeletal disorder characterised by progressive thickening and increased mineral density of craniofacial bones and abnormally developed metaphyses in long bones. 47145 pfam07261: Replication initiation and membrane attachment protein (DnaB). This family consists of several bacterial replication initiation and membrane attachment (DnaB) proteins. The DnaB protein is essential for both replication initiation and membrane attachment of the origin region of the chromosome and plasmid pUB110 in Bacillus subtilis. It is known that there are two different classes (DnaBI and DnaBII) in the DnaB mutants; DnaBI is essential for both chromosome and pUB110 replication, whereas DnaBII is necessary only for chromosome replication. 47146 pfam07262: Protein of unknown function (DUF1436). This family consists of several hypothetical bacterial proteins of around 160 residues in length. The function of this family is unknown. 47147 pfam07263: Dentin matrix protein 1 (DMP1). This family consists of several mammalian dentin matrix protein 1 (DMP1) sequences. The dentin matrix acidic phosphoprotein 1 (DMP1) gene has been mapped to human chromosome 4q21. DMP1 is a bone and teeth specific protein initially identified from mineralised dentin. DMP1 is primarily localised in the nuclear compartment of undifferentiated osteoblasts. In the nucleus, DMP1 acts as a transcriptional component for activation of osteoblast-specific genes like osteocalcin. During the early phase of osteoblast maturation, Ca(2+) surges into the nucleus from the cytoplasm, triggering the phosphorylation of DMP1 by a nuclear isoform of casein kinase II. This phosphorylated DMP1 is then exported out into the extracellular matrix, where it regulates nucleation of hydroxyapatite. DMP1 is a unique molecule that initiates osteoblast differentiation by transcription in the nucleus and orchestrates mineralised matrix formation extracellularly, at later stages of osteoblast maturation. The DMP1 gene has been found to be ectopically expressed in lung cancer although the reason for this is unknown. 47148 pfam07264: Etoposide-induced protein 2.4 (EI24). This family contains a number of eukaryotic etoposide-induced 2.4 (EI24) proteins approximately 350 residues long. In cells treated with the cytotoxic drug etoposide, EI24 is induced by p53. It has been suggested to play an important role in negative cell growth control. 47149 pfam07265: Tapetum specific protein TAP35/TAP44. This family consists of several plant tapetum specific proteins. Members of this family are found in Arabidopsis thaliana, Brassica napus and Sinapis alba. Members of this family may be involved in sporopollenin formation and/or deposition. 47150 pfam07266: Protein of unknown function (DUF1437). This family consists of several hypothetical bacterial proteins of around 235 residues in length. Members of this family are often referred to as YjiH proteins but their function is unknown. 47151 pfam07267: Nucleopolyhedrovirus capsid protein P87. This family consists of several Nucleopolyhedrovirus capsid protein P87 sequences. P87 is expressed late in infection and concentrated in infected cell nuclei. 47152 pfam07268: Exported protein precursor (EppA/BapA). This family consists of a number of exported protein precursor (EppA and BapA) sequences which seem to be specific to Borrelia burgdorferi (Lyme disease spirochete). bapA gene sequences are quite stable but the encoded proteins do not provoke a strong immune response in most individuals. Conversely, EppA proteins are much more antigenic but are more variable in sequence. It is thought that BapA and EppA play important roles during the Borrelia burgdorferi infectious cycle. . 47153 pfam07269: VirB7 protein. This family consists of several VirB7 proteins from Agrobacterium and Rhizobium species. The virulence genes of the Agrobacterium tumefaciens Ti plasmid are grouped into six transcription units and direct the transfer of T-DNA into plant cells. VirB is the largest vir operon from the Ti plasmid pTiA6NC. It is thought that VirB proteins are involved in the formation of a transmembrane structure which mediates the passage of the transferred T-DNA molecule through the bacterial and plant cell membranes. 47154 pfam07270: Protein of unknown function (DUF1438). This family consists of several hypothetical proteins of around 170 residues in length which appear to be mouse specific. The function of this family is unknown. 47155 pfam07271: Cytadhesin P30/P32. This family consists of several Mycoplasma species specific Cytadhesin P32 and P30 proteins. P30 has been found to be membrane associated and localised on the tip organelle. It is thought that it is important in cytadherence and virulence. 47156 pfam07272: Orthoreovirus P17 protein. This family consists of several Orthoreovirus P17 proteins. P17 is specified be ORF2 of the S1 gene and represents a nonstructural protein which associate with cell membranes. 47157 pfam07273: Protein of unknown function (DUF1439). This family consists of several hypothetical bacterial proteins of around 190 residues in length. Several members of this family are annotated as being putative lipoproteins and are often known as YceB. The function of this family is unknown. 47158 pfam07274: Protein of unknown function (DUF1440). This family contains a number of bacterial proteins of unknown function approximately 180 residues long. These are possibly integral membrane proteins. 47159 pfam07275: Antirestriction protein (ArdA). This family consists of several bacterial antirestriction (ArdA) proteins. ArdA functions in bacterial conjugation to allow an unmodified plasmid to evade restriction in the recipient bacterium and yet acquire cognate modification. 47160 pfam07276: Apopolysialoglycoprotein (PSGP). This family represents a series of 13 reside repeats found in the apopolysialoglycoprotein of Oncorhynchus mykiss (Rainbow trout) and Oncorhynchus masou (Cherry salmon). Polysialoglycoprotein (PSGP) of unfertilised eggs of rainbow trout consists of tandem repeats of a glycotridecapeptide, Asp-Asp-Ala-Thr*-Ser*-Glu-Ala-Ala-Thr*-Gly-Pro-Ser- Gly (* denotes the attachment site of a polysialoglycan chain). In response to egg activation, PSGP is discharged by exocytosis into the space between the vitelline envelope and the plasma membrane, i.e. the perivitelline space, where the 200-kDa PSGP molecules undergo rapid and dramatic depolymerisation by proteolysis into glycotridecapeptides. 47161 pfam07277: SapC. This family contains a number of bacterial SapC proteins approximately 250 residues long. In Campylobacter fetus, SapC forms part of a paracrystalline surface layer (S-layer) that confers serum resistance. 47162 pfam07278: Protein of unknown function (DUF1441). This family consists of several hypothetical Enterobacterial proteins of around 160 residues in length. The function of this family is unknown. 47163 pfam07279: Protein of unknown function (DUF1442). This family consists of several hypothetical Arabidopsis thaliana proteins of around 225 residues in length. The function of this family is unknown. 47164 pfam07280: Protein of unknown function (DUF1443). This family consists of several Baculovirus proteins of around 55 residues in length. The function of this family is unknown. 47165 pfam07281: Insulin-induced protein (INSIG). This family contains a number of eukaryotic Insulin-induced proteins (INSIG-1 and INSIG-2) approximately 200 residues long. INSIG-1 and INSIG-2 are found in the endoplasmic reticulum and bind the sterol-sensing domain of SREBP cleavage-activating protein (SCAP), preventing it from escorting SREBPs to the Golgi. Their combined action permits feedback regulation of cholesterol synthesis over a wide range of sterol concentrations. 47166 pfam07282: Putative transposase DNA-binding domain. This putative domain is found at the C-terminus of a large number of transposase proteins. This domain contains four conserved cysteines suggestive of a zinc binding domain. Given the need for transposases to bind DNA as well as the large number of DNA-binding zinc fingers we hypothesise this domain is DNA-binding. 47167 pfam07283: Conjugal transfer protein TrbH. This family contains TrbH, a bacterial conjugal transfer protein approximately 150 residues long. This contains a putative membrane lipoprotein lipid attachment site. 47168 pfam07284: 2-vinyl bacteriochlorophyllide hydratase (BCHF). This family contains the bacterial enzyme 2-vinyl bacteriochlorophyllide hydratase (EC:4.2.1.-) (approximately 150 residues long). This is involved in the light-independent bacteriochlorophyll biosynthesis pathway by adding water across the 2-vinyl group. 47169 pfam07285: Protein of unknown function (DUF1444). This family contains several hypothetical bacterial proteins of unknown function that are approximately 250 residues long. 47170 pfam07286: Protein of unknown function (DUF1445). This family represents a conserved region approximately 150 residues long within a number of hypothetical bacterial and eukaryotic proteins of unknown function. 47171 pfam07287: Protein of unknown function (DUF1446). This family consists of several bacterial and plant proteins of around 400 residues in length. The function of this family is unknown. 47172 pfam07288: Protein of unknown function (DUF1447). This family consists of several bacterial proteins of around 70 residues in length. The function of this family is unknown. 47173 pfam07289: Protein of unknown function (DUF1448). This family consists of several eukaryotic proteins of around 375 residues in length. The function of this family is unknown. 47174 pfam07290: Protein of unknown function (DUF1449). This family consists of several bacterial proteins of around 210 residues in length. The function of this family is unknown. 47175 pfam07291: Methylamine utilisation protein MauE. This family consists of several bacterial methylamine utilisation MauE proteins. Synthesis of enzymes involved in methylamine oxidation via methylamine dehydrogenase (MADH) is encoded by genes present in the mau cluster. MauE and MauD are specifically involved in the processing, transport, and/or maturation of the beta-subunit and that the absence of each of these proteins leads to production of a non-functional beta-subunit which becomes rapidly degraded. 47176 pfam07292: Nmi/IFP 35 domain (NID). This family represents a domain of approximately 90 residues that is tandemly repeated within interferon-induced 35 kDa protein (IFP 35) and the homologous N-myc-interactor (Nmi). This domain mediates Nmi-Nmi protein interactions and subcellular localisation. 47177 pfam07293: Protein of unknown function (DUF1450). This family consists of several hypothetical bacterial proteins of around 80 residues in length. Members of this family contain four highly conserved cysteine residues. The function of this family is unknown. 47178 pfam07294: Fibroin P25. This family consists of several insect fibroin P25 proteins. Silk fibroin produced by the silkworm Bombyx mori consists of a heavy chain, a light chain, and a glycoprotein, P25. The heavy and light chains are linked by a disulfide bond, and P25 associates with disulfide-linked heavy and light chains by noncovalent interactions. P25 is plays an important role in maintaining integrity of the complex. 47179 pfam07295: Protein of unknown function (DUF1451). This family consists of several hypothetical bacterial proteins of around 160 residues in length. Members of this family contain four highly conserved cysteine resides toward the C-terminal region of the protein. The function of this family is unknown. 47180 pfam07296: TraP protein. This family consists of several bacterial conjugative transfer TraP proteins from Escherichia coli and Salmonella typhimurium. TraP appears to play a minor role in conjugation and may interact with TraB, which varies in sequence along with TraP, in order to stabilise the proposed transmembrane complex formed by the tra operon products. 47181 pfam07297: Dolichol phosphate-mannose biosynthesis regulatory protein (DPM2). This family consists of several eukaryotic dolichol phosphate-mannose biosynthesis regulatory (DPM2) proteins. Biosynthesis of glycosylphosphatidylinositol and N-glycan precursor is dependent upon a mannosyl donor, dolichol phosphate-mannose (DPM). DPM2, an 84 amino acid membrane protein expressed in the endoplasmic reticulum (ER), makes a complex with DPM1 that is essential for the ER localisation and stable expression of DPM1. Moreover, DPM2 enhances binding of dolichol phosphate, a substrate of DPM synthase. Biosynthesis of DPM in mammalian cells is regulated by DPM2. 47182 pfam07298: NnrU protein. This family consists of several plant and bacterial NnrU proteins. NnrU is thought to be involved in the reduction of nitric oxide. The exact function of NnrU is unclear. It is thought however that NnrU and perhaps NnrT are required for expression of both nirK and nor. 47183 pfam07299: Fibronectin-binding protein (FBP). This family consists of several bacterial fibronectin-binding proteins which are thought to be involved in virulence in Listeria species. 47184 pfam07300: Protein of unknown function (DUF1452). This family consists of several hypothetical bacterial proteins of around 120 residues in length. Members of this family seem to be found exclusively in Rhizobium, Agrobacterium and Pseudomonas species. The function of this family is unknown. 47185 pfam07301: Protein of unknown function (DUF1453). This family consists of several hypothetical bacterial proteins of around 150 residues in length. The function of this family is unknown. Members of this family seem to be found exclusively in the Order Bacillales. 47186 pfam07302: AroM protein. This family consists of several bacterial and archaeal AroM proteins. In Escherichia coli the aroM gene is cotranscribed with aroL. The function of this family is unknown. 47187 pfam07303: Occludin and RNA polymerase II elongation factor ELL. This family represents a conserved region approximately 100 residues long within eukaryotic occludin proteins and the RNA polymerase II elongation factor ELL. Occludin is an integral membrane protein that localises to tight junctions, while ELL is an elongation factor that can increase the catalytic rate of RNA polymerase II transcription by suppressing transient pausing by polymerase at multiple sites along the DNA. 47188 pfam07304: Steroid receptor RNA activator (SRA1). This family consists of several hypothetical mammalian steroid receptor RNA activator proteins. SRA-RNAs likely to encode stable proteins are widely expressed in breast cancer cell lines. SRA-RNA is a steroid receptor co-activator which acts as a functional RNA and is classified as belonging to the growing family of functional non-coding RNAs. . 47189 pfam07305: Protein of unknown function (DUF1454). This family consists of several Enterobacterial sequences of around 200 residues in length which are often known as YiiQ proteins. The function of this family is unknown. 47190 pfam07306: Protein of unknown function (DUF1455). This family consists of several hypothetical putative outer membrane proteins which appear to be specific to Anaplasma marginale and Anaplasma ovis. 47191 pfam07307: Heptaprenyl diphosphate synthase (HEPPP synthase) subunit 1. This family contains subunit 1 of bacterial heptaprenyl diphosphate synthase (HEPPP synthase) (EC:2.5.1.30) (approximately 230 residues long). The enzyme consists of two subunits, both of which are required for catalysis of heptaprenyl diphosphate synthesis. 47192 pfam07308: Protein of unknown function (DUF1456). This family consists of several hypothetical bacterial proteins of around 150 residues in length. The function of this family is unknown. 47193 pfam07309: Flagellar protein FlaF. This family consists of several bacterial FlaF flagellar proteins. FlaF and FlaG are trans-acting, regulatory factors that modulate flagellin synthesis during flagellum biogenesis. 47194 pfam07310: Protein of unknown function (DUF1457). This family contains a number of hypothetical bacterial proteins of unknown function approximately 200 residues long. 47195 pfam07311: Protein of unknown function (DUF1458). This family consists of several hypothetical bacterial proteins as well as one archaeal sequence. Members of this family are typically of around 70 residues in length. The function of this family is unknown. 47196 pfam07312: Protein of unknown function (DUF1459). This family consists of several hypothetical Caenorhabditis elegans proteins of around 85 residues in length. The function of this family is unknown. 47197 pfam07313: Protein of unknown function (DUF1460). This family consists of several hypothetical bacterial proteins of around 260 residues in length. The function of this family is unknown. 47198 pfam07314: Protein of unknown function (DUF1461). This family contains a number of hypothetical bacterial proteins of unknown function approximately 200 residues long. These are possibly integral membrane proteins. 47199 pfam07315: Protein of unknown function (DUF1462). This family consists of several hypothetical bacterial proteins of around 100 residues in length. The function of this family is unknown. 47200 pfam07316: Protein of unknown function (DUF1463). This family consists of several hypothetical bacterial proteins of around 140 residues in length. Members of this family seem to be found exclusively in Borrelia burgdorferi (Lyme disease spirochete). The function of this family is unknown. 47201 pfam07317: YcgR protein. This family consists of several hypothetical YcgR proteins. YcgR may be involved in the flagellar motor function and may be a new member of the flagellar regulon. 47202 pfam07318: Protein of unknown function (DUF1464). This family consists of several hypothetical archaeal proteins of around 350 residues in length. The function of this family is unknown. 47203 pfam07319: Primosomal protein DnaI N-terminus. This family represents the N-terminus (approximately 120 residues) of bacterial primosomal DnaI proteins, although one family member appears to be of viral origin. DnaI is one of the components of the Bacillus subtilis replication restart primosome, and is required for the DnaB75-dependent loading of the DnaC helicase. 47204 pfam07320: Harpin-induced protein 1 (Hin1). This family contains a number of plant harpin-induced 1 (Hin1) proteins, which are involved in the plant hypersensitive response (HR).. 47205 pfam07321: Type III secretion protein YscO. This family contains the bacterial type III secretion protein YscO, which is approximately 150 residues long. YscO has been shown to be required for high-level expression and secretion of the anti-host proteins V antigen and Yops in Yersinia pestis. 47206 pfam07322: Seadornavirus Vp10. This family consists of several Seadornavirus Vp10 proteins found in the Banna and Kadipiro viruses. Members of this family are typically around 240 residues in length. The function of this family is unknown. 47207 pfam07323: Protein of unknown function (DUF1465). This family consists of several hypothetical bacterial proteins of around 180 residues in length. The function of this family is unknown. 47208 pfam07324: DiGeorge syndrome critical region 6 (DGCR6) protein. This family contains DiGeorge syndrome critical region 6 (DGCR6) proteins (approximately 200 residues long) of a number of vertebrates. DGCR6 is a candidate for involvement in the DiGeorge syndrome pathology by playing a role in neural crest cell migration into the third and fourth pharyngeal pouches, the structures from which derive the organs affected in DiGeorge syndrome. Also found in this family is the Drosophila melanogaster gonadal protein gdl. 47209 pfam07325: Curtovirus V2 protein. This family consists of several Curtovirus V2 proteins. The exact function of V2 is unclear but it is known that the protein is required for a successful host infection process. 47210 pfam07326: Protein of unknown function (DUF1466). This family consists of several hypothetical mammalian proteins of around 240 residues in length. 47211 pfam07327: Neuroparsin. This family consists of several locust specific neuroparsin proteins. Neuroparsins are produced by the A1 type of protocerebral median neurosecretory cells of the PI-CC system and display pleiotropic activities: inhibition of the effect of juvenile hormone, stimulation of fluid reabsorption of isolated recta, induction of an increase in hemolymph lipid and trehalose levels, and neurotrophic effects. 47212 pfam07328: T-DNA border endonuclease VirD1. This family consists of several T-DNA border endonuclease VirD1 proteins which appear to be found exclusively in Agrobacterium species. Agrobacterium, a plant pathogen, is capable to stably transform the plant cell with a segment of its own DNA called T-DNA (transferred DNA). This process depends, among others, on the specialised bacterial virulence proteins VirD1 and VirD2 that excise the T-DNA from its adjacent sequences. VirD1 is thought to interact with VirD2 in this process. 47213 pfam07329: YscB protein. This family consists of several bacterial Yop proteins translocation protein B (YscB) sequences. Yersinia pestis, the causative agent of plague, exports a set of virulence proteins called Yops upon contact with eukaryotic cells. YscB along with YopN, TyeA, SycN and LcrG proteins are necessary to block Yop secretion in the presence of calcium and before contact with a eukaryotic cell. YscB is thought to be a chaperone for Yop proteins. 47214 pfam07330: Protein of unknown function (DUF1467). This family consists of several bacterial proteins of around 90 residues in length. The function of this family is unknown. 47215 pfam07331: Protein of unknown function (DUF1468). This family consists of several hypothetical bacterial proteins of around 150 residues in length. The function of this family is unknown. 47216 pfam07332: Protein of unknown function (DUF1469). This family consists of several hypothetical bacterial proteins of around 140 residues in length. Members of the family seem to be found exclusively in Actinomycetes. The function of this family is unknown. 47217 pfam07333: S locus-related glycoprotein 1 binding pollen coat protein (SLR1-BP). This family consists of a number of cysteine rich SLR1 binding pollen coat like proteins. Adhesion of pollen grains to the stigmatic surface is a critical step during sexual reproduction in plants. In Brassica, S locus-related glycoprotein 1 (SLR1), a stigma-specific protein belonging to the S gene family of proteins, has been shown to be involved in this step. SLR1-BP specifically binds SLR1 with high affinity. The SLR1-BP gene is specifically expressed in pollen at late stages of development and is a member of the class A pollen coat protein (PCP) family, which includes PCP-A1, an SLG (S locus glycoprotein)-binding protein. 47218 pfam07334: Interferon-induced 35 kDa protein (IFP 35) N-terminus. This family represents the N-terminus of interferon-induced 35 kDa protein (IFP 35) (approximately 80 residues long), which contains a leucine zipper motif in an alpha helical configuration. This family also includes N-myc-interactor (Nmi), a homologous interferon-induced protein. 47219 pfam07335: Fungal chitosanase. This family consists of several fungal chitosanase proteins. Chitin, xylan, 6-O-sulphated chitosan and O-carboxymethyl chitin are indigestible by chitosanase. 47220 pfam07336: Protein of unknown function (DUF1470). This family consists of several hypothetical bacterial proteins of around 180 residues in length. Members of this family are found in Streptomyces, Rhizobium, Ralstonia, Agrobacterium and Bradyrhizobium species. The function of this family is unknown. 47221 pfam07337: DC-EC Repeat. This repeat is found in the CagY proteins - part of the CAG pathogenicity island - and involved in delivery of the protein CagA into host cells. It forms part of a surface needle structure, and this repeat may form an alpha-helical rod structure. A conserved -DC- and -EC- can be seen in regularly spaced in the alignment. 47222 pfam07338: Protein of unknown function (DUF1471). This family consists of several hypothetical Enterobacterial proteins of around 90 residues in length. Some members of this family are annotated as ydgH precursors and contain two copies of this region, one at the N-terminus and the other at the C-terminus. The function of this family is unknown. 47223 pfam07339: Protein of unknown function (DUF1472). This family consists of several Enterobacterial proteins of around 125 residues in length and contains 6 highly conserved cysteine residues. The function of this family is unknown. 47224 pfam07340: Cytomegalovirus IE1 protein. Expression from a human cytomegalovirus early promoter (E1.7) has been shown to be activated in trans by the IE2 gene product. Although the IE1 gene product alone had no effect on this early viral promoter, maximal early promoter activity was detected when both IE1 and IE2 gene products were present. The IE1 protein from cytomegalovirus is also known as UL123. 47225 pfam07341: Protein of unknown function (DUF1473). This family consists of several hypothetical bacterial proteins of around 150 residues in length. Members of this family seem to be found exclusively in Borrelia burgdorferi (Lyme disease spirochete). The function of this family is unknown. 47226 pfam07342: Protein of unknown function (DUF1474). This family consists of several bacterial proteins of around 100 residues in length. Members of this family seem to be found exclusively in Staphylococcus aureus. The function of this family is unknown. 47227 pfam07343: Protein of unknown function (DUF1475). This family consists of several hypothetical plant proteins of around 250 residues in length. Members of this family seem to be found exclusively in Arabidopsis thaliana. The function of this family is unknown. 47228 pfam07344: Amastin surface glycoprotein. This family contains the eukaryotic surface glycoprotein amastin (approximately 180 residues long).In Trypanosoma cruzi, amastin is particularly abundant during the amastigote stage. 47229 pfam07345: Protein of unknown function (DUF1476). This family consists of several hypothetical bacterial proteins of around 100 residues in length. Members of this family are found in Bradyrhizobium, Rhizobium, Brucella and Caulobacter species. The function of this family is unknown. 47230 pfam07346: Protein of unknown function (DUF1477). This family consists of several hypothetical Nucleopolyhedrovirus proteins of around 100 resides in length. The function of this family is unknown. 47231 pfam07347: NADH:ubiquinone oxidoreductase subunit B14.5a (Complex I-B14.5a). This family contains the eukaryotic NADH:ubiquinone oxidoreductase subunit B14.5a (Complex I-B14.5a) (EC:1.6.5.3). This is approximately 100 residues long, and forms part of a multiprotein complex that resides on the inner mitochondrial membrane. The main function of the complex is the transport of electrons from NADH to ubiquinone, accompanied by translocation of protons from the mitochondrial matrix to the intermembrane space. 47232 pfam07348: Syd protein. This family contains a number of bacterial Syd proteins approximately 180 residues long. It has been suggested that Syd is loosely associated with the cytoplasmic surface of the cytoplasmic membrane, and that interaction with SecY may be involved in this membrane association. 47233 pfam07349: Protein of unknown function (DUF1478). This family consists of several hypothetical Sapovirus proteins of around 165 residues in length. The function of this family is unknown. 47234 pfam07350: Protein of unknown function (DUF1479). This family consists of several hypothetical Enterobacterial proteins, of around 420 residues in length. Members of this family are often known as YbiU. The function of this family is unknown. 47235 pfam07351: Protein of unknown function (DUF1480). This family consists of several hypothetical Enterobacterial proteins of around 80 residues in length. The function of this family is unknown. 47236 pfam07352: Bacteriophage Mu Gam like protein. This family consists of bacterial and phage Gam proteins. The gam gene of bacteriophage Mu encodes a protein which protects linear double stranded DNA from exonuclease degradation in vitro and in vivo. 47237 pfam07353: Uroplakin II. This family contains uroplakin II, which is approximately 180 residues long and seems to be restricted to mammals. Uroplakin II is an integral membrane protein, and is one of the components of the apical plaques of mammalian urothelium formed by the asymmetric unit membrane - this is believed to play a role in strengthening the urothelial apical surface to prevent the cells from rupturing during bladder distension. 47238 pfam07354: Zona-pellucida-binding protein (Sp38). This family contains a number of zona-pellucida-binding proteins that seem to be restricted to mammals. These are sperm proteins that bind to the 90-kDa family of zona pellucida glycoproteins in a calcium-dependent manner. These represent some of the specific molecules that mediate the first steps of gamete interaction, allowing fertilisation to occur. 47239 pfam07355: Glycine/sarcosine/betaine reductase selenoprotein B (GRDB). This family represents a conserved region approximately 350 residues long within the selenoprotein B component of the bacterial glycine, sarcosine and betaine reductase complexes. 47240 pfam07356: Protein of unknown function (DUF1481). This family consists of several hypothetical bacterial proteins of around 230 residues in length. Members of this family are often referred to as YjaH and are found in the Orders Vibrionales and Enterobacteriales. The function of this family is unknown. 47241 pfam07357: Dinitrogenase reductase ADP-ribosyltransferase (DRAT). This family consists of several bacterial dinitrogenase reductase ADP-ribosyltransferase (DRAT) proteins. Members of this family seem to be specific to Rhodospirillum, Rhodobacter and Azospirillum species. Dinitrogenase reductase ADP-ribosyl transferase (DRAT) carries out the transfer of the ADP-ribose from NAD to the Arg-101 residue of one subunit of the dinitrogenase reductase homodimer, resulting in inactivation of that enzyme. Dinitrogenase reductase-activating glycohydrolase (DRAG) removes the ADP-ribose group attached to dinitrogenase reductase, thus restoring nitrogenase activity. The DRAT-DRAG system negatively regulates nitrogenase activity in response to exogenous NH4+ or energy limitation in the form of a shift to darkness or to anaerobic conditions. 47242 pfam07358: Protein of unknown function (DUF1482). This family consists of several Enterobacterial proteins of around 60 residues in length. The function of this family is unknown. 47243 pfam07359: Liver-expressed antimicrobial peptide 2 precursor (LEAP-2). This family consists of several mammalian liver-expressed antimicrobial peptide 2 (LEAP-2) sequences. LEAP-2 is a cysteine-rich, and cationic protein. LEAP-2 contains a core structure with two disulfide bonds formed by cysteine residues in relative 1-3 and 2-4 positions. LEAP-2 is synthesised as a 77-residue precursor, which is predominantly expressed in the liver and highly conserved among mammals. The largest native LEAP-2 form of 40 amino acid residues is generated from the precursor at a putative cleavage site for a furin-like endoprotease. In contrast to smaller LEAP-2 variants, this peptide exhibits dose-dependent antimicrobial activity against selected microbial model organisms. The exact function of this family is unclear. 47244 pfam07360: Protein of unknown function (DUF1483). This family consists of several bacterial and phage proteins of around 410 residues in length. Bacterial members of this family seem to be found exclusively in Streptococcus species. The function of this family is unknown. 47245 pfam07361: Cytochrome b562. This family contains the bacterial cytochrome b562. This forms a four-helix bundle that non-covalently binds a single heme prosthetic group... 47246 pfam07362: Post-segregation antitoxin CcdA. This family consists of several Enterobacterial post-segregation antitoxin CcdA proteins. The F plasmid-carried bacterial toxin, the CcdB protein, is known to act on DNA gyrase in two different ways. CcdB poisons the gyrase-DNA complex, blocking the passage of polymerases and leading to double-strand breakage of the DNA. Alternatively, in cells that overexpress CcdB, the A subunit of DNA gyrase (GyrA) has been found as an inactive complex with CcdB. Both poisoning and inactivation can be prevented and reversed in the presence of the F plasmid-encoded antidote, the CcdA protein. 47247 pfam07363: Protein of unknown function (DUF1484). This family consists of several hypothetical bacterial proteins of around 110 residues in length. Members of this family appear to be found exclusively in Ralstonia solanacearum. The function of this family is unknown. 47248 pfam07364: Protein of unknown function (DUF1485). This family consists of several hypothetical bacterial proteins of around 300 residues in length. Members of this family all appear to be in the Phylum Proteobacteria. The function of this family is unknown. 47249 pfam07365: Alpha conotoxin precursor. This family consists of several alpha conotoxin precursor proteins from a number of Conus species. The alpha-conotoxins are small peptide neurotoxins from the venom of fish-hunting cone snails which block nicotinic acetylcholine receptors (nAChRs). . 47250 pfam07366: Protein of unknown function (DUF1486). This family consists of several hypothetical bacterial proteins of around 125 residues in length. The function of this family is unknown. 47251 pfam07367: Fungal fruit body lectin. This family consists of several fungal fruit body lectin proteins. Fruit body lectins are thought to have insecticidal activity and may also function in capturing nematodes. . 47252 pfam07368: Protein of unknown function (DUF1487). This family consists of several uncharacterised proteins from Drosophila melanogaster. The function of this family is unknown. 47253 pfam07369: Protein of unknown function (DUF1488). This family consists of several hypothetical bacterial proteins of around 85 residues in length. The function of this family is unknown. 47254 pfam07370: Protein of unknown function (DUF1489). This family consists of several hypothetical bacterial proteins of around 150 residues in length. Members of this family seem to be founds exclusively in the Class Alphaproteobacteria. The function of this family is unknown. 47255 pfam07371: Protein of unknown function (DUF1490). This family consists of several hypothetical bacterial proteins of around 90 residues in length. Members of the family seem to be found exclusively in Mycobacterium species. The function of this family is unknown. 47256 pfam07372: Protein of unknown function (DUF1491). This family consists of several bacterial proteins of around 115 residues in length. Members of this family seem to be found exclusively in the Class Alphaproteobacteria. The function of this family is unknown. 47257 pfam07373: CAMP factor (Cfa). This family consists of several bacterial CAMP factor (Cfa) proteins which seem to be specific to Streptococcus species. The CAMP reaction is a synergistic lysis of erythrocytes by the interaction of an extracellular protein (CAMP factor) produced by some streptococcal species with the Staphylococcus aureus sphingomyelinase C (beta-toxin). . 47258 pfam07374: Protein of unknown function (DUF1492). This family consists of several hypothetical, highly conserved Streptococcal and related phage proteins of around 100 residues in length. The function of this family is unknown. 47259 pfam07375: Tenuivirus PV2 protein. This family consists of several Tenuivirus PV2 proteins. PV2 is thought to be a membrane associated protein. The function of this family is unclear. 47260 pfam07376: Prosystemin. This family consists of several plant specific prosystemin proteins. Prosystemin is the precursor protein of the 18 amino acid wound signal systemin which activates systemic defence in plant leaves against insect herbivores. 47261 pfam07377: Protein of unknown function (DUF1493). This family consists of several bacterial proteins of around 115 residues in length. Members of this family seem to be found exclusively in Salmonella and Yersinia species and several have been described as being putative cytoplasmic proteins. The function of this family is unknown. 47262 pfam07378: Flagellar protein FlbT. This family consists of several FlbT proteins. FlbT is a post-transcriptional regulator of flagellin. FlbT is associated with the 5' untranslated region (UTR) of fljK (25 kDa flagellin) mRNA and that this association requires a predicted loop structure in the transcript. Mutations within this loop abolish FlbT association and result in increased mRNA stability. It is therefore thought that FlbT promotes the degradation of flagellin mRNA by associating with the 5' UTR. 47263 pfam07379: Protein of unknown function (DUF1494). This family consists of several bacterial proteins of around 175 residues in length. Members of this family seem to be found exclusively in Chlamydia species. The function of this family is unknown. 47264 pfam07380: Pneumovirus M2 protein. This family consists of several Pneumovirus M2 proteins. The M2-1 protein of respiratory syncytial virus (RSV) is a transcription processivity factor that is essential for virus replication. 47265 pfam07381: Protein of unknown function (DUF1495). This family consists of several hypothetical archaeal proteins of around 110 residues in length. The function of this family is unknown, although one sequence is described as a putative HTH transcription regulator. 47266 pfam07382: Histone H1-like nucleoprotein HC2. This family contains the bacterial histone H1-like nucleoprotein HC2 (approximately 200 residues long), which seems to be found mostly in Chlamydia. HC2 functions in DNA condensation, although it has been suggested that it also has other roles. 47267 pfam07383: Protein of unknown function (DUF1496). This family consists of several bacterial proteins of around 90 residues in length. Members of this family seem to be found exclusively in the Orders Vibrionales and Enterobacteriales. The function of this family is unknown. 47268 pfam07384: Protein of unknown function (DUF1497). This family consists of several phage and bacterial proteins of around 59 residues in length. Members of this family seem to be found exclusively in Lactococcus lactis and the bacteriophages that infect this organism. The function of this family is unknown. 47269 pfam07385: Protein of unknown function (DUF1498). This family consists of several hypothetical bacterial proteins of around 225 residues in length. The function of this family is unknown. 47270 pfam07386: Protein of unknown function (DUF1499). This family consists of several hypothetical bacterial and plant proteins of around 125 residues in length. The function of this family is unknown. 47271 pfam07387: Seadornavirus VP7. This family consists of several Seadornavirus specific VP7 proteins of around 305 residues in length. The function of this family is unknown. 47272 pfam07388: Alpha-2,8-polysialyltransferase (POLYST). This family contains the bacterial enzyme alpha-2,8-polysialyltransferase (EC:2.4.99.-) (approximately 500 residues long). This catalyses the polycondensation of alpha-2,8-linked sialic acid required for the synthesis of polysialic acid (PSA).. 47273 pfam07389: Protein of unknown function (DUF1500). This family consists of several Orthopoxvirus specific proteins of around 100 residues in length. The function of this family is unknown. 47274 pfam07390: Mycoplasma P30 protein. This family consists of several P30 proteins which seem to be specific to Mycoplasma agalactiae. P30 is a 30-kDa immunodominant antigen and is known to be a transmembrane protein. 47275 pfam07391: NPR nonapeptide repeat. This nine residue repeat which I have called NPR after NonaPeptide Repeat. It is found in two malarial proteins and has the consensus EEhhEEhhP where h stands for a hydrophobic amino acid. 47276 pfam07392: Cyclin-dependent kinase inhibitor 2a p19Arf N-terminus. This family represents the N-terminus (approximately 50 residues) of cyclin-dependent kinase inhibitor 2a p19Arf, which seems to be restricted to mammals. This is a tumour-suppressor protein that has been shown to inhibit the growth of human tumour cells lacking functional p53 by inducing a transient G2 arrest and subsequently apoptosis. 47277 pfam07393: Exocyst complex component Sec10. This family contains the Sec10 component (approximately 650 residues long) of the eukaryotic exocyst complex, which specifically affects the synthesis and delivery of secretory and basolateral plasma membrane proteins. 47278 pfam07394: Protein of unknown function (DUF1501). This family contains a number of hypothetical bacterial proteins of unknown function approximately 400 residues long. 47279 pfam07395: Mig-14. This family contains a number of bacterial mig-14 proteins (approximately 270 residues long). In Salmonella, mig-14 contributes to resistance to antimicrobial peptides, although the mechanism is not fully understood. 47280 pfam07396: Phosphate-selective porin O and P. This family represents a conserved region approximately 400 residues long within the bacterial phosphate-selective porins O and P. These are anion-specific porins, the binding site of which has a higher affinity for phosphate than chloride ions. Porin O has a higher affinity for polyphosphates, while porin P has a higher affinity for orthophosphate. In P. aeruginosa, porin O was found to be expressed only under phosphate-starvation conditions during the stationary growth phase. 47281 pfam07397: Repeat of unknown function (DUF1502). This family consists of a number of repeats of around 34 residues in length. Members of this family seem to be found exclusively in three hypothetical Murid herpesvirus 4 proteins. The function of this family is unknown. 47282 pfam07398: Protein of unknown function (DUF1503). This family consists of several hypothetical bacterial proteins of around 250 residues in length. Members of this family seem to be found exclusively in Streptomyces coelicolor and Mycobacterium tuberculosis. The function of this family is unknown. 47283 pfam07399: Protein of unknown function (DUF1504). This family consists of several hypothetical bacterial proteins of around 440 residues in length. The function of this family is unknown. 47284 pfam07400: Interleukin 11. This family contains interleukin 11 (approximately 200 residues long). This is a secreted protein that stimulates megakaryocytopoiesis, resulting in increased production of platelets, as well as activating osteoclasts, inhibiting epithelial cell proliferation and apoptosis, and inhibiting macrophage mediator production. These functions may be particularly important in mediating the hematopoietic, osseous and mucosal protective effects of interleukin 11. Family members seem to be restricted to mammals. 47285 pfam07401: Bovine Lentivirus VIF protein. This family consists of several Lentivirus viral infectivity factor (VIF) proteins. VIF is known to be essential for ability of cell-free virus preparation to infect cells. Members of this family are specific to Bovine immunodeficiency virus (BIV) and Jembrana disease virus which also infects cattle. 47286 pfam07402: Human herpesvirus U26 protein. This family consists of several Human herpesvirus U26 proteins of around 300 residues in length. The function of this family is unknown. 47287 pfam07403: Protein of unknown function (DUF1505). This family consists of several uncharacterised Caenorhabditis elegans proteins of around 115 resides in length. Members of this family contain 6 highly conserved cysteine residues. The function of this family is unknown. 47288 pfam07404: Telomere-binding protein beta subunit (TEBP beta). This family consists of several telomere-binding protein beta subunits which appear to be specific to the family Oxytrichidae. Telomeres are specialised protein-DNA complexes that compose the ends of eukaryotic chromosomes. Telomeres protect chromosome termini from degradation and recombination and act together with telomerase to ensure complete genome replication. TEBP beta forms a complex with TEBP alpha and this complex is able to recognise and bind ssDNA to form a sequence-specific, telomeric nucleoprotein complex that caps the very 3' ends of chromosomes. . 47289 pfam07405: Protein of unknown function (DUF1506). This family consists of several bacterial proteins of around 130 residues in length. Members of this family seem to be specific to Borrelia burgdorferi (Lyme disease spirochete). The function of this family is unknown. 47290 pfam07406: NICE-3 protein. This family consists of several eukaryotic NICE-3 and related proteins. The gene coding for NICE-3 is part of the epidermal differentiation complex (EDC) which comprises a large number of genes that are of crucial importance for the maturation of the human epidermis. The function of NICE-3 is unknown. 47291 pfam07407: Seadornavirus VP6 protein. This family consists of several VP6 proteins from the Banna virus as well as a related protein VP5 from the Kadipiro virus. Members of this family are typically of around 420 residues in length. The function of this family is unknown. 47292 pfam07408: Protein of unknown function (DUF1507). This family consists of several hypothetical bacterial proteins of around 90 residues in length. The function of this family is unknown. 47293 pfam07409: Phage protein GP46. This family contains GP46 phage proteins (approximately 120 residues long).. 47294 pfam07410: Streptococcus thermophilus bacteriophage Gp111 protein. This family consists of several Streptococcus thermophilus bacteriophage Gp111 proteins of around 110 residues in length. The function of this family is unknown. 47295 pfam07411: Domain of unknown function (DUF1508). This family represents a series of bacterial domains of unknown function of around 50 residues in length. Members of this family are often found as tandem repeats and in some cases represent the whole protein. All member proteins are described as being hypothetical. 47296 pfam07412: Geminin. This family contains the eukaryotic protein geminin (approximately 200 residues long). Geminin inhibits DNA replication by preventing the incorporation of MCM complex into prereplication complex, and is degraded during the mitotic phase of the cell cycle. It has been proposed that geminin inhibits DNA replication during S, G2, and M phases and that geminin destruction at the metaphase-anaphase transition permits replication in the succeeding cell cycle. 47297 pfam07413: Betaherpesvirus immediate-early glycoprotein UL37. This family consists of several Betaherpesvirus immediate-early glycoprotein UL37 sequences. The human cytomegalovirus (HCMV) UL37 immediate-early regulatory protein is a type I integral membrane N-glycoprotein which traffics through the ER and the Golgi network. 47298 pfam07414: Yersiniabactin synthetase thiazolinyl reductase component YbtU. This family represents the thiazolinyl reductase component YbtU (approximately 350 residues long) of the bacterial four-protein yersiniabactin synthetase complex. Yersiniabactin is a virulence factor secreted by Yersinia pestis in iron-deficient microenvironments, in order to scavenge ferric ions. 47299 pfam07415: Gammaherpesvirus latent membrane protein (LMP2) protein. This family consists of several Gammaherpesvirus latent membrane protein (LMP2) proteins. Epstein-Barr virus is a human Gammaherpesvirus that infects and establishes latency in B lymphocytes in vivo. The latent membrane protein 2 (LMP2) gene is expressed in latently infected B cells and encodes two protein isoforms, LMP2A and LMP2B, that are identical except for an additional N-terminal 119 aa cytoplasmic domain which is present in the LMP2A isoform. LMP2A is thought to play a key role in either the establishment or the maintenance of latency and/or the reactivation of productive infection from the latent state. The significance of LMP2B and its role in pathogenesis remain unclear. 47300 pfam07416: Crinivirus P26 protein. This family consists of several Crinivirus P26 proteins which seem to be found exclusively in the Lettuce infectious yellows virus. The function of this family is unknown. 47301 pfam07417: Transcriptional regulator Crl. This family contains the bacterial transcriptional regulator Crl (approximately 130 residues long). This is a transcriptional regulator of the csgA curlin subunit gene for curli fibres that are found on the surface of certain bacteria. 47302 pfam07418: Acidic phosphoprotein precursor PCEMA1. This family consists of several acidic phosphoprotein precursor PCEMA1 sequences which appear to be found exclusively in Plasmodium chabaudi. PCEMA1 is an antigen that is associated with the membrane of the infected erythrocyte throughout the entire intraerythrocytic cycle. The exact function of this family is unclear. . 47303 pfam07419: PilM. This family contains the bacterial protein PilM (approximately 150 residues long). PilM is an inner membrane protein that has been predicted to function as a component of the pilin transport apparatus and thin-pilus basal body. 47304 pfam07420: Protein of unknown function (DUF1509). This family consists of several uncharacterised viral proteins from the Marek's disease-like viruses. Members of this family are typically around 400 residues in length. The function of this family is unknown. 47305 pfam07421: Neurotensin/neuromedin N precursor. This family contains the precursor of bacterial neurotensin/neuromedin N (approximately 170 residues long). This the common precursor of two biologically active related peptides, neurotensin and neuromedin N. It undergoes tissue-specific processing leading to the formation in some tissues and cancer cell lines of large peptides ending with the neurotensin or neuromedin N sequence. 47306 pfam07422: Sexual stage antigen s48/45. This family contains sexual stage s48/45 antigens from Plasmodium (approximately 450 residues long). These are surface proteins expressed by Plasmodium male and female gametes that have been shown to play a conserved and important role in fertilisation. 47307 pfam07423: Protein of unknown function (DUF1510). This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown. 47308 pfam07424: TrbM. This family contains the bacterial protein TrbM (approximately 180 residues long). In Comamonas testosteroni T-2, TrbM is derived from the IncP1beta plasmid pTSA, which encodes the widespread genes for p-toluenesulfonate (TSA) degradation. 47309 pfam07425: Pardaxin. This family consists of several Pardaxin proteins. Pardaxin, a 33-amino-acid pore-forming polypeptide toxin isolated from the Red Sea Moses sole Pardachirus marmoratus, has a helix-hinge-helix structure. This is a common structural motif found both in antibacterial peptides that can act selectively on bacterial membranes (e.g., cecropin), and in cytotoxic peptides that can lyse both mammalian and bacterial cells (e.g., melittin). Pardaxin possesses a high antibacterial activity with a significantly reduced haemolytic activity towards human red blood cells compared with melittin. Pardaxin has also been found to have a shark repellent action. 47310 pfam07426: Dynactin subunit p22. This family contains p22, the smallest subunit of dynactin, a complex that binds to cytoplasmic dynein and is a required activator for cytoplasmic dynein-mediated vesicular transport. Dynactin localises to the cleavage furrow and to the midbodies of dividing cells, suggesting that it may function in cytokinesis. Family members are approximately 170 residues long and seem to be restricted to mammals. 47311 pfam07427: Protein of unknown function (DUF1511). This family represents a conserved region approximately 130 residues long within a number of hypothetical archaeal proteins of unknown function. Some family members contain more than one copy of this region. 47312 pfam07428: 15-O-acetyltransferase Tri3. This family represents a conserved region approximately 400 residues long within 15-O-acetyltransferase (Tri3), which seems to be restricted to ascomycete fungi. In Fusarium sporotrichioides, this is required for acetylation of the C-15 hydroxyl group of trichothecenes in the biosynthesis of T-2 toxin. 47313 pfam07429: 4-alpha-L-fucosyltransferase (Fuc4NAc transferase). This family contains the bacterial enzyme 4-alpha-L-fucosyltransferase (Fuc4NAc transferase) (EC 2.4.1.-) (approximately 360 residues long). This catalyses the synthesis of Fuc4NAc-ManNAcA-GlcNAc-PP-Und (lipid III) as part of the biosynthetic pathway of enterobacterial common antigen (ECA), a polysaccharide comprised of the trisaccharide repeat unit Fuc4NAc-ManNAcA-GlcNAc. 47314 pfam07430: Phloem filament protein PP1. This family represents a conserved region approximately 200 residues long, four copies of which are found within the plant phloem filament protein PP1. This is one of the constituents of the proteinaceous filaments found in the sieve elements of Cucurbita phloem. 47315 pfam07431: Protein of unknown function (DUF1512). This family consists of several archaeal proteins of around 370 residues in length. The function of this family is unknown. 47316 pfam07432: Histone H1-like protein Hc1. This family consists of several bacterial histone H1-like Hc1 proteins which appear to be specific to Chlamydia species. Chlamydiae are prokaryotic obligate intracellular parasites that undergo a biphasic life cycle involving an infectious, extracellular form known as elementary bodies and an intracellular, replicating form termed reticulate bodies. The gene coding for Hc1 is expressed only during the late stages of the chlamydial life cycle concomitant with the reorganisation of chlamydial reticulate bodies into elementary bodies, suggesting that the Hc1 protein plays a role in the condensation of chlamydial chromatin during intracellular differentiation. 47317 pfam07433: Protein of unknown function (DUF1513). This family consists of several bacterial proteins of around 360 residues in length. The function of this family is unknown. 47318 pfam07434: CblD like pilus biogenesis initiator. This family consists of several minor pilin proteins including CblD from Burkholderia cepacia which is known to CblD be the initiator of pilus biogenesis. The family also contains a variety of Enterobacterial minor pilin proteins. 47319 pfam07435: YycH protein. This family contains the bacterial protein YycH (approximately 450 residues long). The function of this protein is not known. 47320 pfam07436: Curtovirus V3 protein. This family consists of several Curtovirus V3 proteins of around 90 residues in length. The function of this family is unknown. 47321 pfam07437: YfaZ precursor. This family contains the precursor of the bacterial protein YfaZ (approximately 180 residues long). Many members of this family are hypothetical proteins. 47322 pfam07438: Protein of unknown function (DUF1514). This family consists of several Staphylococcus aureus and related bacteriophage proteins of around 65 residues in length. The function of this family is unknown. 47323 pfam07439: Protein of unknown function (DUF1515). This family consists of several hypothetical bacterial proteins of around 130 residues in length. Members of this family seem to be found exclusively in Rhizobium species. The function of this family is unknown. 47324 pfam07440: Caerin 1 protein. This family consists of several caerin 1 proteins from Litoria species. The caerin 1 peptides are among the most powerful of the broad-spectrum antibiotic amphibian peptides. 47325 pfam07441: SigmaK-factor processing regulatory protein BofA. This family contains the sigmaK-factor processing regulatory protein BofA (Bypass-of-forespore protein A) (approximately 80 residues long). During sporulation in Bacillus subtilis, transcription is controlled in the developing sporangium by a cascade of sporulation-specific transcription factors (sigma factors). Following engulfment, processing of sigmaK is inhibited by BofA. It has been suggested that this effect is exerted by alteration of the level of the SpoIVFA protein. 47326 pfam07442: Ponericin. This family contains a number of ponericin peptides (approximately 30 residues long) from the venom of the predatory ant Pachycondyla goeldii. These peptides exhibit antibacterial and insecticidal properties, and may adopt an amphipathic alpha-helical structure in polar environments such as cell membranes. 47327 pfam07443: HepA-related protein (HARP). This family represents a conserved region approximately 60 residues long within eukaryotic HepA-related protein (HARP). This exhibits single-stranded DNA-dependent ATPase activity, and is ubiquitously expressed in human and mouse tissues. Family members may contain more than one copy of this region. 47328 pfam07444: Ycf66 protein N-terminus. This family represents the N-terminus (approximately 80 residues) of Ycf66, a protein that seems to be restricted to eukaryotes that contain chloroplasts and to cyanobacteria. 47329 pfam07445: Primosomal replication protein priB and priC. This family contains the bacterial primosomal replication proteins priB and priC (approximately 180 residues long). In Escherichia coli, these function in the assembly of the primosome. 47330 pfam07446: GumN protein. This family contains the bacterial protein GumN (approximately 330 residues long). Note that many members of this family are hypothetical proteins. . 47331 pfam07447: Matrix protein VP40. This family contains viral VP40 matrix proteins that seem to be restricted to the Filoviridae. These play an important role in the assembly process of virus particles by interacting with cellular factors, cellular membranes, and the ribonuclearprotein particle complex. It has been shown that the N-terminal region of VP40 folds into a mixture of hexameric and octameric states - these may have distinct roles. 47332 pfam07448: Secreted phosphoprotein 24 (Spp-24). This family represents a conserved region approximately 140 residues long within secreted phosphoprotein 24 (Spp-24), which seems to be restricted to vertebrates. This is a non-collagenous protein found in bone that is related in sequence to the cystatin family of thiol protease inhibitors. This suggests that Spp-24 could function to modulate the thiol protease activities known to be involved in bone turnover. It is also possible that the intact form of Spp-24 found in bone could be a precursor to a biologically active peptide that coordinates an aspect of bone turnover. 47333 pfam07449: Hydrogenase-1 expression protein HyaE. This family contains bacterial hydrogenase-1 expression proteins approximately 120 residues long. This includes the E. coli protein HyaE, and the homologous proteins HoxO of R. eutropha and HupG of R. leguminosarum. Deletion of the hoxO gene in R. eutropha led to complete loss of the uptake [NiFe] hydrogenase activity, suggesting that it has a critical role in hydrogenase assembly. 47334 pfam07450: Formate hydrogenlyase maturation protein HycH. This family contains the bacterial formate hydrogenlyase maturation protein HycH, which is approximately 140 residues long. This may be required for the conversion of a precursor form of the large subunit of hydrogenlyase 3 into a mature form. 47335 pfam07451: Stage V sporulation protein AD (SpoVAD). This family contains the bacterial stage V sporulation protein AD (SpoVAD), which is approximately 340 residues long. This is one of six proteins encoded by the spoVA operon, which is transcribed exclusively in the forespore at about the time of dipicolinic acid (DPA) synthesis in the mother cell. The functions of the proteins encoded by the spoVA operon are unknown, but it has been suggested they are involved in DPA transport during sporulation. 47336 pfam07452: CHRD domain. CHRD is a novel domain identified in chordin, an inhibitor of bone morphogenetic proteins. This family includes bacterial homologues. It is anticipated to have an immunoglobulin-like beta-barrel structure based on limited similarity to superoxide dismutases but, as yet, no clear functional prediction can be made. 47337 pfam07453: NUMOD1 domain. 47338 pfam07454: Stage II sporulation protein P (SpoIIP). This family contains the bacterial stage II sporulation protein P (SpoIIP) (approximately 350 residues long). It has been shown that a block in polar cytokinesis in Bacillus subtilis is mediated partly by transcription of spoIID, spoIIM and spoIIP. This inhibition of polar division is involved in the locking in of asymmetry after the formation of a polar septum during sporulation. 47339 pfam07455: Phage polarity suppression protein (Psu). This family contains a number of phage polarity suppression proteins (Psu) (approximately 190 residues long). The Psu protein of bacteriophage P4 causes suppression of transcriptional polarity in Escherichia coli by overcoming Rho termination factor activity. 47340 pfam07456: Heptaprenyl diphosphate synthase component I. This family contains component I of bacterial heptaprenyl diphosphate synthase (EC:2.5.1.30) (approximately 170 residues long). This is one of the two dissociable subunits that form the enzyme, both of which are required for the catalysis of the biosynthesis of the side chain of menaquinone-7. 47341 pfam07457: Protein of unknown function (DUF1516). This family contains a number of hypothetical bacterial proteins of unknown function approximately 120 residues long. 47342 pfam07458: Sperm protein associated with nucleus, mapped to X chromosome (SPAN-X). This family contains human sperm proteins associated with the nucleus and mapped to the X chromosome (SPAN-X) (approximately 100 residues long). SPAN-X proteins are cancer-testis antigens (CTAs), and thus represent potential targets for cancer immunotherapy because they are widely distributed in tumours but not in normal tissues, except testes. They are highly insoluble, acidic, and polymorphic. 47343 pfam07459: CTX phage RstB protein. This family contains a number of RstB proteins approximately 120 residues long, including RstB1 and RstB2, from the Vibrio cholerae phage CTX. Functional analyses indicate that rstB2 is required for integration of the CTXphi phage into the V. cholerae chromosome. 47344 pfam07460: NUMOD3 motif. NUMOD3 is a DNA-binding motif found in homing endonucleases and related proteins. 47345 pfam07461: Nicotine adenine dinucleotide glycohydrolase (NADase). This family consists of several bacterial nicotine adenine dinucleotide glycohydrolase (NGA) proteins which appear to be specific to Streptococcus pyogenes. NAD glycohydrolase (NADase) is a potential virulence factor. Streptococcal NADase may contribute to virulence by its ability to cleave beta-NAD at the ribose-nicotinamide bond, depleting intracellular NAD pools and producing the potent vasoactive compound nicotinamide. 47346 pfam07462: Merozoite surface protein 1 (MSP1) C-terminus. This family represents the C-terminal region of merozoite surface protein 1 (MSP1) which are found in a number of Plasmodium species. MSP-1 is a 200-kDa protein expressed on the surface of the P. vivax merozoite. MSP-1 of Plasmodium species is synthesised as a high-molecular-weight precursor and then processed into several fragments. At the time of red cell invasion by the merozoite, only the 19-kDa C-terminal fragment (MSP-119), which contains two epidermal growth factor-like domains, remains on the surface. Antibodies against MSP-119 inhibit merozoite entry into red cells, and immunisation with MSP-119 protects monkeys from challenging infections. Hence, MSP-119 is considered a promising vaccine candidate. 47347 pfam07463: NUMOD4 motif. NUMOD4 is a putative DNA-binding motif found in homing endonucleases and related proteins. 47348 pfam07464: Apolipophorin-III precursor (apoLp-III). This family consists of several insect apolipoprotein-III sequences. Exchangeable apolipoproteins constitute a functionally important family of proteins that play critical roles in lipid transport and lipoprotein metabolism. Apolipophorin III (apoLp-III) is a prototypical exchangeable apolipoprotein found in many insect species that functions in transport of diacylglycerol (DAG) from the fat body lipid storage depot to flight muscles in the adult life stage. 47349 pfam07465: Photosystem I protein M (PsaM). This family consists of several plant and cyanobacterial photosystem I protein M (PsaM) sequences. PsaM forms part of the photosystem I complex and its binding is stabilised by PsaI. 47350 pfam07466: Protein of unknown function (DUF1517). This family consists of several hypothetical glycine rich plant and bacterial proteins of around 300 residues in length. The function of this family is unknown. 47351 pfam07467: Beta-lactamase inhibitor (BLIP). The structure of BLIP reveals two structural domains, which form a polar, concave surface that docks onto a predominantly polar, convex protrusion on beta-lactamase. The ability of BLIP to adapt to a variety of class A beta-lactamases is thought to be due to flexibility between these two domains. 47352 pfam07468: Agglutinin. 47353 pfam07469: Domain of unknown function (DUF1518). This domain, which is usually found tandemly repeated, is found various receptor co-activating proteins. 47354 pfam07470: Glycosyl Hydrolase Family 88. Unsaturated glucuronyl hydrolase catalyses the hydrolytic release of unsaturated glucuronic acids from oligosaccharides (EC:3.2.1.-) produced by the reactions of polysaccharide lyases. . 47355 pfam07471: Phage DNA packaging protein Nu1. Terminase, the DNA packaging enzyme of bacteriophage lambda, is a heteromultimer composed of subunits Nu1 and A. The smaller Nu1 terminase subunit has a low-affinity ATPase stimulated by non-specific DNA. . 47356 pfam07472: Fucose-binding lectin II (PA-IIL). In Pseudomonas aeruginosa the fucose-binding lectin II (PA-IIL) contributes to the pathogenic virulence of the bacterium. PA-IIL functions as a tetramer when binding fucose. Each monomer is comprised of a nine-stranded, antiparallel beta-sandwich arrangement and contains two calcium cations that mediate the binding of fucose in a recognition mode unique among carbohydrate-protein interactions. 47357 pfam07473: Spasmodic peptide gm9a. This family consists of several spasmodic peptide gm9a sequences. Conotoxin gm9a is a putative 27-residue polypeptide encoded by Conus gloriamaris and is known to be a homologue of the ""spasmodic peptide"", tx9a, isolated from the venom of the mollusk-hunting cone shell Conus textile. Upon injection of this venom component, normal mice are converted into behavioural phenocopies of a well-known mutant, the spasmodic mouse. 47358 smart00002: Myelin proteolipid protein (PLP or lipophilin); . 47359 smart00003: Neurohypophysial hormones; Vasopressin/oxytocin gene family. . 47360 smart00004: Domain found in Notch and Lin-12; The Notch protein is essential for the proper differentiation of the Drosophila ectoderm. This protein contains 3 NL domains. . 47361 smart00005: DEATH domain, found in proteins involved in cell death (apoptosis). Alpha-helical domain present in a variety of proteins with apoptotic functions. Some (but not all) of these domains form homotypic and heterotypic dimers. . 47362 smart00006: amyloid A4; amyloid A4 precursor of Alzheimers disease . 47363 smart00008: Domain present in hormone receptors; . 47364 smart00013: Leucine rich repeat N-terminal domain; . 47365 smart00014: Acid phosphatase homologues; . 47366 smart00015: Short calmodulin-binding motif containing conserved Ile and Gln residues. Calmodulin-binding motif. . 47367 smart00017: Osteopontin; Osteopontin is an acidic phosphorylated glycoprotein of about 40 Kd which is abundant in the mineral matrix of bones and which binds tightly to hydroxyapatite. It is suggested that osteopontin might function as a cell attachment factor and could play a key role in the adhesion of osteoclasts to the mineral matrix of bone . 47368 smart00018: P or trefoil or TFF domain; Proposed role in renewal and pathology of mucous epithelia. . 47369 smart00019: Pulmonary surfactant proteins; Pulmonary surfactant associated proteins promote alveolar stability by lowering the surface tension at the air-liquid interface in the peripheral air spaces. SP-C, a component of surfactant, is a highly hydrophobic peptide of 35 amino acid residues which is processed from a larger precursor protein. SP-C is post-translationally modified by the covalent attachment of two palmitoyl groups on two adjacent cysteines . 47370 smart00020: Trypsin-like serine protease; Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues. . 47371 smart00021: Domain present in Dishevelled and axin; Domain of unknown function. . 47372 smart00022: Cytoplasmic phospholipase A2, catalytic subunit; Cytosolic phospholipases A2 hydrolyse arachidonyl phospholipids. Family includes phospholipases B isoforms. . 47373 smart00023: Colipase; Colipase is a protein that functions as a cofactor for pancreatic lipase, with which it forms a stoichiometric complex. It also binds to the bile-salt covered triacylglycerol interface thus allowing the enzyme to anchor itself to the water-lipid interface. Colipase is a small protein of approximately 100 amino-acid residues with five conserved disulfide bonds. . 47374 smart00025: Pumilio-like repeats; Pumilio-like repeats that bind RNA. . 47375 smart00026: Ependymins; Ependymins are the predominant proteins in the cerebrospinal fluid (CSF) of teleost fish. They have been implicated in the neurochemistry of memory and neuronal regeneration. They are glycoproteins of about 200 amino acids that can bind calcium. Four cysteines are conserved that probably form disulfide bonds. . 47376 smart00027: Eps15 homology domain; Pair of EF hand motifs that recognise proteins containing Asn-Pro-Phe (NPF) sequences. . 47377 smart00028: Tetratricopeptide repeats; Repeats present in 4 or more copies in proteins. Contain a minimum of 34 amino acids each and self-associate via a ""knobs and holes"" mechanism. . 47378 smart00029: gastrin / cholecystokinin / caerulein family; This family gathers small proteins of about 100 130 amino acids that act as hormones, among them gastrin, cholecystokinin and preprocaerulein which stimulate gastric, biliary, and pancreatic secretion and smooth muscle contraction. . 47379 smart00030: CLUSTERIN Beta chain; . 47380 smart00031: Death effector domain; . 47381 smart00032: Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR); The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. A missense mutation in seventh CCP domain causes deficiency of the b subunit of factor XIII. . 47382 smart00033: Calponin homology domain; Actin binding domains present in duplicate at the N-termini of spectrin-like proteins (including dystrophin, alpha-actinin). These domains cross-link actin filaments into bundles and networks. A calponin homology domain is predicted in yeasst Cdc24p. . 47383 smart00034: C-type lectin (CTL) or carbohydrate-recognition domain (CRD); Many of these domains function as calcium-dependent carbohydrate binding modules. . 47384 smart00035: CLUSTERIN alpha chain; . 47385 smart00036: Domain found in NIK1-like kinases, mouse citron and yeast ROM1, ROM2; Unpublished observations. . 47386 smart00037: Connexin homologues; Connexin channels participate in the regulation of signaling between developing and differentiated cell types. . 47387 smart00038: Fibrillar collagens C-terminal domain; Found at C-termini of fibrillar collagens: Ephydatia muelleri procollagen EMF1alpha, vertebrate collagens alpha(1)III, alpha(1)II, alpha(2)V etc. . 47388 smart00039: corticotropin-releasing factor; . 47389 smart00040: Granulocyte-macrophage colony-simulating factor (GM-CSF); GM-CSF stimulates the development of and the cytotoxic activity of white blood cells. . 47390 smart00041: C-terminal cystine knot-like domain (CTCK); The structures of transforming growth factor-beta (TGFbeta), nerve growth factor (NGF), platelet-derived growth factor (PDGF) and gonadotropin all form 2 highly twisted antiparallel pairs of beta-strands and contain three disulphide bonds. The domain is non-globular and little is conserved among these presumed homologues except for their cysteine residues. CT domains are predicted to form homodimers. . 47391 smart00042: Domain first found in C1r, C1s, uEGF, and bone morphogenetic protein. This domain is found mostly among developmentally-regulated proteins. Spermadhesins contain only this domain. . 47392 smart00043: Cystatin-like domain; Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as kininogen, His-rich glycoprotein and fetuin also contain these domains. . 47393 smart00044: Adenylyl- / guanylyl cyclase, catalytic domain; Present in two copies in mammalian adenylyl cyclases. Eubacterial homologues are known. Two residues (Asn, Arg) are thought to be involved in catalysis. These cyclases have important roles in a diverse range of cellular processes. . 47394 smart00045: Diacylglycerol kinase accessory domain (presumed); Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. DAG can be produced from the hydrolysis of phosphatidylinositol 4,5-bisphosphate (PIP2) by a phosphoinositide-specific phospholipase C and by the degradation of phosphatidylcholine (PC) by a phospholipase C or the concerted actions of phospholipase D and phosphatidate phosphohydrolase. This domain might either be an accessory domain or else contribute to the catalytic domain. Bacterial homologues are known. . 47395 smart00046: Diacylglycerol kinase catalytic domain (presumed); Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. DAG can be produced from the hydrolysis of phosphatidylinositol 4,5-bisphosphate (PIP2) by a phosphoinositide-specific phospholipase C and by the degradation of phosphatidylcholine (PC) by a phospholipase C or the concerted actions of phospholipase D and phosphatidate phosphohydrolase. This domain is presumed to be the catalytic domain. Bacterial homologues areknown. . 47396 smart00047: Lysozyme subfamily 2; Eubacterial enzymes distantly related to eukaryotic lysozymes. . 47397 smart00048: Defensin/corticostatin family; Cysteine-rich domains that lyse bacteria, fungi and enveloped viruses by forming multimeric membrane-spanning channels. . 47398 smart00049: Domain found in Dishevelled, Egl-10, and Pleckstrin; Domain of unknown function present in signalling proteins that contain PH, rasGEF, rhoGEF, rhoGAP, RGS, PDZ domains. DEP domain in Drosophila dishevelled is essential to rescue planar polarity defects and induce JNK signalling (Cell 94, 109-118). . 47399 smart00050: Homologues of snake disintegrins ; Snake disintegrins inhibit the binding of ligands to integrin receptors. They contain a 'RGD' sequence, identical to the recognition site of many adhesion proteins. Molecules containing both disintegrin and metalloprotease domains are known as ADAMs. . 47400 smart00051: delta serrate ligand; . 47401 smart00052: Domain of Unknown Function 2; Domain apparently occurring exclusively in eubacteria. Unknown function. . 47402 smart00053: Dynamin, GTPase; Large GTPases that mediate vesicle trafficking. Dynamin participates in the endocytic uptake of receptors, associated ligands, and plasma membrane following an exocytic event. . 47403 smart00054: EF-hand, calcium binding motif; EF-hands are calcium-binding motifs that occur at least in pairs. Links between disease states and genes encoding EF-hands, particularly the S100 subclass, are emerging. Each motif consists of a 12 residue loop flanked on either side by a 12 residue alpha-helix. EF-hands undergo a conformational change unpon binding calcium ions. . 47404 smart00055: Fes/CIP4 homology domain; Alignment extended from original report. Highly alpha-helical. Also known as the RAEYL motif or the S. pombe Cdc15 N-terminal domain. . 47405 smart00057: factor I membrane attack complex; . 47406 smart00058: Fibronectin type 1 domain; One of three types of internal repeat within the plasma protein, fibronectin. Found also in coagulation factor XII, HGF activator and tissue-type plasminogen activator. In t-PA and fibronectin, this domain type contributes to fibrin-binding. . 47407 smart00059: Fibronectin type 2 domain; One of three types of internal repeat within the plasma protein, fibronectin. Also occurs in coagulation factor XII, 2 type IV collagenases, PDC-109, and cation-independent mannose-6-phosphate and secretory phospholipase A2 receptors. In fibronectin, PDC-109, and the collagenases, this domain contributes to collagen-binding function. . 47408 smart00060: Fibronectin type 3 domain; One of three types of internal repeat within the plasma protein, fibronectin. The tenth fibronectin type III repeat contains a RGD cell recognition sequence in a flexible loop between 2 strands. Type III modules are present in both extracellular and intracellular proteins. . 47409 smart00061: meprin and TRAF homology; . 47410 smart00062: Bacterial periplasmic substrate-binding proteins; bacterial proteins, eukaryotic ones are in PBPe . 47411 smart00063: Frizzled; Drosophila melanogaster frizzled mediates signalling that polarises a precursor cell along the anteroposterior axis. Homologues of the N-terminal region of frizzled exist either as transmembrane or secreted molecules. Frizzled homologues are reported to be receptors for the Wnt growth factors. (Not yet in MEDLINE: the FRI domain occurs in several receptor tyrosine kinases [Xu, Y.K. and Nusse, Curr. Biol. 8 R405-R406 (1998); Masiakowski, P. and Yanopoulos, G.D., Curr. Biol. 8, R407 (1998)]. . 47412 smart00064: Protein present in Fab1, YOTB, Vac1, and EEA1; Zinc-binding domain, possibly involved in endosomal targetting. Recent data indicates that these domains bind PtdIns(3)P. . 47413 smart00065: Domain present in phytochromes and cGMP-specific phosphodiesterases. Mutations within these domains in PDE6B result in autosomal recessive inheritance of retinitis pigmentosa. . 47414 smart00066: GAL4-like Zn(II)2Cys6 (or C6 zinc) binuclear cluster DNA-binding domain; Gal4 is a positive regulator for the gene expression of the galactose- induced genes of S. cerevisiae. Is present only in fungi. . 47415 smart00067: Glycoprotein hormone alpha chain homologues. Also called gonadotropins. Glycoprotein hormones consist of two glycosylated chains (alpha and beta) of similar topology. . 47416 smart00068: Glycoprotein hormone beta chain homologues. Also called gonadotropins. Glycoprotein hormones consist of two glycosylated chains (alpha and beta) of similar topology. . 47417 smart00069: Domain containing Gla (gamma-carboxyglutamate) residues. A hyaluronan-binding domain found in proteins associated with the extracellular matrix, cell adhesion and cell migration. . 47418 smart00070: Glucagon like hormones; . 47419 smart00071: Galanin; Galanin is a neuropeptide that controls various biological activities: it regulates the release growth hormone, inhibits the release of insulin and somatostatin, contracts smooth muscle of the gastrointestinal and genitourinary tract and may be involved in the control of adrenal secretion . 47420 smart00072: Guanylate kinase homologues. Active enzymes catalyze ATP-dependent phosphorylation of GMP to GDP. Structure resembles that of adenylate kinase. So-called membrane-associated guanylate kinase homologues (MAGUKs) do not possess guanylate kinase activities; instead at least some possess protein-binding functions. . 47421 smart00073: Histidine Phosphotransfer domain; Contains an active histidine residue that mediates phosphotransfer reactions. Domain detected only in eubacteria. This alignment is an extension to that shown in the Cell structure paper. . 47422 smart00075: Hydrophobins; . 47423 smart00076: Interferon alpha, beta and delta. Interferons produce antiviral and antiproliferative responses in cells. They are classified into five groups, all of them related but gamma-interferon. . 47424 smart00077: Immunoreceptor tyrosine-based activation motif; Motif that may be dually phosphorylated on tyrosine that links antigen receptors to downstream signalling machinery. . 47425 smart00078: Insulin / insulin-like growth factor / relaxin family. Family of proteins including insulin, relaxin, and IGFs. Insulin decreases blood glucose concentration. . 47427 smart00080: leukemia inhibitory factor ; OSM, Oncostatin M . 47428 smart00082: Leucine rich repeat C-terminal domain; . 47429 smart00084: Neuromedin U ; Neuromedin U (NmU) is a vertebrate peptide which stimulates uterine smooth muscle contraction and causes selective vasoconstriction. Like most other active peptides, it is proteolytically processed from a larger precursor protein. The mature peptides are 8 (NmU-8) to 25 (NmU-25) residues long and C- terminally amidated. The sequence of the C-terminal extremity of NmU is extremely well conserved in mammals, birds and amphibians. . 47430 smart00085: Phospholipase A2; . 47431 smart00086: Motif C-terminal to PAS motifs (likely to contribute to PAS structural domain); PAC motif occurs C-terminal to a subset of all known PAS motifs. It is proposed to contribute to the PAS domain fold. . 47432 smart00087: Parathyroid hormone; . 47433 smart00088: motif in proteasome subunits, Int-6, Nip-1 and TRIP-15; Also called the PCI (Proteasome, COP9, Initiation factor 3) domain. Unknown function. . 47434 smart00089: Repeats in polycystic kidney disease 1 (PKD1) and other proteins; Polycystic kidney disease 1 protein contains 14 repeats, present elsewhere such as in microbial collagenases. . 47435 smart00090: RIO-like kinase; . 47436 smart00091: PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. . 47437 smart00092: Pancreatic ribonuclease ; . 47438 smart00093: SERine Proteinase INhibitors; . 47439 smart00094: Transferrin; . 47440 smart00095: Transthyretin; . 47441 smart00096: Uteroglobin; . 47442 smart00097: found in Wnt-1; . 47443 smart00098: Alkaline phosphatase homologues; . 47444 smart00099: tob/btg1 family; The tob/btg1 is a family of proteins that inhibit cell proliferation. . 47445 smart00100: Cyclic nucleotide-monophosphate binding domain; Catabolite gene activator protein (CAP) is a prokaryotic homologue of eukaryotic cNMP-binding domains, present in ion channels, and cNMP-dependent kinases. . 47446 smart00101: 14-3-3 homologues; 14-3-3 homologues mediates signal transduction by binding to phosphoserine-containing proteins. They are involved in growth factor signalling and also interact with MEK kinases. . 47447 smart00102: Actin depolymerisation factor/cofilin -like domains; Severs actin filaments and binds to actin monomers. . 47448 smart00103: serum albumin; . 47449 smart00104: Anaphylatoxin homologous domain; C3a, C4a and C5a anaphylatoxins are protein fragments generated enzymatically in serum during activation of complement molecules C3, C4, and C5. They induce smooth muscle contraction. These fragments are homologous to a three-fold repeat in fibulins. . 47450 smart00105: Putative GTP-ase activating proteins for the small GTPase, ARF; Putative zinc fingers with GTPase activating proteins (GAPs) towards the small GTPase, Arf. The GAP of ARD1 stimulates GTPase hydrolysis for ARD1 but not ARFs. . 47451 smart00107: Bruton's tyrosine kinase Cys-rich motif; Zinc-binding motif containing conserved cysteines and a histidine. Always found C-terminal to PH domains (but not all PH domains are followed by BTK motifs). The crystal structure shows this motif packs against the PH domain. The PH+Btk module pair has been called the Tec homology (TH) region. . 47452 smart00108: Bulb-type mannose-specific lectin; . 47453 smart00109: Protein kinase C conserved region 1 (C1) domains (Cysteine-rich domains); Some bind phorbol esters and diacylglycerol. Some bind RasGTP. Zinc-binding domains. . 47454 smart00110: Complement component C1q domain. Globular domain found in many collagens and eponymously in complement C1q. When part of full length proteins these domains form a 'bouquet' due to the multimerization of heterotrimers. The C1q fold is similar to that of tumour necrosis factor. . 47455 smart00111: C-terminal tandem repeated domain in type 4 procollagens; Duplicated domain in C-terminus of type 4 collagens. Mutations in alpha-5 collagen IV are associated with X-linked Alport syndrome. . 47456 smart00112: Cadherin repeats. Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. Cadherin domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium. . 47457 smart00113: calcitonin; This family is formed by calcitonin, the calcitonin gene-related peptide, and amylin. They are short polypeptide hormones. . 47458 smart00114: Caspase recruitment domain; Motif contained in proteins involved in apoptotic signalling. Mediates homodimerisation. Structure consists of six antiparallel helices arranged in a topology homologue to the DEATH and the DED domain. . 47459 smart00115: Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine aspartases that mediate programmed cell death (apoptosis). Caspases are synthesised as zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologues. . 47460 smart00116: Domain in cystathionine beta-synthase and other proteins. Domain present in all 3 forms of cellular life. Present in two copies in inosine monophosphate dehydrogenase, of which one is disordered in the crystal structure. A number of disease states are associated with CBS-containing proteins including homocystinuria, Becker's and Thomsen disease. . 47461 smart00119: Domain Homologous to E6-AP Carboxyl Terminus with ; E3 ubiquitin-protein ligases. Can bind to E2 enzymes. . 47462 smart00120: Hemopexin-like repeats. Hemopexin is a heme-binding protein that transports heme to the liver. Hemopexin-like repeats occur in vitronectin and some matrix metalloproteinases family (matrixins). The HX repeats of some matrixins bind tissue inhibitor of metalloproteinases (TIMPs). . 47463 smart00121: Insulin growth factor-binding protein homologues; High affinity binding partners of insulin-like growth factors. . 47464 smart00125: Interleukin-1 homologues ; Cytokines with various biological functions. Interluekin 1 alpha and beta are also known as hematopoietin and catabolin. . 47465 smart00126: Interleukin-6 homologues; Family includes granulocyte colony-stimulating factor (G-CSF) and myelomonocytic growth factor (MGF). IL-6 is also known as B-cell stimulatory factor 2. . 47466 smart00127: Interleukin-7 and interleukin-9 family. IL-7 is a cytokine that acts as a growth factor for early lymphoid cells of both B- and T-cell lineages. IL-9 is a multifunctional cytokine that, although originally described as a T-cell growth factor, its function in T-cell response remains unclear. . 47467 smart00128: Inositol polyphosphate phosphatase, catalytic domain homologues; Mg(2+)-dependent/Li(+)-sensitive enzymes. . 47468 smart00129: Kinesin motor, catalytic domain. ATPase. Microtubule-dependent molecular motors that play important roles in intracellular transport of organelles and in cell division. . 47469 smart00130: Kringle domain; Named after a Danish pastry. Found in several serine proteases and in ROR-like receptors. Can occur in up to 38 copies (in apolipoprotein(a)). Plasminogen-like kringles possess affinity for free lysine and lysine- containing peptides. . 47470 smart00131: BPTI/Kunitz family of serine protease inhibitors. Serine protease inhibitors. One member of the family is encoded by an alternatively-spliced form of Alzheimer's amyloid beta-protein. . 47471 smart00132: Zinc-binding domain present in Lin-11, Isl-1, Mec-3. Zinc-binding domain family. Some LIM domains bind protein partners via tyrosine-containing motifs. LIM domains are found in many key regulators of developmental pathways. . 47472 smart00133: Extension to Ser/Thr-type protein kinases; . 47473 smart00134: Ly-6 antigen / uPA receptor -like domain; Three-fold repeated domain in urokinase-type plasminogen activator receptor; occurs singly in other GPI-linked cell-surface glycoproteins (Ly-6 family, CD59, thymocyte B cell antigen, Sgp-2). Topology of these domains is similar to that of snake venom neurotoxins. . 47474 smart00135: Low-density lipoprotein-receptor YWTD domain; Type ""B"" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. . 47475 smart00136: Laminin N-terminal domain (domain VI); N-terminal domain of laminins and laminin-related protein such as Unc-6/ netrins. . 47476 smart00137: Domain in meprin, A5, receptor protein tyrosine phosphatase mu (and others); Likely to have an adhesive function. Mutations in the meprin MAM domain affect noncovalent associations within meprin oligomers. In receptor tyrosine phosphatase mu-like molecules the MAM domain is important for homophilic cell-cell interactions. . 47477 smart00138: Methyltransferase, chemotaxis proteins ; Methylates methyl-accepting chemotaxis proteins to form gamma-glutamyl methyl ester residues. . 47478 smart00139: Domain in Myosin and Kinesin Tails; Domain present twice in myosin-VIIa, and also present in 3 other myosins. . 47479 smart00140: Nerve growth factor (NGF or beta-NGF); NGF is important for the development and maintenance of the sympathetic and sensory nervous systems. . 47480 smart00141: Platelet-derived and vascular endothelial growth factors (PDGF, VEGF) family; Platelet-derived growth factor is a potent activator for cells of mesenchymal origin. PDGF-A and PDGF-B form AA and BB homodimers and an AB heterodimer. Members of the VEGF family are homologues of PDGF. . 47481 smart00142: Phosphoinositide 3-kinase, region postulated to contain C2 domain; Outlier of C2 family. . 47482 smart00143: PI3-kinase family, p85-binding domain; Region of p110 PI3K that binds the p85 subunit. . 47483 smart00144: PI3-kinase family, Ras-binding domain; Certain members of the PI3K family possess Ras-binding domains in their N-termini. These regions show some similarity (although not highly significant similarity) to Ras-binding RA domains (unpublished observation). . 47484 smart00145: Phosphoinositide 3-kinase family, accessory domain (PIK domain); PIK domain is conserved in all PI3 and PI4-kinases. Its role is unclear but it has been suggested to be involved in substrate presentation. . 47485 smart00146: Phosphoinositide 3-kinase, catalytic domain; Phosphoinositide 3-kinase isoforms participate in a variety of processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, and apoptosis. These homologues may be either lipid kinases and/or protein kinases: the former phosphorylate the 3-position in the inositol ring of inositol phospholipids. The ataxia telangiectesia-mutated gene produced, the targets of rapamycin (TOR) and the DNA-dependent kinase have not been found to possess lipid kinase activity. Some of this family possess PI-4 kinase activities. . 47486 smart00147: Guanine nucleotide exchange factor for Ras-like small GTPases; . 47487 smart00148: Phospholipase C, catalytic domain (part); domain X; Phosphoinositide-specific phospholipases C. These enzymes contain 2 regions (X and Y) which together form a TIM barrel-like structure containing the active site residues. Phospholipase C enzymes (PI-PLC) act as signal transducers that generate two second messengers, inositol-1,4,5-trisphosphate and diacylglycerol. The bacterial enzyme appears to be a homologue of the mammalian PLCs. . 47488 smart00149: Phospholipase C, catalytic domain (part); domain Y; Phosphoinositide-specific phospholipases C. These enzymes contain 2 regions (X and Y) which together form a TIM barrel-like structure containing the active site residues. Phospholipase C enzymes (PI-PLC) act as signal transducers that generate two second messengers, inositol-1,4,5-trisphosphate and diacylglycerol. The bacterial enzyme appears to be a homologue of the mammalian PLCs. . 47489 smart00150: Spectrin repeats; . 47490 smart00151: SWI complex, BAF60b domains; . 47491 smart00152: Thymosin beta actin-binding motif. . 47492 smart00153: Villin headpiece domain; . 47493 smart00154: AN1-like Zinc finger; Zinc finger at the C-terminus of An1, a ubiquitin-like protein in Xenopus laevis. . 47494 smart00155: Phospholipase D. Active site motifs. Phosphatidylcholine-hydrolyzing phospholipase D (PLD) isoforms are activated by ADP-ribosylation factors (ARFs). PLD produces phosphatidic acid from phosphatidylcholine, which may be essential for the formation of certain types of transport vesicles or may be constitutive vesicular transport to signal transduction pathways. PC-hydrolysing PLD is a homologue of cardiolipin synthase, phosphatidylserine synthase, bacterial PLDs, and viral proteins. Each of these appears to possess a domain duplication which is apparent by the presence of two motifs containing well-conserved histidine, lysine, aspartic acid, and/or asparagine residues which may contribute to the active site. An E. coli endonuclease (nuc) and similar proteins appear to be PLD homologues but possess only one of these motifs. The profile contained here represents only the putative active site regions, since an accurate multiple alignment of the repeat units has not been achieved. . 47495 smart00156: Protein phosphatase 2A homologues, catalytic domain. Large family of serine/threonine phosphatases, that includes PP1, PP2A and PP2B (calcineurin) family members. . 47496 smart00157: Major prion protein; The prion protein is a major component of scrapie-associated fibrils in Creutzfeldt-Jakob disease, kuru, Gerstmann-Straussler syndrome and bovine spongiform encephalopathy. . 47497 smart00159: Pentraxin / C-reactive protein / pentaxin family; This family form a doscoid pentameric structure. Human serum amyloid P demonstrates calcium-mediated ligand-binding. . 47498 smart00160: Ran-binding domain; Domain of apporximately 150 residues that stabilises the GTP-bound form of Ran (the Ras-like nuclear small GTPase). . 47499 smart00162: Saposin/surfactant protein-B A-type DOMAIN; Present as four and three degenerate copies, respectively, in prosaposin and surfactant protein B. Single copies in acid sphingomyelinase, NK-lysin amoebapores and granulysin. Putative phospholipid membrane binding domains. . 47500 smart00164: Domain in Tre-2, BUB2p, and Cdc16p. Probable Rab-GAPs. Widespread domain present in Gyp6 and Gyp7, thereby giving rise to the notion that it performs a GTP-activator activity on Rab-like GTPases. . 47501 smart00165: Ubiquitin associated domain; Present in Rad23, SNF1-like kinases. The newly-found UBA in p62 is known to bind ubiquitin. . 47502 smart00166: Domain present in ubiquitin-regulatory proteins; Present in FAF1 and Shp1p. . 47503 smart00167: Domain present in VPS9; Domain present in yeast vacuolar sorting protein 9 and other proteins. . 47504 smart00173: Ras subfamily of RAS small GTPases; Similar in fold and function to the bacterial EF-Tu GTPase. p21Ras couples receptor Tyr kinases and G protein receptors to protein kinase cascades . 47505 smart00174: Rho (Ras homology) subfamily of Ras-like small GTPases; Members of this subfamily of Ras-like small GTPases include Cdc42 and Rac, as well as Rho isoforms. . 47506 smart00175: Rab subfamily of small GTPases; Rab GTPases are implicated in vesicle trafficking. . 47507 smart00176: Ran (Ras-related nuclear proteins) /TC4 subfamily of small GTPases; Ran is involved in the active transport of proteins through nuclear pores. . 47508 smart00177: ARF-like small GTPases; ARF, ADP-ribosylation factor; Ras homologues involved in vesicular transport. Activator of phospholipase D isoforms. Unlike Ras proteins they lack cysteine residues at their C-termini and therefore are unlikely to be prenylated. ARFs are N-terminally myristoylated. Contains ATP/GTP-binding motif (P-loop). . 47509 smart00178: Sar1p-like members of the Ras-family of small GTPases; Yeast SAR1 is an essential gene required for transport of secretory proteins from the endoplasmic reticulum to the Golgi apparatus. . 47510 smart00179: Calcium-binding EGF-like domain; . 47511 smart00180: Laminin-type epidermal growth factor-like domai; . 47512 smart00181: Epidermal growth factor-like domain. . 47513 smart00182: Cullin; . 47514 smart00183: Natriuretic peptide; Atrial natriuretic peptides are vertebrate hormones important in the overall control of cardiovascular homeostasis and sodium and water balance in general. . 47515 smart00184: Ring finger; E3 ubiquitin-protein ligase activity is intrinsic to the RING domain of c-Cbl and is likely to be a general function of this domain; Various RING fingers exhibit binding activity towards E2 ubiquitin-conjugating enzymes (Ubc' s) . 47516 smart00185: Armadillo/beta-catenin-like repeats; Approx. 40 amino acid repeat. Tandem repeats form superhelix of helices that is proposed to mediate interaction of beta-catenin with its ligands. Involved in transducing the Wingless/Wnt signal. In plakoglobin arm repeats bind alpha-catenin and N-cadherin. . 47517 smart00186: Fibrinogen-related domains (FReDs); Domain present at the C-termini of fibrinogen beta and gamma chains, and a variety of fibrinogen-related proteins, including tenascin and Drosophila scabrous. . 47518 smart00187: Integrin beta subunits (N-terminal portion of extracellular region); Portion of beta integrins that lies N-terminal to their EGF-like repeats. Integrins are cell adhesion molecules that mediate cell-extracellular matrix and cell-cell interactions. They contain both alpha and beta subunits. Beta integrins are proposed to have a von Willebrand factor type-A ""insert"" or ""I"" -like domain (although this remains to be confirmed). . 47519 smart00188: Interleukin-10 family; Interleukin-10 inhibits the synthesis of a number of cytokines, including IFN-gamma, IL-2, IL-3, TNF and GM-CSF produced by activated macrophages and by helper T cells. . 47520 smart00189: Interleukin-2 family; Interleukin-2 is a cytokine produced by T-helper cells in response to antigenic or mitogenic stimulation. This protein is required for T-cell proliferation and other activities crucial to the regulation of the immune response. . 47521 smart00190: Interleukins 4 and 13; Interleukins-4 and -13 are cytokines involved in inflammatory and immune responses. IL-4 stimulates B and T cells. . 47522 smart00191: Integrin alpha (beta-propellor repeats). Integrins are cell adhesion molecules that mediate cell-extracellular matrix and cell-cell interactions. They contain both alpha and beta subunits. Alpha integrins are proposed to contain a domain containing a 7-fold repeat that adopts a beta-propellor fold. Some of these domains contain an inserted von Willebrand factor type-A domain. Some repeats contain putative calcium-binding sites. The 7-fold repeat domain is homologous to a similar domain in phosphatidylinositol-glycan-specific phospholipase D. . 47523 smart00192: Low-density lipoprotein receptor domain class A; Cysteine-rich repeat in the low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. The N-terminal type A repeats in LDL receptor bind the lipoproteins. Other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement. Mutations in the LDL receptor gene cause familial hypercholesterolemia. . 47524 smart00193: Pleiotrophin / midkine family; Heparin-binding domain family. . 47525 smart00194: Protein tyrosine phosphatase, catalytic domain; . 47526 smart00195: Dual specificity phosphatase, catalytic domain; . 47527 smart00197: Serum amyloid A proteins; Serum amyloid A proteins are induced during the acute-phase response. Secondary amyloidosis is characterised by the extracellular accumulation in tissues of SAA proteins. SAA proteins are apolipoproteins. . 47528 smart00198: SCP / Tpx-1 / Ag5 / PR-1 / Sc7 family of extracellular domains. Human glioma pathogenesis-related protein GliPR and the plant pathogenesis-related protein represent functional links between plant defense systems and human immune system. This family has no known function. . 47529 smart00199: Intercrine alpha family (small cytokine C-X-C) (chemokine CXC). Family of cytokines involved in cell-specific chemotaxis, mediation of cell growth, and the inflammatory response. . 47530 smart00200: Domain found in sea urchin sperm protein, enterokinase, agrin; Proposed function of regulating or binding carbohydrate sidechains. . 47531 smart00201: Somatomedin B -like domains; Somatomedin-B is a peptide, proteolytically excised from vitronectin, that is a growth hormone-dependent serum factor with protease-inhibiting activity. . 47532 smart00202: Scavenger receptor Cys-rich; The sea ucrhin egg peptide speract contains 4 repeats of SR domains that contain 6 conserved cysteines. May bind bacterial antigens in the protein MARCO. . 47533 smart00203: Tachykinin family; Tachykinins are a group of biologically active peptides which excite neurons, evoke behavioral responses, are potent vasodilatators and contract (directly or indirectly) many smooth muscles. These peptides are synthesized as longer precursors and then processed to peptides from ten to twelve residues long. . 47534 smart00204: Transforming growth factor-beta (TGF-beta) family; Family members are active as disulphide-linked homo- or heterodimers. TGFB is a multifunctional peptide that controls proliferation, differentiation, and other functions in many cell types. . 47535 smart00205: Thaumatin family; The thaumatin family gathers proteins related to plant pathogenesis. The thaumatin family includes very basic members with extracellular and vacuolar localization. Thaumatin itsel is a potent sweet-tasting protein. Several members of this family display significant in vitro activity of inhibiting hyphal growth or spore germination of various fungi probably by a membrane permeabilizing mechanism. . 47536 smart00206: Tissue inhibitor of metalloproteinase family. Form complexes with metalloproteinases, such as collagenases, and irreversibly inactivate them. . 47537 smart00207: Tumour necrosis factor family. Family of cytokines that form homotrimeric or heterotrimeric complexes. TNF mediates mature T-cell receptor-induced apoptosis through the p75 TNF receptor. . 47538 smart00208: Tumor necrosis factor receptor / nerve growth factor receptor repeats. Repeats in growth factor receptors that are involved in growth factor binding. TNF/TNFR . 47539 smart00209: Thrombospondin type 1 repeats; Type 1 repeats in thrombospondin-1 bind and activate TGF-beta. . 47540 smart00210: Thrombospondin N-terminal -like domains. Heparin-binding and cell adhesion domain of thrombospondin . 47541 smart00211: Thyroglobulin type I repeats. The N-terminal region of human thyroglobulin contains 11 type-1 repeats TY repeats are proposed to be inhibitors of cysteine proteases and binding partners of heparin. . 47542 smart00212: Ubiquitin-conjugating enzyme E2, catalytic domain homologues; Proteins destined for proteasome-mediated degradation may be ubiquitinated. Ubiquitination follows conjugation of ubiquitin to a conserved cysteine residue of UBC homologues. This pathway functions in regulating many fundamental processes required for cell viability.TSG101 is one of several UBC homologues that lacks this active site cysteine. . 47543 smart00213: Ubiquitin homologues; Ubiquitin-mediated proteolysis is involved in the regulated turnover of proteins required for controlling cell cycle progression . 47544 smart00214: von Willebrand factor (vWF) type C domain; . 47545 smart00215: von Willebrand factor (vWF) type C domain; . 47546 smart00216: von Willebrand factor (vWF) type D domain; Von Willebrand factor contains several type D domains: D1 and D2 are present within the N-terminal propeptide whereas the remaining D domains are required for multimerisation. . 47547 smart00217: Four-disulfide core domains; . 47548 smart00218: Domain present in ZO-1 and Unc5-like netrin receptors; Domain of unknown function. . 47549 smart00219: Tyrosine kinase, catalytic domain; Phosphotransferases. Tyrosine-specific kinase subfamily. . 47550 smart00220: Serine/Threonine protein kinases, catalytic domain; Phosphotransferases. Serine or threonine-specific kinase subfamily. . 47551 smart00222: Sec7 domain; Domain named after the S. cerevisiae SEC7 gene product, which is required for proper protein transport through the Golgi. The domain facilitates guanine nucleotide exchange on the small GTPases, ARFs (ADP ribosylation factors). . 47552 smart00223: APPLE domain; Four-fold repeat in plasma kallikrein and coagulation factor XI. Factor XI apple 3 mediates binding to platelets. Factor XI apple 1 binds high-molecular-mass kininogen. Apple 4 in factor XI mediates dimer formation and binds to factor XIIa. Mutations in apple 4 cause factor XI deficiency, an inherited bleeding disorder. . 47553 smart00224: G protein gamma subunit-like motifs; . 47554 smart00225: Broad-Complex, Tramtrack and Bric a brac; Domain in Broad-Complex, Tramtrack and Bric a brac. Also known as POZ (poxvirus and zinc finger) domain. Known to be a protein-protein interaction motif found at the N-termini of several C2H2-type transcription factors as well as Shaw-type potassium channels. Known structure reveals a tightly intertwined dimer formed via interactions between N-terminal strand and helix structures. However in a subset of BTB/POZ domains, these two secondary structures appear to be missing. Be aware SMART predicts BTB/POZ domains without the beta1- and alpha1-secondary structures. . 47555 smart00226: Low molecular weight phosphatase family; . 47556 smart00227: The Nebulin repeat is present also in Las1. Tandem arrays of these repeats are known to bind actin. . 47557 smart00228: Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities. . 47558 smart00229: Guanine nucleotide exchange factor for Ras-like GTPases; N-terminal motif; A subset of guanine nucleotide exchange factor for Ras-like small GTPases appear to possess this domain N-terminal to the RasGef (Cdc25-like) domain. The recent crystal structureof Sos shows that this domain is alpha-helical and plays a ""purely structural role"" (Nature 394, 337-343). . 47559 smart00230: Calpain-like thiol protease family. Calpain-like thiol protease family (peptidase family C2). Calcium activated neutral protease (large subunit). . 47560 smart00231: Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes. . 47561 smart00232: JAB/MPN domain; Domain in Jun kinase activation domain binding protein and proteasomal subunits. Domain at Mpr1p and Pad1p N-termini. Domain of unknown function. . 47562 smart00233: Pleckstrin homology domain. Domain commonly found in eukaryotic signalling proteins. The domain family possesses multiple functions including the abilities to bind inositol phosphates, and various proteins. PH domains have been found to possess inserted domains (such as in PLC gamma, syntrophins) and to be inserted within other domains. Mutations in Brutons tyrosine kinase (Btk) within its PH domain cause X-linked agammaglobulinaemia (XLA) in patients. Point mutations cluster into the positively charged end of the molecule around the predicted binding site for phosphatidylinositol lipids. . 47563 smart00234: in StAR and phosphatidylcholine transfer protein; putative lipid-binding domain in StAR and phosphatidylcholine transfer protein . 47564 smart00235: Zinc-dependent metalloprotease; Neutral zinc metallopeptidases. This alignment represents a subset of known subfamilies. Highest similarity occurs in the HExxH zinc-binding site/ active site. . 47565 smart00236: Fungal-type cellulose-binding domain; Small four-cysteine cellulose-binding domain of fungi . 47566 smart00237: Domains in Na-Ca exchangers and integrin-beta4; Domain in Na-Ca exchangers and integrin subunit beta4 (and some cyanobacterial proteins) . 47567 smart00238: Baculoviral inhibition of apoptosis protein repeat; Domain found in inhibitor of apoptosis proteins (IAPs) and other proteins. Acts as a direct inhibitor of caspase enzymes. . 47568 smart00239: Protein kinase C conserved region 2 (CalB); Ca2+-binding motif present in phospholipases, protein kinases C, and synaptotamins (among others). Some do not appear to contain Ca2+-binding sites. Particular C2s appear to bind phospholipids, inositol polyphosphates, and intracellular proteins. Unusual occurrence in perforin. Synaptotagmin and PLC C2s are permuted in sequence with respect to N- and C-terminal beta strands. SMART detects C2 domains using one or both of two profiles. . 47569 smart00240: Forkhead associated domain; Found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. . 47570 smart00241: Zona pellucida (ZP) domain; ZP proteins are responsible for sperm-adhesion fo the zona pellucida. ZP domains are also present in multidomain transmembrane proteins such as glycoprotein GP2, uromodulin and TGF-beta receptor type III (betaglycan). . 47571 smart00242: Myosin. Large ATPases. ATPase; molecular motor. Muscle contraction consists of a cyclical interaction between myosin and actin. The core of the myosin structure is similar in fold to that of kinesin. . 47572 smart00243: Growth-Arrest-Specific Protein 2 Domain; GROWTH-ARREST-SPECIFIC PROTEIN 2 Domain . 47573 smart00244: prohibitin homologues; prohibitin homologues . 47574 smart00245: tail specific protease; tail specific protease . 47575 smart00246: Wiskott Aldrich syndrome homology region 2; Wiskott Aldrich syndrome homology region 2 / actin-binding motif . 47576 smart00247: Beta/gamma crystallins; Beta/gamma crystallins . 47577 smart00248: ankyrin repeats; Ankyrin repeats are about 33 amino acids long and occur in at least four consecutive copies. They are involved in protein-protein interactions. The core of the repeat seems to be an helix-loop-helix structure. . 47578 smart00249: PHD zinc finger; . 47579 smart00250: Plectin repeat; . 47580 smart00251: SAM / Pointed domain; A subfamily of the SAM domain . 47581 smart00252: Src homology 2 domains; Src homology 2 domains bind phosphotyrosine-containing polypeptides via 2 surface pockets. Specificity is provided via interaction with residues that are distinct from the phosphotyrosine. Only a single occurrence of a SH2 domain has been found in S. cerevisiae. . 47582 smart00253: suppressors of cytokine signalling; suppressors of cytokine signalling . 47583 smart00254: ShK toxin domain; ShK toxin domain . 47584 smart00255: Toll - interleukin 1 - resistance; . 47585 smart00256: A Receptor for Ubiquitination Targets; . 47586 smart00257: Lysin motif; . 47587 smart00258: SAND domain; . 47588 smart00259: A20-like zinc fingers; A20- (an inhibitor of cell death)-like zinc fingers. The zinc finger mediates self-association in A20. These fingers also mediate IL-1-induced NF-kappaB activation. . 47589 smart00260: Two component signalling adaptor domain; . 47590 smart00261: Furin-like repeats; . 47591 smart00262: Gelsolin homology domain; Gelsolin/severin/villin homology domain. Calcium-binding and actin-binding. Both intra- and extracellular domains. . 47592 smart00263: Alpha-lactalbumin / lysozyme C; . 47593 smart00264: BAG domains, present in regulator of Hsp70 proteins; BAG domains, present in Bcl-2-associated athanogene 1 and silencer of death domains . 47594 smart00265: BH4 Bcl-2 homology region 4; . 47595 smart00266: Domains present in proteins implicated in post-mortem DNA fragmentation; . 47596 smart00267: Domain of Unknown Function with GGDEF motif; Domain apparently occurring exclusively in eubacteria. Likely to participate in prokaryotic signalling processes. Unknown function. . 47597 smart00268: Actin; ACTIN subfamily of ACTIN/mreB/sugarkinase/Hsp70 superfamily . 47598 smart00269: Bowman-Birk type proteinase inhibitor; . 47599 smart00270: Chitin binding domain; . 47600 smart00271: DnaJ molecular chaperone homology domain; . 47601 smart00272: Endothelin; . 47602 smart00273: Epsin N-terminal homology (ENTH) domain; . 47603 smart00274: Follistatin-N-terminal domain-like; Follistatin-N-terminal domain-like, EGF-like. Region distinct from the kazal-like sequence . 47604 smart00275: G protein alpha subunit; Subunit of G proteins that contains the guanine nucleotide binding site . 47605 smart00276: Galectin; Galectin - galactose-binding lectin . 47606 smart00277: Granulin; . 47607 smart00278: Helix-hairpin-helix DNA-binding motif class 1; . 47608 smart00279: Helix-hairpin-helix class 2 (Pol1 family) motifs; . 47609 smart00280: Kazal type serine protease inhibitors; Kazal type serine protease inhibitors and follistatin-like domains. . 47610 smart00281: Laminin B domain; . 47611 smart00282: Laminin G domain; . 47612 smart00283: Methyl-accepting chemotaxis-like domains (chemotaxis sensory transducer). Thought to undergo reversible methylation in response to attractants or repellants during bacterial chemotaxis. . 47613 smart00284: Olfactomedin-like domains; . 47614 smart00285: P21-Rho-binding domain; Small domains that bind Cdc42p- and/or Rho-like small GTPases. Also known as the Cdc42/Rac interactive binding (CRIB). . 47615 smart00286: Plant trypsin inhibitors; . 47616 smart00287: Bacterial SH3 domain homologues; . 47617 smart00288: Domain present in VPS-27, Hrs and STAM; Unpublished observations. Domain of unknown function. . 47618 smart00289: Worm-specific repeat type 1; Worm-specific repeat type 1. Cysteine-rich domain apparently unique (so far) to C. elegans. Often appears with KU domains. About 3 dozen worm proteins contain this domain. . 47619 smart00290: Ubiquitin Carboxyl-terminal Hydrolase-like zinc finger; . 47620 smart00291: Zinc-binding domain, present in Dystrophin, CREB-binding protein. Putative zinc-binding domain present in dystrophin-like proteins, and CREB-binding protein/p300 homologues. The ZZ in dystrophin appears to bind calmodulin. A missense mutation of one of the conserved cysteines in dystrophin results in a patient with Duchenne muscular dystrophy. . 47621 smart00292: breast cancer carboxy-terminal domain; . 47622 smart00293: domain with conserved PWWP motif; conservation of Pro-Trp-Trp-Pro residues . 47623 smart00294: putative band 4.1 homologues' binding motif; . 47624 smart00295: Band 4.1 homologues; Also known as ezrin/radixin/moesin (ERM) protein domains. Present in myosins, ezrin, radixin, moesin, protein tyrosine phosphatases. Plasma membrane-binding domain. These proteins play structural and regulatory roles in the assembly and stabilization of specialized plasmamembrane domains. Some PDZ domain containing proteins bind one or more of this family. Now includes JAKs. . 47625 smart00297: bromo domain; . 47626 smart00298: Chromatin organization modifier domain; . 47627 smart00299: Clathrin heavy chain repeat homology; . 47628 smart00300: Chromo Shadow Domain; . 47629 smart00301: Doublesex DNA-binding motif; . 47630 smart00302: Dynamin GTPase effector domain; . 47631 smart00303: G-protein-coupled receptor proteolytic site domain; Present in latrophilin/CL-1, sea urchin REJ and polycystin. . 47632 smart00304: HAMP (Histidine kinases, Adenylyl cyclases, Methyl binding proteins, Phosphatases) domain; . 47633 smart00305: Hint (Hedgehog/Intein) domain C-terminal region; Hedgehog/Intein domain, C-terminal region. Domain has been split to accommodate large insertions of endonucleases. . 47634 smart00306: Hint (Hedgehog/Intein) domain N-terminal region; Hedgehog/Intein domain, N-terminal region. Domain has been split to accommodate large insertions of endonucleases. . 47635 smart00307: I/LWEQ domain; Thought to possess an F-actin binding function. . 47636 smart00308: Lipoxygenase homology 2 (beta barrel) domain; . 47637 smart00309: Pancreatic hormones / neuropeptide F / peptide YY family ; Pancreatic hormone is a regulator of pancreatic and gastrointestinal functions. . 47638 smart00310: Phosphotyrosine-binding domain (IRS1-like); . 47639 smart00311: PWI, domain in splicing factors; . 47640 smart00312: PhoX homologous domain, present in p47phox and p40phox. Eukaryotic domain of unknown function present in phox proteins, PLD isoforms, a PI3K isoform. . 47641 smart00313: Domain associated with PX domains; unpubl. observations . 47642 smart00314: Ras association (RalGDS/AF-6) domain; RasGTP effectors (in cases of AF6, canoe and RalGDS); putative RasGTP effectors in other cases. Kalhammer et al. have shown that not all RA domains bind RasGTP. Predicted structure similar to that determined, and that of the RasGTP-binding domain of Raf kinase. Predicted RA domains in PLC210 and nore1 found to bind RasGTP. Included outliers (Grb7, Grb14, adenylyl cyclases etc.) . 47643 smart00315: Regulator of G protein signalling domain; RGS family members are GTPase-activating proteins for heterotrimeric G-protein alpha-subunits. . 47644 smart00316: Ribosomal protein S1-like RNA-binding domain; . 47645 smart00317: SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain; Putative methyl transferase, based on outlier plant homologues . 47646 smart00318: Staphylococcal nuclease homologues; . 47647 smart00319: Homologues of the ligand binding domain of Tar; Homologues of the ligand binding domain of the wild-type bacterial aspartate receptor, Tar. . 47648 smart00320: WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain. . 47649 smart00321: present in yeast cell wall integrity and stress response component proteins; Domain present in WSC proteins, polycystin and fungal exoglucanase . 47650 smart00322: K homology RNA-binding domain; . 47651 smart00323: GTPase-activator protein for Ras-like GTPases; All alpha-helical domain that accelerates the GTPase activity of Ras, thereby ""switching"" it into an ""off"" position. Improved domain limits from structure. . 47652 smart00324: GTPase-activator protein for Rho-like GTPases; GTPase activator proteins towards Rho/Rac/Cdc42-like small GTPases. etter domain limits and outliers. . 47653 smart00325: Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Improved coverage. . 47654 smart00326: Src homology 3 domains; Src homology 3 (SH3) domains bind to target proteins through sequences containing proline and hydrophobic amino acids. Pro-containing polypeptides may bind to SH3 domains in 2 different binding orientations. . 47655 smart00327: von Willebrand factor (vWF) type A domain; VWA domains in extracellular eukaryotic proteins mediate adhesion via metal ion-dependent adhesion sites (MIDAS). Intracellular VWA domains and homologues in prokaryotes have recently been identified. The proposed VWA domains in integrin beta subunits have recently been substantiated using sequence-based methods. . 47656 smart00328: BPI/LBP/CETP N-terminal domain; Bactericidal permeability-increasing protein (BPI) / Lipopolysaccharide-binding protein (LBP) / Cholesteryl ester transfer protein (CETP) N-terminal domain . 47657 smart00329: BPI/LBP/CETP C-terminal domain; Bactericidal permeability-increasing protein (BPI) / Lipopolysaccharide-binding protein (LBP) / Cholesteryl ester transfer protein (CETP) C-terminal domain . 47658 smart00330: Phosphatidylinositol phosphate kinases; . 47659 smart00331: Sigma factor PP2C-like phosphatases; . 47660 smart00332: Serine/threonine phosphatases, family 2C, catalytic domain; The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity. . 47661 smart00333: Tudor domain; Domain of unknown function present in several RNA-binding proteins. 10 copies in the Drosophila Tudor protein. Initial proposal that the survival motor neuron gene product contain a Tudor domain are corroborated by more recent database search techniques such as PSI-BLAST (unpublished). . 47662 smart00335: Annexin repeats; . 47663 smart00336: B-Box-type zinc finger; . 47664 smart00337: BCL (B-Cell lymphoma); contains BH1, BH2 regions; (BH1, BH2, (BH3 (one helix only)) and not BH4(one helix only)). Involved in apoptosis regulation . 47665 smart00338: basic region leucin zipper; . 47666 smart00339: FORKHEAD; FORKHEAD, also known as a ""winged helix"" . 47667 smart00340: homeobox associated leucin zipper; . 47668 smart00341: Helicase and RNase D C-terminal; Hypothetical role in nucleic acid binding. Mutations in the HRDC domain cause human disease. . 47669 smart00342: helix_turn_helix, arabinose operon control protein; . 47670 smart00343: zinc finger; . 47671 smart00344: helix_turn_helix ASNC type; AsnC: an autogenously regulated activator of asparagine synthetase A transcription in Escherichia coli) . 47672 smart00345: helix_turn_helix gluconate operon transcriptional repressor; . 47673 smart00346: helix_turn_helix isocitrate lyase regulation; . 47674 smart00347: helix_turn_helix multiple antibiotic resistance protein; . 47675 smart00348: interferon regulatory factor; interferon regulatory factor, also known as trytophan pentad repeat . 47676 smart00349: krueppel associated box; . 47677 smart00350: minichromosome maintenance proteins; . 47678 smart00351: Paired Box domain; . 47679 smart00352: Found in Pit-Oct-Unc transcription factors; . 47680 smart00353: helix loop helix domain; . 47681 smart00354: helix_turn _helix lactose operon repressor; . 47682 smart00355: zinc finger; . 47683 smart00356: zinc finger; . 47684 smart00357: Cold shock protein domain; RNA-binding domain that functions as a RNA-chaperone in bacteria and is involved in regulating translation in eukaryotes. Contains sub-family of RNA-binding domains in the Rho transcription termination factor. . 47685 smart00358: Double-stranded RNA binding motif; . 47686 smart00359: Putative RNA-binding Domain in PseudoUridine synthase and Archaeosine transglycosylase; . 47687 smart00360: RNA recognition motif; . 47688 smart00361: RNA recognition motif; . 47689 smart00363: S4 RNA-binding domain; . 47690 smart00364: Leucine-rich repeats, bacterial type; . 47691 smart00365: Leucine-rich repeat, SDS22-like subfamily; . 47692 smart00367: Leucine-rich repeat - CC (cysteine-containing) subfamily; . 47693 smart00368: Leucine rich repeat, ribonuclease inhibitor type; . 47694 smart00369: Leucine-rich repeats, typical (most populated) subfamily; . 47695 smart00380: DNA-binding domain in plant proteins such as APETALA2 and EREBPs; . 47696 smart00382: ATPases associated with a variety of cellular activities; AAA - ATPases associated with a variety of cellular activities. This profile/alignment only detects a fraction of this vast family. The poorly conserved N-terminal helix is missing from the alignment. . 47697 smart00384: DNA binding domain with preference for A/T rich regions; Small DNA-binding motif first described in the high mobility group non-histone chromosomal protein HMG-I(Y). . 47698 smart00385: domain present in cyclins, TFIIB and Retinoblastoma; A helical domain present in cyclins and TFIIB (twice) and Retinoblastoma (once). A protein recognition domain functioning in cell-cycle and transcription control. . 47699 smart00386: HAT (Half-A-TPR) repeats; Present in several RNA-binding proteins. Structurally and sequentially thought to be similar to TPRs. . 47700 smart00387: Histidine kinase-like ATPases; Histidine kinase-, DNA gyrase B-, phytochrome-like ATPases. . 47701 smart00388: His Kinase A (phosphoacceptor) domain; Dimerisation and phosphoacceptor domain of histidine kinases. . 47702 smart00389: Homeodomain; DNA-binding factors that are involved in the transcriptional regulation of key developmental processes . 47703 smart00390: LGN motif, putative GEFs specific for G-alpha GTPases; GEF specific for Galpha_i proteins . 47704 smart00391: Methyl-CpG binding domain; Methyl-CpG binding domain, also known as the TAM (TTF-IIP5, ARBP, MeCP1) domain . 47705 smart00392: Profilin; Binds actin monomers, membrane polyphosphoinositides and poly-L-proline. . 47706 smart00393: Putative single-stranded nucleic acids-binding domain; . 47707 smart00394: RIIalpha, Regulatory subunit portion of type II PKA R-subunit; RIIalpha, Regulatory subunit portion of type II PKA R-subunit. Contains dimerisation interface and binding site for A-kinase-anchoring proteins (AKAPs). . 47708 smart00396: Putative zinc finger in N-recognin, a recognition component of the N-end rule pathway; Domain is involved in recognition of N-end rule substrates in yeast Ubr1p . 47709 smart00397: Helical region found in SNAREs; All alpha-helical motifs that form twisted and parallel four-helix bundles in target soluble N-ethylmaleimide-sensitive factor (NSF) attachment protein (SNAP) receptor proteins. This motif found in ""Q-SNAREs"". . 47710 smart00398: high mobility group; . 47711 smart00399: c4 zinc finger in nuclear hormone receptors; . 47712 smart00400: zinc finger; . 47713 smart00401: zinc finger binding to DNA consensus sequence [AT]GATA[AG] ; . 47714 smart00404: Protein tyrosine phosphatase, catalytic domain motif; . 47715 smart00406: Immunoglobulin V-Type; . 47716 smart00407: Immunoglobulin C-Type; . 47717 smart00408: Immunoglobulin C-2 Type; . 47718 smart00409: Immunoglobulin; . 47719 smart00411: bacterial (prokaryotic) histone like domain; . 47720 smart00412: Copper-Fist; binds DNA only in present of copper or silver . 47721 smart00413: erythroblast transformation specific domain; variation of the helix-turn-helix motif . 47722 smart00414: Histone 2A; . 47723 smart00415: heat shock factor; . 47724 smart00417: Histone H4; . 47725 smart00418: helix_turn_helix, Arsenical Resistance Operon Repressor; . 47726 smart00419: helix_turn_helix, cAMP Regulatory protein; . 47727 smart00420: helix_turn_helix, Deoxyribose operon repressor; . 47728 smart00421: helix_turn_helix, Lux Regulon; lux regulon (activates the bioluminescence operon . 47729 smart00422: helix_turn_helix, mercury resistance; . 47730 smart00423: domain found in Plexins, Semaphorins and Integrins; . 47731 smart00424: STE like transcription factors; . 47732 smart00425: Domain first found in the mice T locus (Brachyury) protein; . 47733 smart00426: TEA domain; . 47734 smart00427: Histone H2B; . 47735 smart00428: Histone H3; . 47736 smart00429: ig-like, plexins, transcription factors; . 47737 smart00430: Ligand binding domain of hormone receptors; . 47738 smart00431: leucine rich region; . 47739 smart00432: MADS domain. 47740 smart00433: TopoisomeraseII; Eukaryotic DNA topoisomerase II, GyrB, ParE . 47741 smart00434: DNA Topoisomerase IV; Bacterial DNA topoisomerase IV, GyrA, ParC . 47742 smart00435: DNA Topoisomerase I (eukaryota); DNA Topoisomerase I (eukaryota), DNA topoisomerase V, Vaccina virus topoisomerase, Variola virus topoisomerase, Shope fibroma virus topoisomeras . 47743 smart00436: Bacterial DNA topoisomeraes I ATP-binding domain; Extension of TOPRIM in Bacterial DNA topoisomeraes I and III, Eukaryotic DNA topoisomeraes III, reverse gyrase beta subunit . 47744 smart00437: Bacterial DNA topoisomerase I DNA-binding domain; Bacterial DNA topoisomerase I and III, Eukaryotic DNA topoisomeraes III, reverse gyrase alpha subunit . 47745 smart00438: Repressor of transcription . 47746 smart00439: Bromo adjacent homology domain; . 47747 smart00440: C2C2 Zinc finger; Nucleic-acid-binding motif in transcriptional elongation factor TFIIS and RNA polymerases. . 47748 smart00441: Contains two conserved F residues; A novel motif that often accompanies WW domains. Often contains two conserved Phe (F) residues. . 47749 smart00442: Acidic and basic fibroblast growth factor family. Mitogens that stimulate growth or differentiation of cells of mesodermal or neuroectodermal origin. The family play essential roles in patterning and differentiation during vertebrate embryogenesis, and have neurotrophic activities. . 47750 smart00443: glycine rich nucleic binding domain; A predicted glycine rich nucleic binding domain found in the splicing factor 45, SON DNA binding protein and D-type Retrovirus- polyproteins. . 47751 smart00444: Contains conserved Gly-Tyr-Phe residues; Proline-binding domain in CD2-binding protein. Contains conserved Gly-Tyr-Phe residues. . 47752 smart00445: Link (Hyaluronan-binding); . 47753 smart00446: occurring C-terminal to leucine-rich repeats; A motif occurring C-terminal to leucine-rich repeats in ""sds22-like "" and ""typical"" LRR-containing proteins. . 47754 smart00448: cheY-homologous receiver domain; CheY regulates the clockwise rotation of E. coli flagellar motors. This domain contains a phosphoacceptor site that is phosphorylated by histidine kinase homologues. . 47755 smart00449: Domain in SPla and the RYanodine Receptor. ; Domain of unknown function. Distant homologues are domains in butyrophilin/marenostrin/pyrin homologues. . 47756 smart00450: Rhodanese Homology Domain; An alpha beta fold found duplicated in the Rhodanese protein. The the Cysteine containing enzymatically active version of the domain is also found in the CDC25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and stress proteins such as Senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions with a loss of the cysteine are also seen in Dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases. These are likely to play a role in protein interactions. . 47757 smart00451: U1-like zinc finger; Family of C2H2-type zinc fingers, present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins. . 47758 smart00452: Soybean trypsin inhibitor (Kunitz) family of protease inhibitors; . 47759 smart00453: Worm-specific (usually) N-terminal domain; . 47760 smart00454: Sterile alpha motif. Widespread domain in signalling and nuclear proteins. In EPH-related tyrosine kinases, appears to mediate cell-cell initiated signal transduction via the binding of SH2-containing proteins to a conserved tyrosine that is phosphorylated. In many cases mediates homodimerisation. . 47761 smart00455: Raf-like Ras-binding domain; . 47762 smart00456: Domain with 2 conserved Trp (W) residues; Also known as the WWP or rsp5 domain. Binds proline-rich polypeptides. . 47763 smart00457: membrane-attack complex / perforin; . 47764 smart00458: Ricin-type beta-trefoil; Carbohydrate-binding domain formed from presumed gene triplication. . 47765 smart00459: Sorbin homologous domain; First found in the peptide hormone sorbin and later in the ponsin/ArgBP2/vinexin family of proteins. . 47766 smart00460: Transglutaminase/protease-like homologues; Transglutaminases are enzymes that establish covalent links between proteins. A subset of transglutaminase homologues appear to catalyse the reverse reaction, the hydrolysis of peptide bonds. Proteins with this domain are both extracellular and intracellular, and it is likely that the eukaryotic intracellular proteins are involved in signalling events. . 47767 smart00461: WASP homology region 1; Region of the Wiskott-Aldrich syndrome protein (WASp) that contains point mutations in the majority of patients with WAS. Unknown function. Ena-like WH1 domains bind polyproline-containing peptides, and that Homer contains a WH1 domain. . 47768 smart00462: Phosphotyrosine-binding domain, phosphotyrosine-interaction (PI) domain; PTB/PI domain structure similar to those of pleckstrin homology (PH) and IRS-1-like PTB domains. . 47769 smart00463: Small MutS-related domain; . 47771 smart00465: GIY-YIG type nucleases (URI domain); . 47772 smart00466: SET and RING finger associated domain. Domain of unknown function in SET domain containing proteins and in Deinococcus radiodurans DRA1533. Domain in SET domain containing proteins and in Deinococcus radiodurans DRA1533. . 47773 smart00467: GS motif; Aa approx. 30 amino acid motif that precedes the kinase domain in types I and II TGF beta receptors. Mutation of two or more of the serines or threonines in the TTSGSGSG of TGF-beta type I receptor impairs phosphorylation and signaling activity. . 47774 smart00468: N-terminal to some SET domains; A Cys-rich putative Zn2+-binding domain that occurs N-terminal to some SET domains. Function is unknown. Unpublished. . 47775 smart00469: Wnt-inhibitory factor-1 like domain; Occurs as extracellular domain in metazoan Ryk receptor tyrosine kinases. C. elegans Ryk is required for cell-cuticle recognition. WIF-1 binds to Wnt and inhibits its activity. . 47776 smart00470: ParB-like nuclease domain; Plasmid RK2 ParB preferentially cleaves single-stranded DNA. ParB also nicks supercoiled plasmid DNA preferably at sites with potential single-stranded character, like AT-rich regions and sequences that can form cruciform structures. ParB also exhibits 5-->3 exonuclease activity. . 47777 smart00471: Metal dependent phosphohydrolases with conserved 'HD' motif. Includes eukaryotic cyclic nucleotide phosphodiesterases (PDEc). This profile/HMM does not detect HD homologues in bacterial glycine aminoacyl-tRNA synthetases (beta subunit). . 47778 smart00472: Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases; Ponting (2000) Trends in Biochem. Sci, in press . 47779 smart00473: divergent subfamily of APPLE domains; Apple-like domains present in Plasminogen, C. elegans hypothetical ORFs and the extracellular portion of plant receptor-like protein kinases. Predicted to possess protein- and/or carbohydrate-binding functions. . 47780 smart00474: 3'-5' exonuclease; 3' -5' exonuclease proofreading domain present in DNA polymerase I, Werner syndrome helicase, RNase D and other enzymes . 47781 smart00475: 5'-3' exonuclease; . 47782 smart00476: deoxyribonuclease I; Deoxyribonuclease I catalyzes the endonucleolytic cleavage of double-stranded DNA. The enzyme is secreted outside the cell and also involved in apoptosis in the nucleus. . 47783 smart00477: DNA/RNA non-specific endonuclease; prokaryotic and eukaryotic double- and single-stranded DNA and RNA endonucleases also present in phosphodiesterases . 47784 smart00478: endonuclease III; includes endonuclease III (DNA-(apurinic or apyrimidinic site) lyase), alkylbase DNA glycosidases (Alka-family) and other DNA glycosidases . 47785 smart00479: exonuclease domain in DNA-polymerase alpha and epsilon chain, ribonuclease T and other exonucleases . 47786 smart00480: DNA polymerase III beta subunit; . 47787 smart00481: DNA polymerase alpha chain like domain; DNA polymerase alpha chain like domain, incl. family of hypothetical proteins . 47788 smart00482: DNA polymerase A domain; . 47789 smart00483: DNA polymerase X family; includes vertebrate polymerase beta and terminal deoxynucleotidyltransferases . 47790 smart00484: Xeroderma pigmentosum G I-region; domain in nucleases . 47791 smart00485: Xeroderma pigmentosum G N-region; domain in nucleases . 47792 smart00486: DNA polymerase type-B family; DNA polymerase alpha, delta, epsilon and zeta chain (eukaryota), DNA polymerases in archaea, DNA polymerase II in e. coli, mitochondrial DNA polymerases and and virus DNA polymerases . 47793 smart00487: DEAD-like helicases superfamily; . 47794 smart00488: DEAD-like helicases superfamily; . 47795 smart00490: helicase superfamily c-terminal domain; . 47796 smart00491: helicase superfamily c-terminal domain; . 47797 smart00493: topoisomerases, DnaG-type primases, OLD family nucleases and RecR proteins . 47798 smart00494: Chitin-binding domain type 2; . 47799 smart00495: Chitin-binding domain type 3; . 47800 smart00496: Intron-encoded nuclease repeat 2; Short helical motif of unknown function (unpublished results). . 47801 smart00497: Intron encoded nuclease repeat motif; Repeat of unknown function, but possibly DNA-binding via helix-turn-helix motif (Ponting, unpublished). . 47802 smart00498: Formin Homology 2 Domain; FH proteins control rearrangements of the actin cytoskeleton, especially in the context of cytokinesis and cell polarisation. Members of this family have been found to interact with Rho-GTPases, profilin and other actin-assoziated proteins. These interactions are mediated by the proline-rich FH1 domain, usually located in front of FH2 (but not listed in SMART). Despite this cytosolic function, vertebrate formins have been assigned functions within the nucleus. A set of Formin-Binding Proteins (FBPs) has been shown to bind FH1 with their WW domain. . 47803 smart00499: Plant lipid transfer protein / seed storage protein / trypsin-alpha amylase inhibitor domain family; . 47804 smart00500: Splicing Factor Motif, present in Prp18 and Pr04; . 47805 smart00501: BRIGHT, ARID (A/T-rich interaction domain) domain; DNA-binding domain containing a helix-turn-helix structure . 47806 smart00502: B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains . 47807 smart00503: Syntaxin N-terminal domain; Three-helix domain that (in Sso1p) slows the rate of its reaction with the SNAP-25 homologue Sec9p . 47808 smart00504: Modified RING finger domain; Modified RING finger domain, without the full complement of Zn2+-binding ligands. Probable involvement in E2-dependent ubiquitination. . 47809 smart00505: Knottins; Knottins, representing plant lectins/antimicrobial peptides, plant proteinase/amylase inhibitors, plant gamma-thionins and arthropod defensins. . 47810 smart00506: Appr-1""-p processing enzyme; Function determined by Martzen et al. Extended family detected by reciprocal PSI-BLAST searches (unpublished results, and Pehrson & Fuji). . 47811 smart00507: HNH nucleases; . 47812 smart00508: Cysteine-rich motif following a subset of SET domains; . 47813 smart00509: Domain in the N-terminus of transcription elongation factor S-II (and elsewhere) ; . 47814 smart00510: Domain in the central regions of transcription elongation factor S-II (and elsewhere); . 47815 smart00511: Orange domain; This domain confers specificity among members of the Hairy/E(SPL) family. . 47816 smart00512: Found in Skp1 protein family; Family of Skp1 (kinetochore protein required for cell cycle progression) and elongin C (subunit of RNA polymerase II transcription factor SIII) homologues. . 47817 smart00513: Putative DNA-binding (bihelical) motif predicted to be involved in chromosomal organisation; . 47818 smart00515: Domain at the C-termini of GCD6, eIF-2B epsilon, eIF-4 gamma and eIF-5; . 47819 smart00516: Domain in homologues of a S. cerevisiae phosphatidylinositol transfer protein (Sec14p); Domain in homologues of a S. cerevisiae phosphatidylinositol transfer protein (Sec14p) and in RhoGAPs, RhoGEFs and the RasGEF, neurofibromin (NF1). Lipid-binding domain. The SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits. . 47820 smart00517: C-terminal domain of Poly(A)-binding protein. Present also in Drosophila hyperplastics discs protein. Involved in homodimerisation (either directly or indirectly) . 47821 smart00518: AP endonuclease family 2; These endonucleases play a role in DNA repair. Cleave phosphodiester bonds at apurinic or apyrimidinic sites . 47822 smart00520: Basic domain in HLH proteins of MYOD family; . 47823 smart00521: CCAAT-Binding transcription Factor; . 47824 smart00523: Domain A in dwarfin family proteins; . 47825 smart00524: Domain B in dwarfin family proteins; . 47826 smart00525: iron-sulpphur binding domain in DNA-(apurinic or apyrimidinic site) lyase (subfamily of ENDO3) . 47827 smart00526: Domain in histone families 1 and 5; . 47828 smart00527: domain in high mobilty group proteins HMG14 and HMG 17; . 47829 smart00528: Domain in histone-like proteins of HNS family; . 47830 smart00529: Helix-turn-helix diphteria tox regulatory element; iron dependent repressor . 47831 smart00530: Helix-turn-helix XRE-family like proteins ; . 47832 smart00531: Transcription initiation factor IIE; . 47833 smart00532: Ligase N family; . 47834 smart00533: DNA-binding domain of DNA mismatch repair MUTS family; . 47835 smart00534: ATPase domain of DNA mismatch repair MUTS family; . 47836 smart00535: Ribonuclease III family; . 47837 smart00536: domain in Ataxins and HMG containing proteins; unknown function . 47838 smart00537: Domain in the Doublecortin (DCX) gene product; Tandemly-repeated domain in doublin, the Doublecortin gene product. Proposed to bind tubulin. Doublecortin (DCX) is mutated in human X-linked neuronal migration defects. . 47839 smart00538: A domain found in a protein subunit of human RNase MRP and RNase P ribonucleoprotein complexes and archaeal proteins. . 47840 smart00539: Extracellular domain of unknown function in nidogen (entactin) and hypothetical proteins. . 47841 smart00540: in nuclear membrane-associated proteins; LEM, domain in nuclear membrane-associated proteins, including lamino-associated polypeptide 2 and emerin. . 47842 smart00541: ""FY-rich"" domain, N-terminal region; is sometimes closely juxtaposed with the C-terminal region (FYRC), but sometimes is far distant. Unknown function, but occurs frequently in chromatin-associated proteins. . 47843 smart00542: ""FY-rich"" domain, C-terminal region; is sometimes closely juxtaposed with the N-terminal region (FYRN), but sometimes is far distant. Unknown function, but occurs frequently in chromatin-associated proteins. . 47844 smart00543: Middle domain of eukaryotic initiation factor 4G (eIF4G); Also occurs in NMD2p and CBP80. The domain is rich in alpha-helices and may contain multiple alpha-helical repeats. In eIF4G, this domain binds eIF4A, eIF3, RNA and DNA. Ponting (TiBS) ""Novel eIF4G domain homologues (in press) . 47845 smart00544: Domain in DAP-5, eIF4G, MA-3 and other proteins. Highly alpha-helical. May contain repeats and/or regions similar to MIF4G domains Ponting (TIBS) ""Novel eIF4G domain homologues"" in press . 47846 smart00545: Small domain found in the jumonji family of transcription factors; To date, this domain always co-occurs with the JmjC domain (although the reverse is not true). . 47847 smart00546: Domain that may be involved in binding ubiquitin-conjugating enzymes (UBCs); CUE domains also occur in two protein of the IL-1 signal transduction pathway, tollip and TAB2. Ponting (Biochem. J.) ""Proteins of the Endoplasmic reticulum"" (in press) . 47848 smart00547: Zinc finger domain; Zinc finger domain in Ran-binding proteins (RanBPs), and other proteins. In RanBPs, this domain binds RanGDP. . 47849 smart00548: Motif in Iroquois-class homeodomain proteins (only). Unknown function. . 47850 smart00549: TAF homology; Domain in Drosophila nervy, CBFA2T1, human TAF105, human TAF130, and Drosophila TAF110. Also known as nervy homology region 1 (NHR1). . 47851 smart00550: Z-DNA-binding domain in adenosine deaminases. Helix-turn-helix-containing domain. Also known as Zab. . 47852 smart00551: TAZ zinc finger, present in p300 and CBP ; . 47853 smart00552: tRNA-specific and double-stranded RNA adenosine deaminase (RNA-specific editase); . 47854 smart00553: Domain present in Saccharomyces cerevisiae Shp1, Drosophila melanogaster eyes closed gene (eyc), and vertebrate p47. . 47855 smart00554: Four repeated domains in the Fasciclin I family of proteins, present in many other contexts. . 47856 smart00555: Helical motif in the GIT family of ADP-ribosylation factor GTPase-activating proteins; Helical motif in the GIT family of ADP-ribosylation factor GTPase-activating proteins, and in yeast Spa2p and Sph1p (CPP; unpublished results). In p95-APP1 the N-terminal GIT motif might be involved in binding PIX. . 47857 smart00557: Filamin-type immunoglobulin domains; These form a rod-like structure in the actin-binding cytoskeleton protein, filamin. The C-terminal repeats of filamin bind beta1-integrin (CD29). . 47858 smart00558: A domain family that is part of the cupin metalloenzyme superfamily. Probable enzymes, but of unknown functions, that regulate chromatin reorganisation processes (Clissold and Ponting, in press). . 47859 smart00559: Ku70 and Ku80 are 70kDa and 80kDa subunits of the Lupus Ku autoantigen; This is a single stranded DNA- and ATP-depedent helicase that has a role in chromosome translocation. This is a domain of unknown function C-terminal to its von Willebrand factor A domain, that also occurs in bacterial hypothetical proteins. . 47860 smart00560: LamG-like jellyroll fold domain; . 47861 smart00561: Present in Drosophila Scm, l(3)mbt, and vertebrate SCML2; Present in Drosophila Scm, l(3)mbt, and vertebrate SCML2. These proteins are involved in transcriptional regulation. . 47862 smart00562: These are enzymes that catalyze nonsubstrate specific conversions of nucleoside diphosphates to nucleoside triphosphates. These enzymes play important roles in bacterial growth, signal transduction and pathogenicity. . 47863 smart00563: Phosphate acyltransferases; Function in phospholipid biosynthesis and have either glycerolphosphate, 1-acylglycerolphosphate, or 2-acylglycerolphosphoethanolamine acyltransferase activities. Tafazzin, the product of the gene mutated in patients with Barth syndrome, is a member of this family. . 47864 smart00564: beta-propeller repeat; Beta-propeller repeat occurring in enzymes with pyrrolo-quinoline quinone (PQQ) as cofactor, in Ire1p-like Ser/Thr kinases, and in prokaryotic dehydrogenases. . 47865 smart00566: Domain in SOG (short gastrulation protein) and chordin; . 47866 smart00567: E-Z type HEAT repeats; Present in subunits of cyanobacterial phycocyanin lyase, and other proteins. Probable scaffolding role. . 47867 smart00568: domain in glucosyltransferases, myotubularins and other putative membrane-associated proteins; . 47868 smart00569: domain in receptor targeting proteins Lin-2 and Lin-7; . 47869 smart00570: associated with SET domains; subdomain of PRESET . 47870 smart00571: domain in different transcription and chromosome remodeling factors; . 47871 smart00572: domain in DSRM or ZnF_C2H2 domain containing proteins; . 47872 smart00573: domain in helicases and associated with SANT domains; . 47873 smart00574: domain associated with HOX domains; . 47874 smart00575: plant mutator transposase zinc finger; . 47875 smart00576: Bromodomain transcription factors and PHD domain containing proteins; subdomain of archael histone-like transcription factors . 47876 smart00577: catalytic domain of ctd-like phosphatases; . 47877 smart00579: domain in FBox and BRCT domain containing plant proteins; . 47878 smart00580: domain in protein kinases, N-glycanases and other nuclear proteins; . 47879 smart00581: proline-rich domain in spliceosome associated proteins; . 47880 smart00582: domain present in proteins, which are involved in regulation of nuclear pre-mRNA . 47881 smart00583: domain in SET and PHD domain containing proteins and protein kinases . 47882 smart00584: domain in TBC and LysM domain containing proteins; . 47883 smart00586: Zinc finger in DBF-like proteins; . 47884 smart00587: ZnF_C4 abd HLH domain containing kinases domain; subfamily of choline kinases . 47885 smart00588: domain in neuralized proteins . 47886 smart00589: associated with SPRY domains . 47887 smart00591: domain in RING finger and WD repeat containing proteins and DEXDc-like helicases subfamily related to the UBCc domain . 47888 smart00592: domain in transcription and CHROMO domain helicases; . 47889 smart00593: domain involved in Ras-like GTPase signaling . 47890 smart00594: UAS domain. 47891 smart00595: subfamily of SANT domain . 47892 smart00596: PRE_C2H2 domain. 47893 smart00597: zinc finger in transposases and transcription factors . 47894 smart00602: VPS10 domain. 47895 smart00603: LCCL domain. 47896 smart00604: MD domain. 47897 smart00605: CW domain. 47898 smart00606: Cellulose Binding Domain Type IV; . 47899 smart00607: eel-Fucolectin Tachylectin-4 Pentaxrin-1 Domain; . 47900 smart00608: ADAM Cysteine-Rich Domain; . 47901 smart00609: Vault protein Inter-alpha-Trypsin domain; . 47902 smart00611: Domain of unknown function in Sec63p, Brr2p and other proteins. . 47903 smart00612: Kelch domain. 47904 smart00613: domain present in PNGases and other hypothetical proteins; present in several copies in proteins with unknown function in C. elegans . 47905 smart00614: BED zinc finger; DNA-binding domain in chromatin-boundary-element-binding proteins and transposases . 47906 smart00615: Ephrin receptor ligand binding domain; . 47907 smart00630: semaphorin domain; . 47908 smart00631: Zn_pept domain. 47909 smart00632: Aamy_C domain. 47910 smart00633: Glycosyl hydrolase family 10; . 47911 smart00634: Bacterial Ig-like domain (group 1); . 47912 smart00635: Bacterial Ig-like domain 2; . 47913 smart00636: Glyco_18 domain. 47914 smart00637: CBD_II domain. 47915 smart00638: Lipoprotein N-terminal Domain; . 47916 smart00639: Paramecium Surface Antigen Repeat; . 47917 smart00640: Glycosyl hydrolases family 32; . 47918 smart00641: Glycosyl hydrolases family 25; . 47920 smart00643: Netrin C-terminal Domain; . 47921 smart00644: Ami_2 domain. 47922 smart00645: Papain family cysteine protease; . 47923 smart00646: Ami_3 domain. 47924 smart00647: In Between Ring fingers; the domains occurs between pairs og RING fingers . 47925 smart00648: Suppressor-of-White-APricot splicing regulator; domain present in regulators which are responsible for pre-mRNA splicing processes . 47926 smart00649: Ribosomal protein L11/L12; . 47927 smart00650: Ribosomal RNA adenine dimethylases; . 47928 smart00651: snRNP Sm proteins; small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing . 47929 smart00652: eukaryotic translation initiation factor 1A; . 47930 smart00653: domain present in translation initiation factor eIF2B and eIF5; . 47931 smart00654: translation initiation factor 6; . 47932 smart00656: Amb_all domain. 47933 smart00657: DNA-directed RNA-polymerase II subunit ; . 47934 smart00658: RNA polymerase subunit 8; subunit of RNA polymerase I, II and III . 47935 smart00659: RNA polymerase subunit CX; present in RNA polymerase I, II and III . 47936 smart00661: RNA polymerase subunit 9; . 47937 smart00662: RNA polymerases D; DNA-directed RNA polymerase subunit D and bacterial alpha chain . 47938 smart00663: RNA polymerase I subunit A N-terminus; . 47939 smart00664: Possible catecholamine-binding domain present in a variety of eukaryotic proteins. A predominantly beta-sheet domain present as a regulatory N-terminal domain in dopamine beta-hydroxylase, mono-oxygenase X and SDR2. Its function remains unknown at present (Ponting, Human Molecular Genetics, in press). . 47940 smart00665: Cytochrome b-561 / ferric reductase transmembrane domain. Cytochrome b-561 recycles ascorbate for the generation of norepinephrine by dopamine-beta-hydroxylase in the chromaffin vesicles of the adrenal gland. It is a transmembrane heme protein with the two heme groups being bound to conserved histidine residues. A cytochrome b-561 homologue, termed Dcytb, is an iron-regulated ferric reductase in the duodenal mucosa. Other homologues of these are also likely to be ferric reductases. SDR2 is proposed to be important in regulating the metabolism of iron in the onset of neurodegenerative disorders. . 47941 smart00666: PB1 domain ; Phox and Bem1p domain, present in many eukaryotic cytoplasmic signalling proteins. The domain adopts a beta-grasp fold, similar to that found in ubiquitin and Ras-binding domains. A motif, variously termed OPR, PC and AID, represents the most conserved region of the majority of PB1 domains, and is necessary for PB1 domain function. This function is the formation of PB1 domain heterodimers, although not all PB1 domain pairs associate. . 47942 smart00667: Lissencephaly type-1-like homology motif; Alpha-helical motif present in Lis1, treacle, Nopp140, some katanin p60 subunits, muskelin, tonneau, LEUNIG and numerous WD40 repeat-containing proteins. It is suggested that LisH motifs contribute to the regulation of microtubule dynamics, either by mediating dimerisation, or else by binding cytoplasmic dynein heavy chain or microtubules directly. . 47943 smart00668: C-terminal to LisH motif. ; Alpha-helical motif of unknown function. . 47944 smart00670: Large family of predicted nucleotide-binding domains; From similarities to 5'-exonucleases, these domains are predicted to be RNases. PINc domains in nematode SMG-5 and yeast NMD4p are predicted to be involved in RNAi. . 47945 smart00671: Sel1-like repeats. These represent a subfamily of TPR (tetratricopeptide repeat) sequences. . 47946 smart00672: Putative lipopolysaccharide-modifying enzyme. . 47947 smart00673: Domain in CAPs (cyclase-associated proteins) and X-linked retinitis pigmentosa 2 gene product. . 47948 smart00674: Putative DNA-binding domain in centromere protein B, mouse jerky and transposases. . 47949 smart00675: Domains in hypothetical proteins in Drosophila including 2 in CG15241 and CG9329. . 47950 smart00676: Domains in hypothetical proteins in Drosophila, C. elegans and mammals. Occurs singly in some nucleoside diphosphate kinases. . 47951 smart00678: Domain in Deltex and TRIP12 homologues. Possibly involved in regulation of ubiquitin-mediated proteolysis. . 47952 smart00679: Repeated motif present between transmembrane helices in cystinosin, yeast ERS1p, mannose-P-dolichol utilization defect 1, and other hypothetical proteins. Function unknown, but likely to be associated with the glycosylation machinery. . 47953 smart00680: Clip or disulphide knot domain; Present in horseshoe crab proclotting enzyme N-terminal domain, Drosophila Easter and silkworm prophenoloxidase-activating enzyme. . 47954 smart00682: G2 nidogen domain and fibulin; . 47955 smart00683: Repeats in sea squirt COS41.4, worm R01H10.6, fly CG1126 etc. . 47956 smart00684: Tandem repeat in fly CG14066 (La related protein), human KIAA0731 and worm R144.7. Unknown function. . 47957 smart00685: Repeats in fly CG4713, worm Y37H9A.3 and human FLJ20241. . 47958 smart00686: Domain present in fly proteins (CG14681, CG12492, CG6217), worm H06A10.1 and Arabidopsis thaliana MBG8.9. . 47959 smart00688: Domain of unknown function in Drosophila CG15332, CG15333 and CG18293; . 47960 smart00689: Cysteine-rich domain currently specific to Drosophila. . 47961 smart00690: Domain of unknown function, currently peculiar to Drosophila. . 47962 smart00692: Zinc finger domain in CG10631, C. elegans LIN-15B and human P52rIPK. . 47963 smart00693: Dysferlin domain, N-terminal region. Domain of unknown function present in yeast peroxisomal proteins, dysferlin, myoferlin and hypothetical proteins. Due to an insertion of a dysferlin domain within a second dysferlin domain we have chosen to predict these domains in two parts: the N-terminal region and the C-terminal region. . 47964 smart00694: Dysferlin domain, C-terminal region. Domain of unknown function present in yeast peroxisomal proteins, dysferlin, myoferlin and hypothetical proteins. Due to an insertion of a dysferlin domain within a second dysferlin domain we have chosen to predict these domains in two parts: the N-terminal region and the C-terminal region. . 47965 smart00695: Domain in ubiquitin-specific proteases. . 47966 smart00696: Repeats found in Drosophila proteins. . 47967 smart00697: Repeats found in several Drosophila proteins. . 47968 smart00698: Possible plasma membrane-binding motif in junctophilins, PIP-5-kinases and protein kinases. . 47969 smart00700: Juvenile hormone binding protein domains in insects. The juvenile hormone exerts pleiotropic functions during insect life cycles and its binding proteins regulate these functions. . 47970 smart00701: Animal peptidoglycan recognition proteins homologous to Bacteriophage T3 lysozyme. The bacteriophage molecule, but not its moth homologue, has been shown to have N-acetylmuramoyl-L-alanine amidase activity. One member of this family, Tag7, is a cytokine. . 47971 smart00702: Prolyl 4-hydroxylase alpha subunit homologues. Mammalian enzymes catalyse hydroxylation of collagen, for example. Prokaryotic enzymes might catalyse hydroxylation of antibiotic peptides. These are 2-oxoglutarate-dependent dioxygenases, requiring 2-oxoglutarate and dioxygen as cosubstrates and ferrous iron as a cofactor. . 47972 smart00703: N-terminal domain in C. elegans NRF-6 (Nose Resistant to Fluoxetine-4) and NDG-4 (resistant to nordihydroguaiaretic acid-4). Also present in several other worm and fly proteins. . 47973 smart00704: CDGSH-type zinc finger. Function unknown. . 47974 smart00705: Repeats in THEG (testicular haploid expressed gene) and several fly proteins. . 47975 smart00706: Beta propeller repeats in Physarum polycephalum tectonins, Limulus lectin L-6 and animal hypothetical proteins. . 47976 smart00707: Repeat in Drosophila CG10860, human KIAA0680 and C. elegans F26H9.2; . 47977 smart00708: Insect pheromone/odorant binding protein domains. . 47978 smart00709: Duplicated domain in the epidermal growth factor- and elongation factor-1alpha-binding protein Zpr1. Also present in archaeal proteins. . 47979 smart00710: Parallel beta-helix repeats; The tertiary structures of pectate lyases and rhamnogalacturonase A show a stack of parallel beta strands that are coiled into a large helix. Each coil of the helix represents a structural repeat that, in some homologues, can be recognised from sequence information alone. Conservation of asparagines might be connected with asparagine-ladders that contribute to the stability of the fold. Proteins containing these repeats most often are enzymes with polysaccharide substrates. . 47980 smart00711: Short repeats in human TONDU, fly vestigial and other proteins. Unknown function. . 47981 smart00712: DNA/RNA-binding repeats in PUR-alpha/beta/gamma and in hypothetical proteins from spirochetes and the Bacteroides-Cytophaga-Flexibacter bacteria. . 47982 smart00713: Motif of unknown function with conserved Gly, Tyr, Arg tripeptide in Drosophila proteins. . 47983 smart00714: Possible membrane-associated motif in LPS-induced tumor necrosis factor alpha factor (LITAF), also known as PIG7, and other animal proteins. . 47984 smart00715: Domain in the RNA-binding Lupus La protein; unknown function; . 47985 smart00717: SANT SWI3, ADA2, N-CoR and TFIIIB'' DNA-binding domains; . 47986 smart00718: DM4/DM12 family of domains in Drosophila melanogaster proteins of unknown function. . 47987 smart00719: Short conserved domain in transcriptional regulators. Plus3 domains occur in the Saccharomyces cerevisiae Rtf1p protein, which interacts with Spt6p, and in parsley CIP, which interacts with the bZIP protein CPRF1. . 47988 smart00720: calpain_III domain. 47989 smart00721: BAR domain. 47990 smart00722: Domain present in carbohydrate binding proteins and sugar hydrolses; . 47991 smart00723: Adhesion-associated domain present in MUC4 and other proteins; . 47992 smart00724: TRAM, LAG1 and CLN8 homology domains. Protein domain with at least 5 transmembrane alpha-helices. Lag1p and Lac1p are essential for acyl-CoA-dependent ceramide synthesis, TRAM is a subunit of the translocon and the CLN8 gene is mutated in Northern epilepsy syndrome. The family may possess multiple functions such as lipid trafficking, metabolism, or sensing. Trh homologues possess additional homeobox domains. . 47993 smart00725: NEAr Transporter domain; . 47994 smart00726: Ubiquitin-interacting motif. Present in proteasome subunit S5a and other ubiquitin-associated proteins. . 47995 smart00727: Heat shock chaperonin-binding motif. . 47996 smart00728: Clostridial hydrophobic, with a conserved W residue, domain. . 47997 smart00729: Elongator protein 3, MiaB family, Radical SAM; This superfamily contains MoaA, NifB, PqqE, coproporphyrinogen III oxidase, biotin synthase and MiaB families, and includes a representative in the eukaryotic elongator subunit, Elp-3. Some members of the family are methyltransferases. . 47998 smart00730: Presenilin, signal peptide peptidase, family; Presenilin 1 and presenilin 2 are polytopic membrane proteins, whose genes are mutated in some individuals with Alzheimer's disease. Distant homologues, present in eukaryotes and archaea, also contain conserved aspartic acid residues which are predicted to contribute to catalysis. At least one member of this family has been shown to possess signal peptide peptidase activity. . 47999 smart00731: SprT homologues. Predicted to have roles in transcription elongation. Contains a conserved HExxH motif, indicating a metalloprotease function. . 48000 smart00732: Likely ribonuclease with RNase H fold. YqgF proteins are likely to function as an alternative to RuvC in most bacteria, and could be the principal holliday junction resolvases in low-GC Gram-positive bacteria. In Spt6p orthologues, the catalytic residues are substituted indicating that they lack enzymatic functions. . 48001 smart00733: Mitochondrial termination factor repeats; Human mitochondrial termination factor is a DNA-binding protein that acts as a transcription termination factor. Six repeats occur in human mTERF, that also are present in numerous plant proteins. . 48002 smart00734: Rad18-like CCHC zinc finger; Yeast Rad18p functions with Rad5p in error-free post-replicative DNA repair. This zinc finger is likely to bind nucleic-acids. . 48003 smart00736: Dystroglycan-type cadherin-like domains. Cadherin-homologous domains present in metazoan dystroglycans and alpha/epsilon sarcoglycans, yeast Axl2p and in a very large protein from magnetotactic bacteria. Likely to bind calcium ions. . 48004 smart00737: Domain involved in innate immunity and lipid metabolism. ML (MD-2-related lipid-recognition) is a novel domain identified in MD-1, MD-2, GM2A, Npc2 and multiple proteins of unknown function in plants, animals and fungi. These single-domain proteins were predicted to form a beta-rich fold containing multiple strands, and to mediate diverse biological functions through interacting with specific lipids. . 48005 smart00738: In Spt5p, this domain may confer affinity for Spt4p. It possesses a RNP-like fold. In Spt5p, this domain may confer affinity for Spt4p.Spt4p . 48006 smart00739: KOW (Kyprides, Ouzounis, Woese) motif. Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54. . 48007 smart00740: PASTA domain. 48008 smart00741: Saposin (B) Domains; Present in multiple copies in prosaposin and in pulmonary surfactant-associated protein B. In plant aspartic proteinases, a saposin domain is circularly permuted. This causes the prediction algorithm to predict two such domains, where only one is truly present. . 48009 smart00742: Rho effector or protein kinase C-related kinase homology region 1 homologues; Alpha-helical domain found in vertebrate PRK1 and yeast PKC1 protein kinases C. The HR1 in rhophilin bind RhoGTP; those in PRK1 bind RhoA and RhoB. Also called RBD - Rho-binding domain . 48010 smart00743: Tudor-like domain present in plant sequences. Domain in plant sequences with possible chromatin-associated functions. . 48011 smart00744: The RING-variant domain is a C4HC3 zinc-finger like motif found in a number of cellular and viral proteins. Some of these proteins have been shown both in vivo and in vitro to have ubiquitin E3 ligase activity. The RING-variant domain is reminiscent of both the RING and the PHD domains and may represent an evolutionary intermediate. To describe this domain the term PHD/LAP domain has been used in the past. Extended description: The RING-variant (RINGv) domain contains a C4HC3 zinc-finger-like motif similar to the PHD domain, while some of the spacing between the Cys/His residues follow a pattern somewhat closer to that found in the RING domain. The RINGv domain, similar to the RING, PHD and LIM domains, is thought to bind two zinc ions co-ordinated by the highly conserved Cys and His residues. RING variant domain: C-x(2)-C-x(10-45)-C-x(1)-C-x(7)-H-x(2)-C-x(11-25)-C-x(2)-C As opposed to a PHD: C-x(1-2)-C-x(7-13)-C-x(2-4)-C-x(4-5)-H-x(2)-C-x(10-21)-C-x(2)-C Classical RING domain: C-x(2)-C-x(9-39)-C-x(1-3)-H-x(2-3)-C-x(2)-C-x(4-48)-C-x(2)-C . 48012 smart00745: Microtubule Interacting and Trafficking molecule domain; . 48013 smart00746: metallochaperone-like domain; . 48014 smart00747: eight cysteine-containing domain present in fungal extracellular membrane proteins . 48015 smart00748: Higher Eukarytoes and Prokaryotes Nucleotide-binding domain; . 48016 smart00749: bacterial OsmY and nodulation domain; . 48017 smart00750: kinase non-catalytic C-lobe domain; It is an interaction domain identified as being similar to the C-terminal protein kinase catalytic fold (C lobe). Its presence at the N terminus of signalling proteins and the absence of the active-site residues in the catalytic and activation loops suggest that it folds independently and is likely to be non-catalytic. The occurrence of KIND only in metazoa implies that it has evolved from the catalytic protein kinase domain into an interaction domain possibly by keeping the substrate-binding features . 48018 smart00751: domain in transcription factors and synapse-associated proteins; . 48019 smart00752: Horizontally Transferred TransMembrane Domain; Sequence analysis of vitamin K dependent gamma-carboxylases (VKGC) revealed the presence of a novel domain, HTTM (Horizontally Transferred TransMembrane) in its N-terminus. In contrast to most known domains, HTTM contains four transmembrane regions. Its occurrence in eukaryotes, bacteria and archaea is more likely caused by horizontal gene transfer than by early invention. The conservation of VKGC catalytic sites indicates an enzymatic function also for the other family members. . 48020 cd00421: Intradiol dioxygenases catalyze the critical ring-cleavage step in the conversion of catecholate derivatives to citric acid cycle intermediates. This family contains catechol 1,2-dioxygenases and protocatechuate 3,4-dioxygenases which are mononuclear non-heme iron enzymes that catalyze the oxygenation of catecholates to aliphatic acids via the cleavage of aromatic rings. The members are intradiol-cleaving enzymes which break the catechol C1-C2 bond and utilize Fe3+, as opposed to the extradiol-cleaving enzymes which break the C2-C3 or C1-C6 bond and utilize Fe2+ and Mn+. Catechol 1,2-dioxygenases are mostly homodimers with one catalytic ferric ion per monomer. Protocatechuate 3,4-dioxygenases form more diverse oligomers. 48021 cd03457: Intradiol dioxygenase supgroup. Intradiol dioxygenases catalyze the critical ring-cleavage step in the conversion of catecholate derivatives to citric acid cycle intermediates. They break the catechol C1-C2 bond and utilize Fe3+, as opposed to the extradiol-cleaving enzymes which break the C2-C3 or C1-C6 bond and utilize Fe2+ and Mn+. The family contains catechol 1,2-dioxygenases and protocatechuate 3,4-dioxygenases. The specific function of this subgroup is unknown. 48022 cd03458: Catechol intradiol dioxygenases can be divided into several subgroups according to their substrate specificity for catechol, chlorocatechols and hydroxyquinols. Almost all members of this family are homodimers containing one ferric ion (Fe3+) per monomer. They belong to the intradiol dioxygenase family, a family of mononuclear non-heme iron intradiol-cleaving enzymes that catalyze the oxygenation of catecholates to aliphatic acids via the cleavage of aromatic rings. 48023 cd03459: 4-PCD, Protocatechuate 3,4-dioxygenase (3,4-PCD) catalyzes the oxidative ring cleavage of 3,4-dihydroxybenzoate to produce beta-carboxy-cis,cis-muconate. 3,4-PCDs are large aggregates of 12 protomers, each composed of an alpha- and beta-subunit and an Fe3+ ion bound in the beta-subunit at the alpha-beta-subunit interface. 3,4-PCD is a member of the aromatic dioxygenases which are non-heme iron intradiol-cleaving enzymes that break the C1-C2 bond and utilize Fe3+.. 48024 cd03460: 2-CTD, Catechol 1,2 dioxygenase (1,2-CTD) catalyzes an intradiol cleavage reaction of catechol to form cis,cis-muconate. 1,2-CTDs is homodimers with one catalytic non-heme ferric ion per monomer. They belong to the aromatic dioxygenase family, a family of mononuclear non-heme iron intradiol-cleaving enzymes that catalyze the oxygenation of catecholates to aliphatic acids via the cleavage of aromatic rings. 48025 cd03461: 2-HQD, Hydroxyquinol 1,2-dioxygenase (1,2-HQD) catalyzes the ring cleavage of hydroxyquinol (1,2,4-trihydroxybenzene), a intermediate in the degradation of a large variety of aromatic compounds including some polychloro- and nitroaromatic pollutants, to form 3-hydroxy-cis,cis-muconates. 1,2-HQD blongs to the aromatic dioxygenase family, a family of mononuclear non-heme intradiol-cleaving enzymes. 48026 cd03462: 2-CCD, chlorocatechol 1,2-dioxygenases (1,2-CCDs) (type II enzymes) are homodimeric intradiol dioxygenases that degrade chlorocatechols via the addition of molecular oxygen and the subsequent cleavage between two adjacent hydroxyl groups. This reaction is part of the modified ortho-cleavage pathway which is a central oxidative bacterial pathway that channels chlorocatechols, derived from the degradation of chlorinated benzoic acids, phenoxyacetic acids, phenols, benzenes, and other aromatics into the energy-generating tricarboxylic acid pathway. 48027 cd03463: 4-PCD_alpha, Protocatechuate 3,4-dioxygenase (3,4-PCD) , alpha subunit. 3,4-PCD catalyzes the oxidative ring cleavage of 3,4-dihydroxybenzoate to produce beta-carboxy-cis,cis-muconate. 3,4-PCDs are large aggregates of 12 protomers, each composed of an alpha- and beta-subunit and an Fe3+ ion bound in the beta-subunit at the alpha-subunit-beta-subunit interface. 3,4-PCD is a member of the aromatic dioxygenases which are non-heme iron intradiol-cleaving enzymes that break the C1-C2 bond and utilize Fe3+.. 48028 cd03464: 4-PCD_beta, Protocatechuate 3,4-dioxygenase (3,4-PCD) , beta subunit. 3,4-PCD catalyzes the oxidative ring cleavage of 3,4-dihydroxybenzoate to produce beta-carboxy-cis,cis-muconate. 3,4-PCDs are large aggregates of 12 protomers, each composed of an alpha- and beta-subunit and an Fe3+ ion bound in the beta-subunit at the alpha-subunit-beta-subunit interface. 3,4-PCD is a member of the aromatic dioxygenases which are non-heme iron intradiol-cleaving enzymes that break the C1-C2 bond and utilize Fe3+.. 48029 cd00493: FabA/Z, beta-hydroxyacyl-acyl carrier protein (ACP)-dehydratases: One of several distinct enzyme types of the dissociative, type II, fatty acid synthase system (found in bacteria and plants) required to complete successive cycles of fatty acid elongation. The third step of the elongation cycle, the dehydration of beta-hydroxyacyl-ACP to trans-2-acyl-ACP, is catalyzed by FabA or FabZ. FabA is bifunctional and catalyzes an additional isomerization reaction of trans-2-acyl-ACP to cis-3-acyl-ACP, an essential reaction to unsaturated fatty acid synthesis. FabZ is the primary dehydratase that participates in the elongation cycles of saturated as well as unsaturated fatty acid biosynthesis, whereas FabA is more active in the dehydration of beta-hydroxydecanoyl-ACP. The FabA structure is homodimeric with two independent active sites located at the dimer interface. 48030 cd00556: Thioesterase II (TEII) is thought to regenerate misprimed nonribosomal peptide synthetases (NRPSs) as well as modular polyketide synthases (PKSs) by hydrolyzing acetyl groups bound to the peptidyl carrier protein (PCP) and acyl carrier protein (ACP) domains, respectively. TEII has two tandem asymmetric hot dog folds that are structurally similar to one found in PaaI thioesterase, 4-hydroxybenzoyl-CoA thioesterase (4HBT) and beta-hydroxydecanoyl-ACP dehydratase and thus, the TEII monomer is equivalent to the homodimeric form of the latter three enzymes. Human TEII is expressed in T cells and has been shown to bind the product of the HIV-1 Nef gene. 48031 cd00586: 4-hydroxybenzoyl-CoA thioesterase (4HBT). Catalyzes the final step in the 4-chlorobenzoate degradation pathway in which 4-chlorobenzoate is converted to 4-hydroxybenzoate in certain soil-dwelling bacteria. 4HBT forms a homotetramer with four active sites. There is no evidence to suggest that 4HBT is related to the type I thioesterases functioning in primary or secondary metabolic pathways. Each subunit of the 4HBT tetramer adopts a so-called hot-dog fold similar to those of beta-hydroxydecanoyl-ACP dehydratase, (R)-specific enoyl-CoA hydratase, and type II, thioesterase (TEII).. 48032 cd01287: FabA, beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase: Bacterial protein of the type II, fatty acid synthase system that binds ACP and catalyzes both dehydration and isomerization reactions, apparently in the same active site. The FabA structure is a homodimer with two independent active sites located at the dimer interface. Each active site is tunnel-shaped and completely inaccessible to solvent. No metal ions or cofactors are required for ligand binding or catalysis. 48033 cd01288: FabZ is a 17kD beta-hydroxyacyl-acyl carrier protein (ACP) dehydratase that primarily catalyzes the dehydration of beta-hydroxyacyl-ACP to trans-2-acyl-ACP, the third step in the elongation phase of the bacterial/ plastid, type II, fatty-acid biosynthesis pathway. 48034 cd01289: Domain of unknown function, appears to be related to a diverse group of beta-hydroxydecanoyl ACP dehydratases (FabA) and beta-hydroxyacyl ACP dehydratases (FabZ). This group appears to lack the conserved active site histidine of FabA and FabZ. 48035 cd03440: The hotdog fold was initially identified in the E. coli FabA (beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase) structure and subsequently in 4HBT (4-hydroxybenzoyl-CoA thioesterase) from Pseudomonas. A number of other seemingly unrelated proteins also share the hotdog fold. These proteins have related, but distinct, catalytic activities that include metabolic roles such as thioester hydrolysis in fatty acid metabolism, and degradation of phenylacetic acid and the environmental pollutant 4-chlorobenzoate. This superfamily also includes the PaaI-like protein FapR, a non-catalytic bacterial homolog involved in transcriptional regulation of fatty acid biosynthesis. 48036 cd03441: (R)-hydratase [(R)-specific enoyl-CoA hydratase]. Catalyzes the hydration of trans-2-enoyl CoA to (R)-3-hydroxyacyl-CoA as part of the PHA (polyhydroxyalkanoate) biosynthetic pathway. The structure of the monomer includes a five-strand antiparallel beta-sheet wrapped around a central alpha helix, referred to as a hot dog fold. The active site lies within a substrate-binding tunnel formed by the homodimer. Other enzymes with this fold include MaoC dehydratase, Hydratase-Dehydrogenase-Epimerase protein (HDE), and the fatty acid synthase beta subunit. 48037 cd03442: Brown fat-inducible thioesterase (BFIT). Brain acyl-CoA hydrolase (BACH). These enzymes deacylate long-chain fatty acids by hydrolyzing acyl-CoA thioesters to free fatty acids and CoA-SH. Eukaryotic members of this family are expressed in brain, testis, and brown adipose tissues. The archeal and eukaryotic members of this family have two tandem copies of the conserved hot dog fold, while most bacterial members have only one copy. 48038 cd03443: PaaI_thioesterase is a tetrameric acyl-CoA thioesterase with a hot dog fold and one of several proteins responsible for phenylacetic acid (PA) degradation in bacteria. Although orthologs of PaaI exist in archaea and eukaryotes, their function has not been determined. Sequence similarity between PaaI, E. coli medium chain acyl-CoA thioesterase II, and human thioesterase III suggests they all belong to the same thioesterase superfamily. The conserved fold present in these thioesterases is referred to as an asymmetric hot dog fold, similar to those of 4-hydroxybenzoyl-CoA thioesterase (4HBT) and the beta-hydroxydecanoyl-ACP dehydratases (FabA/FabZ).. 48039 cd03444: Thioesterase II (TEII) is thought to regenerate misprimed nonribosomal peptide synthetases (NRPSs) as well as modular polyketide synthases (PKSs) by hydrolyzing acetyl groups bound to the peptidyl carrier protein (PCP) and acyl carrier protein (ACP) domains, respectively. TEII has two tandem asymmetric hot dog folds that are structurally similar to one found in PaaI thioesterase, 4-hydroxybenzoyl-CoA thioesterase (4HBT) and beta-hydroxydecanoyl-ACP dehydratase and thus, the TEII monomer is equivalent to the homodimeric form of the latter three enzymes. Human TEII is expressed in T cells and has been shown to bind the product of the HIV-1 Nef gene. 48040 cd03445: Thioesterase II (TEII) is thought to regenerate misprimed nonribosomal peptide synthetases (NRPSs) as well as modular polyketide synthases (PKSs) by hydrolyzing acetyl groups bound to the peptidyl carrier protein (PCP) and acyl carrier protein (ACP) domains, respectively. TEII has two tandem asymmetric hot dog folds that are structurally similar to one found in PaaI thioesterase, 4-hydroxybenzoyl-CoA thioesterase (4HBT) and beta-hydroxydecanoyl-ACP dehydratase and thus, the TEII monomer is equivalent to the homodimeric form of the latter three enzymes. Human TEII is expressed in T cells and has been shown to bind the product of the HIV-1 Nef gene. 48041 cd03446: MoaC_like Similar to the MaoC (monoamine oxidase C) dehydratase regulatory protein but without the N-terminal PutA domain. This protein family has a hot-dog fold similar to that of (R)-specific enoyl-CoA hydratase, the peroxisomal Hydratase-Dehydrogenase-Epimerase (HDE) protein, and the fatty acid synthase beta subunit. 48042 cd03447: FAS_MaoC, the MaoC-like hot dog fold of the fatty acid synthase, beta subunit. Other enzymes with this fold include MaoC dehydratase, Hydratase-Dehydrogenase-Epimerase protein (HDE), and 17-beta-hydroxysteriod dehydrogenase (HSD).. 48043 cd03448: HDE_HSD The R-hydratase-like hot dog fold of the 17-beta-hydroxysteriod dehydrogenase (HSD), and Hydratase-Dehydrogenase-Epimerase (HDE) proteins. Other enzymes with this fold include MaoC dehydratase, and the fatty acid synthase beta subunit. 48044 cd03449: (R)-hydratase [(R)-specific enoyl-CoA hydratase] catalyzes the hydration of trans-2-enoyl CoA to (R)-3-hydroxyacyl-CoA as part of the PHA (polyhydroxyalkanoate) biosynthetic pathway. (R)-hydratase contains a hot-dog fold similar to those of thioesterase II, and beta-hydroxydecanoyl-ACP dehydratase, MaoC dehydratase, Hydratase-Dehydrogenase-Epimerase protein (HDE), and the fatty acid synthase beta subunit. The active site lies within a substrate-binding tunnel formed by the (R)-hydratase homodimer. A subset of the bacterial (R)-hydratases contain a C-terminal phosphotransacetylase (PTA) domain. 48045 cd03450: NodN (nodulation factor N) contains a single hot dog fold similar to those of the peroxisomal Hydratase-Dehydrogenase-Epimerase (HDE) protein, and the fatty acid synthase beta subunit. Rhizobium and related species form nodules on the roots of their legume hosts, a symbiotic process that requires production of Nod factors, which are signal molecules involved in root hair deformation and meristematic cell division. The nodulation gene products, including NodN, are involved in producing the Nod factors, however the role played by NodN is unclear. 48046 cd03451: FkbR2 is a Streptomyces hygroscopicus protein with a hot dog fold that belongs to a conserved family of proteins found in prokaryotes and archaea but not in eukaryotes. FkbR2 has sequence similarity to (R)-specific enoyl-CoA hydratase, the peroxisomal Hydratase-Dehydrogenase-Epimerase (HDE) protein, and the fatty acid synthase beta subunit. The function of FkbR2 is unknown. 48047 cd03452: MaoC_C The C-terminal hot dog fold of the MaoC (monoamine oxidase C) dehydratase regulatory protein. Orthologs of MaoC include PaaZ [Escherichia coli] and PaaN [Pseudomonas putida], which are putative ring-opening enzymes involved in phenylacetic acid degradation. The C-terminal domain of MaoC has sequence similarity to (R)-specific enoyl-CoA hydratase,Hydratase-Dehydrogenase-Epimerase (HDE) protein, and the fatty acid synthase beta subunit. MaoC also has an N-terminal PutA domain like that found in the E. coli PutA proline dehydrogenase and other members of the aldehyde dehydrogenase family. 48048 cd03453: SAV4209_like. Similar in sequence to the Streptomyces avermitilis SAV4209 protein, with a hot dog fold that is similar to those of (R)-specific enoyl-CoA hydratase, the peroxisomal Hydratase-Dehydrogenase-Epimerase (HDE) protein, and the fatty acid synthase beta subunit. 48049 cd03454: YdeM is a Bacillus subtilis protein that belongs to a family of prokaryotic proteins of unkown function. YdeM has sequence similarity to the hot-dog fold of (R)-specific enoyl-CoA hydratase. Other enzymes with this fold include the peroxisomal Hydratase-Dehydrogenase-Epimerase (HDE) protein, and the fatty acid synthase beta subunit. 48050 cd03455: SAV4209 is a Streptomyces avermitilis protein with a hot dog fold that is similar to those of (R)-specific enoyl-CoA hydratase, the peroxisomal Hydratase-Dehydrogenase-Epimerase (HDE) protein, and the fatty acid synthase beta subunit. The alpha- and gamma-proteobacterial members of this CD have, in addition to a hot dog fold, an N-terminal extension. 48051 cd00546: Quinol:fumarate reductase (QFR) Type D subfamily, 15kD hydrophobic subunit C; QFR couples the reduction of fumarate to succinate to the oxidation of quinol to quinone, the opposite reaction to that catalyzed by the related protein, succinate:quinine oxidoreductase (SQR). QFRs oxidize low potential quinols such as menaquinol and are involved in anaerobic respiration with fumarate as the terminal electron acceptor. SQR and QFR share a common subunit arrangement, composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Members of this subfamily are classified as Type D as they contain two transmembrane subunits (C and D) and no heme groups. The structural arrangement allows efficient electron transfer between the catalytic subunit, through iron-sulfur centers, and the transmembrane subunit containing the electron donor (quinol). The quinone binding site resides in the transmembrane subunits. 48052 cd00547: Quinol:fumarate reductase (QFR) Type D subfamily, 13kD hydrophobic subunit D; QFR couples the reduction of fumarate to succinate to the oxidation of quinol to quinone, the opposite reaction to that catalyzed by the related protein, succinate:quinine oxidoreductase (SQR). QFRs oxidize low potential quinols such as menaquinol and are involved in anaerobic respiration with fumarate as the terminal electron acceptor. SQR and QFR share a common subunit arrangement, composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Members of this subfamily are classified as Type D as they contain two transmembrane subunits (C and D) and no heme groups. The structural arrangement allows efficient electron transfer between the catalytic subunit, through iron-sulfur centers, and the transmembrane subunit containing the electron donor (quinol). The quinone binding site resides in the transmembrane subunits. 48053 cd00581: Quinol:fumarate reductase (QFR) Type B subfamily, transmembrane subunit; QFR couples the reduction of fumarate to succinate to the oxidation of quinol to quinone, the opposite reaction to that catalyzed by the related protein, succinate:quinone oxidoreductase (SQR). QFRs oxidize low potential quinols such as menaquinol and rhodoquinol and are involved in anaerobic respiration with fumarate as the terminal electron acceptor. SQR and QFR share a common subunit arrangement, composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Members of this subfamily are classified as Type B as they contain one transmembrane subunit and two heme groups. The heme and quinone binding sites reside in the transmembrane subunit. The structural arrangement allows efficient electron transfer between the catalytic subunit, through iron-sulfur centers, and the transmembrane subunit containing the electron donor (quinol). The Type B enzyme from Desulfovibrio gigas is capable of fumarate reduction and succinate oxidation. 48054 cd03493: Succinate:quinone oxidoreductase (SQR) and Quinol:fumarate reductase (QFR) family, transmembrane subunits; SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol, while QFR catalyzes the reverse reaction. SQR, also called succinate dehydrogenase or Complex II, is part of the citric acid cycle and the aerobic respiratory chain, while QFR is involved in anaerobic respiration with fumarate as the terminal electron acceptor. SQRs may reduce either high or low potential quinones while QFRs oxidize only low potential quinols. SQR and QFR share a common subunit arrangement, composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. The structural arrangement allows efficient electron transfer between the catalytic subunit, through iron-sulfur centers, and the transmembrane subunit(s) containing the electron donor/acceptor (quinol or quinone). The reversible reduction of quinone is an essential feature of respiration, allowing the transfer of electrons between respiratory complexes. SQRs and QFRs can be classified into five types (A-E) according to the number of their hydrophobic subunits and heme groups. This classification is consistent with the characteristics and phylogeny of the catalytic and iron-sulfur subunits. Type E proteins, e.g. non-classical archael SQRs, contain atypical transmembrane subunits and are not included in this hierarchy. The heme and quinone binding sites reside in the transmembrane subunits. Although succinate oxidation and fumarate reduction are carried out by separate enzymes in most organisms, some bifunctional enzymes that exhibit both SQR and QFR activities exist. 48055 cd03494: Succinate:quinone oxidoreductase (SQR) Type C subfamily, Succinate dehydrogenase D (SdhD) subunit; SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol. E. coli SQR, a member of this subfamily, reduces the high potential quinine, ubiquinone. SQR is also called succinate dehydrogenase or Complex II, and is part of the citric acid cycle and the aerobic respiratory chain. SQR is composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Members of this subfamily are classified as Type C SQRs because they contain two transmembrane subunits and one heme group. SdhD and SdhC are the two transmembrane proteins of bacterial SQRs. They contain heme and quinone binding sites. The two-electron oxidation of succinate in the flavoprotein active site is coupled to the two-electron reduction of quinone in the membrane anchor subunits via electron transport through FAD and three iron-sulfur centers. The reversible reduction of quinone is an essential feature of respiration, allowing transfer of electrons between respiratory complexes. 48056 cd03495: Succinate:quinone oxidoreductase (SQR) Type C subfamily, Succinate dehydrogenase D (SdhD) subunit-like; composed of predominantly uncharacterized bacterial proteins with similarity to the E. coli SdhD subunit. One characterized protein is the respiratory Complex II SdhD subunit of the only eukaryotic member, Reclinomonas americana. SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol. It is also called succinate dehydrogenase or Complex II, and is part of the citric acid cycle and the aerobic respiratory chain. SQR is composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. E. coli SQR is classified as Type C SQRs because it contains two transmembrane subunits and one heme group. The SdhD and SdhC subunits are membrane anchor subunits containing heme and quinone binding sites. The two-electron oxidation of succinate in the flavoprotein active site is coupled to the two-electron reduction of quinone in the membrane anchor subunits via electron transport through FAD and three iron-sulfur centers. The reversible reduction of quinone is an essential feature of respiration, allowing transfer of electrons between respiratory complexes. 48057 cd03496: SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol. Eukaryotic SQRs reduce high potential quinones such as ubiquinone. SQR is also called succinate dehydrogenase or Complex II, and is part of the citric acid cycle and the aerobic respiratory chain. SQR is composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Members of this subfamily are classified as Type C SQRs because they contain two transmembrane subunits and one heme group. CybS and CybL are the two transmembrane proteins of eukaryotic SQRs. They contain heme and quinone binding sites. CybS is the eukaryotic homolog of the bacterial SdhD subunit. The two-electron oxidation of succinate in the flavoprotein active site is coupled to the two-electron reduction of quinone in the transmembrane subunits via electron transport through FAD and three iron-sulfur centers. The reversible reduction of quinone is an essential feature of respiration, allowing transfer of electrons between respiratory complexes. Mutations in human Complex II result in various physiological disorders including hereditary paraganglioma and pheochromocytoma tumors. The gene encoding for the SdhD subunit is classified as a tumor suppressor gene. 48058 cd03497: Succinate:quinone oxidoreductase (SQR) Type B subfamily 1, transmembrane subunit; composed of proteins similar to Bacillus subtilis SQR. SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol. Bacillus subtilis SQR reduces low potential quinones such as menaquinone. SQR is also called succinate dehydrogenase (Sdh) or Complex II and is part of the citric acid cycle and the aerobic respiratory chain. SQR is composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Members of this subfamily are classified as Type B as they contain one transmembrane subunit and two heme groups. The heme and quinone binding sites reside on the transmembrane subunit. The transmembrane subunit of Bacillus subtilis SQR is also called Sdh cytochrome b558 subunit. The structural arrangement allows efficient electron transfer between the catalytic subunit, through iron-sulfur centers, and the transmembrane subunit containing the electron acceptor (quinone). The reversible reduction of quinone is an essential feature of respiration, allowing transfer of electrons between respiratory complexes. 48059 cd03498: Succinate:quinone oxidoreductase (SQR)-like Type B subfamily 2, transmembrane subunit; composed of proteins with similarity to the SQRs of Geobacter metallireducens and Corynebacterium glutamicum. SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol. C. glutamicum SQR reduces low potential quinones such as menaquinone. SQR is also called succinate dehydrogenase (Sdh) or Complex II and is part of the citric acid cycle and the aerobic respiratory chain. SQR is composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Members of this subfamily are classified as Type B as they contain one transmembrane subunit and two heme groups. The heme and quinone binding sites reside in the transmembrane subunit. The transmembrane subunit of members of this subfamily is also called Sdh cytochrome b558 subunit based on the Bacillus subtilis protein. The structural arrangement allows efficient electron transfer between the catalytic subunit, through iron-sulfur centers, and the transmembrane subunit containing the electron acceptor (quinone). The reversible reduction of quinone is an essential feature of respiration, allowing transfer of electrons between respiratory complexes. Proteins in this subfamily from G. metallireducens and G. sulfurreducens are bifunctional enzymes with SQR and QFR activities. 48060 cd03499: Succinate:quinone oxidoreductase (SQR) Type C subfamily, Succinate dehydrogenase C (SdhC) subunit; composed of bacterial SdhC and eukaryotic large cytochrome b binding (CybL) proteins. SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol. Members of this family reduce high potential quinones such as ubiquinone. SQR is also called succinate dehydrogenase or Complex II, and is part of the citric acid cycle and the aerobic respiratory chain. SQR is composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Proteins in this subfamily are classified as Type C SQRs because they contain two transmembrane subunits and one heme group. The heme and quinone binding sites reside in the transmembrane subunits. The SdhC or CybL protein is one of the two transmembrane subunits of bacterial and eukaryotic SQRs. The two-electron oxidation of succinate in the flavoprotein active site is coupled to the two-electron reduction of quinone in the membrane anchor subunits via electron transport through FAD and three iron-sulfur centers. The reversible reduction of quinone is an essential feature of respiration, allowing transfer of electrons between respiratory complexes. 48061 cd03500: Succinate:quinone oxidoreductase (SQR) Type A subfamily, Succinate dehydrogenase D (SdhD)-like subunit; SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol. Members of this subfamily reduce low potential quinones such as menaquinone and thermoplasmaquinone. SQR is also called succinate dehydrogenase or Complex II, and is part of the citric acid cycle and the aerobic respiratory chain. SQR is composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Members of this subfamily are similar to the Thermoplasma acidophilum SQR and are classified as Type A because they contain two transmembrane subunits as well as two heme groups. Although there are no structures available for this subfamily, the presence of two hemes has been proven spectroscopically for T. acidophilum. The two membrane anchor subunits are similar to the SdhD and SdhC subunits of bacterial SQRs, which contain heme and quinone binding sites. The two-electron oxidation of succinate in the flavoprotein active site is coupled to the two-electron reduction of quinone in the membrane anchor subunits via electron transport through FAD and three iron-sulfur centers. The reversible reduction of quinone is an essential feature of respiration, allowing transfer of electrons between respiratory complexes. 48062 cd03501: Succinate:quinone oxidoreductase (SQR) Type A subfamily, Succinate dehydrogenase C (SdhC)-like subunit; SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol. Members of this subfamily reduce low potential quinones such as menaquinone and thermoplasmaquinone. SQR is also called succinate dehydrogenase or Complex II, and is part of the citric acid cycle and the aerobic respiratory chain. SQR is composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Members of this subfamily are similar to the Thermoplasma acidophilum SQR and are classified as Type A because they contain two transmembrane subunits as well as two heme groups. Although there are no structures available for this subfamily, the presence of two hemes has been proven spectroscopically for T. acidophilum. The two membrane anchor subunits are similar to the SdhD and SdhC subunits of bacterial SQRs, which contain heme and quinone binding sites. The two-electron oxidation of succinate in the flavoprotein active site is coupled to the two-electron reduction of quinone in the membrane anchor subunits via electron transport through FAD and three iron-sulfur centers. The reversible reduction of quinone is an essential feature of respiration, allowing transfer of electrons between respiratory complexes. 48063 cd03526: Succinate:quinone oxidoreductase (SQR) and Quinol:fumarate reductase (QFR) Type B subfamily, transmembrane subunit; SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol, while QFR catalyzes the reverse reaction. SQR, also called succinate dehydrogenase or Complex II, is part of the citric acid cycle and the aerobic respiratory chain, while QFR is involved in anaerobic respiration with fumarate as the terminal electron acceptor. SQR and QFR share a common subunit arrangement, composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Type B proteins contain one transmembrane subunit and two heme groups. The heme and quinone binding sites reside in the transmembrane subunits. The structural arrangement allows efficient electron transfer between the catalytic subunit, through iron-sulfur centers, and the transmembrane subunit containing the electron donor/acceptor (quinol or quinone). The reversible reduction of quinone is an essential feature of respiration, allowing the transfer of electrons between respiratory complexes. 48064 cd00660: Topoisomer_IB_N: N-terminal DNA binding fragment found in eukaryotic DNA topoisomerase (topo) IB proteins similar to the monomeric yeast and human topo I and heterodimeric topo I from Leishmania donvanni. Topo I enzymes are divided into: topo type IA (bacterial) and type IB (eukaryotic). Topo I relaxes superhelical tension in duplex DNA by creating a single-strand nick, the broken strand can then rotate around the unbroken strand to remove DNA supercoils and, the nick is religated, liberating topo I. These enzymes regulate the topological changes that accompany DNA replication, transcription and other nuclear processes. Human topo I is the target of a diverse set of anticancer drugs including camptothecins (CPTs). CPTs bind to the topo I-DNA complex and inhibit re-ligation of the single-strand nick, resulting in the accumulation of topo I-DNA adducts. In addition to differences in structure and some biochemical properties, Trypanosomatid parasite topo I differ from human topo I in their sensitivity to CPTs and other classical topo I inhibitors. Trypanosomatid topos I play putative roles in organizing the kinetoplast DNA network unique to these parasites. This family may represent more than one structural domain. 48065 cd03488: Topoisomer_IB_N_htopoI_like : N-terminal DNA binding fragment found in eukaryotic DNA topoisomerase (topo) IB proteins similar to the monomeric yeast and human topo I. Topo I enzymes are divided into: topo type IA (bacterial) and type IB (eukaryotic). Topo I relaxes superhelical tension in duplex DNA by creating a single-strand nick, the broken strand can then rotate around the unbroken strand to remove DNA supercoils and, the nick is religated, liberating topo I. These enzymes regulate the topological changes that accompany DNA replication, transcription and other nuclear processes. Human topo I is the target of a diverse set of anticancer drugs including camptothecins (CPTs). CPTs bind to the topo I-DNA complex and inhibit religation of the single-strand nick, resulting in the accumulation of topo I-DNA adducts. This family may represent more than one structural domain. 48066 cd03489: Topoisomer_IB_N_LdtopoI_like: N-terminal DNA binding fragment found in eukaryotic DNA topoisomerase (topo) IB proteins similar to the heterodimeric topo I from Leishmania donvanni. Topo I enzymes are divided into: topo type IA (bacterial) and type IB (eukaryotic). Topo I relaxes superhelical tension in duplex DNA by creating a single-strand nick, the broken strand can then rotate around the unbroken strand to remove DNA supercoils and, the nick is religated, liberating topo I. These enzymes regulate the topological changes that accompany DNA replication, transcription and other nuclear processes. Human topo I is the target of a diverse set of anticancer drugs including camptothecins (CPTs). CPTs bind to the topo I-DNA complex and inhibit re-ligation of the single-strand nick, resulting in the accumulation of topo I-DNA adducts. In addition to differences in structure and some biochemical properties, Trypanosomatid parasite topo I differ from human topo I in their sensitivity to CPTs and other classical topo I inhibitors. Trypanosomatid topo I play putative roles in organizing the kinetoplast DNA network unique to these parasites. This family may represent more than one structural domain. 48067 cd03490: Topoisomer_IB_N_1: A subgroup of the N-terminal DNA binding fragment found in eukaryotic DNA topoisomerase (topo) IB. Topo IB proteins include the monomeric yeast and human topo I and heterodimeric topo I from Leishmania donvanni. Topo I enzymes are divided into: topo type IA (bacterial) and type IB (eukaryotic). Topo I relaxes superhelical tension in duplex DNA by creating a single-strand nick, the broken strand can then rotate around the unbroken strand to remove DNA supercoils and, the nick is religated, liberating topo I. These enzymes regulate the topological changes that accompany DNA replication, transcription and other nuclear processes. Human topo I is the target of a diverse set of anticancer drugs including camptothecins (CPTs). CPTs bind to the topo I-DNA complex and inhibit religation of the single-strand nick, resulting in the accumulation of topo I-DNA adducts. In addition to differences in structure and some biochemical properties, Trypanosomatid parasite topos I differ from human topo I in their sensitivity to CPTs and other classical topo I inhibitors. Trypanosomatid topos I have putative roles in organizing the kinetoplast DNA network unique to these parasites. This family may represent more than one structural domain. 48068 cd00316: The nitrogenase enzyme system catalyzes the ATP-dependent reduction of dinitrogen to ammonia. This group contains both alpha and beta subunits of component 1 of the three known genetically distinct types of nitrogenase systems: a molybdenum-dependent nitrogenase (Mo-nitrogenase), a vanadium-dependent nitrogenase (V-nitrogenase), and an iron-only nitrogenase (Fe-nitrogenase) and, both subunits of Protochlorophyllide (Pchlide) reductase and chlorophyllide (chlide) reductase. The nitrogenase systems consist of component 1 (MoFe protein, VFe protein or, FeFe protein respectively) and, component 2 (Fe protein). The most widespread and best characterized nitrogenase is the Mo-nitrogenase. MoFe is an alpha2beta2 tetramer, the alternative nitrogenases are alpha2beta2delta2 hexamers whose alpha and beta subunits are similar to the alpha and beta subunits of MoFe. For MoFe, each alphabeta pair contains one P-cluster (at the alphabeta interface) and, one molecule of iron molybdenum cofactor (FeMoco) contained within the alpha subunit. The Fe protein contains a single [4Fe-4S] cluster from which, electrons are transferred to the P-cluster of the MoFe and in turn, to FeMoCo at the site of substrate reduction. The V-nitrogenase requires an iron-vanadium cofactor (FeVco), the iron only-nitrogenase an iron only cofactor (FeFeco). These cofactors are analogous to the FeMoco. The V-nitrogenase has P clusters identical to those of MoFe. Pchlide reductase and chlide reductase participate in the Mg-branch of the tetrapyrrole biosynthetic pathway. Pchlide reductase catalyzes the reduction of the D-ring of Pchlide during the synthesis of chlorophylls (Chl) and bacteriochlorophylls (BChl). Chlide-a reductase catalyzes the reduction of the B-ring of Chlide-a during the synthesis of BChl-a. The Pchlide reductase NB complex is a an N2B2 heterotetramer resembling nitrogenase FeMo, N and B proteins are homologous to the FeMo alpha and beta subunits respectively. The NB complex may serve as a catalytic site for Pchlide reduction and, the ZY complex as a site of chlide reduction, similar to MoFe for nitrogen reduction. 48069 cd01965: Nitrogenase_MoFe_beta_like: Nitrogenase MoFe protein, beta subunit_like. The nitrogenase enzyme catalyzes the ATP-dependent reduction of dinitrogen (N2) to ammonia. This group contains the beta subunits of component 1 of the three known genetically distinct types of nitrogenase systems: a molybdenum-dependent nitrogenase (Mo-nitrogenase), a vanadium-dependent nitrogenase (V-nitrogenase), and an iron-only nitrogenase (Fe-nitrogenase). These nitrogenase systems consist of component 1 (MoFe protein, VFe protein or, FeFe protein respectively) and, component 2 (Fe protein). The most widespread and best characterized of these systems is the Mo-nitrogenase. MoFe is an alpha2beta2 tetramer, the alternative nitrogenases are alpha2beta2delta2 hexamers having alpha and beta subunits similar to the alpha and beta subunits of MoFe. For MoFe, each alphabeta pair contains one P-cluster (at the alphabeta interface) and, one molecule of iron molybdenum cofactor (FeMoco) contained within the alpha subunit. The Fe protein contains, a single [4Fe-4S] cluster from which electrons are transferred to the P-cluster of the MoFe and in turn, to FeMoCo, the site of substrate reduction. The V-nitrogenase requires an iron-vanadium cofactor (FeVco), the iron only-nitrogenase an iron only cofactor (FeFeco). These cofactors are analogous to the FeMoco. The V-nitrogenase has P clusters identical to those of MoFe. In addition to N2, nitrogenase also catalyzes the reduction of a variety of other substrates such as acetylene The V-nitrogenase differs from the Mo-nitrogenase in that it produces free hydrazine, as a minor product during N2-reduction and, ethane as a minor product during acetylene reduction. 48070 cd01966: Nitrogenase_nifN1: A subgroup of the NifN subunit of the NifEN complex: NifN forms an alpha2beta2 tetramer with NifE. NifN and nifE are structurally homologous to nitrogenase MoFe protein beta and alpha subunits respectively. NifEN participates in the synthesis of the iron-molybdenum cofactor (FeMoco) of the MoFe protein. NifB-co (an iron and sulfur containing precursor of the FeMoco) from NifB is transferred to the NifEN complex where it is further processed to FeMoco. The nifEN bound precursor of FeMoco has been identified as a molybdenum-free, iron- and sulfur- containing analog of FeMoco. It has been suggested that this nifEN bound precursor also acts as a cofactor precursor in nitrogenase systems which require a cofactor other than FeMoco: i.e. iron-vanadium cofactor (FeVco) or iron only cofactor (FeFeco).. 48071 cd01967: Nitrogenase_MoFe_alpha_like: Nitrogenase MoFe protein, alpha subunit_like. The nitrogenase enzyme catalyzes the ATP-dependent reduction of dinitrogen to ammonia. Three genetically distinct types of nitrogenase systems are known to exist: a molybdenum-dependent nitrogenase (Mo-nitrogenase), a vanadium dependent nitrogenase (V-nitrogenase), and an iron-only nitrogenase (Fe-nitrogenase). These nitrogenase systems consist of component 1 (MoFe protein, VFe protein or, FeFe protein respectively) and, component 2 (Fe protein). This group contains the alpha subunit of component 1 of all three different forms. The most widespread and best characterized of these systems is the Mo-nitrogenase. MoFe is an alpha2beta2 tetramer, the alternative nitrogenases are alpha2beta2delta2 hexamers having alpha and beta subunits similar to the alpha and beta subunits of MoFe. The role of the delta subunit is unknown. For MoFe, each alphabeta pair of subunits contains one P-cluster (located at the alphabeta interface) and, one molecule of iron molybdenum cofactor (FeMoco) contained within the alpha subunit. The Fe protein is a homodimer which contains, a single [4Fe-4S] cluster from which electrons are transferred to the P-cluster of the MoFe and in turn, to FeMoCo the site of substrate reduction. The V-nitrogenase requires an iron-vanadium cofactor (FeVco), the iron only-nitrogenase an iron only cofactor (FeFeco). These cofactors are analogous to the FeMoco. The V-nitrogenase has P clusters identical to those of MoFe. In addition to N2, nitrogenase also catalyzes the reduction of a variety of other substrates such as acetylene The V-nitrogenase differs from the Mo- nitrogenase in that it produces free hydrazine, as a minor product during dinitrogen reduction and, ethane as a minor product during acetylene reduction. 48072 cd01968: Nitrogenase_NifE_I: a subgroup of the NifE subunit of the NifEN complex: NifE forms an alpha2beta2 tetramer with NifN. NifE and NifN are structurally homologous to nitrogenase MoFe protein alpha and beta subunits respectively. NifEN participates in the synthesis of the iron-molybdenum cofactor (FeMoco) of the MoFe protein. NifB-co (an iron and sulfur containing precursor of the FeMoco) from NifB is transferred to the NifEN complex where it is further processed to FeMoco. The NifEN bound precursor of FeMoco has been identified as a molybdenum-free, iron- and sulfur- containing analog of FeMoco. It has been suggested that this NifEN bound precursor also acts as a cofactor precursor in nitrogenase systems which require a cofactor other than FeMoco: i.e. iron-vanadium cofactor (FeVco) or iron only cofactor (FeFeco).. 48073 cd01971: Nitrogenase_vnfN_like: VnfN subunit of the VnfEN complex-like. This group in addition to VnfN contains a subset of the beta subunit of the nitrogenase MoFe protein and NifN-like proteins. The nitrogenase enzyme system catalyzes the ATP-dependent reduction of dinitrogen to ammonia. NifEN participates in the synthesis of the iron-molybdenum cofactor (FeMoco) of MoFe protein of the molybdenum(Mo)-nitrogenase. NifB-co (an iron and sulfur containing precursor of the FeMoco) from NifB is transferred to NifEN where it is further processed to FeMoco. VnfEN may similarly be a scaffolding protien for the iron-vanadium cofactor (FeVco) of the vanadium-dependent (V)-nitrogenase. NifE and NifN are essential for the Mo-nitrogenase, VnfE and VnfN are not essential for the V-nitrogenase. NifE and NifN can substitute when the vnfEN genes are inactivated. 48074 cd01972: Nitrogenase_VnfE_like: VnfE subunit of the VnfEN complex_like. This group in addition to VnfE contains a subset of the alpha subunit of the nitrogenase MoFe protein and NifE-like proteins. The nitrogenase enzyme system catalyzes the ATP-dependent reduction of dinitrogen to ammonia. NifEN participates in the synthesis of the iron-molybdenum cofactor (FeMoco) of MoFe protein of the molybdenum(Mo)-nitrogenase. NifB-co (an iron and sulfur containing precursor of the FeMoco) from NifB is transferred to NifEN where it is further processed to FeMoco. VnfEN may similarly be a scaffolding protein for the iron-vanadium cofactor (FeVco) of the vanadium-dependent (V)-nitrogenase. NifE and NifN are essential for the Mo-nitrogenase, VnfE and VnfN are not essential for the V-nitrogenase. NifE and NifN can substitute when the vnfEN genes are inactivated. 48075 cd01973: Nitrogenase_VFe_beta -like: Nitrogenase VFe protein, beta subunit like. This group contains proteins similar to the beta subunits of the VFe protein of the vanadium-dependent (V-) nitrogenase. Nitrogenase catalyzes the ATP-dependent reduction of dinitrogen (N2) to ammonia. In addition to V-nitrogenase there is a molybdenum (Mo)-dependent nitrogenase and an iron only (Fe-) nitrogenase. The Mo-nitrogenase is the most widespread and best characterized of these systems. These systems consist of component 1 (VFe protein, FeFe protein or, MoFe protein respectively) and, component 2 (Fe protein). MoFe is an alpha2beta2 tetramer, V-and Fe- nitrogenases are alpha2beta2delta2 hexamers. The alpha and beta subunits of VFe and FeFe are similar to the alpha and beta subunits of MoFe. For MoFe each alphabeta pair contains one P-cluster (at the alphabeta interface) and, one molecule of iron molybdenum cofactor (FeMoco) contained within the alpha subunit. The Fe protein which has a practically identical structure in all three systems, it contains a single [4Fe-4S] cluster. Electrons are transferred from the [4Fe-4S] cluster of the Fe protein to the P-cluster of the MoFe and in turn to FeMoCo, the site of substrate reduction. The V-nitrogenase requires an iron-vanadium cofactor (FeVco), the iron only-nitrogenase an iron only cofactor (FeFeco). These cofactors are analogous to the FeMoco. The V-nitrogenase has P clusters identical to those of MoFe. In addition to N2, nitrogenase also catalyzes the reduction of a variety of other substrates such as acetylene The V-nitrogenase differs from the Mo-nitrogenase in that it produces free hydrazine, as a minor product during dinitrogen reduction and, ethane as a minor product during acetylene reduction. 48076 cd01974: Nitrogenase_MoFe_beta: Nitrogenase MoFe protein, beta subunit. The nitrogenase enzyme catalyzes the ATP-dependent reduction of dinitrogen to ammonia. The Molybdenum (Mo-) nitrogenase is the most widespread and best characterized of these systems. Mo-nitrogenase consists of the MoFe protein (component 1) and the Fe protein (component 2). MoFe is an alpha2beta2 tetramer. This group contains the beta subunit of the MoFe protein. Each alphabeta pair of MoFe contains one P-cluster (at the alphabeta interface) and, one molecule of iron molybdenum cofactor (FeMoco) contained within the alpha subunit. The Fe protein contains a single [4Fe-4S] cluster. Electrons are transferred from the [4Fe-4S] cluster of the Fe protein to the P-cluster of the MoFe and in turn to FeMoCo, the site of substrate reduction. 48077 cd01976: Nitrogenase_MoFe_alpha_II: Nitrogenase MoFe protein, beta subunit. A group of proteins similar to the alpha subunit of the MoFe protein of the molybdenum (Mo-) nitrogenase. The nitrogenase enzyme catalyzes the ATP-dependent reduction of dinitrogen to ammonia. The Mo-nitrogenase is the most widespread and best characterized of these systems. Mo-nitrogenase consists of the MoFe protein (component 1) and the Fe protein (component 2). MoFe is an alpha2beta2 tetramer. Each alphabeta pair of MoFe contains one P-cluster (at the alphabeta interface) and, one molecule of iron molybdenum cofactor (FeMoco) contained within the alpha subunit. The Fe protein contains a single [4Fe-4S] cluster. Electrons are transferred from the [4Fe-4S] cluster of the Fe protein to the P-cluster of the MoFe and in turn to FeMoCo, the site of substrate reduction. 48078 cd01977: Nitrogenase_VFe_alpha -like: Nitrogenase VFe protein, alpha subunit like. This group contains proteins similar to the alpha subunits of, the VFe protein of the vanadium-dependent (V-) nitrogenase and the FeFe protein of the iron only (Fe-) nitrogenase Nitrogenase catalyzes the ATP-dependent reduction of dinitrogen (N2) to ammonia. In addition to V- and Fe- nitrogenases there is a molybdenum (Mo)-dependent nitrogenase which is the most widespread and best characterized of these systems. These systems consist of component 1 (VFe protein, FeFe protein or, MoFe protein respectively) and, component 2 (Fe protein). MoFe is an alpha2beta2 tetramer, V-and Fe- nitrogenases are alpha2beta2delta2 hexamers. The alpha and beta subunits of VFe and FeFe are similar to the alpha and beta subunits of MoFe. For MoFe each alphabeta pair contains one P-cluster (at the alphabeta interface) and, one molecule of iron molybdenum cofactor (FeMoco) contained within the alpha subunit. The Fe protein which has a practically identical structure in all three systems, it contains a single [4Fe-4S] cluster. Electrons are transferred from the [4Fe-4S] cluster of the Fe protein to the P-cluster of the MoFe and in turn to FeMoCo, the site of substrate reduction. The V-nitrogenase requires an iron-vanadium cofactor (FeVco), the iron only-nitrogenase an iron only cofactor (FeFeco). These cofactors are analogous to the FeMoco. The V-nitrogenase has P clusters identical to those of MoFe. In addition to N2, nitrogenase also catalyzes the reduction of a variety of other substrates such as acetylene The V-nitrogenase differs from the Mo-nitrogenase in that it produces free hydrazine, as a minor product during dinitrogen reduction and, ethane as a minor product during acetylene reduction. 48079 cd01979: Pchlide_reductase_N: N protein of the NB protein complex of Protochlorophyllide (Pchlide)_reductase. Pchlide reductase catalyzes the reductive formation of chlorophyllide (chlide) from protochlorophyllide (pchlide) during biosynthesis of chlorophylls and bacteriochlorophylls. This group contains both the light-independent Pchlide reductase (DPOR) and light-dependent Pchlide reductase (LPOR). Angiosperms contain only LPOR, cyanobacteria, algae and gymnosperms contain both DPOR and LPOR, primitive anoxygenic photosynthetic bacteria contain only DPOR. NB is structurally similar to the FeMo protein of nitrogenase, forming an N2B2 heterotetramer. N and B are homologous to the FeMo alpha and beta subunits respectively. Also in common with nitrogenase in vitro DPOR activity requires ATP hydrolysis and dithoionite or ferredoxin as electron donor. The NB protein complex may serve as a catalytic site for Pchlide reduction similar to MoFe for nitrogen reduction. 48080 cd01980: Chlide_reductase_Y : Y subunit of chlorophyllide (chlide) reductase (BchY). Chlide reductase participates in photosynthetic pigment synthesis playing a role in the conversion of chlorophylls(Chl) into bacteriochlorophylls (BChl). Chlide reductase catalyzes the reduction of the B-ring of the tetrapyrolle. Chlide reductase is a three subunit enzyme (subunits are designated BchX, BchY and BchZ). The similarity between these three subunits and the subunits for nitrogenase suggests that BchX serves as an electron donor for the BchY-BchY catalytic subunits. 48081 cd01981: Pchlide_reductase_B: B protein of the NB protein complex of Protochlorophyllide (Pchlide)_reductase. Pchlide reductase catalyzes the reductive formation of chlorophyllide (chlide) from protochlorophyllide (pchlide) during biosynthesis of chlorophylls and bacteriochlorophylls. This group contains both the light-independent Pchlide reductase (DPOR) and light-dependent Pchlide reductase (LPOR). Angiosperms contain only LPOR, cyanobacteria, algae and gymnosperms contain both DPOR and LPOR, primitive anoxygenic photosynthetic bacteria contain only DPOR. NB is structurally similar to the FeMo protein of nitrogenase, forming an N2B2 heterotetramer. N and B are homologous to the FeMo alpha and beta subunits respectively. Also in common with nitrogenase in vitro DPOR activity requires ATP hydrolysis and dithoionite or ferredoxin as electron donor. The NB protein complex may serve as a catalytic site for Pchlide reduction similar to MoFe for nitrogen reduction. 48082 cd01982: Chlide_reductase_Z : Z subunit of chlorophyllide (chlide) reductase (BchZ). Chlide reductase participates in photosynthetic pigment synthesis playing a role in the conversion of chlorophylls(Chl) into bacteriochlorophylls (BChl). Chlide reductase catalyzes the reduction of the B-ring of the tetrapyrolle. Chlide reductase is a three subunit enzyme (subunits are designated BchX, BchY and BchZ). The similarity between these three subunits and the subunits for nitrogenase suggests that BchX serves as an electron donor for the BchY-BchY catalytic subunits. 48083 cd03466: Nitrogenase_nifN_2: A subgroup of the NifN subunit of the NifEN complex: NifN forms an alpha2beta2 tetramer with NifE. NifN and nifE are structurally homologous to nitrogenase MoFe protein beta and alpha subunits respectively. NifEN participates in the synthesis of the iron-molybdenum cofactor (FeMoco) of the MoFe protein. NifB-co (an iron and sulfur containing precursor of the FeMoco) from NifB is transferred to the NifEN complex where it is further processed to FeMoco. The nifEN bound precursor of FeMoco has been identified as a molybdenum-free, iron- and sulfur- containing analog of FeMoco. It has been suggested that this nifEN bound precursor also acts as a cofactor precursor in nitrogenase systems which require a cofactor other than FeMoco: i.e. iron-vanadium cofactor (FeVco) or iron only cofactor (FeFeco). This group also contains the Clostidium fused NifN-NifB protein. 48084 cd01610: PAP2_like proteins, a super-family of histidine phosphatases and vanadium haloperoxidases, includes type 2 phosphatidic acid phosphatase or lipid phosphate phosphatase (LPP), Glucose-6-phosphatase, Phosphatidylglycerophosphatase B and bacterial acid phosphatase, vanadium chloroperoxidases, vanadium bromoperoxidases, and several other mostly uncharacterized subfamilies. Several members of this superfamily have been predicted to be transmembrane proteins. 48085 cd03380: PAP2_like_1 proteins, a sub-family of PAP2, containing bacterial acid phosphatase, vanadium chloroperoxidases and vanadium bromoperoxidases. 48086 cd03381: PAP2_like proteins, glucose-6-phosphatase subfamily. Glucose-6-phosphatase converts glucose-6-phosphate into free glucose and is active in the lumen of the endoplasmic reticulum, where it is bound to the membrane. The generation of free glucose is an important control point in metabolism, and stands at the end of gluconeogenesis and the release of glucose from glycogen. Deficiency of glucose-6-phosphatase leads to von Gierke's disease. 48087 cd03382: PAP2_like proteins, dolichyldiphosphatase subfamily. Dolichyldiphosphatase is a membrane-associated protein located in the endoplasmic reticulum and hydrolyzes dolichyl pyrophosphate, as well as dolichylmonophosphate at a low rate. The enzyme is necessary for maintaining proper levels of dolichol-linked oligosaccharides and protein N-glycosylation, and might play a role in re-utilization of the glycosyl carrier lipid for additional rounds of lipid intermediate biosynthesis after its release during protein N-glycosylation reactions. 48088 cd03383: PAP2_like proteins, diacylglycerol_kinase like sub-family. In some prokaryotes, PAP2_like phosphatase domains appear fused to E. coli DAGK-like trans-membrane diacylglycerol kinase domains. The cellular function of these architectures remains to be determined. 48089 cd03384: PAP2, wunen subfamily. Most likely a family of membrane associated phosphatidic acid phosphatases. Wunen is a drosophila protein expressed in the central nervous system, which provides repellent activity towards primordial germ cells (PGCs), controls the survival of PGCs and is essential in the migration process of these cells towards the somatic gonadal precursors. 48090 cd03385: PAP2_like proteins, BcrC_like subfamily. Several members of this family have been annotated as bacitracin transport permeases, as it was suspected that they form the permease component of an ABC transporter system. It was shown, however, that BcrC from Bacillus subtilis posesses undecaprenyl pyrophosphate (UPP) phospatase activity, and it is hypothesized that it competes with bacitracin for UPP, increasing the cell's resistance to bacitracin. 48091 cd03386: PAP2_like proteins, Aur1_like subfamily. Yeast Aur1p or Ipc1p is necessary for the addition of inositol phosphate to ceramide, an essential step in yeast sphingolipid synthesis, and is the target of several antifungal compounds such as aureobasidin. 48092 cd03388: PAP2_like proteins, sphingosine-1-phosphatase subfamily. Sphingosine-1-phosphatase is an intracellular enzyme located in the endoplasmic reticulum, which regulates the level of sphingosine-1-phosphate (S1P), a bioactive lipid. S1P acts as a second messenger in the cell, and extracellularly by binding to G-protein coupled receptors of the endothelial differentiation gene family. 48093 cd03389: PAP2_like proteins, Lipid A 1-phosphatase subfamily. Lipid A 1-phosphatase, or LpxE from Francisella novicida selectively dephosphorylates lipid A at the 1-position. Lipid A is the membrane-anchor component of lipopolysaccharides (LPS), the major constituents of the outer membrane in many gram-negative bacteria. 48094 cd03390: PAP2, subfamily similar to human phosphatidic_acid_phosphatase_type_2_domain_containing_1. Most likely membrane-associated phosphatidic acid phosphatases. Plant members of this group are constitutively expressed in many tissues and exhibit both diacylglycerol pyrophosphate phosphatase activity as well as phosphatidate (PA) phosphatase activity, they may have a more generic housekeeping role in lipid metabolism. 48095 cd03391: PAP2, subfamily similar to human phosphatidic_acid_phosphatase_type_2_domain_containing_2. PAP2 is a super-family of phosphatases and haloperoxidases. This subgroup, which is specific to eukaryota, lacks functional characterization and may act as a membrane-associated phosphatidic acid phosphatase. 48096 cd03392: PAP2_like_2 proteins. PAP2 is a super-family of phosphatases and haloperoxidases. This subgroup, which is specific to bacteria, lacks functional characterization and may act as a membrane-associated lipid phosphatase. 48097 cd03393: PAP2_like_3 proteins. PAP2 is a super-family of phosphatases and haloperoxidases. This subgroup, which is specific to bacteria and archaea, lacks functional characterization and may act as a membrane-associated lipid phosphatase. 48098 cd03394: PAP2_like_5 proteins. PAP2 is a super-family of phosphatases and haloperoxidases. This subgroup, which is specific to bacteria, lacks functional characterization and may act as a membrane-associated lipid phosphatase. 48099 cd03395: PAP2_like_4 proteins. PAP2 is a super-family of phosphatases and haloperoxidases. This subgroup, which is specific to bacteria, lacks functional characterization and may act as a membrane-associated lipid phosphatase. 48100 cd03396: PAP2_like_6 proteins. PAP2 is a super-family of phosphatases and haloperoxidases. This subgroup, which mainly contains bacterial proteins, lacks functional characterization and may act as a membrane-associated lipid phosphatase. 48101 cd03397: PAP2, bacterial acid phosphatase or class A non-specific acid phosphatases. These enzymes catalyze phosphomonoester hydrolysis, with optimal activity in low pH conditions. They are secreted into the periplasmic space, and their physiological role remains to be determined. 48102 cd03398: PAP2, haloperoxidase_like subfamily. Haloperoxidases catalyze the oxidation of halides such as bromide or chloride by hydrogen peroxide, which results in subsequent halogenation of organic substrates, or halide-assisted disproportionation of hydrogen peroxide forming dioxygen. They are likely to participate in the biosynthesis of halogenated natural products, such as volatile halogenated hydrocarbons, chiral halogenated terpenes, acetogenins and indoles. 48103 cd00299: Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins and stringent starvation protein A. 48104 cd03177: GST_C family, Class Delta and Epsilon subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. The class Delta and Epsilon subfamily is made up primarily of insect GSTs, which play major roles in insecticide resistance by facilitating reductive dehydrochlorination of insecticides or conjugating them with GSH to produce water-soluble metabolites that are easily excreted. They are also implicated in protection against cellular damage by oxidative stress. 48105 cd03178: GST_C family, Ure2p-like subfamily; composed of the Saccharomyces cerevisiae Ure2p and related GSTs. Ure2p is a regulator for nitrogen catabolism in yeast. It represses the expression of several gene products involved in the use of poor nitrogen sources when rich sources are available. A transmissible conformational change of Ure2p results in a prion called [Ure3], an inactive, self-propagating and infectious amyloid. Ure2p displays a GST fold containing an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain. The N-terminal thioredoxin-fold domain is sufficient to induce the [Ure3] phenotype and is also called the prion domain of Ure2p. In addition to its role in nitrogen regulation, Ure2p confers protection to cells against heavy metal ion and oxidant toxicity, and shows glutathione (GSH) peroxidase activity. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of GSH with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST active site is located in a cleft between the N- and C-terminal domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. 48106 cd03179: GST_C family, unknown subfamily 1; composed of uncharacterized bacterial proteins, with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. 48107 cd03180: GST_C family, unknown subfamily 2; composed of uncharacterized bacterial proteins, with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. 48108 cd03181: GST_C family, Gamma subunit of Elongation Factor 1B (EFB1gamma) subfamily; EF1Bgamma is part of the eukaryotic translation elongation factor-1 (EF1) complex which plays a central role in the elongation cycle during protein biosynthesis. EF1 consists of two functionally distinct units, EF1A and EF1B. EF1A catalyzes the GTP-dependent binding of aminoacyl-tRNA to the ribosomal A site concomitant with the hydrolysis of GTP. The resulting inactive EF1A:GDP complex is recycled to the active GTP form by the guanine-nucleotide exchange factor EF1B, a complex composed of at least two subunits, alpha and gamma. Metazoan EFB1 contain a third subunit, beta. The EF1B gamma subunit contains a GST fold consisting of an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain. The GST-like domain of EF1Bgamma is believed to mediate the dimerization of the EF1 complex, which in yeast is a dimer of the heterotrimer EF1A:EF1Balpha:EF1Bgamma. In addition to its role in protein biosynthesis, EF1Bgamma may also display other functions. The recombinant rice protein has been shown to possess GSH conjugating activity. The yeast EF1Bgamma binds membranes in a calcium dependent manner and is also part of a complex that binds to the msrA (methionine sulfoxide reductase) promoter suggesting a function in the regulation of its gene expression. Also included in this subfamily is the GST_C-like domain at the N-terminus of human valyl-tRNA synthetase and its homologs from zebrafish and Xenopus. Although not included in the alignment, the GST_C-like domain containing a deletion present in some aminoacyl-tRNA synthetases is recognized by this model. This domain will be represented in the future by a deletion model of GST_C. 48109 cd03182: GST_C family, Saccharomyces cerevisiae GTT2-like subfamily; composed of predominantly uncharacterized proteins with similarity to the S. cerevisiae GST protein, GTT2. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. GTT2, a homodimer, exhibits GST activity with standard substrates. Strains with deleted GTT2 genes are viable but exhibit increased sensitivity to heat shock. 48110 cd03183: GST_C family, Class Theta subfamily; composed of eukaryotic class Theta GSTs and bacterial dichloromethane (DCM) dehalogenase. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Mammalian class Theta GSTs show poor GSH conjugating activity towards the standard substrates, CDNB and ethacrynic acid, differentiating them from other mammalian GSTs. GSTT1-1 shows similar cataytic activity as bacterial DCM dehalogenase, catalyzing the GSH-dependent hydrolytic dehalogenation of dihalomethanes. This is an essential process in methylotrophic bacteria to enable them to use chloromethane and DCM as sole carbon and energy sources. The presence of polymorphisms in human GSTT1-1 and its relationship to the onset of diseases including cancer is subject of many studies. Human GSTT2-2 exhibits a highly specific sulfatase activity, catalyzing the cleavage of sulfate ions from aralkyl sufate esters, but not from the aryl or alkyl sulfate esters. 48111 cd03184: GST_C family, Class Omega subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Class Omega GSTs show little or no GSH-conjugating activity towards standard GST substrates. Instead, they catalyze the GSH dependent reduction of protein disulfides, dehydroascorbate and monomethylarsonate, activities which are more characteristic of glutaredoxins. They contain a conserved cysteine equivalent to the first cysteine in the CXXC motif of glutaredoxins, which is a redox active residue capable of reducing GSH mixed disulfides in a monothiol mechanism. Polymorphisms of the class Omega GST genes may be associated with the development of some types of cancer and the age-at-onset of both Alzheimer's and Parkinson's diseases. 48112 cd03185: GST_C family, Class Tau subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. The plant-specific class Tau GST subfamily has undergone extensive gene duplication. The Arabidopsis and Oryza genomes contain 28 and 40 Tau GSTs, respectively. They are primarily responsible for herbicide detoxification together with class Phi GSTs, showing class specificity in substrate preference. Tau enzymes are highly efficient in detoxifying diphenylether and aryloxyphenoxypropionate herbicides. In addition, Tau GSTs play important roles in intracellular signalling, biosynthesis of anthocyanin, responses to soil stresses and responses to auxin and cytokinin hormones. 48113 cd03186: GST_N family, Stringent starvation protein A (SspA) subfamily; SspA is a RNA polymerase (RNAP)-associated protein required for the lytic development of phage P1 and for stationary phase-induced acid tolerance of E. coli. It is implicated in survival during nutrient starvation. SspA adopts the GST fold with an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, but it does not bind glutathione (GSH) and lacks GST activity. SspA is highly conserved among gram-negative bacteria. Related proteins found in Neisseria (called RegF), Francisella and Vibrio regulate the expression of virulence factors necessary for pathogenesis. 48114 cd03187: GST_C family, Class Phi subfamily; composed of plant-specific class Phi GSTs and related fungal and bacterial proteins. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. The class Phi GST subfamily has experience extensive gene duplication. The Arabidopsis and Oryza genomes contain 13 and 16 Tau GSTs, respectively. They are primarily responsible for herbicide detoxification together with class Tau GSTs, showing class specificity in substrate preference. Phi enzymes are highly reactive toward chloroacetanilide and thiocarbamate herbicides. Some Phi GSTs have other functions including transport of flavonoid pigments to the vacuole, shoot regeneration and GSH peroxidase activity. 48115 cd03188: GST_C family, Class Beta subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Unlike mammalian GSTs which detoxify a broad range of compounds, the bacterial class Beta GSTs exhibit limited GSH conjugating activity with a narrow range of substrates. In addition to GSH conjugation, they also bind antibiotics and reduce the antimicrobial activity of beta-lactam drugs. The structure of the Proteus mirabilis enzyme reveals that the cysteine in the active site forms a covalent bond with GSH. 48116 cd03189: GST_C family, Saccharomyces cerevisiae GTT1-like subfamily; composed of predominantly uncharacterized proteins with similarity to the S. cerevisiae GST protein, GTT1, and the Schizosaccharomyces pombe GST-III. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. GTT1, a homodimer, exhibits GST activity with standard substrates and associates with the endoplasmic reticulum. Its expression is induced after diauxic shift and remains high throughout the stationary phase. S. pombe GST-III is implicated in the detoxification of various metals. 48117 cd03190: GST_C family, ECM4-like subfamily; composed of predominantly uncharacterized and taxonomically diverse proteins with similarity to the translation product of the Saccharomyces cerevisiae gene ECM4. ECM4, a gene of unknown function, is involved in cell surface biosynthesis and architecture. S. cerevisiae ECM4 mutants show increased amounts of the cell wall hexose, N-acetylglucosamine. More recently, global gene expression analysis shows that ECM4 is upregulated during genotoxic conditions and together with the expression profiles of 18 other genes could potentially differentiate between genotoxic and cytotoxic insults in yeast. 48118 cd03191: GST_C family, Class Zeta subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Class Zeta GSTs, also known as maleylacetoacetate (MAA) isomerases, catalyze the isomerization of MAA to fumarylacetoacetate, the penultimate step in tyrosine/phenylalanine catabolism, using GSH as a cofactor. They show little GSH-conjugating activity towards traditional GST substrates, but display modest GSH peroxidase activity. They are also implicated in the detoxification of the carcinogen dichloroacetic acid by catalyzing its dechlorination to glyoxylic acid. 48119 cd03192: GST_C family, Class Sigma_like; composed of GSTs belonging to class Sigma and similar proteins, including GSTs from class Mu, Pi, and Alpha. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Vertebrate class Sigma GSTs are characterized as GSH-dependent hematopoietic prostaglandin (PG) D synthases and are responsible for the production of PGD2 by catalyzing the isomerization of PGH2. The functions of PGD2 include the maintenance of body temperature, inhibition of platelet aggregation, bronchoconstriction, vasodilation, and mediation of allergy and inflammation. Other class Sigma members include the class II insect GSTs, S-crystallins from cephalopods, nematode-specific GSTs, and 28-kDa GSTs from parasitic flatworms. Drosophila GST2 is associated with indirect flight muscle and exhibits preference for catalyzing GSH conjugation to lipid peroxidation products, indicating an anti-oxidant role. S-crystallin constitutes the major lens protein in cephalopod eyes and is responsible for lens transparency and proper refractive index. The 28-kDa GST from Schistosoma is a multifunctional enzyme, exhibiting GSH transferase, GSH peroxidase, and PGD2 synthase activities, and may play an important role in host-parasite interactions. Also members are novel GSTs from the fungus Cunninghamella elegans, designated as class Gamma, and from the protozoan Blepharisma japonicum, described as a light-inducible GST. 48121 cd03194: GST_C family, unknown subfamily 3; composed of uncharacterized proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. 48122 cd03195: GST_C family, unknown subfamily 4; composed of uncharacterized proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. 48123 cd03196: GST_C family, unknown subfamily 5; composed of uncharacterized bacterial proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. 48124 cd03197: GST_C family; microsomal Prostaglandin E synthase Type 2 (mPGES2) subfamily; mPGES2 is a membrane-anchored dimeric protein containing a CXXC motif which catalyzes the isomerization of PGH2 to PGE2. Unlike cytosolic PGE synthase (cPGES) and microsomal PGES Type 1 (mPGES1), mPGES2 does not require glutathione (GSH) for its activity, although its catalytic rate is increased two- to four-fold in the presence of DTT, GSH, or other thiol compounds. PGE2 is widely distributed in various tissues and is implicated in the sleep/wake cycle, relaxation/contraction of smooth muscle, excretion of sodium ions, maintenance of body temperature, and mediation of inflammation. mPGES2 contains an N-terminal hydrophobic domain which is membrane associated and a C-terminal soluble domain with a GST-like structure. The C-terminus contains two structural domains a N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain. The GST active site is located in a cleft between the two domains. 48125 cd03198: GST_C family, Chloride Intracellular Channel (CLIC) subfamily; composed of CLIC1-5, p64, parchorin, and similar proteins. They are auto-inserting, self-assembling intracellular anion channels involved in a wide variety of functions including regulated secretion, cell division, and apoptosis. They can exist in both water-soluble and membrane-bound states and are found in various vesicles and membranes. Biochemical studies of the C. elegans homolog, EXC-4, show that the membrane localization domain is present in the N-terminal part of the protein. The structure of soluble human CLIC1 reveals that it is monomeric and adopts a fold similar to GSTs, containing an N-terminal domain with a thioredoxin fold and a C-terminal alpha helical domain. Upon oxidation, the N-terminal domain of CLIC1 undergoes a structural change to form a non-covalent dimer stabilized by the formation of an intramolecular disulfide bond between two cysteines that are far apart in the reduced form. The CLIC1 dimer bears no similarity to GST dimers. The redox-controlled structural rearrangement exposes a large hydrophobic surface, which is masked by dimerization in vitro. In vivo, this surface may represent the docking interface of CLIC1 in its membrane-bound state. The two cysteines in CLIC1 that form the disulfide bond in oxidizing conditions are essential for dimerization and chloride channel activity, however, in other subfamily members, the second cysteine is not conserved. 48126 cd03199: GST_C family, Glutaredoxin 2 (GRX2) subfamily; composed of bacterial proteins similar to E. coli GRX2, an atypical GRX with a molecular mass of about 24kD (most GRXs range from 9-12kD). GRX2 adopts a GST fold containing an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain. It contains a redox active CXXC motif located in the N-terminal domain, but is not able to reduce ribonucleotide reductase like other GRXs. However, it catalyzes GSH-dependent protein disulfide reduction of other substrates efficiently. GRX2 is thought to function primarily in catalyzing the reversible glutathionylation of proteins in cellular redox regulation including stress responses. 48127 cd03200: GST_C family, JTV-1 subfamily; composed of uncharacterized proteins with similarity to the translation product of the human JTV-1 gene. Human JTV-1, a gene of unknown function, initiates within the human PMS2 gene promoter, but is transcribed from the opposite strand. PMS2 encodes a protein involved in DNA mismatch repair and is mutated in a subset of patients with hereditary nonpolyposis colon cancer. It is unknown whether the expression of JTV-1 affects that of PMS2, or vice versa, as a result of their juxtaposition. JTV-1 is up-regulated while PMS2 is down-regulated in tumor cell spheroids that show increased resistance to anticancer cytotoxic drugs compared with tumor cell monolayers indicating that suppressed DNA mismatch repair may be a mechanism for multicellular resistance to alkylating agents. 48128 cd03201: GST_C family, Dehydroascorbate Reductase (DHAR) subfamily; composed of plant-specific DHARs, monomeric enzymes catalyzing the reduction of DHA into ascorbic acid (AsA) using glutathione as the reductant. DHAR allows plants to recycle oxidized AsA before it is lost. AsA serves as a cofactor of violaxanthin de-epoxidase in the xanthophyll cycle and as an antioxidant in the detoxification of reactive oxygen species. Because AsA is the major reductant in plants, DHAR serves to regulate their redox state. It has been suggested that a significant portion of DHAR activity is plastidic, acting to reduce the large amounts of ascorbate oxidized during hydrogen peroxide scavenging by ascorbate peroxidase. DHAR contains a conserved cysteine in its active site and in addition to its reductase activity, shows thiol transferase activity similar to glutaredoxins. 48129 cd03202: GST_C family, Beta etherase LigE subfamily; composed of proteins similar to Sphingomonas paucimobilis beta etherase, LigE, a GST-like protein that catalyzes the cleavage of the beta-aryl ether linkages present in low-moleculer weight lignins using GSH as the hydrogen donor. This reaction is an essential step in the degradation of lignin, a complex phenolic polymer that is the most abundant aromatic material in the biosphere. The beta etherase activity of LigE is enantioselective and it complements the activity of the other GST family beta etherase, LigF. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. 48130 cd03203: GST_C family, Class Lambda subfamily; composed of plant-specific class Lambda GSTs. GSTs are cytosolic, usually dimeric, proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. The class Lambda subfamily was recently discovered, together with dehydroascorbate reductases (DHARs), as two outlying groups of the GST superfamily in Arabidopsis thaliana, which contain conserved active site cysteines. Characterization of recombinant A. thaliana proteins show that Lambda class GSTs are monomeric, similar to DHARs. They do not exhibit GSH conjugating or DHAR activities, but are active as thiol transferases, similar to glutaredoxins. Members of this subfamily were originally identified as encoded proteins of the In2-1 gene, which can be induced by treatment with herbicide safeners. 48131 cd03204: GST_C family, Ganglioside-induced differentiation-associated protein 1 (GDAP1) subfamily; GDAP1 was originally identified as a highly expressed gene at the differentiated stage of GD3 synthase-transfected cells. More recently, mutations in GDAP1 have been reported to cause both axonal and demyelinating autosomal-recessive Charcot-Marie-Tooth (CMT) type 4A neuropathy. CMT is characterized by slow and progressive weakness and atrophy of muscles. Sequence analysis of GDAP1 shows similarities and differences with GSTs; it appears to contain both N-terminal thioredoxin-fold and C-terminal alpha helical domains of GSTs, however, it also contains additional C-terminal transmembrane domains unlike GSTs. GDAP1 is mainly expressed in neuronal cells and is localized in the mitochondria through its transmembrane domains. It does not exhibit GST activity using standard substrates. 48132 cd03205: GST_C family, unknown subfamily 6; composed of uncharacterized bacterial proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. 48133 cd03206: GST_C family, unknown subfamily 7; composed of uncharacterized proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. 48134 cd03207: GST_C family, unknown subfamily 8; composed of uncharacterized bacterial proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. 48135 cd03208: GST_C family, Class Alpha subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. The class Alpha subfamily is composed of vertebrate GSTs which can form homodimer and heterodimers. There are at least six types of class Alpha GST subunits in rats, four of which have human counterparts, resulting in many possible isoenzymes with different activities, tissue distribution and substrate specificities. Human GSTA1-1 and GSTA2-2 show high GSH peroxidase activity. GSTA3-3 catalyzes the isomerization of intermediates in steroid hormone biosynthesis. GSTA4-4 preferentially catalyzes the GSH conjugation of alkenals. 48136 cd03209: GST_C family, Class Mu subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. The class Mu subfamily is composed of eukaryotic GSTs. In rats, at least six distinct class Mu subunits have been identified, with homologous genes in humans for five of these subunits. Class Mu GSTs can form homodimers and heterodimers, giving a large number of possible isoenzymes that can be formed, all with overlapping activities but different substrate specificities. They are the most abundant GSTs in human liver, skeletal muscle and brain, and are believed to provide protection against diseases including cancer and neurodegenerative disorders. Some isoenzymes have additional specific functions. Human GST M1-1 acts as an endogenous inhibitor of ASK1 (apoptosis signal-regulating kinase 1) thereby suppressing ASK1-mediated cell death. Human GSTM2-2 and 3-3 have been identified as prostaglandin E2 synthases in the brain and may play crucial roles in temperature and sleep-wake regulation. 48137 cd03210: GST_C family, Class Pi subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Class Pi GST is a homodimeric eukaryotic protein. The human GSTP1 is mainly found in erythrocytes, kidney, placenta and fetal liver. It is involved in stress responses and in cellular proliferation pathways as an inhibitor of JNK (c-Jun N-terminal kinase). Following oxidative stress, monomeric GSTP1 dissociates from JNK and dimerizes, losing its ability to bind JNK and causing an increase in JNK activity, thereby promoting apoptosis. GSTP1 is expressed in various tumors and is the predominant GST in a wide range of cancer cells. It has been implicated in the development of multidrug-resistant tumors. 48138 cd03211: GST_C family, Metaxin subfamily, Metaxin 2; a metaxin 1 binding protein identified through a yeast two-hybrid system using metaxin 1 as the bait. Metaxin 2 shares sequence similarity with metaxin 1 but does not contain a C-terminal mitochondrial outer membrane signal-anchor domain. It associates with mitochondrial membranes through its interaction with metaxin 1, which is a component of the mitochondrial preprotein import complex of the outer membrane. The biological function of metaxin 2 is unknown. It is likely that it also plays a role in protein translocation into the mitochondria. However, this has not been experimentally validated. In a recent proteomics study, it has been shown that metaxin 2 is overexpressed in response to lipopolysaccharide-induced liver injury. 48139 cd03212: GST_C family, Metaxin subfamily, Metaxin 1-like proteins; composed of metaxins 1 and 3, and similar proteins. Mammalian metaxin (or metaxin 1) is a component of the preprotein import complex of the mitochondrial outer membrane. Metaxin extends to the cytosol and is anchored to the mitochondrial membrane through its C-terminal domain. In mice, metaxin is required for embryonic development. Like the murine gene, the human metaxin gene is located downstream to the glucocerebrosidase (GBA) pseudogene and is convergently transcribed. Inherited deficiency of GBA results in Gaucher disease, which presents many diverse clinical phenotypes. Alterations in the metaxin gene, in addition to GBA mutations, may be associated with Gaucher disease. Genome sequencing shows that a third metaxin gene also exists in zebrafish, Xenopus, chicken, and mammals. 48140 cd00465: The URO-D_CIMS_like protein superfamily includes bacterial and eukaryotic uroporphyrinogen decarboxylases (URO-D), coenzyme M methyltransferases and other putative bacterial methyltransferases, as well as cobalamine (B12) independent methionine synthases. Despite their sequence similarities, members of this family have clearly different functions. Uroporphyrinogen decarboxylase (URO-D) decarboxylates the four acetate side chains of uroporphyrinogen III (uro-III) to create coproporphyrinogen III, an important branching point of the tetrapyrrole biosynthetic pathway. The methyltransferases represented here are important for ability of methanogenic organisms to use other compounds than carbon dioxide for reduction to methane, and methionine synthases transfer a methyl group from a folate cofactor to L-homocysteine in a reaction requiring zinc. 48141 cd00717: Uroporphyrinogen decarboxylase (URO-D) is a dimeric cytosolic enzyme that decarboxylates the four acetate side chains of uroporphyrinogen III (uro-III) to create coproporphyrinogen III, without requiring any prosthetic groups or cofactors. This reaction is located at the branching point of the tetrapyrrole biosynthetic pathway, leading to the biosynthesis of heme, chlorophyll or bacteriochlorophyll. URO-D deficiency is responsible for the human genetic diseases familial porphyria cutanea tarda (fPCT) and hepatoerythropoietic porphyria (HEP).. 48142 cd03307: MtaA_CmuA_like family. MtaA/CmuA, also MtsA, or methyltransferase 2 (MT2) MT2-A and MT2-M isozymes, are methylcobamide:Coenzyme M methyltransferases, which play a role in metabolic pathways of methane formation from various substrates, such as methylated amines and methanol. Coenzyme M, 2-mercaptoethylsulfonate or CoM, is methylated during methanogenesis in a reaction catalyzed by three proteins. A methyltransferase methylates the corrinoid cofactor, which is bound to a second polypeptide, a corrinoid protein. The methylated corrinoid protein then serves as a substrate for MT2-A and related enzymes, which methylate CoM. 48143 cd03308: CmuA_CmuC_like: uncharacterized protein family similar to uroporphyrinogen decarboxylase (URO-D) and the methyltransferases CmuA and CmuC. 48144 cd03309: CmuC_like. Proteins similar to the putative corrinoid methyltransferase CmuC. Its function has been inferred from sequence similarity to the methyltransferases CmuA and MtaA. Mutants of Methylobacterium sp. disrupted in cmuC and purU appear deficient in some step of chloromethane metabolism. 48145 cd03310: CIMS - Cobalamine-independent methonine synthase, or MetE. Many members have been characterized as 5-methyltetrahydropteroyltriglutamate-homocysteine methyltransferases, EC:2.1.1.14, mostly from bacteria and plants. This enzyme catalyses the last step in the production of methionine by transferring a methyl group from 5-methyltetrahydrofolate to L-homocysteine without using an intermediate methyl carrier. The active enzyme has a dual (beta-alpha)8-barrel structure, and this model covers both the N-and C-terminal barrel, and some single-barrel sequences, mostly from Archaea. It is assumed that the homologous N-terminal barrel has evolved from the C-terminus via gene duplication and has subsequently lost binding sites, and it seems as if the two barrels forming the active enzyme may sometimes reside on different polypeptides. The C-terminal domain incorporates the Zinc ion, which binds and activates homocysteine. Side chains from both barrels contribute to the binding of the folate substrate. 48146 cd03311: CIMS - Cobalamine-independent methonine synthase, or MetE, C-terminal domain_like. Many members have been characterized as 5-methyltetrahydropteroyltriglutamate-homocysteine methyltransferases, EC:2.1.1.14, mostly from bacteria and plants. This enzyme catalyses the last step in the production of methionine by transferring a methyl group from 5-methyltetrahydrofolate to L-homocysteine without using an intermediate methyl carrier. The active enzyme has a dual (beta-alpha)8-barrel structure, and this model covers the C-terminal barrel, and a few single-barrel sequences most similar to the C-terminal barrel. It is assumed that the homologous N-terminal barrel has evolved from the C-terminus via gene duplication and has subsequently lost binding sites, and it seems as if the two barrels forming the active enzyme may sometimes reside on different polypeptides. The C-terminal domain incorporates the Zinc ion, which binds and activates homocysteine. Sidechains from both barrels contribute to the binding of the folate substrate. 48147 cd03312: CIMS - Cobalamine-independent methonine synthase, or MetE, N-terminal domain_like. Many members have been characterized as 5-methyltetrahydropteroyltriglutamate-homocysteine methyltransferases, EC:2.1.1.14, mostly from bacteria and plants. This enzyme catalyses the last step in the production of methionine by transferring a methyl group from 5-methyltetrahydrofolate to L-homocysteine without using an intermediate methyl carrier. The active enzyme has a dual (beta-alpha)8-barrel structure, and this model covers the N-terminal barrel, and a few single-barrel sequences most similar to the N-terminal barrel. It is assumed that the homologous N-terminal barrel has evolved from the C-terminus via gene duplication and has subsequently lost binding sites, and it seems as if the two barrels forming the active enzyme may sometimes reside on different polypeptides. The C-terminal domain incorporates the Zinc ion, which binds and activates homocysteine. Side chains from both barrels contribute to the binding of the folate substrate. 48148 cd03465: The URO-D _like protein superfamily includes bacterial and eukaryotic uroporphyrinogen decarboxylases (URO-D), coenzyme M methyltransferases and other putative bacterial methyltransferases. Uroporphyrinogen decarboxylase (URO-D) decarboxylates the four acetate side chains of uroporphyrinogen III (uro-III) to create coproporphyrinogen III, an important branching point of the tetrapyrrole biosynthetic pathway. The methyltransferases represented here are important for ability of methanogenic organisms to use other compounds than carbon dioxide for reduction to methane. 48151 cd03334: TCP-1 like domain of the eukaryotic phosphatidylinositol 3-phosphate (PtdIns3P) 5-kinase Fab1. Fab1p is important for vacuole size regulation, presumably by modulating PtdIns(3,5)P2 effector activity. In the human homolog p235/PIKfyve deletion of this domain leads to loss of catalytic activity. However no exact function this domain has been defined. In general, chaperonins are involved in productive folding of proteins. 48152 cd03335: TCP-1 (CTT or eukaryotic type II) chaperonin family, alpha subunit. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. In contrast to bacterial group I chaperonins (GroEL), each ring of the eukaryotic cytosolic chaperonin (CTT) consists of eight different, but homologous subunits. Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. The best studied in vivo substrates of CTT are actin and tubulin. 48153 cd03336: TCP-1 (CTT or eukaryotic type II) chaperonin family, beta subunit. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. In contrast to bacterial group I chaperonins (GroEL), each ring of the eukaryotic cytosolic chaperonin (CTT) consists of eight different, but homologous subunits. Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. The best studied in vivo substrates of CTT are actin and tubulin. 48155 cd03338: TCP-1 (CTT or eukaryotic type II) chaperonin family, delta subunit. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. In contrast to bacterial group I chaperonins (GroEL), each ring of the eukaryotic cytosolic chaperonin (CTT) consists of eight different, but homologous subunits. Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. The best studied in vivo substrates of CTT are actin and tubulin. 48156 cd03339: TCP-1 (CTT or eukaryotic type II) chaperonin family, epsilon subunit. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. In contrast to bacterial group I chaperonins (GroEL), each ring of the eukaryotic cytosolic chaperonin (CTT) consists of eight different, but homologous subunits. Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. The best studied in vivo substrates of CTT are actin and tubulin. 48157 cd03340: TCP-1 (CTT or eukaryotic type II) chaperonin family, eta subunit. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. In contrast to bacterial group I chaperonins (GroEL), each ring of the eukaryotic cytosolic chaperonin (CTT) consists of eight different, but homologous subunits. Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. The best studied in vivo substrates of CTT are actin and tubulin. 48160 cd03343: cpn60 chaperonin family. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. Archaeal cpn60 (thermosome), together with TF55 from thermophilic bacteria and the eukaryotic cytosol chaperonin (CTT), belong to the type II group of chaperonins. Cpn60 consists of two stacked octameric rings, which are composed of one or two different subunits. Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. 48161 cd03344: GroEL_like type I chaperonin. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). With the aid of cochaperonin GroES, GroEL encapsulates non-native substrate proteins inside the cavity of the GroEL-ES complex and promotes folding by using energy derived from ATP hydrolysis. 48162 cd00568: Thiamine pyrophosphate (TPP) enzyme family, TPP-binding module; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. These enzymes include, among others, the E1 components of the pyruvate, the acetoin and the branched chain alpha-keto acid dehydrogenase complexes. 48163 cd02000: Thiamine pyrophosphate (TPP) family, E1 of PDC_ADC_BCADC subfamily, TPP-binding module; composed of proteins similar to the E1 components of the human pyruvate dehydrogenase complex (PDC), the acetoin dehydrogenase complex (ADC) and the branched chain alpha-keto acid dehydrogenase/2-oxoisovalerate dehydrogenase complex (BCADC). PDC catalyzes the irreversible oxidative decarboxylation of pyruvate to produce acetyl-CoA in the bridging step between glycolysis and the citric acid cycle. ADC participates in the breakdown of acetoin while BCADC participates in the breakdown of branched chain amino acids. BCADC catalyzes the oxidative decarboxylation of 4-methyl-2-oxopentanoate, 3-methyl-2-oxopentanoate and 3-methyl-2-oxobutanoate (branched chain 2-oxo acids derived from the transamination of leucine, valine and isoleucine).. 48164 cd02001: Thiamine pyrophosphate (TPP) family, ComE and PpyrDC subfamily, TPP-binding module; composed of proteins similar to sulfopyruvate decarboxylase beta subunit (ComE) and phosphonopyruvate decarboxylase (Ppyr decarboxylase). Methanococcus jannaschii sulfopyruvate decarboxylase (ComDE) is a dodecamer of six alpha (D) subunits and six (E) beta subunits which, catalyzes the decarboxylation of sulfopyruvic acid to sulfoacetaldehyde in the coenzyme M pathway. Ppyr decarboxylase is a homotrimeric enzyme which functions in the biosynthesis of C-P compounds such as bialaphos tripeptide in Streptomyces hygroscopicus. Ppyr decarboxylase and ComDE require TPP and divalent metal cation cofactors. 48165 cd02002: Thiamine pyrophosphate (TPP) family, BFDC subfamily, TPP-binding module; composed of proteins similar to Pseudomonas putida benzoylformate decarboxylase (BFDC). P. putida BFDC plays a role in the mandelate pathway, catalyzing the conversion of benzoylformate to benzaldehyde and carbon dioxide. This enzyme is dependent on TPP and a divalent metal cation as cofactors. 48166 cd02003: Thiamine pyrophosphate (TPP) family, IolD subfamily, TPP-binding module; composed of proteins similar to Rhizobium leguminosarum bv. viciae IolD. IolD plays an important role in myo-inositol catabolism. 48167 cd02004: Thiamine pyrophosphate (TPP) family, BZL_OCoD_HPCL subfamily, TPP-binding module; composed of proteins similar to benzaldehyde lyase (BZL), oxalyl-CoA decarboxylase (OCoD) and 2-hydroxyphytanoyl-CoA lyase (2-HPCL). Pseudomonas fluorescens biovar I BZL cleaves the acyloin linkage of benzoin producing 2 molecules of benzaldehyde and enabling the Pseudomonas to grow on benzoin as the sole carbon and energy source. OCoD has a role in the detoxification of oxalate, catalyzing the decarboxylation of oxalyl-CoA to formate. 2-HPCL is a peroxisomal enzyme which plays a role in the alpha-oxidation of 3-methyl-branched fatty acids, catalyzing the cleavage of 2-hydroxy-3-methylacyl-CoA into formyl-CoA and a 2-methyl-branched fatty aldehyde. All these enzymes depend on Mg2+ and TPP for activity. 48168 cd02005: Thiamine pyrophosphate (TPP) family, PDC_IPDC subfamily, TPP-binding module; composed of proteins similar to pyruvate decarboxylase (PDC) and indolepyruvate decarboxylase (IPDC). PDC, a key enzyme in alcoholic fermentation, catalyzes the conversion of pyruvate to acetaldehyde and CO2. It is able to utilize other 2-oxo acids as substrates. In plants and various plant-associated bacteria, IPDC plays a role in the indole-3-pyruvic acid (IPA) pathway, a tryptophan-dependent biosynthetic route to indole-3-acetaldehyde (IAA). IPDC catalyzes the decarboxylation of IPA to IAA. Both PDC and IPDC depend on TPP and Mg2+ as cofactors. 48169 cd02006: Thiamine pyrophosphate (TPP) family, Gcl subfamily, TPP-binding module; composed of proteins similar to Escherichia coli glyoxylate carboligase (Gcl). E. coli glyoxylate carboligase, plays a key role in glyoxylate metabolism where it catalyzes the condensation of two molecules of glyoxylate to give tartronic semialdehyde and carbon dioxide. This enzyme requires TPP, magnesium ion and FAD as cofactors. 48171 cd02008: Thiamine pyrophosphate (TPP) family, IOR-alpha subfamily, TPP-binding module; composed of proteins similar to indolepyruvate ferredoxin oxidoreductase (IOR) alpha subunit. IOR catalyzes the oxidative decarboxylation of arylpyruvates, such as indolepyruvate or phenylpyruvate, which are generated by the transamination of aromatic amino acids, to the corresponding aryl acetyl-CoA. 48172 cd02009: Thiamine pyrophosphate (TPP) family, SHCHC synthase subfamily, TPP-binding module; composed of proteins similar to Escherichia coli 2-succinyl-6-hydroxyl-2,4-cyclohexadiene-1-carboxylic acid (SHCHC) synthase (also called MenD). SHCHC synthase plays a key role in the menaquinone biosynthetic pathway, converting isochorismate and 2-oxoglutarate to SHCHC, pyruvate and carbon dioxide. The enzyme requires TPP and a divalent metal cation for activity. 48173 cd02010: Thiamine pyrophosphate (TPP) family, Acetolactate synthase (ALS) subfamily, TPP-binding module; composed of proteins similar to Klebsiella pneumoniae ALS, a catabolic enzyme required for butanediol fermentation. ALS catalyzes the conversion of 2 molecules of pyruvate to acetolactate and carbon dioxide. ALS does not contain FAD, and requires TPP and a divalent metal cation for activity. 48174 cd02011: Thiamine pyrophosphate (TPP) family, Phosphoketolase (PK) subfamily, TPP-binding module; PK catalyzes the conversion of D-xylulose 5-phosphate and phosphate to acetyl phosphate, D-glyceraldehyde-3-phosphate and H2O. This enzyme requires divalent magnesium ions and TPP for activity. 48175 cd02012: Thiamine pyrophosphate (TPP) family, Transketolase (TK) subfamily, TPP-binding module; TK catalyzes the transfer of a two-carbon unit from ketose phosphates to aldose phosphates. In heterotrophic organisms, TK provides a link between glycolysis and the pentose phosphate pathway and provides precursors for nucleotide, aromatic amino acid and vitamin biosynthesis. In addition, the enzyme plays a central role in the Calvin cycle in plants. Typically, TKs are homodimers. They require TPP and divalent cations, such as magnesium ions, for activity. 48176 cd02013: Thiamine pyrophosphate (TPP) family, Xsc-like subfamily, TPP-binding module; composed of proteins similar to Alcaligenes defragrans sulfoacetaldehyde acetyltransferase (Xsc). Xsc plays a key role in the degradation of taurine, catalyzing the desulfonation of 2-sulfoacetaldehyde into sulfite and acetyl phosphate. This enzyme requires TPP and divalent metal ions for activity. 48177 cd02014: Thiamine pyrophosphate (TPP) family, Pyruvate oxidase (POX) subfamily, TPP-binding module; composed of proteins similar to Lactobacillus plantarum POX, which plays a key role in controlling acetate production under aerobic conditions. POX decarboxylates pyruvate, producing hydrogen peroxide and the energy-storage metabolite acetylphosphate. It requires FAD in addition to TPP and a divalent cation as cofactors. 48178 cd02015: Thiamine pyrophosphate (TPP) family, Acetohydroxyacid synthase (AHAS) subfamily, TPP-binding module; composed of proteins similar to the large catalytic subunit of AHAS. AHAS catalyzes the condensation of two molecules of pyruvate to give the acetohydroxyacid, 2-acetolactate. 2-Acetolactate is the precursor of the branched chain amino acids, valine and leucine. AHAS also catalyzes the condensation of pyruvate and 2-ketobutyrate to form 2-aceto-2-hydroxybutyrate in isoleucine biosynthesis. In addition to requiring TPP and a divalent metal ion as cofactors, AHAS requires FAD. 48179 cd02016: Thiamine pyrophosphate (TPP) family, E1 of OGDC-like subfamily, TPP-binding module; composed of proteins similar to the E1 component of the 2-oxoglutarate dehydrogenase multienzyme complex (OGDC). OGDC catalyzes the oxidative decarboxylation of 2-oxoglutarate to succinyl-CoA and carbon dioxide, a key reaction of the tricarboxylic acid cycle. 48180 cd02017: Thiamine pyrophosphate (TPP) family, E1 of E. coli PDC-like subfamily, TPP-binding module; composed of proteins similar to the E1 component of the Escherichia coli pyruvate dehydrogenase multienzyme complex (PDC). PDC catalyzes the oxidative decarboxylation of pyruvate and the subsequent acetylation of coenzyme A to acetyl-CoA. The E1 component of PDC catalyzes the first step of the multistep process, using TPP and a divalent cation as cofactors. E. coli PDC is a homodimeric enzyme. 48181 cd02018: Thiamine pyrophosphate (TPP family), Pyruvate ferredoxin/flavodoxin oxidoreductase (PFOR) subfamily, TPP-binding module; PFOR catalyzes the oxidative decarboxylation of pyruvate to form acetyl-CoA, a crucial step in many metabolic pathways. Archaea, anaerobic bacteria and eukaryotes that lack mitochondria (and therefore pyruvate dehydrogenase) use PFOR to oxidatively decarboxylate pyruvate, with ferredoxin or flavodoxin as the electron acceptor. PFORs can be homodimeric, heterodimeric, or heterotetrameric, depending on the organism. These enzymes are dependent on TPP and a divalent metal cation as cofactors. 48182 cd03371: Thiamine pyrophosphate (TPP) family, PpyrDC subfamily, TPP-binding module; composed of proteins similar to phosphonopyruvate decarboxylase (PpyrDC) proteins. PpyrDC is a homotrimeric enzyme which functions in the biosynthesis of C-P compounds such as bialaphos tripeptide in Streptomyces hygroscopicus. These proteins require TPP and divalent metal cation cofactors. 48183 cd03372: Thiamine pyrophosphate (TPP) family, ComE subfamily, TPP-binding module; composed of proteins similar to Methanococcus jannaschii sulfopyruvate decarboxylase beta subunit (ComE). M. jannaschii sulfopyruvate decarboxylase (ComDE) is a dodecamer of six alpha (D) subunits and six (E) beta subunits, which catalyzes the decarboxylation of sulfopyruvic acid to sulfoacetaldehyde in the coenzyme M pathway. ComDE requires TPP and divalent metal cation cofactors. 48185 cd03376: Thiamine pyrophosphate (TPP family), PFOR porB-like subfamily, TPP-binding module; composed of proteins similar to the beta subunit (porB) of the Helicobacter pylori four-subunit pyruvate ferredoxin oxidoreductase (PFOR), which are also found in archaea and some hyperthermophilic bacteria. PFOR catalyzes the oxidative decarboxylation of pyruvate to form acetyl-CoA, a crucial step in many metabolic pathways. Archaea, anaerobic bacteria and eukaryotes that lack mitochondria (and therefore pyruvate dehydrogenase) use PFOR to oxidatively decarboxylate pyruvate, with ferredoxin or flavodoxin as the electron acceptor. The 36-kDa porB subunit contains the binding sites for the cofactors, TPP and a divalent metal cation, which are required for activity. 48186 cd03377: Thiamine pyrophosphate (TPP family), PFOR_PNO subfamily, TPP-binding module; composed of proteins similar to the single subunit pyruvate ferredoxin oxidoreductase (PFOR) of Desulfovibrio Africanus, present in bacteria and amitochondriate eukaryotes. This subfamily also includes proteins characterized as pyruvate NADP+ oxidoreductase (PNO). These enzymes are dependent on TPP and a divalent metal cation as cofactors. PFOR and PNO catalyze the oxidative decarboxylation of pyruvate to form acetyl-CoA, a crucial step in many metabolic pathways. Archaea, anaerobic bacteria and eukaryotes that lack mitochondria (and therefore pyruvate dehydrogenase) use PFOR to oxidatively decarboxylate pyruvate, with ferredoxin or flavodoxin as the electron acceptor. The PFOR from cyanobacterium Anabaena (NifJ) is required for the transfer of electrons from pyruvate to flavodoxin, which reduces nitrogenase. The facultative anaerobic mitochondrion of the photosynthetic protist Euglena gracilis oxidizes pyruvate with PNO. 48188 cd03313: Enolase: Enolases are homodimeric enzymes that catalyse the reversible dehydration of 2-phospho-D-glycerate to phosphoenolpyruvate as part of the glycolytic and gluconeogenesis pathways. The reaction is facilitated by the presence of metal ions. 48191 cd03316: Mandelate racemase (MR)-like subfamily of the enolase superfamily. Enzymes of this subgroup share three conserved carboxylate ligands for the essential divalent metal ion (usually Mg2+), two aspartates and a glutamate, and conserved catalytic residues, a Lys-X-Lys motif and a conserved histidine-aspartate dyad. Members of the MR subgroup are mandelate racemase, D-glucarate/L-idarate dehydratase (GlucD), D-altronate/D-mannonate dehydratase , D-galactonate dehydratase (GalD) , D-gluconate dehydratase (GlcD), and L-rhamnonate dehydratase (RhamD).. 48192 cd03317: N-acylamino acid racemase (NAAAR), an octameric enzyme that catalyzes the racemization of N-acylamino acids. NAAARs act on a broad range of N-acylamino acids rather than amino acids. Enantiopure amino acids are of industrial interest as chiral building blocks for antibiotics, herbicides, and drugs. NAAAR is a member of the enolase superfamily, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion. 48193 cd03318: Muconate Lactonizing Enzyme (MLE), an homooctameric enzyme, catalyses the conversion of cis,cis-muconate (CCM) to muconolactone (ML) in the catechol branch of the beta-ketoadipate pathway. This pathway is used in soil microbes to breakdown lignin-derived aromatics, catechol and protocatechuate, to citric acid cycle intermediates. Some bacterial species are also capable of dehalogenating chloroaromatic compounds by the action of chloromuconate lactonizing enzymes (Cl-MLEs). MLEs are members of the enolase superfamily characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and that is stabilized by coordination to the essential Mg2+ ion. 48194 cd03319: L-Ala-D/L-Glu epimerase catalyzes the epimerization of L-Ala-D/L-Glu and other dipeptides. The genomic context and the substrate specificity of characterized members of this family from E.coli and B.subtilis indicates a possible role in the metabolism of the murein peptide of peptidoglycan, of which L-Ala-D-Glu is a component. L-Ala-D/L-Glu epimerase is a member of the enolase-superfamily, which is characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion. 48196 cd03321: Mandelate racemase (MR) catalyzes the Mg2+-dependent 1,1-proton transfer reaction that interconverts the enantiomers of mandelic acid. MR is the first enzyme in the bacterial pathway that converts mandelic acid to benzoic acid and allows this pathway to utilize either enantiomer of mandelate. MR belongs to the enolase superfamily of enzymes, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion. 48198 cd03323: D-Glucarate dehydratase (GlucD) catalyzes the dehydration of both D-glucarate and L-idarate to form 5-keto-4-deoxy-D-glucarate (5-KDG) , the initial reaction of the catabolic pathway for (D)-glucarate. GlucD belongs to the enolase superfamily of enzymes, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and that is stabilized by coordination to the essential Mg2+ ion. 48199 cd03324: Human rTS beta is encoded by the rTS gene which, through alternative RNA splicing, also encodes rTS alpha whose mRNA is complementary to thymidylate synthase mRNA. rTS beta expression is associated with the production of small molecules that appear to mediate the down-regulation of thymidylate synthase protein by a novel intercellular signaling mechanism. A member of this family, from Xanthomonas, has been characterized to be a L-fuconate dehydratase. rTS beta belongs to the enolase superfamily of enzymes, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion. 48200 cd03325: D-galactonate dehydratase catalyses the dehydration of galactonate to 2-keto-3-deoxygalactnate (KDGal), as part of the D-galactonate nonphosphorolytic catabolic Entner-Doudoroff pathway. D-galactonate dehydratase belongs to the enolase superfamily of enzymes, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion. 48201 cd03326: Mandelate racemase (MR)-like subfamily of the enolase superfamily, subgroup 1. Enzymes of this subgroup share three conserved carboxylate ligands for the essential divalent metal ion (usually Mg2+), two aspartates and a glutamate, and conserved catalytic residues, a Lys-X-Lys motif and a conserved histidine-aspartate dyad. This subgroup's function is unknown. 48203 cd03328: Mandelate racemase (MR)-like subfamily of the enolase superfamily, subgroup 3. Enzymes of this subgroup share three conserved carboxylate ligands for the essential divalent metal ion (usually Mg2+), two aspartates and a glutamate, and conserved catalytic residues, a Lys-X-Lys motif and a conserved histidine-aspartate dyad. This subgroup's function is unknown. 48204 cd03329: Mandelate racemase (MR)-like subfamily of the enolase superfamily, subgroup 4. Enzymes of this subgroup share three conserved carboxylate ligands for the essential divalent metal ion (usually Mg2+), two aspartates and a glutamate, and conserved catalytic residues, a Lys-X-Lys motif and a conserved histidine-aspartate dyad. This subgroup's function is unknown. 48205 cd00291: SirA, YedF, and YeeD. Two-layered alpha/beta sandwich domain. SirA (also known as UvrY, and YhhP) belongs to a family of bacterial two-component response regulators that controls secondary metabolism and virulence. The other member of this two-component system is a sensor kinase called BarA which phosphorylates SirA. A variety of microorganisms have similar proteins, all of which contain a common CPxP sequence motif in the N-terminal region. YhhP is suggested to be important for normal cell division and growth in rich nutrient medium. Moreover, despite a low primary sequence similarity, the YccP structure closely resembles the non-homologous C-terminal RNA-binding domain of E. coli translation initiation factor IF3. The signature CPxP motif serves to stabilize the N-terminal helix as part of the N-capping box and might be important in mRNA-binding. 48206 cd03420: SirA_RHOD_Pry_redox. SirA-like domain located within a multidomain protein of unknown function. Other domains include RHOD (rhodanese homology domain), and Pry_redox (pyridine nucleotide-disulphide oxidoreductase) as well as a C-terminal domain that corresponds to COG2210. This fold is referred to as a two-layered alpha/beta sandwich, structurally similar to that of translation initiation factor 3. 48207 cd03421: SirA_like_N, a protein of unknown function with an N-terminal SirA-like domain. The SirA, YedF, YeeD protein family is present in bacteria as well as archaea. SirA (also known as UvrY, and YhhP) belongs to a family of a two-component response regulators that controls secondary metabolism and virulence. The other member of this two-component system is a sensor kinase called BarA which phosphorylates SirA. A variety of microorganisms have similar proteins, all of which contain a common CPxP sequence motif in the N-terminal region. YhhP is suggested to be important for normal cell division and growth in rich nutrient medium. Moreover, despite a low primary sequence similarity, the YccP structure closely resembles the non-homologous C-terminal RNA-binding domain of E. coli translation initiation factor IF3. The signature CPxP motif serves to stabilize the N-terminal helix as part of the N-capping box and might be important in mRNA-binding. 48208 cd03422: YedF is a bacterial SirA-like protein of unknown function. SirA (also known as UvrY, and YhhP) belongs to a family of a two-component response regulators that controls secondary metabolism and virulence. The other member of this two-component system is a sensor kinase called BarA which phosphorylates SirA. A variety of microorganisms have similar proteins, all of which contain a common CPxP sequence motif in the N-terminal region. YhhP is suggested to be important for normal cell division and growth in rich nutrient medium. Moreover, despite a low primary sequence similarity, the YccP structure closely resembles the non-homologous C-terminal RNA-binding domain of E. coli translation initiation factor IF3. The signature CPxP motif serves to stabilize the N-terminal helix as part of the N-capping box and might be important in mRNA-binding. 48209 cd03423: SirA (also known as UvrY, and YhhP) belongs to a family of two-component response regulators that controls secondary metabolism and virulence. The other member of this two-component system is a sensor kinase called BarA which phosphorylates SirA. A variety of microorganisms have similar proteins, all of which contain a common CPxP sequence motif in the N-terminal region. YhhP is thought to be important for normal cell division and growth in rich nutrient medium. Moreover, despite a low primary sequence similarity, the YccP structure closely resembles the non-homologous C-terminal RNA-binding domain of E. coli translation initiation factor IF3. The signature CPxP motif serves to stabilize the N-terminal helix as part of the N-capping box and might be important in mRNA-binding. 48210 cd02106: The band 7 domain of flotillin (reggie) like proteins. This group contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson 's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome. 48211 cd03399: Band_7_flotillin: a subgroup of the band 7 domain of flotillin (reggie) like proteins. This subgroup contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. These two proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and, interact with a variety of proteins. Flotillins may play a role in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. 48212 cd03400: A subgroup of the band 7 domain of flotillin (reggie) like proteins. This subgroup contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson 's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome. 48213 cd03401: Band_7_prohibitin. A subgroup of the band 7 domain of flotillin (reggie) like proteins. This subgroup group includes proteins similar to prohibitin (a lipid raft-associated integral membrane protein). Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. These microdomains in addition to being stable scaffolds may also be also dynamic units with their own regulatory functions. Prohibitin is a mitochondrial inner-membrane protein which may act as a chaperone for the stabilization of mitochondrial proteins. Human prohibitin forms a heter-oligomeric complex with Bap-37 (prohibitin 2, a band 7 domain carrying homologue). This complex may protect non-assembled membrane proteins against proteolysis by the m-AAA protease. Prohibitin and Bap-37 yeast homologues have been implicated in yeast longevity and, in the maintenance of mitochondrial morphology. 48214 cd03402: A subgroup of the band 7 domain of flotillin (reggie) like proteins. This subgroup contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson 's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome. 48215 cd03403: Band_7_stomatin_like: A subgroup of the band 7 domain of flotillin (reggie) like proteins similar to stomatin and podicin (two lipid raft-associated integral membrane proteins). Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Stomatin is widely expressed and, highly expressed in red blood cells. It localizes predominantly to the plasma membrane and to intracellular vesicles of the endocytic pathway, where it is present in higher order homo-oligomeric complexes (of between 9 and 12 monomers). Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and, is implicated in trafficking of Glut1 glucose transporters. Prohibitin is a mitochondrial inner-membrane protein hypothesized to act as a chaperone for the stabilization of mitochondrial proteins. Podicin localizes to the plasma membrane of podocyte foot processes and, is found in higher order oligomers. Podocin plays a role in regulating glomerular permeability. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome. This group also contains proteins similar to three Caenorhabditis elegans proteins: UNC-1, UNC-24 and, MEC-2. Mutations in the unc-1 and unc-24 genes result in abnormal motion and altered patterns of sensitivity to volatile anesthetics. MEC-2 and UNC-24 proteins interact with MEC-4 which is part of the degenerin channel complex required for response to gentle body touch. 48216 cd03404: Band_7_HflK: The band 7 domain of flotillin (reggie) like proteins. This group includes proteins similar to prokaryotic HlfK (High frequency of lysogenization K). Although many members of the band 7 family are lipid raft associated, prokaryote plasma membranes lack cholesterol and are unlikely to have lipid raft domains. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Escherichia coli HflK is an integral membrane protein which may localize to the plasma membrane. HflK associates with another band 7 family member (HflC) to form an HflKC complex. HflKC interacts with FtsH in a large complex termed the FtsH holo-enzyme. FtsH is an AAA ATP-dependent protease which exerts progressive proteolysis against membrane-embedded and soluble substrate proteins. HflKC can modulate the activity of FtsH. HflKC plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. 48217 cd03405: Band_7_HflC: The band 7 domain of flotillin (reggie) like proteins. This group includes proteins similar to prokaryotic HlfC (High frequency of lysogenization C). Although many members of the band 7 family are lipid raft associated, prokaryote plasma membranes lack cholesterol and are unlikely to have lipid raft domains. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Escherichia coli HflC is an integral membrane protein which may localize to the plasma membrane. HflC associates with another band 7 family member (HflK) to form an HflKC complex. HflKC interacts with FtsH in a large complex termed the FtsH holo-enzyme. FtsH is an AAA ATP-dependent protease which exerts progressive proteolysis against membrane-embedded and soluble substrate proteins. HflKC can modulate the activity of FtsH. HflKC plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. 48218 cd03406: A subgroup of the band 7 domain of flotillin (reggie) like proteins. This subgroup contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson 's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome. 48219 cd03407: A subgroup of the band 7 domain of flotillin (reggie) like proteins. This subgroup contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson 's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome. 48220 cd03408: A subgroup of the band 7 domain of flotillin (reggie) like proteins. This subgroup contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson 's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome. 48222 cd00883: Carbonic anhydrases (CA) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism in which the nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide is followed by the regeneration of an active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. CAs are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionarily distinct families of CAs (the alpha-, beta-, and gamma-CAs) which show no significant sequence identity or structural similarity. Within the beta-CA family there are four evolutionarily distinct clades (A through D). The beta-CAs are multimeric enzymes (forming dimers,tetramers,hexamers and octamers) which are present in higher plants, algae, fungi, archaea and prokaryotes. 48223 cd00884: Carbonic anhydrases (CA) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism in which the nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide is followed by the regeneration of an active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. CAs are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionarily distinct families of CAs (the alpha-, beta-, and gamma-CAs) which show no significant sequence identity or structural similarity. Within the beta-CA family there are four evolutionarily distinct clades (A through D). The beta-CAs are multimeric enzymes (forming dimers,tetramers,hexamers and octamers) which are present in higher plants, algae, fungi, archaea and prokaryotes. 48225 cd03379: Carbonic anhydrases (CA) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism in which the nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide is followed by the regeneration of an active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. CAs are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionarily distinct families of CAs (the alpha-, beta-, and gamma-CAs) which show no significant sequence identity or structural similarity. Within the beta-CA family there are four evolutionarily distinct clades (A through D). The beta-CAs are multimeric enzymes (forming dimers,tetramers,hexamers and octamers) which are present in higher plants, algae, fungi, archaea and prokaryotes. 48226 cd00188: Topoisomerase-primase domain. This is a nucleotidyl transferase/hydrolase domain found in type IA, type IIA and type IIB topoisomerases, bacterial DnaG-type primases, small primase-like proteins from bacteria and archaea, OLD family nucleases from bacterial and archaea, and bacterial DNA repair proteins of the RecR/M family. This domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). This glutamate and two aspartates, cluster together to form a highly acid surface patch. The conserved glutamate may act as a general base in nucleotide polymerization by primases and in strand joining in topoisomerases and, as a general acid in strand cleavage by topisomerases and nucleases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. 48227 cd00223: TOPRIM_TopoIIB_SPO: topoisomerase-primase (TOPRIM) nucleotidyl transferase/hydrolase domain of the type found in the type IIB family of DNA topoisomerases and Spo11. This subgroup contains proteins similar to Sulfolobus shibatae topoisomerase VI (TopoVI) and Saccharomyces cerevisiae meiotic recombination factor: Spo11. Type II DNA topoisomerases catalyze the ATP-dependent transport of one DNA duplex through another, in the process generating transient double strand breaks via covalent attachments to both DNA strands at the 5' positions. TopoVI enzymes are heterotetramers found in archaea and plants. Spo11 plays a role in generating the double strand breaks that initiate homologous recombination during meiosis. S. shibatae TopoVI relaxes both positive and negative supercoils, and in addition has a strong decatenase activity. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD. For topoisomerases the conserved glutamate is believed to act as a general base in strand joining and, as a general acid in strand cleavage. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. 48228 cd01025: TOPRIM_recR: topoisomerase-primase (TOPRIM) nucleotidyl transferase/hydrolase domain of the type found in Escherichia coli RecR. RecR participates in the RecFOR pathway of homologous recombinational repair in prokaryotes. This pathway provides a single-stranded DNA molecule coated with RecA to allow invasion of a homologous molecule. The RecFOR system directs the loading of RecA onto gapped DNA coated with SSB protein. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). In RecR sequences this glutamate in the first turn of the TOPRIM domain is semiconserved, the DXD motif is not conserved. 48229 cd01026: TOPRIM_OLD: topoisomerase-primase (TOPRIM) nucleotidyl transferase/hydrolase domain of the type found in bacterial and archaeal nucleases of the OLD (overcome lysogenization defect) family. The bacteriophage P2 OLD protein, which has DNase as well as RNase activity, consists of an N-terminal ABC-type ATPase domain and a C-terminal Toprim domain; the nuclease activity of OLD is stimulated by ATP, though the ATPase activity is not DNA-dependent. Functional details on OLD are scant and further experimentation is required to define the relationship between the ATPase and Toprim nuclease domains. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). The conserved glutamate may act as a general acid in strand cleavage by nucleases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. 48230 cd01027: TOPRIM_ RNase M5_like: The topoisomerase-primase (TOPRIM) nucleotidyl transferase/hydrolase domain found in Ribonuclease M5: (RNase M5) and other small primase-like proteins from bacteria and archaea. RNase M5 catalyzes the maturation of 5S rRNA in low G+C Gram-positive bacteria. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). The conserved glutamate may act as a general base in nucleotide polymerization by primases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. 48231 cd01028: TOPRIM_TopoIA: topoisomerase-primase (TOPRIM) nucleotidyl transferase/hydrolase domain of the type found in the type IA family of DNA topoisomerases (TopoIA). This subgroup contains proteins similar to the Type I DNA topoisomerases: E. coli topisomerases I and III, eukaryotic topoisomerase III and, ATP-dependent reverse gyrase found in archaea and thermophilic bacteria. Type IA DNA topoisomerases remove (relax) negative supercoils in the DNA. These enzymes cleave one strand of the DNA duplex, covalently link to the 5' phosphoryl end of the DNA break and allow the other strand of the duplex to pass through the gap. Reverse gyrase is also able to insert positive supercoils in the presence of ATP and negative supercoils in the presence of AMPPNP. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). For topoisomerases the conserved glutamate is believed to act as a general base in strand joining and, as a general acid in strand cleavage. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. 48232 cd01029: TOPRIM_primases: The topoisomerase-primase (TORPIM) nucleotidyl transferase/hydrolase domain found in the active site regions of bacterial DnaG-type primases and their homologs. Primases synthesize RNA primers for the initiation of DNA replication. DnaG type primases are often closely associated with DNA helicases in primosome assemblies. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). This glutamate and two aspartates, cluster together to form a highly acid surface patch. The conserved glutamate may act as a general base in nucleotide polymerization by primases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. The prototypical bacterial primase. Escherichia coli DnaG is a single subunit enzyme. 48233 cd01030: TOPRIM_TopoIIA: topoisomerase-primase (TOPRIM) nucleotidyl transferase/hydrolase domain of the type found in proteins of the type IIA family of DNA topoisomerases similar to Saccharomyces cerevisiae Topoisomerase II. TopoIIA enzymes cut both strands of the duplex DNA to remove (relax) both positive and negative supercoils in DNA. These enzymes covalently attach to the 5' ends of the cut DNA, separate the free ends of the cleaved strands, pass another region of the duplex through this gap, then rejoin the ends. These proteins also catenate/ decatenate duplex rings. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). The conserved glutamate may act as a general base in strand joining and as a general acid in strand cleavage by topisomerases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. 48234 cd03361: TopoIA_RevGyr : The topoisomerase-primase (TORPIM) domain found in members of the type IA family of DNA topoisomerases (Topo IA) similar to the ATP-dependent reverse gyrase found in archaea and thermophilic bacteria. Type IA DNA topoisomerases remove (relax) negative supercoils in the DNA by: cleaving one strand of the DNA duplex, covalently linking to the 5' phosphoryl end of the DNA break and, allowing the other strand of the duplex to pass through the gap. Reverse gyrase is also able to insert positive supercoils in the presence of ATP and negative supercoils in the presence of AMPPNP. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). For topoisomerases the conserved glutamate is believed to act as a general base in strand joining and, as a general acid in strand cleavage. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. 48235 cd03362: TOPRIM_TopoIA_TopoIII: The topoisomerase-primase (TORPIM) domain found in members of the type IA family of DNA topoisomerases (Topo IA) similar to topoisomerase III. Type IA DNA topoisomerases remove (relax) negative supercoils in the DNA by: cleaving one strand of the DNA duplex, covalently linking to the 5' phosphoryl end of the DNA break and, allowing the other strand of the duplex to pass through the gap. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). For topoisomerases the conserved glutamate is believed to act as a general base in strand joining and, as a general acid in strand cleavage. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. 48236 cd03363: TOPRIM_TopoIA_TopoI: The topoisomerase-primase (TORPIM) domain found in members of the type IA family of DNA topoisomerases (Topo IA) similar to Escherichia coli DNA topoisomerase I. Type IA DNA topoisomerases remove (relax) negative supercoils in the DNA by: cleaving one strand of the DNA duplex, covalently linking to the 5' phosphoryl end of the DNA break and, allowing the other strand of the duplex to pass through the gap. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). For topoisomerases the conserved glutamate is believed to act as a general base in strand joining and, as a general acid in strand cleavage. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. 48237 cd03364: TOPRIM_DnaG_primases: The topoisomerase-primase (TORPIM) nucleotidyl transferase/hydrolase domain found in the active site regions of proteins similar to Escherichia coli DnaG. Primases synthesize RNA primers for the initiation of DNA replication. DnaG type primases are often closely associated with DNA helicases in primosome assemblies. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). This glutamate and two aspartates, cluster together to form a highly acid surface patch. The conserved glutamate may act as a general base in nucleotide polymerization by primases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. E. coli DnaG is a single subunit enzyme. 48238 cd03365: TOPRIM_TopoIIA: topoisomerase-primase (TOPRIM) nucleotidyl transferase/hydrolase domain of the type found in proteins of the type IIA family of DNA topoisomerases similar to Saccharomyces cerevisiae Topoisomerase II. TopoIIA enzymes cut both strands of the duplex DNA to remove (relax) both positive and negative supercoils in DNA. These enzymes covalently attach to the 5' ends of the cut DNA, separate the free ends of the cleaved strands, pass another region of the duplex through this gap, then rejoin the ends. These proteins also catenate/ decatenate duplex rings. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). This glutamate and two aspartates, cluster together to form a highly acid surface patch. The conserved glutamate may act as a general base in strand joining and as a general acid in strand cleavage by topisomerases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. 48239 cd03366: TOPRIM_TopoIIA_GyrB: topoisomerase-primase (TOPRIM) nucleotidyl transferase/hydrolase domain of the type found in proteins of the type IIA family of DNA topoisomerases similar to the Escherichia coli GyrB subunit. TopoIIA enzymes cut both strands of the duplex DNA to remove (relax) both positive and negative supercoils in DNA. These enzymes covalently attach to the 5' ends of the cut DNA, separate the free ends of the cleaved strands, pass another region of the duplex through this gap, then rejoin the ends. These proteins also catenate/ decatenate duplex rings. DNA gyrase is more effective at relaxing supercoils than decatentating DNA. DNA gyrase in addition inserts negative supercoils in the presence of ATP. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). The conserved glutamate may act as a general base in strand joining and as a general acid in strand cleavage by topisomerases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. 48240 cd00361: Biopterin-dependent aromatic amino acid hydroxylase; a family of non-heme, iron(II)-dependent enzymes that includes prokaryotic and eukaryotic phenylalanine-4-hydroxylase (PheOH), eukaryotic tyrosine hydroxylase (TyrOH) and eukaryotic tryptophan hydroxylase (TrpOH). PheOH converts L-phenylalanine to L-tyrosine, an important step in phenylalanine catabolism and neurotransmitter biosynthesis, and is linked to a severe variant of phenylketonuria in humans. TyrOH and TrpOH are involved in the biosynthesis of catecholamine and serotonin, respectively. The eukaryotic enzymes are all homotetramers. 48241 cd03345: Eukaryotic tyrosine hydroxylase (TyrOH); a member of the biopterin-dependent aromatic amino acid hydroxylase family of non-heme, iron(II)-dependent enzymes that also includes prokaryotic and eukaryotic phenylalanine-4-hydroxylase (PheOH) and eukaryotic tryptophan hydroxylase (TrpOH). TyrOH catalyzes the conversion of tyrosine to L-dihydroxyphenylalanine (L-DOPA), the rate-limiting step in the biosynthesis of the catecholamines dopamine, noradrenaline, and adrenaline. 48242 cd03346: Eukaryotic tryptophan hydroxylase (TrpOH); a member of the biopterin-dependent aromatic amino acid hydroxylase family of non-heme, iron(II)-dependent enzymes that also includes prokaryotic and eukaryotic phenylalanine-4-hydroxylase (PheOH) and eukaryotic tyrosine hydroxylase (TyrOH). TrpOH oxidizes L-tryptophan to 5-hydroxy-L-tryptophan, the rate-limiting step in the biosynthesis of serotonin (5-hydroxytryptamine), a widely distributed hormone and neurotransmitter. 48243 cd03347: Eukaryotic phenylalanine-4-hydroxylase (eu_PheOH); a member of the biopterin-dependent aromatic amino acid hydroxylase family of non-heme, iron(II)-dependent enzymes that also includes prokaryotic phenylalanine-4-hydroxylase (pro_PheOH), eukaryotic tyrosine hydroxylase (TyrOH) and eukaryotic tryptophan hydroxylase (TrpOH). PheOH catalyzes the first and rate-limiting step in the metabolism of the amino acid L-phenylalanine (L-Phe), the hydroxylation of L-Phe to L-tyrosine (L-Tyr). It uses (6R)-L-erythro-5,6,7,8-tetrahydrobiopterin (BH4) as the physiological electron donor. The catalytic activity of the tetrameric enzyme is tightly regulated by the binding of L-Phe and BH4 as well as by phosphorylation. Mutations in the human enzyme are linked to a severe variant of phenylketonuria. 48244 cd03348: Prokaryotic phenylalanine-4-hydroxylase (pro_PheOH); a member of the biopterin-dependent aromatic amino acid hydroxylase family of non-heme, iron(II)-dependent enzymes that also includes the eukaryotic proteins, phenylalanine-4-hydroxylase (eu_PheOH), tyrosine hydroxylase (TyrOH) and tryptophan hydroxylase (TrpOH). PheOH catalyzes the hydroxylation of L-Phe to L-tyrosine (L-Tyr). It uses (6R)-L-erythro-5,6,7,8-tetrahydrobiopterin (BH4) as the physiological electron donor. 48336 cd00218: Beta1,3-glucuronyltransferase I (GlcAT-I) domain; GlcAT-I is a central enzyme in the initial steps of proteoglycan synthesis; GlcAT-I transfers a glucuronic acid moiety from the uridine diphosphate-glucuronic acid (UDP-GlcUA) to the common linkage region trisaccharide Gal-beta-(1-3)-Gal-beta-(1-4)-Xyl covalently bound to a Ser residue at the glycosaminylglycan attachment site of proteoglycans; the enzyme is an alpha/beta protein with two subdomains that constitute the donor and acceptor substrate binding site; the active site residues lie in a cleft extending across both subdomains in which the trisaccharide molecule is oriented perpendicular to the UDP. 48337 cd00280: Telomeric Repeat binding Factor or TTAGGG Repeat binding Factor, central (dimerization) domain Homology; TRFH. Telomeres are protein/DNA complexes that make up the physical ends of eukaryotic linear chromosomes and are essential for chromosome stability, protecting the chromosome ends from degradation and end-to-end fusion. Proteins TRF1, TRF2 and Taz1 bind telomeric DNA and are also involved in recruiting interacting proteins, TIN2, and Rap1, to the telomeres. It has also been demonstrated that PARP1 associates with TRF2 and is capable of poly(ADP-ribosyl)ation of TRF2, which affects binding of TRF2 to telomeric DNA. TRF1, TRF2 and Taz1 proteins contain three functional domains: an N-terminal acidic domain, a central TRF-specific/dimerization domain, and a C-terminal DNA binding domain with a single Myb-like repeat. Homodimerization, a prerequisite to DNA binding, results in the juxtaposition of two Myb DNA binding domains. 48338 cd00305: Copper/zinc superoxide dismutase (SOD). superoxide dismutases catalyse the conversion of superoxide radicals to molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the copper/zinc-binding family is one. Defects in the human SOD1 gene causes familial amyotrophic lateral sclerosis (Lou Gehrig's disease). Cytoplasmic and periplasmic SODs exist as dimers, whereas chloroplastic and extracellular enzymes exist as tetramers. Structure supports independent functional evolution in prokaryotes (P-class) and eukaryotes (E-class) [PMID:.8176730].. 48339 cd00319: Ribosomal protein S12-like family; composed of prokaryotic 30S ribosomal protein S12, eukaryotic 40S ribosomal protein S23 and similar proteins. S12 and S23 are located at the interface of the large and small ribosomal subunits, adjacent to the decoding center. They play an important role in translocation during the peptide elongation step of protein synthesis. They are also involved in important RNA and protein interactions. Ribosomal protein S12 is essential for maintenance of a pretranslocation state and, together with S13, functions as a control element for the rRNA- and tRNA-driven movements of translocation. S23 interacts with domain III of the eukaryotic elongation factor 2 (eEF2), which catalyzes translocation. Mutations in S12 and S23 have been found to affect translational accuracy. Antibiotics such as streptomycin may also bind S12/S23 and cause the ribosome to misread the genetic code. 48340 cd03367: S12-like family, 40S ribosomal protein S23 subfamily; S23 is located at the interface of the large and small ribosomal subunits of eukaryotes, adjacent to the decoding center. It interacts with domain III of the eukaryotic elongation factor 2 (eEF2), which catalyzes the translocation of the growing peptidyl-tRNA to the P site to make room for the next aminoacyl-tRNA at the A (acceptor) site. Through its interaction with eEF2, S23 may play an important role in translocation. Also members of this subfamily are the archaeal 30S ribosomal S12 proteins. Prokaryotic S12 is essential for maintenance of a pretranslocation state and, together with S13, functions as control element for the rRNA- and tRNA-driven movements of translocation. S12 and S23 are also implicated in translation accuracy. Antibiotics such as streptomycin bind S12/S23 and cause the ribosome to misread the genetic code. 48341 cd03368: S12-like family, 30S ribosomal protein S12 subfamily; S12 is located at the interface of the large and small ribosomal subunits of prokaryotes, chloroplasts and mitochondria, where it plays an important role in both tRNA and ribosomal subunit interactions. S12 is essential for maintenance of a pretranslocation state and, together with S13, functions as a control element for the rRNA- and tRNA-driven movements of translocation. Antibiotics such as streptomycin bind S12 and cause the ribosome to misread the genetic code. 48342 cd00320: Chaperonin 10 Kd subunit (cpn10 or GroES); Cpn10 cooperates with chaperonin 60 (cpn60 or GroEL), an ATPase, to assist the folding and assembly of proteins and is found in eubacterial cytosol, as well as in the matrix of mitochondria and chloroplasts. It forms heptameric rings with a dome-like structure, forming a lid to the large cavity of the tetradecameric cpn60 cylinder and thereby tightly regulating release and binding of proteins to the cpn60 surface. 48343 cd00336: Ribosomal protein L22/L17e. L22 (L17 in eukaryotes) is a core protein of the large ribosomal subunit. It is the only ribosomal protein that interacts with all six domains of 23S rRNA, and is one of the proteins important for directing the proper folding and stabilizing the conformation of 23S rRNA. L22 is the largest protein contributor to the surface of the polypeptide exit channel, the tunnel through which the polypeptide product passes. L22 is also one of six proteins located at the putative translocon binding site on the exterior surface of the ribosome. 48344 cd00433: Cytosol aminopeptidase family, N-terminal and catalytic domains. Family M17 contains zinc- and manganese-dependent exopeptidases ( EC 3.4.11.1), including leucine aminopeptidase. They catalyze removal of amino acids from the N-terminus of a protein and play a key role in protein degradation and in the metabolism of biologically active peptides. They do not contain HEXXH motif (which is used as one of the signature patterns to group the peptidase families) in the metal-binding site. The two associated zinc ions and the active site are entirely enclosed within the C-terminal catalytic domain in leucine aminopeptidase. The enzyme is a hexamer, with the catalytic domains clustered around the three-fold axis, and the two trimers related to one another by a two-fold rotation. The N-terminal domain is structurally similar to the ADP-ribose binding Macro domain. This family includes proteins from bacteria, archaea, animals and plants. 48345 cd00446: GrpE is the adenine nucleotide exchange factor of DnaK (Hsp70)-type ATPases. The GrpE dimer binds to the ATPase domain of Hsp70 catalyzing the dissociation of ADP, which enables rebinding of ATP, one step in the Hsp70 reaction cycle in protein folding. In eukaryotes, only the mitochondrial Hsp70, not the cytosolic form, is GrpE dependent. 48346 cd00462: Peptidyl-tRNA hydrolase (PTH) is a monomeric protein that cleaves the ester bond linking the nascent peptide and tRNA when peptidyl-tRNA is released prematurely from the ribosome. This ensures the recycling of peptidyl-tRNAs into tRNAs produced through abortion of translation and is essential for cell viability.This group also contains chloroplast RNA splicing 2 (CRS2), which is closely related nuclear-encoded protein required for the splicing of nine group II introns in chloroplasts. 48347 cd02406: Chloroplast RNA splicing 2 (CRS2) is a nuclear-encoded protein required for the splicing of group II introns in the chloroplast. CRS2 forms stable complexes with two CRS2-associated factors, CAF1 and CAF2, which are required for the splicing of distinct subsets of CRS2-dependent introns. CRS2 is closely related to bacterial peptidyl-tRNA hydrolases (PTH).. 48348 cd00488: PCD_DCoH: The bifunctional protein pterin-4alpha-carbinolamine dehydratase (PCD), also known as DCoH (dimerization cofactor of hepatocyte nuclear factor-1), is both a transcription activator and a metabolic enzyme. DCoH stimulates gene expression by associating with specific DNA binding proteins such as HNF-1alpha (hepatocyte nuclear factor-1) and Xenopus enhancer of rudimentary homologue (XERH). DCoH also catalyzes the dehydration of 4alpha- hydroxy- tetrahydrobiopterin (4alpha-OH-BH4) to quinoiddihydrobiopterin, a percursor of the phenylalanine hydroxylase cofactor BH4 (tetrahydrobiopterin). The DCoH homodimer has a saddle-shaped structure similar to that of TBP (TATA binding protein). Two DCoH proteins have been identifed in humans: DCoH1 and DCoH2. Mutations in human DCoH1 cause hyperphenylalaninemia. Loss of enzymic activity of DCoH in humans is associated with the depigmentation disorder vitiligo. DCoH1 has been reported to be overexpessed in colon cancer carcinomas and in malignant melanomas. 48349 cd00913: PCD_DCoH: The bifunctional protein pterin-4alpha-carbinolamine dehydratase (PCD), also known as DCoH (dimerization cofactor of hepatocyte nuclear factor-1), is both a transcription activator and a metabolic enzyme. DCoH stimulates gene expression by associating with specific DNA binding proteins such as HNF-1alpha (hepatocyte nuclear factor-1) and Xenopus enhancer of rudimentary homologue (XERH). DCoH also catalyzes the dehydration of 4alpha- hydroxy- tetrahydrobiopterin (4alpha-OH-BH4) to quinoiddihydrobiopterin, a percursor of the phenylalanine hydroxylase cofactor BH4 (tetrahydrobiopterin). The DCoH homodimer has a saddle-shaped structure similar to that of TBP (TATA binding protein).. 48350 cd00914: PCD_DCoH: The bifunctional protein pterin-4alpha-carbinolamine dehydratase (PCD), also known as DCoH (dimerization cofactor of hepatocyte nuclear factor-1), is both a transcription activator and a metabolic enzyme. DCoH stimulates gene expression by associating with specific DNA binding proteins such as HNF-1alpha (hepatocyte nuclear factor-1) and Xenopus enhancer of rudimentary homologue (XERH). DCoH also catalyzes the dehydration of 4alpha- hydroxy- tetrahydrobiopterin (4alpha-OH-BH4) to quinoiddihydrobiopterin, a percursor of the phenylalanine hydroxylase cofactor BH4 (tetrahydrobiopterin). The DCoH homodimer has a saddle-shaped structure similar to that of TBP (TATA binding protein). Two DCoH proteins have been identifed in humans: DCoH1 and DCoH2. Mutations in human DCoH1 cause hyperphenylalaninemia. Loss of enzymic activity of DCoH in humans is associated with the depigmentation disorder vitiligo. DCoH1 has been reported to be overexpessed in colon cancer carcinomas and in malignant melanomas. 48351 cd00577: Proliferating Cell Nuclear Antigen (PCNA) domain found in eukaryotes and archaea. These polymerase processivity factors play a role in DNA replication and repair. PCNA encircles duplex DNA in its central cavity, providing a DNA-bound platform for the attachment of the polymerase. The trimeric PCNA ring is structurally similar to the dimeric ring formed by the DNA polymerase processivity factors in bacteria (beta subunit DNA polymerase III holoenzyme) and in bacteriophages (catalytic subunits in T4 and RB69). This structural correspondence further substantiates the mechanistic connection between eukaryotic and prokaryotic DNA replication that has been suggested on biochemical grounds. PCNA is also involved with proteins involved in cell cycle processes such as DNA repair and apoptosis. Many of these proteins contain a highly conserved motif known as the PIP-box (PCNA interacting protein box) which contains the sequence Qxx[LIM]xxF[FY]. . 48353 cd00353: Ribosomal protein S15 (prokaryotic)_S13 (eukaryotic) binds the central domain of 16S rRNA and is required for assembly of the small ribosomal subunit and for intersubunit association, thus representing a key element in the assembly of the whole ribosome. S15 also plays an important autoregulatory role by binding and preventing its own mRNA from being translated. S15 has a predominantly alpha-helical fold that is highly structured except for the N-terminal alpha helix. 48354 cd00677: S15/NS1/EPRS_RNA-binding domain. This short domain consists of a helix-turn-helix structure, which can bind to several types of RNA. It is found in the ribosomal protein S15, the influenza A viral nonstructural protein (NSA) and in several eukaryotic aminoacyl tRNA synthetases (aaRSs), where it occurs as a single or a repeated unit. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions in the formation of tRNA-synthetases into multienzyme complexes. While this domain lacks significant sequence similarity between the subgroups in which it is found, they share similar electrostatic surface potentials and thus are likely to bind to RNA via the same mechanism. 48355 cd00935: GlyRS_RNA binding domain. This short RNA-binding domain is found at the N-terminus of GlyRS in several higher eukaryote aminoacyl-tRNA synthetases (aaRSs). This domain consists of a helix-turn-helix structure , which is similar to other RNA-binding proteins. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions, which are important for the formation of aaRSs into multienzyme complexes. 48356 cd00936: WEPRS_RNA binding domain. This short RNA-binding domain is found in several higher eukaryote aminoacyl-tRNA synthetases (aaRSs). It is found in multiple copies in eukaryotic bifunctional glutamyl-prolyl-tRNA synthetases (EPRS) in a region that separates the N-terminal glutamyl-tRNA synthetase (GluRS) from the C-terminal prolyl-tRNA synthetase (ProRS). It is also found at the N-terminus of vertebrate tryptophanyl-tRNA synthetases (TrpRS). This domain consists of a helix-turn-helix structure, which is similar to other RNA-binding proteins. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions, which are important for the formation of aaRSs into multienzyme complexes. 48357 cd00938: HisRS_RNA binding domain. This short RNA-binding domain is found at the N-terminus of HisRS in several higher eukaryote aminoacyl-tRNA synthetases (aaRSs). This domain consists of a helix- turn- helix structure, which is similar to other RNA-binding proteins. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions, which are important for the formation of aaRSs into multienzyme complexes. 48358 cd00939: MetRS_RNA binding domain. This short RNA-binding domain is found at the C-terminus of MetRS in several higher eukaryote aminoacyl-tRNA synthetases (aaRSs). It is repeated in Drosophila MetRS. This domain consists of a helix-turn-helix structure, which is similar to other RNA-binding proteins. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions, which are important for the formation of aaRSs into multienzyme complexes. 48359 cd01200: EPRS-like_RNA binding domain. This short RNA-binding domain is found in several higher eukaryote aminoacyl-tRNA synthetases (aaRSs). It is found in three copies in the mammalian bifunctional EPRS in a region that separates the N-terminal GluRS from the C-terminal ProRS. In the Drosophila EPRS, this domain is repeated six times. It is found at the N-terminus of TrpRS, HisRS and GlyR and at the C-terminus of MetRS. This domain consists of a helix- turn- helix structure, which is similar to other RNA-binding proteins. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions, which are important for the formation of aaRSs into multienzyme complexes. 48360 cd00477: Formyltetrahydrofolate synthetase (FTHFS) catalyzes the ATP-dependent activation of formate ion via its addition to the N10 position of tetrahydrofolate. FTHFS is a highly expressed key enzyme in both the Wood-Ljungdahl pathway of autotrophic CO2 fixation (acetogenesis) and the glycine synthase/reductase pathways of purinolysis. The key physiological role of this enzyme in acetogens is to catalyze the formylation of tetrahydrofolate, an initial step in the reduction of carbon dioxide and other one-carbon precursors to acetate. In purinolytic organisms, the enzymatic reaction is reversed, liberating formate from 10-formyltetrahydrofolate with concurrent production of ATP. 48362 cd02032: This family of proteins contains bchL and chlL. Protochlorophyllide reductase catalyzes the reductive formation of chlorophyllide from protochlorophyllide during biosynthesis of chlorophylls and bacteriochlorophylls. Three genes, bchL, bchN and bchB, are involved in light-independent protochlorophyllide reduction in bacteriochlorophyll biosynthesis. In cyanobacteria, algae, and gymnosperms, three similar genes, chlL, chlN and chlB are involved in protochlorophyllide reduction during chlorophylls biosynthesis. BchL/chlL, bchN/chlN and bchB/chlB exhibit significant sequence similarity to the nifH, nifD and nifK subunits of nitrogenase, respectively. Nitrogenase catalyzes the reductive formation of ammonia from dinitrogen. 48363 cd02033: Chlorophyllide reductase converts chlorophylls into bacteriochlorophylls by reducing the chlorin B-ring. This family contains the X subunit of this three-subunit enzyme. Sequence and structure similarity between bchX, protochlorophyllide reductase L subunit (bchL and chlL) and nitrogenase Fe protein (nifH gene) suggest their functional similarity. Members of the BchX family serve as the unique electron donors to their respective catalytic subunits (bchN-bchB, bchY-bchZ and nitrogenase component 1). Mechanistically, they hydrolyze ATP and transfer electrons through a Fe4-S4 cluster. 48369 cd02040: NifH gene encodes component II (iron protein) of nitrogenase. Nitrogenase is responsible for the biological nitrogen fixation, i.e. reduction of molecular nitrogen to ammonia. NifH consists of two oxygen-sensitive metallosulfur proteins: the mollybdenum-iron (alternatively, vanadium-iron or iron-iron) protein (commonly referred to as component 1), and the iron protein (commonly referred to as component 2). The iron protein is a homodimer, with an Fe4S4 cluster bound between the subunits and two ATP-binding domains. It supplies energy by ATP hydrolysis, and transfers electrons from reduced ferredoxin or flavodoxin to component 1 for the reduction of molecular nitrogen to ammonia. 48371 cd02117: This family contains the NifH (iron protein) of nitrogenase, L subunit (BchL/ChlL) of the protochlorophyllide reductase and the BchX subunit of the Chlorophyllide reductase. Members of this family use energey from ATP hydrolysis and transfer electrons through a Fe4-S4 cluster to other subunit for reduction of substrate. 48376 cd03112: The function of this protein family is unkown. The amino acid sequence of YjiA protein in E. coli contains several conserved motifs that characterizes it as a P-loop GTPase. YijA gene is among the genes significantly induced in response to DNA-damage caused by mitomycin. YijA gene is a homologue of the CobW gene which encodes the cobalamin synthesis protein/P47K. 48377 cd03113: CTP synthetase (CTPs) is a two-domain protein, which consists of an N-terminal synthetase domain and C-terminal glutaminase domain. The enzymes hydrolyze the amide bond of glutamine to ammonia and glutamate at the glutaminase domains and transfer nascent ammonia to the acceptor substrate at the synthetase domain to form an aminated product. Glutaminase domains have evolved from the same ancestor, whereas the synthetase domains are evolutionarily unrelated and have different functions. This protein family is classified based on the N-terminal synthetase domain. 48378 cd03114: The function of this protein family is unkown. The protein sequences are similar to the ArgK protein in E. coli. ArgK protein is a membrane ATPase which is required for transporting arginine, ornithine and lysine into the cells by the arginine and ornithine (AO system) and lysine, arginine and ornithine (LAO) transport systems. 48379 cd03115: The signal recognition particle (SRP) mediates the transport to or across the plasma membrane in bacteria and the endoplasmic reticulum in eukaryotes. SRP recognizes N-terminal sighnal sequences of newly synthesized polypeptides at the ribosome. The SRP-polypeptide complex is then targeted to the membrane by an interaction between SRP and its cognated receptor (SR). In mammals, SRP consists of six protein subunits and a 7SL RNA. One of these subunits is a 54 kd protein (SRP54), which is a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 is a multidomain protein that consists of an N-terminal domain, followed by a central G (GTPase) domain and a C-terminal M domain. 48380 cd03116: Molybdenum is an essential trace element in the form of molybdenum cofactor (Moco) which is associated with the metabolism of nitrogen, carbon and sulfur by redox active enzymes. In E. coli, the synthesis of Moco involves genes from several loci: moa, mob, mod, moe and mog. The mob locus contains mobA and mobB genes. MobB catalyzes the attachment of the guanine dinucleotide to molybdopterin. 48382 cd02135: Nitroreductase-like family which includes NADH oxidase and arsenite oxidiase. NADH oxidase catalyses the oxidation of NAD(P)H and accepts a wide broad range of compounds as electron acceptors, such as nitrocompound. Arsenite oxidase in a beta-proteobacterial strain is able to oxidize arsenite to arsenate. 48385 cd02138: Nitroreductase-like family 2. A subfamily of the nitroreductase family containing uncharacterized proteins that are similar to nitroreductase. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor. The enzyme is typically a homodimer. Members of this family are also called NADH dehydrogenase, oxygen-insensitive NAD(P)H nitrogenase or dihydropteridine reductase. 48386 cd02139: Nitroreductase-like family 3. A subfamily of the nitroreductase family containing uncharacterized proteins that are similar to nitroreductase. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor. The enzyme is typically a homodimer. Members of this family are also called NADH dehydrogenase, oxygen-insensitive NAD(P)H nitrogenase or dihydropteridine reductase. 48387 cd02140: Nitroreductase-like family 4. A subfamily of the nitroreductase family containing uncharacterized proteins that are similar to nitroreductase. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor. The enzyme is typically a homodimer. Members of this family are also called NADH dehydrogenase, oxygen-insensitive NAD(P)H nitrogenase or dihydropteridine reductase. 48388 cd02142: This family is the oxydase domain of NRPS (non-ribosomal peptide synthetase) and other proteins that modify polypeptides by cyclizing a thioester to form a ring. These include epoB, part of the epothilone biosynthesis pathway; tubD, part of the tubulysin biosynthesis pathway, mtsD, part of the myxothiozol biosynthesis pathway; indC, part of the indigoidine biosynthesis pathway and tfxB, part of the trifitoxin processing pathway. All are FMN-dependent and oxidize the product of the cyclization of thioesters in short polypeptides. 48390 cd02144: Iodotyrosine dehalogenase catalyzes the removal of iodine from the 3, 5 positions of L-tyosine in thyroid, liver and kidney, using NADPH as electron donor. This enzyme is a homolog of the nitroreductase family. These enzymes are usually homodimers. 48391 cd02145: Subfamily of the nitroreductase family that includes BluB protein in Rhodobacter capsulatus is involved in the conversion of cobinamide to cobalamin in Cobalamin (vitamin B12) biosynthesis. Nitroreductases typically reduce their substrates by using NAD(P)H as electron donor and often use FMN as a cofactor. 48392 cd02146: This family contains NADPH-dependent flavin reductase and oxygen-insensitive nitroreductase. These enzymes are homodimeric flavoproteins that contain one FMN per monomer as a cofactor. Flavin reductase catalyzes the reduction of flavin by using NADPH as an electron donor. Oxygen-insensitive nitroreductase, such as NfsA protein in Escherichia coli, catalyzes reduction of nitrocompounds using NADPH as electron donor. 48393 cd02148: Nitroreductase-like family 5. A subfamily of the nitroreductase family containing uncharacterized proteins that are similar to nitroreductase. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor. The enzyme is typically a homodimer. Members of this family are also called NADH dehydrogenase, oxygen-insensitive NAD(P)H nitrogenase or dihydropteridine reductase. 48395 cd02150: NAD(P)H:flavin oxidoreductase-like family 1. A subfamily of the nitroreductase family containing uncharacterized proteins that are similar to nitroreductase. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor. The enzyme is typically a homodimer. Members of this family are also called NADH dehydrogenase, oxygen-insensitive NAD(P)H nitrogenase or dihydropteridine reductase. 48396 cd02151: NAD(P)H:flavin oxidoreductase-like family 2. A subfamily of the nitroreductase family containing uncharacterized proteins that are similar to nitroreductase. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor. The enzyme is typically a homodimer. Members of this family are also called NADH dehydrogenase, oxygen-insensitive NAD(P)H nitrogenase or dihydropteridine reductase. 48398 cd02153: The tRNA binding domain is also known as the Myf domain in literature. This domain is found in a diverse collection of tRNA binding proteins, including prokaryotic phenylalanyl tRNA synthetases (PheRS), methionyl-tRNA synthetases (MetRS), human tyrosyl-tRNA synthetase(hTyrRS), Saccharomyces cerevisiae Arc1p, Thermus thermophilus CsaA, Aquifex aeolicus Trbp111, human p43 and human EMAP-II. PheRS, MetRS and hTyrRS aminoacylate their cognate tRNAs. Arc1p is a transactivator of yeast methionyl-tRNA and glutamyl-tRNA synthetases. The molecular chaperones Trbp111 and CsaA also contain this domain. CsaA has export related activities; Trbp111 is structure-specific recognizing the L-shape of the tRNA fold. This domain has general tRNA binding properties. In a subset of this family this domain has the added capability of a cytokine. For example the p43 component of the Human aminoacyl-tRNA synthetase complex is cleaved to release EMAP-II cytokine. EMAP-II has multiple activities during apoptosis, angiogenesis and inflammation and participates in malignant transformation. An EMAP-II-like cytokine is released from hTyrRS upon cleavage. The active cytokine heptapeptide locates to this domain. For homodimeric members of this group which include CsaA, Trbp111 and Escherichia coli MetRS this domain acts as a dimerization domain. 48399 cd02796: tRNA-binding-domain-containing prokaryotic phenylalanly tRNA synthetase (PheRS) beta chain. PheRS aminoacylate phenylalanine transfer RNAs (tRNAphe). PheRSs belong structurally to class II aminoacyl tRNA synthetases (aaRSs) but, as they aminoacylate the 2'OH of the terminal ribose of tRNA they belong functionally to class 1 aaRSs. This domain has general tRNA binding properties and is believed to direct tRNAphe to the active site of the enzyme. 48400 cd02798: tRNA-binding-domain-containing CsaA-like proteins. CsaA is a molecular chaperone with export related activities. CsaA has a putative tRNA binding activity. The functional unit of CsaA is a homodimer and this domain acts as a dimerization domain. 48401 cd02799: tRNA-binding-domain-containing EMAP2-like proteins. This family contains a diverse fraction of tRNA binding proteins, including Caenorhabditis elegans methionyl-tRNA synthetase (CeMetRS), human tyrosyl- tRNA synthetase (hTyrRS), Saccharomyces cerevisiae Arc1p, human p43 and EMAP2. CeMetRS and hTyrRS aminoacylate their cognate tRNAs. Arc1p is a transactivator of yeast methionyl-tRNA and glutamyl-tRNA synthetases. This domain has general tRNA binding properties. In a subset of this family this domain has the added capability of a cytokine. For example the p43 component of the Human aminoacyl-tRNA synthetase complex is cleaved to release EMAP-II cytokine. EMAP-II has multiple activities during apoptosis, angiogenesis and inflammation and participates in malignant transformation. A EMAP-II-like cytokine also is released from hTyrRS upon cleavage. The active cytokine heptapeptide locates to this domain. 48402 cd02800: tRNA-binding-domain-containing Escherichia coli methionyl-tRNA synthetase (EcMetRS)-like proteins. This family includes EcMetRS and Aquifex aeolicus Trbp111 (AaTrbp111). This domain has general tRNA binding properties. MetRS aminoacylates methionine transfer RNAs (tRNAmet). AaTrbp111 is structure-specific molecular chaperone recognizing the L-shape of the tRNA fold. AaTrbp111 plays a role in nuclear trafficking of tRNAs. The functional unit of EcMetRs and AaTrbp111 is a homodimer, this domain acts as the dimerization domain. 48403 cd02407: Peptidyl-tRNA hydrolase, type 2 (PTH2)_like . Peptidyl-tRNA hydrolase activity releases tRNA from the premature translation termination product peptidyl-tRNA. Two structurally different enzymes have been reported to encode such activity, Pth present in bacteria and eukaryotes and Pth2 present in archaea and eukaryotes. 48404 cd02429: Peptidyl-tRNA hydrolase, type 2 (PTH2)_like . Peptidyl-tRNA hydrolase activity releases tRNA from the premature translation termination product peptidyl-tRNA. Two structurally different enzymes have been reported to encode such activity, Pth present in bacteria and eukaryotes and Pth2 present in archaea and eukaryotes. There is no functional information for this eukaryote-specific subgroup. 48405 cd02430: Peptidyl-tRNA hydrolase, type 2 (PTH2). Peptidyl-tRNA hydrolase (PTH) activity releases tRNA from the premature translation termination product peptidyl-tRNA, therefore allowing the tRNA and peptide to be reused in protein synthesis. PTH2 is present in archaea and eukaryotes. 48406 cd02134: NusA_K homology RNA-binding domain (KH). NusA is an essential multifunctional transcription elongation factor that is universally conserved among prokaryotes and archaea. NusA anti-termination function plays an important role in the expression of ribosomal rrn operons. During transcription of many other genes, NusA-induced RNAP pausing provides a mechanism for synchronizing transcription and translation . The N-terminal RNAP-binding domain (NTD) is connected through a flexible hinge helix to three globular domains, S1, KH1 and KH2. The KH motif is a beta-alpha-alpha-beta-beta unit that folds into an alpha-beta structure with a three stranded beta-sheet interupted by two contiguous helices. 48407 cd02409: KH-II (K homology RNA-binding domain, type II). KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins (e.g. ribosomal protein S3), transcription factors (e.g. NusA_K), and post-transcriptional modifiers of mRNA (e.g. hnRNP K). There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is a beta-alpha-alpha-beta-beta unit that folds into an alpha-beta structure with a three stranded beta-sheet interupted by two contiguous helices. In addition to their KH core domain, KH-II proteins have an N-terminal alpha helical extension while KH-I proteins have a C-terminal alpha helical extension. 48408 cd02410: The archaeal cleavage and polyadenylation specificity factor (CPSF) contains an N-terminal K homology RNA-binding domain (KH). The archeal CPSFs are predicted to be metal-dependent RNases belonging to the beta-CASP family, a subgroup enzymes within the metallo-beta-lactamase fold. The KH motif is a beta-alpha-alpha-beta-beta unit that folds into an alpha-beta structure with a three stranded beta-sheet interupted by two contiguous helices. In general, KH domains are known to bind single-stranded RNA or DNA and are found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. 48409 cd02411: K homology RNA-binding domain (KH) of the archaeal 30S small ribosomal subunit S3 protein. S3 is part of the head region of the 30S ribosomal subunit and is believed to interact with mRNA as it threads its way from the latch into the channel. The KH motif is a beta-alpha-alpha-beta-beta unit that folds into an alpha-beta structure with a three stranded beta-sheet interupted by two contiguous helices. In general, KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. 48410 cd02412: K homology RNA-binding (KH) domain of the prokaryotic 30S small ribosomal subunit protein S3. S3 is part of the head region of the 30S ribosomal subunit and is believed to interact with mRNA as it threads its way from the latch into the channel. The KH motif is a beta-alpha-alpha-beta-beta unit that folds into an alpha-beta structure with a three stranded beta-sheet interupted by two contiguous helices. In general, KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. 48411 cd02413: K homology RNA-binding (KH) domain of the eukaryotic 40S small ribosomal subunit protein S3. S3 is part of the head region of the 40S ribosomal subunit and is believed to interact with mRNA as it threads its way from the latch into the channel. The KH motif is a beta-alpha-alpha-beta-beta unit that folds into an alpha-beta structure with a three stranded beta-sheet interupted by two contiguous helices. In general, KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. 48412 cd02414: jag_K homology RNA-binding domain. The KH domain is found in proteins homologous to the Bacillus subtilis protein Jag, which is associated with SpoIIIJ and is necessary for the third stage of sporulation. The KH motif is a beta-alpha-alpha-beta-beta unit that folds into an alpha-beta structure with a three stranded beta-sheet interupted by two contiguous helices. In general, KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. 48413 cd02685: MIT_C; domain found C-terminal to MIT (contained within Microtubule Interacting and Trafficking molecules) domains, as well as in some bacterial proteins. The function of this domain is unknown. 48414 cd03127: Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the ""tetraspani n web"", which may also include integrins. 48415 cd03151: Tetraspanin, extracellular domain or large extracellular loop (LEL), CD81_like subfamily. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the ""tetraspanin web"". CD81, also referred to as Target for anti-proliferative antigen-1, TAPA-1, is found in virtually all tissues, may be involved in regulation of cell growth and has been described as a member of the CD19/CD21/Leu-13 signal transduction complex identified on B cells (the B-Cell co-receptor).. 48416 cd03152: Tetraspanin, extracellular domain or large extracellular loop (LEL), CD9 family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the ""tetraspanin web"". CD9 is found in virtually all tissues and is potentially involved in developmental processes. It associates with the tetraspanins CD81 and CD63, as well as with some integrin, and has been shown to be involved in a variety of activation, adhesion, and cell motility functions, as well as cell-cell interactions - such as during fertilization. 48417 cd03153: Tetraspanin, extracellular domain or large extracellular loop (LEL), PHEMX_like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the ""tetraspanin web"". Phemx (pan hematopoietic expression) or TSSC6 may play a role in hematopoietic cell function. 48418 cd03154: Tetraspanin, extracellular domain or large extracellular loop (LEL), TM4SF3_like subfamily. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the ""tetraspanin web"". This subfamily contaions transmembrane 4 superfamily 3 (TM4SF3) or D6.1a and related proteins. D6.1a associates with alpha6beta4 integrin and supports cell motility, it has been ascribed a role in tumor progression and metastasis. 48419 cd03155: Tetraspanin, extracellular domain or large extracellular loop (LEL), CD151_Like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the ""tetraspanin web"". CD151strongly associates with integrins, especially alpha3beta1, alpha6beta1, alpha7beta1, and alpha6beta4; it may play roles in cell-cell adhesion, cell migration, platelet aggregation, and angiogenesis. For example, CD151 is is involved in regulation of migration of neutrophils, endothelial cells, and various tumor cell lines; it associates specifically with laminin-binding integrins and strengthens alpha6beta1 integrin-mediated adhesion to laminin-1; CD151 also specifically attenuates adhesion-dependent activation of Ras and correspdonding downstream effects, and is involved in epithelial cell-cell adhesion as a modulator of PKC- and Cdc42-dependent actin cytoskeletal reorganization. 48420 cd03156: Tetraspanin, extracellular domain or large extracellular loop (LEL), uroplakin_I_like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the ""tetraspanin web"". Uroplakin Ia and Ib are components of the 16nm protein particles, which are packed hexagonally to form 2D crystals of asymmetric unit membranes, and cover the apical surface of mammalian urothelium, contributing to the urinay bladder's permeability barrier function. Uroplakins Ia and Ib are maturation facilitators. They trigger conformational changes in their single-transmembrane-domain binding partner proteins uroplakin II and IIIa, which in turn may lead to ER-exit, stabilization, and cell-surface expression. 48421 cd03157: Tetraspanin, extracellular domain or large extracellular loop (LEL), TM4SF12_like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the ""tetraspanin web"". This sub-family contains proteins similar to human transmembrane 4 superfamily member 12 (TM4SF12).. 48422 cd03158: Tetraspanin, extracellular domain or large extracellular loop (LEL), penumbra_like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the ""tetraspanin web"". Human Penumbra exhibits growth-suppressive activity in vitro and has been associated with myeloid malignancies. 48423 cd03159: Tetraspanin, extracellular domain or large extracellular loop (LEL), TM4SF9_like subfamily. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the ""tetraspanin web"". This subfamily contaions transmembrane 4 superfamily 9 (TM4SF9) or Tetraspanin-5 and related proteins. TM4SF9 is strongly expressed witin the central nervous system, and expression levels appear to correlate with differentiation status of particular neurons, hinting at a role in neuronal maturation. 48424 cd03160: Tetraspanin, extracellular domain or large extracellular loop (LEL), CD37_CD82_Like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the ""tetraspanin web"". CD37 is a leukocyte-specific protein, and its restricted expression pattern suggests a role in the immune system. A regulatory role in T-cell proliferation has been suggested. CD82 is a metastasis suppressor implicated in biological processes ranging from fusion, adhesion, and migration to apoptosis and alterations of cell morphology. 48425 cd03161: Tetraspanin, extracellular domain or large extracellular loop (LEL), TM4SF2_6_like subfamily. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the ""tetraspanin web"". This subfamily contaions transmembrane 4 superfamily 2 (TM4SF2) or Tspan-7, transmembrane 4 superfamily 6 (TM4SF6) or Tspan-6, and related proteins. TM4SF2 has been identified as involved in some forms of X-linked mental retardation. 48426 cd03162: Tetraspanin, extracellular domain or large extracellular loop (LEL), peripherin_like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the ""tetraspanin web"". Peripherin, or RDS (retinal degradation slow) is a glycoprotein expressed in vertebrate photoreceptors, located at the rim of the disc membranes of the photoreceptor outer segments. RDS is thought to play a major role in folding and stacking of the discs. Mutations in RDS have been linked to hereditary retinal dystrophies, which typically exhibit a wide phenotypic spectrum. 48427 cd03163: Tetraspanin, extracellular domain or large extracellular loop (LEL), TM4SF8_like subfamily. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the ""tetraspanin web"". This subfamily contaions transmembrane 4 superfamily 8 (TM4SF8) or Tspan-3 and related proteins. Tspan-3 has been reported to form a complex with integrin beta1 and OSP/claudin-11, which may be involved in oligodendrocyte proliferation and migration. 48428 cd03164: Tetraspanin, extracellular domain or large extracellular loop (LEL), CD53_Like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the ""tetraspanin web"". CD53 is a tetraspanin of the lymphoid-myeloid lineage and has been implicated in apoptosis protection. It associates with integrin alpha4beta1. Some of the cellular responses modulated by CD53 may be mediated by JNK activation and/or via the AKT pathway. 48429 cd03165: Tetraspanin, extracellular domain or large extracellular loop (LEL), NET-5_like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the ""tetraspanin web"". This sub-family contains proteins similar to human tetraspan NET-5. 48430 cd03166: Tetraspanin, extracellular domain or large extracellular loop (LEL), CD63 family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the ""tetraspanin web"". CD63 is present in platelets, neutrophils, and endothelial cells, amongst others. In platelets it associates with the integrin alphaIIBbeta3 and may modulate alphaIIbbeta3-dependent cytoskeletal reorganization. 48431 cd03167: Tetraspanin, extracellular domain or large extracellular loop (LEL), oculospanin_like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the ""tetraspanin web"". This subfamily contains sequences similar to oculospanin, which is found to be expressed in retinal pigment epithelium, iris, ciliary body, and retinal ganglion cells. 48432 cd00480: Malate synthase catalyzes the Claisen condensation of glyoxylate and acetyl-CoA to malyl-CoA , which hydrolyzes to malate and CoA. This reaction is part of the glyoxylate cycle, which allows certain organisms, like plants and fungi, to derive their carbon requirements from two-carbon compounds, by bypassing the two carboxylation steps of the citric acid cycle. 48433 cd00727: Malate synthase A (MSA), present in some bacteria, plants and fungi. Prokaryotic MSAs tend to be monomeric, whereas eukaryotic enzymes are homomultimers. In general, malate synthase catalyzes the Claisen condensation of glyoxylate and acetyl-CoA to malyl-CoA, which hydrolyzes to malate and CoA. This reaction is part of the glyoxylate cycle, which allows certain organisms, like plants and fungi, to derive their carbon requirements from two-carbon compounds, by bypassing the two carboxylation steps of the citric acid cycle. 48434 cd00728: Malate synthase G (MSG), monomeric enzyme present in some bacteria. In general, malate synthase catalyzes the Claisen condensation of glyoxylate and acetyl-CoA to malyl-CoA , which hydrolyzes to malate and CoA. This reaction is part of the glyoxylate cycle, which allows certain organisms to derive their carbon requirements from two-carbon compounds, by bypassing the two carboxylation steps of the citric acid cycle. 48435 cd00542: Penicillin V acylase (PVA), also known as conjugated bile salt acid hydrolase (CBAH), catalyzes the hydrolysis of penicillin V to yield 6-amino penicillanic acid (6-APA), an important key intermediate of semisynthetic penicillins. PVA has an N-terminal nucleophilic cysteine, as do other members of the Ntn hydrolase family to which PVA belongs. This nucleophilic cysteine is exposed by post-translational prossessing of the PVA precursor. PVA forms a homotetramer. 48436 cd01901: The Ntn hydrolases (N-terminal nucleophile) are a diverse superfamily of of enzymes that are activated autocatalytically via an N-terminally lcated nucleophilic amino acid. N-terminal nucleophile (NTN-) hydrolase superfamily, which contains a four-layered alpha, beta, beta, alpha core structure. This family of hydrolases includes penicillin acylase, the 20S proteasome alpha and beta subunits, and glutamate synthase. The mechanism of activation of these proteins is conserved, although they differ in their substrate specificities. All known members catalyze the hydrolysis of amide bonds in either proteins or small molecules, and each one of them is synthesized as a preprotein. For each, an autocatalytic endoproteolytic process generates a new N-terminal residue. This mature N-terminal residue is central to catalysis and acts as both a polarizing base and a nucleophile during the reaction. The N-terminal amino group acts as the proton acceptor and activates either the nucleophilic hydroxyl in a Ser or Thr residue or the nucleophilic thiol in a Cys residue. The position of the N-terminal nucleophile in the active site and the mechanism of catalysis are conserved in this family, despite considerable variation in the protein sequences. 48437 cd01902: Choloylglycine hydrolase (CGH) is a bile salt-modifying enzyme that hydrolyzes non-peptide carbon-nitrogen bonds in choloylglycine and choloyltaurine, both of which are present in bile. CGH is present in a number of probiotic microbial organisms that inhabit the gut. CGH has an N-terminal nucleophilic cysteine, as do other members of the Ntn hydrolase family to which CGH belongs. 48438 cd01903: AC_NAAA This conserved domain includes two closely related proteins, acid ceramidase (AC, also known as N-acylsphingosine amidohydrolase), and N-acylethanolamine-hydrolyzing acid amidase (NAAA). AC catalyzes the hydrolysis of ceramide to sphingosine and fatty acid. Ceramide is required for the biosynthesis of most sphingolipids and plays an important role in many signal transduction pathways by inducing apoptosis and/or arresting cell growth. An inherited deficiency of AC activity leads to the lysosomal storage disorder known as Farber disease. AC is considered a ""rheostat"" important for maintaining the proper intracellular levels of these lipids since hydrolysis of ceramide is the only source of sphingosine in cells. NAAA is a eukaryotic glycoprotein that hydrolyzes bioactive N-acylethanolamines, including anandamide (an endocannabinoid) and N-palmitoylethanolamine (an anti-inflammatory and neuroprotective substance), to fatty acids and ethanolamine at acidic pH. NAAA shows structural and functional similarity to acid ceramidase, but lacks the ceramide-hydrolyzing activity of AC. 48439 cd01906: proteasome_protease_HslV. This group contains the eukaryotic proteosome alpha and beta subunits and the prokaryotic protease hslV subunit. Proteasomes are large multimeric self-compartmentalizing proteases, involved in the clearance of misfolded proteins, the breakdown of regulatory proteins, and the processing of proteins such as the preparation of peptides for immune presentation. Two main proteasomal types are distinguished by their different tertiary structures: the eukaryotic/archeal 20S proteasome and the prokaryotic proteasome-like heat shock protein encoded by heat shock locus V, hslV. The proteasome core particle is a highly conserved cylindrical structure made up of non-identical subunits that have their active sites on the inner walls of a large central cavity. The proteasome subunits of bacteria, archaea, and eukaryotes all share a conserved Ntn (N terminal nucleophile) hydrolase fold and a catalytic mechanism involving an N-terminal nucleophilic threonine that is exposed by post-translational processing of an inactive propeptide. 48440 cd01911: proteasome alpha subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 different alpha and 10 different beta proteasome subunit genes while archaea have one of each. 48441 cd01912: proteasome beta subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 48442 cd01913: Protease HslV and the ATPase/chaperone HslU are part of an ATP-dependent proteolytic system that is the prokaryotic homolog of the proteasome. HslV is a dimer of hexamers (a dodecamer) that forms a central proteolytic chamber with active sites on the interior walls of the cavity. HslV shares significant sequence and structural similarity with the proteasomal beta-subunit and both are members of the Ntn-family of hydrolases. HslV has a nucleophilic threonine residue at its N-terminus that is exposed after processing of the propeptide and is directly involved in active site catalysis. 48447 cd03749: proteasome_alpha_type_1. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 48448 cd03750: proteasome_alpha_type_2. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 48449 cd03751: proteasome_alpha_type_3. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 48450 cd03752: proteasome_alpha_type_4. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 48451 cd03753: proteasome_alpha_type_5. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 48452 cd03754: proteasome_alpha_type_6. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 48453 cd03755: proteasome_alpha_type_7. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 48454 cd03756: proteasome_alpha_archeal. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 48455 cd03757: proteasome beta type-1 subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 48456 cd03758: proteasome beta type-2 subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis.Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 48457 cd03759: proteasome beta type-3 subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 48458 cd03760: proteasome beta type-4 subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis.Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 48459 cd03761: proteasome beta type-5 subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 48460 cd03762: proteasome beta type-6 subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 48461 cd03763: proteasome beta type-7 subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 48462 cd03764: Archeal proteasome, beta subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme for non-lysosomal protein degradation in both the cytosol and the nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are both members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 48463 cd03765: Bacterial proteasome, beta subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each. 48464 cd00238: ERp29 and ERp38, C-terminal domain; composed of the protein disulfide isomerase (PDI)-like proteins ERp29 and ERp38. ERp29 (also called ERp28) is a ubiquitous endoplasmic reticulum (ER)-resident protein expressed in high levels in secretory cells. It contains a redox inactive TRX-like domain at the N-terminus. The expression profile of ERp29 suggests a role in secretory protein production, distinct from that of PDI. It has also been identified as a member of the thyroglobulin folding complex and is essential in regulating the secretion of thyroglobulin. The Drosophila homolog, Wind, is the product of windbeutel, an essential gene in the development of dorsal-ventral patterning. Wind is required for correct targeting of Pipe, a Golgi-resident type II transmembrane protein with homology to 2-O-sulfotransferase. ERp38 is a P5-like protein, first isolated from alfalfa (the cDNA clone was named G1), which contains two redox active TRX domains at the N-terminus, like human P5. However, unlike human P5, ERp38 also contains a C-terminal domain with homology to the C-terminal domain of ERp29. It may be a glucose-regulated protein. The function of the all-helical C-terminal domain of ERp29 and ERp38 remains unclear. The C-terminal domain of Wind is thought to provide a distinct site required for interaction with its substrate, Pipe. 48465 cd00329: MutL_Trans: transducer domain, having a ribosomal S5 domain 2-like fold, conserved in the C-terminal domain of type II DNA topoisomerases (Topo II) and DNA mismatch repair (MutL/MLH1/PMS2) proteins. This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. The GyrB dimerizes in response to ATP binding, and is homologous to the N-terminal half of eukaryotic Topo II and the ATPase fragment of MutL. Type II DNA topoisomerases catalyze the ATP-dependent transport of one DNA duplex through another, in the process generating transient double strand breaks via covalent attachments to both DNA strands at the 5' positions. Included in this group are proteins similar to human MLH1 and PMS2. MLH1 forms a heterodimer with PMS2 which functions in meiosis and in DNA mismatch repair (MMR). Cells lacking either hMLH1 or hPMS2 have a strong mutator phenotype and display microsatellite instability (MSI). Mutation in hMLH1 accounts for a large fraction of Lynch syndrome (HNPCC) families. 48466 cd00782: MutL_Trans: transducer domain, having a ribosomal S5 domain 2-like fold, conserved in the C-terminal domain of DNA mismatch repair (MutL/MLH1/PMS2) family. This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. Included in this group are proteins similar to human MLH1, hPMS2, hPMS1, hMLH3 and E. coli MutL, MLH1 forms heterodimers with PMS2, PMS1 and MLH3. These three complexes have distinct functions in meiosis. hMLH1-hPMS2 also participates in the repair of all DNA mismatch repair (MMR) substrates. Roles for hMLH1-hPMS1 or hMLH1-hMLH3 in MMR have not been established. Cells lacking either hMLH1 or hPMS2 have a strong mutator phenotype and display microsatellite instability (MSI). Mutation in hMLH1 causes predisposition to HNPCC, Muir-Torre syndrome and Turcot syndrome (HNPCC variant). Mutation in hPMS2 causes predisposition to HPNCC and Turcot syndrome. Mutation in hMLH1 accounts for a large fraction of HNPCC families. There is no convincing evidence to support hPMS1 having a role in HNPCC predisposition. It has been suggested that hMLH3 may be a low risk gene for colorectal cancer; however there is little evidence to support it having a role in classical HNPCC. It has been suggested that during initiation of DNA mismatch repair in E. coli, the mismatch recognition protein MutS recruits MutL in the presence of ATP. The MutS(ATP)-MutL ternary complex formed, then recruits the latent endonuclease MutH. 48467 cd00822: TopoIIA_Trans_DNA_gyrase: Transducer domain, having a ribosomal S5 domain 2-like fold, of the type found in proteins of the type IIA family of DNA topoisomerases similar to the B subunits of E. coli DNA gyrase and E. coli Topoisomerase IV which are heterodimers composed of two subunits. The type IIA enzymes are the predominant form of topoisomerase and are found in some bacteriophages, viruses and archaea, and in all bacteria and eukaryotes. All type IIA topoisomerases are related to each other at amino acid sequence level, though their oligomeric organization sometimes differs. TopoIIA enzymes cut both strands of the duplex DNA to remove (relax) both positive and negative supercoils in DNA. These enzymes covalently attach to the 5' ends of the cut DNA, separate the free ends of the cleaved strands, pass another region of the duplex through this gap, then rejoin the ends. TopoIIA enzymes also catenate/ decatenate duplex rings. E.coli DNA gyrase is a heterodimer composed of two subunits. E. coli DNA gyrase B subunit is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. 48468 cd00823: TopoIIB_Trans: Transducer domain, having a ribosomal S5 domain 2-like fold, of the type found in proteins of the type IIB family of DNA topoisomerases similar to Sulfolobus shibatae topoisomerase VI (topoVI). The sole representative of the Type IIB family is topo VI. Topo VI enzymes are heterotetramers found in archaea and plants. S. shibatae topoVI relaxes both positive and negative supercoils, and in addition has a strong decatenase activity. This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. 48469 cd03481: TopoIIA_Trans_ScTopoIIA: Transducer domain, having a ribosomal S5 domain 2-like fold, of the type found in proteins of the type IIA family of DNA topoisomerases similar to Saccharomyces cerevisiae Topo IIA. S. cerevisiae Topo IIA is a homodimer encoded by a single gene. The type IIA enzymes are the predominant form of topoisomerase and are found in some bacteriophages, viruses and archaea, and in all bacteria and eukaryotes. All type IIA topoisomerases are related to each other at amino acid sequence level, though their oligomeric organization sometimes differs. TopoIIA enzymes cut both strands of the duplex DNA to remove (relax) both positive and negative supercoils in DNA. These enzymes covalently attach to the 5' ends of the cut DNA, separate the free ends of the cleaved strands, pass another region of the duplex through this gap, then rejoin the ends. TopoIIA enzymes also catenate/ decatenate duplex rings. This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. 48470 cd03482: MutL_Trans_MutL: transducer domain, having a ribosomal S5 domain 2-like fold, found in proteins similar to Escherichia coli MutL. EcMutL belongs to the DNA mismatch repair (MutL/MLH1/PMS2) family. This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from the ATP-binding site to the DNA breakage/reunion regions of the enzymes. It has been suggested that during initiation of DNA mismatch repair in E. coli, the mismatch recognition protein MutS recruits MutL in the presence of ATP. The MutS(ATP)-MutL ternary complex formed, then recruits the latent endonuclease MutH. Prokaryotic MutS and MutL are homodimers. 48471 cd03483: MutL_Trans_MLH1: transducer domain, having a ribosomal S5 domain 2-like fold, found in proteins similar to yeast and human MLH1 (MutL homologue 1). This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. MLH1 forms heterodimers with PMS2, PMS1 and MLH3. These three complexes have distinct functions in meiosis. hMLH1-hPMS2 also participates in the repair of all DNA mismatch repair (MMR) substrates. Roles for hMLH1-hPMS1 or hMLH1-hMLH3 in MMR have not been established. Cells lacking hMLH1 have a strong mutator phenotype and display microsatellite instability (MSI). Mutation in hMLH1 causes predisposition to HNPCC, Muir-Torre syndrome and Turcot syndrome (HNPCC variant). Mutation in hMLH1 accounts for a large fraction of HNPCC families. 48472 cd03484: MutL_Trans_hPMS2_like: transducer domain, having a ribosomal S5 domain 2-like fold, found in proteins similar to human PSM2 (hPSM2). hPSM2 belongs to the DNA mismatch repair (MutL/MLH1/PMS2) family. This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. Included in this group are proteins similar to yeast PMS1. The yeast MLH1-PMS1 and the human MLH1-PMS2 heterodimers play a role in meiosis. hMLH1-hPMS2 also participates in the repair of all DNA mismatch repair (MMR) substrates. Cells lacking hPMS2 have a strong mutator phenotype and display microsatellite instability (MSI). Mutation in hPMS2 causes predisposition to HPNCC and Turcot syndrome. 48473 cd03485: MutL_Trans_hPMS1_like: transducer domain, having a ribosomal S5 domain 2-like fold, found in proteins similar to human PSM1 (hPSM1) and yeast MLH2. hPSM1 and yMLH2 are members of the DNA mismatch repair (MutL/MLH1/PMS2) family. This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. PMS1 forms a heterodimer with MLH1. The MLH1-PMS1 complex functions in meiosis. Loss of yMLH2 results in a small but significant decrease in spore viability and a significant increase in gene conversion frequencies. A role for hMLH1-hPMS1 in DNA mismatch repair has not been established. Mutation in hMLH1 accounts for a large fraction of Lynch syndrome (HNPCC) families, however there is no convincing evidence to support hPMS1 having a role in HNPCC predisposition. 48474 cd03486: MutL_Trans_MLH3: transducer domain, having a ribosomal S5 domain 2-like fold, found in proteins similar to yeast and human MLH3 (MutL homologue 3). MLH3 belongs to the DNA mismatch repair (MutL/MLH1/PMS2) family. This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. MLH1 forms heterodimers with MLH3. The MLH1-MLH3 complex plays a role in meiosis. A role for hMLH1-hMLH3 in DNA mismatch repair (MMR) has not been established. It has been suggested that hMLH3 may be a low risk gene for colorectal cancer; however there is little evidence to support it having a role in classical HNPCC. 48475 cd00352: Glutamine amidotransferases class-II (GATase). The glutaminase domain catalyzes an amide nitrogen transfer from glutamine to the appropriate substrate. In this process, glutamine is hydrolyzed to glutamic acid and ammonia. This domain is related to members of the Ntn (N-terminal nucleophile) hydrolase superfamily and is found at the N-terminus of enzymes such as glucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase), asparagine synthetase B (AsnB), beta lactam synthetase (beta-LS) and glutamate synthase (GltS). GLMS catalyzes the formation of glucosamine 6-phosphate from fructose 6-phosphate and glutamine in amino sugar synthesis. GPATase catalyzes the first step in purine biosynthesis, an amide transfer from glutamine to PRPP, resulting in phosphoribosylamine, pyrophosphate and glutamate. Asparagine synthetase B synthesizes asparagine from aspartate and glutamine. Beta-LS catalyzes the formation of the beta-lactam ring in the beta-lactamase inhibitor clavulanic acid. GltS synthesizes L-glutamate from 2-oxoglutarate and L-glutamine. These enzymes are generally dimers, but GPATase also exists as a homotetramer. 48476 cd00712: Glutamine amidotransferases class-II (GATase) asparagine synthase_B type. Asparagine synthetase B catalyses the ATP-dependent conversion of aspartate to asparagine. This enzyme is a homodimer, with each monomer composed of a glutaminase domain and a synthetase domain. The N-terminal glutaminase domain hydrolyzes glutamine to glutamic acid and ammonia. 48477 cd00713: Glutamine amidotransferases class-II (Gn-AT), glutamate synthase (GltS)-type. GltS is a homodimer that synthesizes L-glutamate from 2-oxoglutarate and L-glutamine, an important step in ammonia assimilation in bacteria, cyanobacteria and plants. The N-terminal glutaminase domain catalyzes the hydrolysis of glutamine to glutamic acid and ammonia, and has a fold similar to that of other glutamine amidotransferases such as glucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase), asparagine synthetase B (AsnB), and beta lactam synthetase (beta-LS), as well as the Ntn hydrolase folds of the proteasomal alpha and beta subunits. 48478 cd00714: Glutamine amidotransferases class-II (Gn-AT)_GFAT-type. This domain is found at the N-terminus of glucosamine-6P synthase (GlmS, or GFAT in humans). The glutaminase domain catalyzes amide nitrogen transfer from glutamine to the appropriate substrate. In this process, glutamine is hydrolyzed to glutamic acid and ammonia. In humans, GFAT catalyzes the first and rate-limiting step of hexosamine metabolism, the conversion of D-fructose-6P (Fru6P) into D-glucosamine-6P using L-glutamine as a nitrogen source. The end product of this pathway, UDP-N-acetyl glucosamine, is a major building block of the bacterial peptidoglycan and fungal chitin. 48479 cd00715: Glutamine amidotransferases class-II (GN-AT)_GPAT- type. This domain is found at the N-terminus of glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase) . The glutaminase domain catalyzes amide nitrogen transfer from glutamine to the appropriate substrate. In this process, glutamine is hydrolyzed to glutamic acid and ammonia. GPATase catalyzes the first step in purine biosynthesis, an amide transfer from glutamine to PRPP, resulting in phosphoribosylamine, pyrophosphate and glutamate. GPATase crystalizes as a homotetramer, but can also exist as a homdimer. 48481 cd01908: Glutamine amidotransferases class-II (Gn-AT)_YafJ-type. YafJ is a glutamine amidotransferase-like protein of unknown function found in prokaryotes, eukaryotes and archaea. YafJ has a conserved structural fold similar to those of other class II glutamine amidotransferases including lucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase), asparagine synthetase B (AsnB), beta lactam synthetase (beta-LS) and glutamate synthase (GltS). The YafJ fold is also somwhat similar to the Ntn (N-terminal nucleophile) hydrolase fold of the proteasomal alpha and beta subunits. 48482 cd01909: Glutamine amidotransferases class-II (GATase) asparagine synthase_betaLS-type. Carbapenam synthetase (CarA) is an ATP/Mg2+-dependent enzyme that catalyzes the formation of the beta-lactam ring in (5R)-carbapenem-3-carboxylic acid biosynthesis. CarA is homologous to beta-lactam synthetase (beta-LS), which is involved in the biosynthesis of clavulanic acid, a clinically important beta-lactamase inhibitor. CarA and beta-LS each have two distinct domains, an N-terminal Ntn hydrolase domain and a C-terminal synthetase domain, a domain architecture similar to that of the class-B asparagine synthetases (AS-B's). The N-terminal domain of these enzymes hydrolyzes glutamine to glutamate and ammonia. CarA forms a homotetramer while betaLS forms a heterodimer. The N-terminal folds of CarA and beta-LS are similar to those of other class II glutamine amidotransferases including lucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase), asparagine synthetase B (AsnB), and glutamate synthase (GltS). This fold is also somwhat similar to the Ntn (N-terminal nucleophile) hydrolase fold of the proteasomal alpha and beta subunits. 48483 cd01910: This domain is present in Wali7, a protein of unknown function, expressed in wheat and induced by aluminum. Wali7 has a single domain similar to the glutamine amidotransferase domain of glucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase), asparagine synthetase B (AsnB), beta lactam synthetase (beta-LS) and glutamate synthase (GltS). The Wali7 domain is also somewhat similar to the Ntn hydrolase fold of the proteasomal alph and beta subunits. 48484 cd03766: Gn_AT_II_novel. This asparagine synthase-related domain is present in eukaryotes but its function has not yet been determined. The glutaminase domain catalyzes an amide nitrogen transfer from glutamine to the appropriate substrate. In this process, glutamine is hydrolyzed to glutamic acid and ammonia. This domain is related to members of the Ntn (N-terminal nucleophile) hydrolase superfamily and is found at the N-terminus of enzymes such as glucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase), asparagine synthetase B (AsnB), beta lactam synthetase (beta-LS) and glutamate synthase (GltS). GLMS catalyzes the formation of glucosamine 6-phosphate from fructose 6-phosphate and glutamine in amino sugar synthesis. GPATase catalyzes the first step in purine biosynthesis, an amide transfer from glutamine to PRPP, resulting in phosphoribosylamine, pyrophosphate and glutamate. Asparagine synthetase B synthesizes asparagine from aspartate and glutamine. Beta-LS catalyzes the formation of the beta-lactam ring in the beta-lactamase inhibitor clavulanic acid. GltS synthesizes L-glutamate from 2-oxoglutarate and L-glutamine. These enzymes are generally dimers, but GPATase also exists as a homotetramer. 48485 cd00424: Y-family of DNA polymerases. Pol_Y's can transverse replication-blocking DNA lesions, such as cyclobutane pyrimidine dimers resulting from UV damage, at the cost of an elevated error rate. The Y-family has no 3'->5' exonuclease activity. In addition to possessing a topology akin to a right hand, with ""thumb"", ""fingers"" and ""palm"" motifs, like polymerases from the A-, B-, C- and X-families, the Y-family has a unique ""little finger"" motif. Expression of Y-family polymerases is often induced by DNA damage. These polymerases are phylogenetically unrelated to classical DNA polymerases. 48486 cd01700: Pol V was discovered in Escherichia coli as Umuc and UmuD proteins induced by UV. This branch of DNA polymerases is mostly found in bacteria. Pol V enables DNA replication to bypass covalently linked cys-sin T-T photo-dimers and 6-4 T-T or T-C photoproducts, which would otherwise stall the DNA replication fork. 48487 cd01701: Pol_zeta, a member of the Y-family of DNA polymerases. Y-family polymerases can transverse normal replication-blocking DNA lesions, such as cyclobutane pyrimidine dimers resulting from UV damage, at higher error rate. The Y-family has no 3'->5' exonuclease activity. Pol zeta has also been named Rev1; this subfamily is mainly present in eukaryotes. 48489 cd01703: Pol iota is member of the DNA polymerase Y-family, and has also been called Rad30 homolog B. Unlike classic DNA polymerases,Y-family polymerases are induced by DNA damage. They can transverse normal replication-blocking DNA lesions. Unlike Pol eta, Pol iota is unable to replicate through a cis-syn T-T dimer. In human Pol iota, the base-pairing mode in the active site at the replicative end mat bee Hoogsteen instead of Watson-Click. Human Pol iota can incorporate the correct nucleotide opposite a purine much more efficiently than opposite a pyrimidine. Pol iota prefers to insert Guanosine instead of Adenosine opposite Thymidine. 48490 cd03468: Pol_Y_like: a group of putative Y-family DNA polymerases. Y-family polymerases can transverse normal replication-blocking DNA lesions, such as cyclobutane pyrimidine dimers resulting from UV damage, at higher error rate. The Y-family has no 3'->5' exonuclease activity. In addition to possessing a topology akin to a right hand, with ""thumb"", ""fingers"" and ""palm"" motifs, like polymerases from the A-, B-, C- and X-families, the Y-family has a unique ""little finger"" motif. Expression of Y-family polymerases is often induced by DNA damage. These polymerases are phylogenetically unrelated to classical DNA polymerases. 48491 cd03586: Pol_IV_kappa, a member of the Y-family of DNA polymerases. Pol_Y's can transverse replication-blocking DNA lesions, such as cyclobutane pyrimidine dimers resulting from UV damage, at the cost of an elevated error rate. The Y-family has no 3'->5' exonuclease activity. In addition to possessing a topology akin to a right hand, with ""thumb"", ""fingers"" and ""palm"" motifs, like polymerases from the A-, B-, C- and X-families, the Y-family has a unique ""little finger"" motif. Expression of Y-family polymerases is often induced by DNA damage. These polymerases are phylogenetically unrelated to classical DNA polymerases. Originally called the DinB family, they belong to the recently described Y-family of DNA polymerases. Pol IV is mostly found in bacteria and archaea. Although the structure of Pol IV is similar to that of Pol eta, it shows markedly differenct efficiencies and fidelities in bypassing various DNA lesions. All Pol IV-like polymerases studied to date are able to bypass an abasic site, and the resulting daughter strand is often 1 nt shorter than the template. They tend to slip along the DNA template and result in deletion mutations. Pol IV has higher error rate than other members of the Y-family. Member of the DNA polymerase Y-family. Expression of Y-family polymerases is often induced by DNA damage. Y-family polymerases are characterized by low fidelity replication using undamaged template and the ability to carry out translesion DNA synthesis. Pol kappa is the eukaryotic DinB homologue and is able to bypass abasic and bulky DNA adduct lesions and make both base-substitution and frame-shift mutation. 48492 cd00340: Glutathione (GSH) peroxidase family; tetrameric selenoenzymes that catalyze the reduction of a variety of hydroperoxides including lipid peroxidases, using GSH as a specific electron donor substrate. GSH peroxidase contains one selenocysteine residue per subunit, which is involved in catalysis. Different isoenzymes are known in mammals,which are involved in protection against reactive oxygen species, redox regulation of many metabolic processes, peroxinitrite scavenging, and modulation of inflammatory processes. 48493 cd00570: Glutathione S-transferase (GST) family, N-terminal domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK subfamily, a member of the DsbA family). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxin 2 and stringent starvation protein A. 48494 cd01659: Thioredoxin (TRX) superfamily; a large, diverse group of proteins containing a TRX-fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include TRX, protein disulfide isomerase (PDI), tlpA-like, glutaredoxin, NrdH redoxin, and the bacterial Dsb (DsbA, DsbC, DsbG, DsbE, DsbDgamma) protein families. Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins and glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others. 48495 cd02066: Glutaredoxin (GRX) family; composed of GRX, approximately 10 kDa in size, and proteins containing a GRX or GRX-like domain. GRX is a glutathione (GSH) dependent reductase, catalyzing the disulfide reduction of target proteins such as ribonucleotide reductase. It contains a redox active CXXC motif in a TRX fold and uses a similar dithiol mechanism employed by TRXs for intramolecular disulfide bond reduction of protein substrates. Unlike TRX, GRX has preference for mixed GSH disulfide substrates, in which it uses a monothiol mechanism where only the N-terminal cysteine is required. The flow of reducing equivalents in the GRX system goes from NADPH -> GSH reductase -> GSH -> GRX -> protein substrates. By altering the redox state of target proteins, GRX is involved in many cellular functions including DNA synthesis, signal transduction and the defense against oxidative stress. Different classes are known including human GRX1 and GRX2, as well as E. coli GRX1 and GRX3, which are members of this family. E. coli GRX2, however, is a 24-kDa protein that belongs to the GSH S-transferase (GST) family. 48496 cd02947: TRX family; composed of two groups: Group I, which includes proteins that exclusively encode a TRX domain; and Group II, which are composed of fusion proteins of TRX and additional domains. Group I TRX is a small ancient protein that alter the redox state of target proteins via the reversible oxidation of an active site dithiol, present in a CXXC motif, partially exposed at the protein's surface. TRX reduces protein disulfide bonds, resulting in a disulfide bond at its active site. Oxidized TRX is converted to the active form by TRX reductase, using reducing equivalents derived from either NADPH or ferredoxins. By altering their redox state, TRX regulates the functions of at least 30 target proteins, some of which are enzymes and transcription factors. It also plays an important role in the defense against oxidative stress by directly reducing hydrogen peroxide and certain radicals, and by serving as a reductant for peroxiredoxins. At least two major types of functional TRXs have been reported in most organisms; in eukaryotes, they are located in the cytoplasm and the mitochondria. Higher plants contain more types (at least 20 TRX genes have been detected in the genome of Arabidopsis thaliana), two of which (types f amd m) are located in the same compartment, the chloroplast. Also included in the alignment are TRX-like domains which show sequence homology to TRX but do not contain the redox active CXXC motif. Group II proteins, in addition to either a redox active TRX or a TRX-like domain, also contain additional domains, which may or may not possess homology to known proteins. 48497 cd02948: TRX domain, TRX and NDP-kinase (NDPK) fusion protein family; most members of this group are fusion proteins which contain one redox active TRX domain containing a CXXC motif and three NDPK domains, and are characterized as intermediate chains (ICs) of axonemal outer arm dynein. Dyneins are molecular motors that generate force against microtubules to produce cellular movement, and are divided into two classes: axonemal and cytoplasmic. They are supramolecular complexes consisting of three protein groups classified according to size: dynein heavy, intermediate and light chains. Axonemal dyneins form two structures, the inner and outer arms, which are attached to doublet microtubules throughout the cilia and flagella. The human homolog is the sperm-specific Sptrx-2, presumed to be a component of the human sperm axoneme architecture. Included in this group is another human protein, TRX-like protein 2, a smaller fusion protein containing one TRX and one NDPK domain, which is also associated with microtubular structures. The other members of this group are hypothetical insect proteins containing a TRX domain and outer arm dynein light chains (14 and 16kDa) of Chlamydomonas reinhardtii. Using standard assays, the fusion proteins have shown no TRX enzymatic activity. 48498 cd02949: TRX domain, novel NADPH thioredoxin reductase (NTR) family; composed of fusion proteins found only in oxygenic photosynthetic organisms containing both TRX and NTR domains. The TRX domain functions as a protein disulfide reductase via the reversible oxidation of an active center dithiol present in a CXXC motif, while the NTR domain functions as a reductant to oxidized TRX. The fusion protein is bifunctional, showing both TRX and NTR activities, but it is not an independent NTR/TRX system. In plants, the protein is found exclusively in shoots and mature leaves and is localized in the chloroplast. It is involved in plant protection against oxidative stress. 48499 cd02950: TRX-like protein A (TxlA) family; TxlA was originally isolated from the cyanobacterium Synechococcus. It is found only in oxygenic photosynthetic organisms. TRX is a small enzyme that participate in redox reactions, via the reversible oxidation of an active site dithiol present in a CXXC motif. Disruption of the txlA gene suggests that the protein is involved in the redox regulation of the structure and function of photosynthetic apparatus. The plant homolog (designated as HCF164) is localized in the chloroplast and is involved in the assembly of the cytochrome b6f complex, which takes a central position in photosynthetic electron transport. 48500 cd02951: SoxW family; SoxW is a bacterial periplasmic TRX, containing a redox active CXXC motif, encoded by a genetic locus (sox operon) involved in thiosulfate oxidation. Sulfur bacteria oxidize sulfur compounds to provide reducing equivalents for carbon dioxide fixation during autotrophic growth and the respiratory electron transport chain. It is unclear what the role of SoxW is, since it has been found to be dispensable in the oxidation of thiosulfate to sulfate. SoxW is specifically kept in the reduced state by SoxV, which is essential in thiosulfate oxidation. 48501 cd02952: Human TRX-related protein 14 (TRP14)-like family; composed of proteins similar to TRP14, a 14kD cytosolic protein that shows disulfide reductase activity in vitro with a different substrate specificity compared with another human cytosolic protein, TRX1. TRP14 catalyzes the reduction of small disulfide-containing peptides but does not reduce disulfides of ribonucleotide reductase, peroxiredoxin and methionine sulfoxide reductase, which are TRX1 substrates. TRP14 also plays a role in tumor necrosis factor (TNF)-alpha signaling pathways, distinct from that of TRX1. Its depletion promoted TNF-alpha induced activation of c-Jun N-terminal kinase and mitogen-activated protein kinases. 48502 cd02953: DsbD gamma family; DsbD gamma is the C-terminal periplasmic domain of the bacterial protein DsbD. It contains a CXXC motif in a TRX fold and shuttles the reducing potential from the membrane domain (DsbD beta) to the N-terminal periplasmic domain (DsbD alpha). DsbD beta, a transmembrane domain comprising of eight helices, acquires its reducing potential from the cytoplasmic thioredoxin. DsbD alpha transfers the acquired reducing potential from DsbD gamma to target proteins such as the periplasmic protein disulphide isomerases, DsbC and DsbG. This flow of reducing potential from the cytoplasm through DsbD allows DsbC and DsbG to act as isomerases in the oxidizing environment of the bacterial periplasm. DsbD also transfers reducing potential from the cytoplasm to specific reductases in the periplasm which are involved in the maturation of cytochromes. 48503 cd02954: Dim1 family; Dim1 is also referred to as U5 small nuclear ribonucleoprotein particle (snRNP)-specific 15kD protein. It is a component of U5 snRNP, which pre-assembles with U4/U6 snRNPs to form a [U4/U6:U5] tri-snRNP complex required for pre-mRNA splicing. Dim1 interacts with multiple splicing-associated proteins, suggesting that it functions at multiple control points in the splicing of pre-mRNA as part of a large spliceosomal complex involving many protein-protein interactions. U5 snRNP contains seven core proteins (common to all snRNPs) and nine U5-specific proteins, one of which is Dim1. Dim1 adopts a thioredoxin fold but does not contain the redox active CXXC motif. It is essential for G2/M phase transition, as a consequence to its role in pre-mRNA splicing. 48504 cd02955: TRX domain, SSP411 protein family; members of this family are highly conserved proteins present in eukaryotes, bacteria and archaea, about 600-800 amino acids in length, which contain a TRX domain with a redox active CXXC motif. The human/rat protein, called SSP411, is specifically expressed in the testis in an age-dependent manner. The SSP411 mRNA is increased during spermiogenesis and is localized in round and elongated spermatids, suggesting a function in fertility regulation. 48505 cd02956: ybbN protein family; ybbN is a hypothetical protein containing a redox-inactive TRX-like domain. Its gene has been sequenced from several gammaproteobacteria and actinobacteria. 48506 cd02957: Phosducin (Phd)-like family; composed of Phd and Phd-like proteins (PhLP), characterized as cytosolic regulators of G protein functions. Phd and PhLPs specifically bind G protein betagamma (Gbg)-subunits with high affinity, resulting in the solubilization of Gbg from the plasma membrane and impeding G protein-mediated signal transduction by inhibiting the formation of a functional G protein trimer (G protein alphabetagamma). Phd also inhibits the GTPase activity of G protein alpha. Phd can be phosphorylated by protein kinase A and G protein-coupled receptor kinase 2, leading to its inactivation. Phd was originally isolated from the retina, where it is highly expressed and has been implicated to play an important role in light adaptation. It is also found in the pineal gland, liver, spleen, striated muscle and the brain. The C-terminal domain of Phd adopts a thioredoxin fold, but it does not contain a CXXC motif. Phd interacts with G protein beta mostly through the N-terminal helical domain. Also included in this family is a PhLP characterized as a viral inhibitor of apoptosis (IAP)-associated factor, named VIAF, that functions in caspase activation during apoptosis. 48507 cd02958: UAS family; UAS is a domain of unknown function. Most members of this family are uncharacterized proteins with similarity to FAS-associated factor 1 (FAF1) and ETEA because of the presence of a UAS domain N-terminal to a ubiquitin-associated UBX domain. FAF1 is a longer protein, compared to the other members of this family, having additional N-terminal domains, a ubiquitin-associated UBA domain and a nuclear targeting domain. FAF1 is an apoptotic signaling molecule that acts downstream in the Fas signal transduction pathway. It interacts with the cytoplasmic domain of Fas, but not to a Fas mutant that is deficient in signal transduction. ETEA is the protein product of a highly expressed gene in T-cells and eosinophils of atopic dermatitis patients. The presence of the ubiquitin-associated UBX domain in the proteins of this family suggests the possibility of their involvement in ubiquitination. Recently, FAF1 has been shown to interact with valosin-containing protein (VCP), which is involved in the ubiquitin-proteosome pathway. Some members of this family are uncharacterized proteins containing only a UAS domain. 48508 cd02959: Endoplasmic reticulum protein 19 (ERp19) family; ERp19 is also known as ERp18, a protein located in the ER containing one redox active TRX domain. Denaturation studies indicate that the reduced form is more stable than the oxidized form, suggesting that the protein is involved in disulfide bond formation. In vitro, ERp19 has been shown to possess thiol-disulfide oxidase activity which is dependent on the presence of both active site cysteines. Although described as protein disulfide isomerase (PDI)-like, the protein does not complement for PDI activity. ERp19 shows a wide tissue distribution but is most abundant in liver, testis, heart and kidney. 48509 cd02960: Anterior Gradient (AGR) family; members of this family are similar to secreted proteins encoded by the cement gland-specific genes XAG-1 and XAG-2, expressed in the anterior region of dorsal ectoderm of Xenopus. They are implicated in the formation of the cement gland and the induction of forebrain fate. The human homologs, hAG-2 and hAG-3, are secreted proteins associated with estrogen-positive breast tumors. Yeast two-hybrid studies identified the metastasis-associated C4.4a protein and dystroglycan as binding partners, indicating possible roles in the development and progression of breast cancer. hAG-2 has also been implicated in prostate cancer. Its gene was cloned as an androgen-inducible gene and it was shown to be overexpressed in prostate cancer cells at the mRNA and protein levels. AGR proteins contain one conserved cysteine corresponding to the first cysteine in the CXXC motif of TRX. They show high sequence similarity to ERp19. 48510 cd02961: Protein Disulfide Isomerase (PDIa) family, redox active TRX domains; composed of eukaryotic proteins involved in oxidative protein folding in the endoplasmic reticulum (ER) by acting as catalysts and folding assistants. Members of this family include PDI and PDI-related proteins like ERp72, ERp57 (or ERp60), ERp44, P5, PDIR, ERp46 and the transmembrane PDIs. PDI, ERp57, ERp72, P5, PDIR and ERp46 are all oxidases, catalyzing the formation of disulfide bonds of newly synthesized polypeptides in the ER. They also exhibit reductase activity in acting as isomerases to correct any non-native disulfide bonds, as well as chaperone activity to prevent protein aggregation and facilitate the folding of newly synthesized proteins. These proteins usually contain multiple copies of a redox active TRX (a) domain containing a CXXC motif, and may also contain one or more redox inactive TRX-like (b) domains. Only one a domain is required for the oxidase function but multiple copies are necessary for the isomerase function. The different types of PDIs may show different substrate specificities and tissue-specific expression, or may be induced by stress. PDIs are in their reduced form at steady state and are oxidized to the active form by Ero1, which is localized in the ER through ERp44. Some members of this family also contain a DnaJ domain in addition to the redox active a domains; examples are ERdj5 and Pfj2. Also included in the family is the redox inactive N-terminal TRX-like domain of ERp29. 48511 cd02962: TMX2 family; composed of proteins similar to human TMX2, a 372-amino acid TRX-related transmembrane protein, identified and characterized through the cloning of its cDNA from a human fetal library. It contains a TRX domain but the redox active CXXC motif is replaced with SXXC. Sequence analysis predicts that TMX2 may be a Type I membrane protein, with its C-terminal half protruding on the luminal side of the endoplasmic reticulum (ER). In addition to the TRX domain, transmembrane region and ER-retention signal, TMX2 also contains a Myb DNA-binding domain repeat signature and a dileucine motif in the tail. 48512 cd02963: TRX domain, DnaJ domain containing protein family; composed of uncharacterized proteins of about 500-800 amino acids, containing an N-terminal DnaJ domain followed by one redox active TRX domain. DnaJ is a member of the 40 kDa heat-shock protein (Hsp40) family of molecular chaperones, which regulate the activity of Hsp70s. TRX is involved in the redox regulation of many protein substrates through the reduction of disulfide bonds. TRX has been implicated to catalyse the reduction of Hsp33, a chaperone holdase that binds to unfolded protein intermediates. The presence of DnaJ and TRX domains in members of this family suggests that they could be involved in a redox-regulated chaperone network. 48513 cd02964: Tryparedoxin (TryX)-like family; composed of TryX and related proteins including nucleoredoxin (NRX), rod-derived cone viability factor (RdCVF) and the nematode homolog described as a 16-kD class of TRX. Most members of this family, except RdCVF, are protein disulfide oxidoreductases containing an active site CXXC motif, similar to TRX. 48514 cd02965: HyaE family; HyaE is also called HupG and HoxO. They are proteins serving a critical role in the assembly of multimeric [NiFe] hydrogenases, the enzymes that catalyze the oxidation of molecular hydrogen to enable microorganisms to utilize hydrogen as the sole energy source. The E. coli HyaE protein is a chaperone that specifically interacts with the twin-arginine translocation (Tat) signal peptide of the [NiFe] hydrogenase-1 beta subunit precursor. Tat signal peptides target precursor proteins to the Tat protein export system, which facilitates the transport of fully folded proteins across the inner membrane. HyaE may be involved in regulating the traffic of [NiFe] hydrogenase-1 on the Tat transport pathway. 48515 cd02966: TlpA-like family; composed of TlpA, ResA, DsbE and similar proteins. TlpA, ResA and DsbE are bacterial protein disulfide reductases with important roles in cytochrome maturation. They are membrane-anchored proteins with a soluble TRX domain containing a CXXC motif located in the periplasm. The TRX domains of this family contain an insert, approximately 25 residues in length, which correspond to an extra alpha helix and a beta strand when compared with TRX. TlpA catalyzes an essential reaction in the biogenesis of cytochrome aa3, while ResA and DsbE are essential proteins in cytochrome c maturation. Also included in this family are proteins containing a TlpA-like TRX domain with domain architectures similar to E. coli DipZ protein, and the N-terminal TRX domain of PilB protein from Neisseria which acts as a disulfide reductase that can recylce methionine sulfoxide reductases. 48516 cd02967: Methylamine utilization (mau) D family; mauD protein is the translation product of the mauD gene found in methylotrophic bacteria, which are able to use methylamine as a sole carbon source and a nitrogen source. mauD is an essential accessory protein for the biosynthesis of methylamine dehydrogenase (MADH), the enzyme that catalyzes the oxidation of methylamine and other primary amines. MADH possesses an alpha2beta2 subunit structure; the alpha subunit is also referred to as the large subunit. Each beta (small) subunit contains a tryptophan tryptophylquinone (TTQ) prosthetic group. Accessory proteins are essential for the proper transport of MADH to the periplasm, TTQ synthesis and the formation of several structural disulfide bonds. Bacterial mutants containing an insertion on the mauD gene were unable to grow on methylamine as a sole carbon source, were found to lack the MADH small subunit and had decreased amounts of the MADH large subunit. 48517 cd02968: SCO (an acronym for Synthesis of Cytochrome c Oxidase) family; composed of proteins similar to Sco1, a membrane-anchored protein possessing a soluble domain with a TRX fold. Members of this family are required for the proper assembly of cytochrome c oxidase (COX). They contain a metal binding motif, typically CXXXC, which is located in a flexible loop. COX, the terminal enzyme in the respiratory chain, is imbedded in the inner mitochondrial membrane of all eukaryotes and in the plasma membrane of some prokaryotes. It is composed of two subunits, COX I and COX II. It has been proposed that Sco1 specifically delivers copper to the CuA site, a dinuclear copper center, of the COX II subunit. Mutations in human Sco1 and Sco2 cause fatal infantile hepatoencephalomyopathy and cardioencephalomyopathy, respectively. Both disorders are associated with severe COX deficiency in affected tissues. More recently, it has been argued that the redox sensitivity of the copper binding properties of Sco1 implies that it participates in signaling events rather than functioning as a chaperone that transfers copper to COX II. 48518 cd02969: Peroxiredoxin (PRX)-like 1 family; hypothetical proteins that show sequence similarity to PRXs. Members of this group contain a conserved cysteine that aligns to the first cysteine in the CXXC motif of TRX. This does not correspond to the peroxidatic cysteine found in PRXs, which aligns to the second cysteine in the CXXC motif of TRX. In addition, these proteins do not contain the other two conserved residues of the catalytic triad of PRX. PRXs confer a protective antioxidant role in cells through their peroxidase activity in which hydrogen peroxide, peroxynitrate, and organic hydroperoxides are reduced and detoxified using reducing equivalents derived from either thioredoxin, glutathione, trypanothione and AhpF. 48519 cd02970: Peroxiredoxin (PRX)-like 2 family; hypothetical proteins that show sequence similarity to PRXs. Members of this group contain a CXXC motif, similar to TRX. The second cysteine in the motif corresponds to the peroxidatic cysteine of PRX, however, these proteins do not contain the other two residues of the catalytic triad of PRX. PRXs confer a protective antioxidant role in cells through their peroxidase activity in which hydrogen peroxide, peroxynitrate, and organic hydroperoxides are reduced and detoxified using reducing equivalents derived from either thioredoxin, glutathione, trypanothione and AhpF. TRXs alter the redox state of target proteins by catalyzing the reduction of their disulfide bonds via the CXXC motif using reducing equivalents derived from either NADPH or ferredoxins. 48520 cd02971: Peroxiredoxin (PRX) family; composed of the different classes of PRXs including many proteins originally known as bacterioferritin comigratory proteins (BCP), based on their electrophoretic mobility before their function was identified. PRXs are thiol-specific antioxidant (TSA) proteins also known as TRX peroxidases and alkyl hydroperoxide reductase C22 (AhpC) proteins. They confer a protective antioxidant role in cells through their peroxidase activity in which hydrogen peroxide, peroxynitrate, and organic hydroperoxides are reduced and detoxified using reducing equivalents derived from either TRX, glutathione, trypanothione and AhpF. They are distinct from other peroxidases in that they have no cofactors such as metals or prosthetic groups. The first step of catalysis, common to all PRXs, is the nucleophilic attack by the catalytic cysteine (also known as the peroxidatic cysteine) on the peroxide leading to cleavage of the oxygen-oxygen bond and the formation of a cysteine sulfenic acid intermediate. The second step of the reaction, the resolution of the intermediate, distinguishes the different types of PRXs. The presence or absence of a second cysteine (the resolving cysteine) classifies PRXs as either belonging to the 2-cys or 1-cys type. The resolving cysteine of 2-cys PRXs is either on the same chain (atypical) or on the second chain (typical) of a functional homodimer. Structural and motif analysis of this growing family supports the need for a new classification system. The peroxidase activity of PRXs is regulated in vivo by irreversible cysteine over-oxidation into a sulfinic acid, phosphorylation and limited proteolysis. 48522 cd02973: Thioredoxin (TRX)-Glutaredoxin (GRX)-like family; composed of archaeal and bacterial proteins that show similarity to both TRX and GRX, including the C-terminal TRX-fold subdomain of Pyrococcus furiosus protein disulfide oxidoreductase (PfPDO). All members contain a redox-active CXXC motif and may function as PDOs. The archaeal proteins Mj0307 and Mt807 show structures more similar to GRX, but activities more similar to TRX. Some members of the family are similar to PfPDO in that they contain a second CXXC motif located in a second TRX-fold subdomain at the N-terminus; the superimposable N- and C-terminal TRX subdomains form a compact structure. PfPDO is postulated to be the archaeal counterpart of bacterial DsbA and eukaryotic protein disulfide isomerase (PDI). The C-terminal CXXC motif of PfPDO is required for its oxidase, reductase and isomerase activities. Also included in the family is the C-terminal TRX-fold subdomain of the N-terminal domain (NTD) of bacterial AhpF, which has a similar fold as PfPDO with two TRX-fold subdomains but without the second CXXC motif. 48523 cd02974: Alkyl hydroperoxide reductase F subunit (AhpF) N-terminal domain (NTD) family, N-terminal TRX-fold subdomain; AhpF is a homodimeric flavoenzyme which catalyzes the NADH-dependent reduction of the peroxiredoxin AhpC, which in turn catalyzes the reduction of hydrogen peroxide and organic hydroperoxides. AhpF contains an NTD forming two contiguous TRX-fold subdomain similar to Pyrococcus furiosus protein disulfide oxidoreductase (PfPDO). It also contains a catalytic core similar to TRX reductase containing FAD and NADH binding domains with an active site disulfide. The proposed mechanism of action of AhpF is similar to a TRX/TRX reductase system. The flow of reducing equivalents goes from NADH -> catalytic core of AhpF -> NTD of AhpF -> AhpC -> peroxide substrates. The N-terminal TRX-fold subdomain of AhpF NTD is redox inactive, but is proposed to contain an important residue that aids in the catalytic function of the redox-active CXXC motif contained in the C-terminal TRX-fold subdomain. 48524 cd02975: Pyrococcus furiosus protein disulfide oxidoreductase (PfPDO)-like family, N-terminal TRX-fold subdomain; composed of proteins with similarity to PfPDO, a redox active thermostable protein believed to be the archaeal counterpart of bacterial DsbA and eukaryotic protein disulfide isomerase (PDI), which are both involved in oxidative protein folding. PfPDO contains two redox active CXXC motifs in two contiguous TRX-fold subdomains. The active site in the N-terminal TRX-fold subdomain is required for isomerase but not for reductase activity of PfPDO. The exclusive presence of PfPDO-like proteins in extremophiles may suggest that they have a special role in adaptation to extreme conditions. 48525 cd02976: NrdH-redoxin (NrdH) family; NrdH is a small monomeric protein with a conserved redox active CXXC motif within a TRX fold, characterized by a glutaredoxin (GRX)-like sequence and TRX-like activity profile. In vitro, it displays protein disulfide reductase activity that is dependent on TRX reductase, not glutathione (GSH). It is part of the NrdHIEF operon, where NrdEF codes for class Ib ribonucleotide reductase (RNR-Ib), an efficient enzyme at low oxygen levels. Under these conditions when GSH is mostly conjugated to spermidine, NrdH can still function and act as a hydrogen donor for RNR-Ib. It has been suggested that the NrdHEF system may be the oldest RNR reducing system, capable of functioning in a microaerophilic environment, where GSH was not yet available. NrdH from Corynebacterium ammoniagenes can form domain-swapped dimers, although it is unknown if this happens in vivo. Domain-swapped dimerization, which results in the blocking of the TRX reductase binding site, could be a mechanism for regulating the oxidation state of the protein. 48526 cd02977: Arsenate Reductase (ArsC) family; composed of TRX-fold arsenic reductases and similar proteins including the transcriptional regulator, Spx. ArsC catalyzes the reduction of arsenate [As(V)] to arsenite [As(III)], using reducing equivalents derived from glutathione (GSH) via glutaredoxin (GRX), through a single catalytic cysteine. This family of predominantly bacterial enzymes is unrelated to two other families of arsenate reductases which show similarity to low-molecular-weight acid phosphatases and phosphotyrosyl phosphatases. Spx is a general regulator that exerts negative and positive control over transcription initiation by binding to the C-terminal domain of the alpha subunit of RNA polymerase. 48527 cd02978: KaiB-like family; composed of the circadian clock proteins, KaiB and the N-terminal KaiB-like sensory domain of SasA. KaiB is an essential protein in maintaining circadian rhythm. It was originally discovered from the cyanobacterium Synechococcus as part of the circadian clock gene cluster, kaiABC. KaiB attenuates KaiA-enhanced KaiC autokinase activity by interacting with KaiA-KaiC complexes in a circadian fashion. KaiB is membrane-associated as well as cytosolic. The amount of membrane-associated protein peaks in the evening (at circadian time (CT) 12-16) while the cytosolic form peaks later (at CT 20). The rhythmic localization of KaiB may function in regulating the formation of Kai complexes. SasA is a sensory histidine kinase which associates with KaiC. Although it is not an essential oscillator component, it is important in enhancing kaiABC expression and is important in metabolic growth control under day/night cycle conditions. SasA contains an N-terminal sensory domain with a TRX fold which is involved in the SasA-KaiC interaction. This domain shows high sequence similarity with KaiB. However, the KaiB structure does not show a classical TRX fold. The N-terminal half of KaiB shares the same beta-alpha-beta topology as TRX, but the topology of its C-terminal half diverges. 48528 cd02979: FAD-dependent Phenol hydoxylase (PHOX) family, C-terminal TRX-fold domain; composed of proteins similar to PHOX from the aerobic topsoil yeast Trichosporon cutaneum. PHOX is a flavoprotein monooxygenase that catalyzes the hydroxylation of phenol and simple phenol derivatives in the ortho position with the consumption of NADPH and oxygen. This is the first step in the biodegradation and detoxification of phenolic compounds. PHOX contains three domains. The substrate and FAD/NAD(P) binding sites are contained in the first two domains, which adopt a complicated folding pattern. The third or C-terminal domain contains a TRX fold and is involved in dimerization. The functional unit of PHOX is a dimer, although active tetramers of the recombinant enzyme can be isolated when overproduced in bacteria. 48529 cd02980: Thioredoxin (TRX)-like [2Fe-2S] Ferredoxin (Fd) family; composed of [2Fe-2S] Fds with a TRX fold (TRX-like Fds) and proteins containing domains similar to TRX-like Fd including formate dehydrogenases, NAD-reducing hydrogenases and the subunit E of NADH:ubiquinone oxidoreductase (NuoE). TRX-like Fds are soluble low-potential electron carriers containing a single [2Fe-2S] cluster. The exact role of TRX-like Fd is still unclear. It has been suggested that it may be involved in nitrogen fixation. Its homologous domains in large redox enzymes (such as Nuo and hydrogenases) function as electron carriers. 48530 cd02981: Protein Disulfide Isomerase (PDIb) family, redox inactive TRX-like domain b; composed of eukaryotic proteins involved in oxidative protein folding in the endoplasmic reticulum (ER) by acting as catalysts and folding assistants. Members of this family include PDI, calsequestrin and other PDI-related proteins like ERp72, ERp57, ERp44 and PDIR. PDI, ERp57 (or ERp60), ERp72 and PDIR are all oxidases, catalyzing the formation of disulfide bonds of newly synthesized polypeptides in the ER. They also exhibit reductase activity in acting as isomerases to correct any non-native disulfide bonds, as well as chaperone activity to prevent protein aggregation and facilitate the folding of newly synthesized proteins. These proteins contain multiple copies of a redox active TRX (a) domain containing a CXXC motif, and one or more redox inactive TRX-like (b) domains. The molecular structure of PDI is abb'a'. Also included in this family is the PDI-related protein ERp27, which contains only redox-inactive TRX-like (b and b') domains. The redox inactive b domains are implicated in substrate recognition. 48531 cd02982: Protein Disulfide Isomerase (PDIb') family, redox inactive TRX-like domain b'; composed of eukaryotic proteins involved in oxidative protein folding in the endoplasmic reticulum (ER) by acting as catalysts and folding assistants. Members of this family include PDI, calsequestrin and other PDI-related proteins like ERp72, ERp57 (or ERp60), ERp44, P5 and PDIR. PDI, ERp57, ERp72, P5 and PDIR are all oxidases, catalyzing the formation of disulfide bonds of newly synthesized polypeptides in the ER. They also exhibit reductase activity in acting as isomerases to correct any non-native disulfide bonds, as well as chaperone activity to prevent protein aggregation and facilitate the folding of newly synthesized proteins. These proteins contain multiple copies of a redox active TRX (a) domain containing a CXXC motif, and one or more redox inactive TRX-like (b) domains. The molecular structure of PDI is abb'a'. Also included in this family is the PDI-related protein ERp27, which contains only redox-inactive TRX-like (b and b') domains. The redox inactive domains are implicated in substrate recognition with the b' domain serving as the primary substrate binding site. Only the b' domain is necessary for the binding of small peptide substrates. In addition to the b' domain, other domains are required for the binding of larger polypeptide substrates. The b' domain is also implicated in chaperone activity. 48532 cd02983: P5 family, C-terminal redox inactive TRX-like domain; P5 is a protein disulfide isomerase (PDI)-related protein with a domain structure of aa'b (where a and a' are redox active TRX domains and b is a redox inactive TRX-like domain). Like PDI, P5 is located in the endoplasmic reticulum (ER) and displays both isomerase and chaperone activities, which are independent of each other. Compared to PDI, the isomerase and chaperone activities of P5 are lower. The first cysteine in the CXXC motif of both redox active domains in P5 is necessary for isomerase activity. The P5 gene was first isolated as an amplified gene from a hydroxyurea-resistant hamster cell line. The zebrafish P5 homolog has been implicated to play a critical role in establishing left/right asymmetries in the embryonic midline. The C-terminal domain is likely involved in substrate binding, similar to the b and b' domains of PDI. 48533 cd02984: TRX domain, PICOT (for PKC-interacting cousin of TRX) subfamily; PICOT is a protein that interacts with protein kinase C (PKC) theta, a calcium independent PKC isoform selectively expressed in skeletal muscle and T lymphocytes. PICOT contains an N-terminal TRX-like domain, which does not contain the catalytic CXXC motif, followed by one to three glutaredoxin domains. The TRX-like domain is required for interaction with PKC theta. PICOT inhibits the activation of c-Jun N-terminal kinase and the transcription factors, AP-1 and NF-kB, induced by PKC theta or T-cell activating stimuli. 48534 cd02985: TRX family, chloroplastic drought-induced stress protein of 32 kD (CDSP32); CDSP32 is composed of two TRX domains, a C-terminal TRX domain which contains a redox active CXXC motif and an N-terminal TRX-like domain which contains an SXXS sequence instead of the redox active motif. CDSP32 is a stress-inducible TRX, i.e., it acts as a TRX by reducing protein disulfides and is induced by environmental and oxidative stress conditions. It plays a critical role in plastid defense against oxidative damage, a role related to its function as a physiological electron donor to BAS1, a plastidic 2-cys peroxiredoxin. Plants lacking CDSP32 exhibit decreased photosystem II photochemical efficiencies and chlorophyll retention compared to WT controls, as well as an increased proportion of BAS1 in its overoxidized monomeric form. 48535 cd02986: Dim1 family, Dim1-like protein (DLP) subfamily; DLP is a novel protein which shares 38% sequence identity to Dim1. Like Dim1, it is also implicated in pre-mRNA splicing and cell cycle progression. DLP is located in the nucleus and has been shown to interact with the U5 small nuclear ribonucleoprotein particle (snRNP)-specific 102kD protein (or Prp6). Dim1 protein, also known as U5 snRNP-specific 15kD protein is a component of U5 snRNP, which pre-assembles with U4/U6 snRNPs to form a [U4/U6:U5] tri-snRNP complex required for pre-mRNA splicing. Dim1 adopts a thioredoxin fold but does not contain the redox active CXXC motif. 48537 cd02988: Phosducin (Phd)-like family, Viral inhibitor of apoptosis (IAP)-associated factor (VIAF) subfamily; VIAF is a Phd-like protein that functions in caspase activation during apoptosis. It was identified as an IAP binding protein through a screen of a human B-cell library using a prototype IAP. VIAF lacks a consensus IAP binding motif and while it does not function as an IAP antagonist, it still plays a regulatory role in the complete activation of caspases. VIAF itself is a substrate for IAP-mediated ubiquitination, suggesting that it may be a target of IAPs in the prevention of cell death. The similarity of VIAF to Phd points to a potential role distinct from apoptosis regulation. Phd functions as a cytosolic regulator of G protein by specifically binding to G protein betagamma (Gbg)-subunits. The C-terminal domain of Phd adopts a thioredoxin fold, but it does not contain a CXXC motif. Phd interacts with G protein beta mostly through the N-terminal helical domain. 48538 cd02989: Phosducin (Phd)-like family, Thioredoxin (TRX) domain containing protein 9 (TxnDC9) subfamily; composed of predominantly uncharacterized eukaryotic proteins, containing a TRX-like domain without the redox active CXXC motif. The gene name for the human protein is TxnDC9. The two characterized members are described as Phd-like proteins, PLP1 of Saccharomyces cerevisiae and PhLP3 of Dictyostelium discoideum. Gene disruption experiments show that both PLP1 and PhLP3 are non-essential proteins. Unlike Phd and most Phd-like proteins, members of this group do not contain the Phd N-terminal helical domain which is implicated in binding to the G protein betagamma subunit. 48539 cd02990: UAS family, FAS-associated factor 1 (FAF1) subfamily; FAF1 contains a UAS domain of unknown function N-terminal to a ubiquitin-associated UBX domain. FAF1 also contains ubiquitin-associated UBA and nuclear targeting domains, N-terminal to the UAS domain. FAF1 is an apoptotic signaling molecule that acts downstream in the Fas signal transduction pathway. It interacts with the cytoplasmic domain of Fas, but not to a Fas mutant that is deficient in signal transduction. It is widely expressed in adult and embryonic tissues, and in tumor cell lines, and is localized not only in the cytoplasm where it interacts with Fas, but also in the nucleus. FAF1 contains phosphorylation sites for protein kinase CK2 within the nuclear targeting domain. Phosphorylation influences nuclear localization of FAF1 but does not affect its potentiation of Fas-induced apoptosis. Other functions have also been attributed to FAF1. It inhibits nuclear factor-kB (NF-kB) by interfering with the nuclear translocation of the p65 subunit. FAF1 also interacts with valosin-containing protein (VCP), which is involved in the ubiquitin-proteosome pathway. 48540 cd02991: UAS family, ETEA subfamily; composed of proteins similar to human ETEA protein, the translation product of a highly expressed gene in the T-cells and eosinophils of atopic dermatitis patients compared with those of normal individuals. ETEA shows homology to Fas-associated factor 1 (FAF1); both containing UAS and UBX (ubiquitin-associated) domains. Compared to FAF1, however, ETEA lacks the ubiquitin-associated UBA domain and a nuclear targeting domain. The function of ETEA is still unknown. A yeast two-hybrid assay showed that it can interact with Fas. Because of its homology to FAF1, it is postulated that ETEA could be involved in modulating Fas-mediated apoptosis of T-cells and eosinophils of atopic dermatitis patients, making them more resistant to apoptosis. 48541 cd02992: PDIa family, Quiescin-sulfhydryl oxidase (QSOX) subfamily; QSOX is a eukaryotic protein containing an N-terminal redox active TRX domain, similar to that of PDI, and a small C-terminal flavin adenine dinucleotide (FAD)-binding domain homologous to the yeast ERV1p protein. QSOX oxidizes thiol groups to disulfides like PDI, however, unlike PDI, this oxidation is accompanied by the reduction of oxygen to hydrogen peroxide. QSOX is localized in high concentrations in cells with heavy secretory load and prefers peptides and proteins as substrates, not monothiols like glutathione. Inside the cell, QSOX is found in the endoplasmic reticulum and Golgi. The flow of reducing equivalents in a QSOX-catalyzed reaction goes from the dithiol substrate -> dithiol of the QSOX TRX domain -> dithiols of the QSOX ERV1p domain -> FAD -> oxygen. 48542 cd02993: PDIa family, 5'-Adenylylsulfate (APS) reductase subfamily; composed of plant-type APS reductases containing a C-terminal redox active TRX domain and an N-terminal reductase domain which is part of a superfamily that includes N type ATP PPases. APS reductase catalyzes the reduction of activated sulfate to sulfite, a key step in the biosynthesis of sulfur-containing metabolites. Sulfate is first activated by ATP sulfurylase, forming APS, which can be phosphorylated to 3 '-phosphoadenosine-5'-phosphosulfate (PAPS). Depending on the organism, either APS or PAPS can be used for sulfate reduction. Prokaryotes and fungi use PAPS, whereas plants use both APS and PAPS. Since plant-type APS reductase uses glutathione (GSH) as its electron donor, the C-terminal domain may function like glutaredoxin, a GSH-dependent member of the TRX superfamily. The flow of reducing equivalents goes from GSH -> C-terminal TRX domain -> N-terminal reductase domain -> APS. Plant-type APS reductase shows no homology to that of dissimilatory sulfate-reducing bacteria, which is an iron-sulfur flavoenzyme. Also included in the alignment is EYE2 from Chlamydomonas reinhardtii, a protein required for eyespot assembly. 48543 cd02994: PDIa family, TMX subfamily; composed of proteins similar to the TRX-related human transmembrane protein, TMX. TMX is a type I integral membrane protein; the N-terminal redox active TRX domain is present in the endoplasmic reticulum (ER) lumen while the C-terminus is oriented towards the cytoplasm. It is expressed in many cell types and its active site motif (CPAC) is unique. In vitro, TMX reduces interchain disulfides of insulin and renatures inactive RNase containing incorrect disulfide bonds. The C. elegans homolog, DPY-11, is expressed only in the hypodermis and resides in the cytoplasm. It is required for body and sensory organ morphogeneis. Another uncharacterized TRX-related transmembrane protein, human TMX4, is included in the alignment. The active site sequence of TMX4 is CPSC. 48544 cd02995: PDIa family, C-terminal TRX domain (a ') subfamily; composed of the C-terminal redox active a' domains of PDI, ERp72, ERp57 (or ERp60) and EFP1. PDI, ERp72 and ERp57 are endoplasmic reticulum (ER)-resident eukaryotic proteins involved in oxidative protein folding. They are oxidases, catalyzing the formation of disulfide bonds of newly synthesized polypeptides in the ER. They also exhibit reductase activity in acting as isomerases to correct any non-native disulfide bonds, as well as chaperone activity to prevent protein aggregation and facilitate the folding of newly synthesized proteins. PDI and ERp57 have the abb'a' domain structure (where a and a' are redox active TRX domains while b and b' are redox inactive TRX-like domains). PDI also contains an acidic region (c domain) after the a' domain that is absent in ERp57. ERp72 has an additional a domain at the N-terminus (a""abb'a' domain structure). ERp57 interacts with the lectin chaperones, calnexin and calreticulin, and specifically promotes the oxidative folding of glycoproteins, while PDI shows a wider substrate specificity. ERp72 associates with several ER chaperones and folding factors to form complexes in the ER that bind nascent proteins. EFP1 is a binding partner protein of thyroid oxidase, which is responsible for the generation of hydrogen peroxide, a crucial substrate of thyroperoxidase, which functions to iodinate thyroglobulin and synthesize thyroid hormones. 48545 cd02996: PDIa family, endoplasmic reticulum protein 44 (ERp44) subfamily; ERp44 is an ER-resident protein, induced during stress, involved in thiol-mediated ER retention. It contains an N-terminal TRX domain, similar to that of PDIa, with a CXFS motif followed by two redox inactive TRX-like domains, homologous to the b and b' domains of PDI. The CXFS motif in the N-terminal domain allows ERp44 to form stable reversible mixed disulfides with its substrates. Through this activity, ERp44 mediates the ER localization of Ero1alpha, a protein that oxidizes protein disulfide isomerases into their active form. ERp44 also prevents the secretion of unassembled cargo protein with unpaired cysteines. It also modulates the activity of inositol 1,4,5-triphosphate type I receptor (IP3R1), an intracellular channel protein that mediates calcium release from the ER to the cytosol. 48546 cd02997: PDIa family, PDIR subfamily; composed of proteins similar to human PDIR (for Protein Disulfide Isomerase Related). PDIR is composed of three redox active TRX (a) domains and an N-terminal redox inactive TRX-like (b) domain. Similar to PDI, it is involved in oxidative protein folding in the endoplasmic reticulum (ER) through its isomerase and chaperone activities. These activities are lower compared to PDI, probably due to PDIR acting only on a subset of proteins. PDIR is preferentially expressed in cells actively secreting proteins and its expression is induced by stress. Similar to PDI, the isomerase and chaperone activities of PDIR are independent; CXXC mutants lacking isomerase activity retain chaperone activity. 48547 cd02998: PDIa family, endoplasmic reticulum protein 38 (ERp38) subfamily; composed of proteins similar to the P5-like protein first isolated from alfalfa, which contains two redox active TRX (a) domains at the N-terminus, like human P5, and a C-terminal domain with homology to the C-terminal domain of ERp29, unlike human P5. The cDNA clone of this protein (named G1) was isolated from an alfalfa cDNA library by screening with human protein disulfide isomerase (PDI) cDNA. The G1 protein is constitutively expressed in all major organs of the plant and its expression is induced by treatment with tunicamycin, indicating that it may be a glucose-regulated protein. The G1 homolog in the eukaryotic social amoeba Dictyostelium discoideum is also described as a P5-like protein, which is located in the endoplasmic reticulum (ER) despite the absence of an ER-retrieval signal. G1 homologs from Aspergillus niger and Neurospora crassa have also been characterized, and are named TIGA and ERp38, respectively. Also included in the alignment is an atypical PDI from Leishmania donovani containing a single a domain, and the C-terminal a domain of a P5-like protein from Entamoeba histolytica. 48548 cd02999: PDIa family, endoplasmic reticulum protein 44 (ERp44)-like subfamily; composed of uncharacterized PDI-like eukaryotic proteins containing only one redox active TRX (a) domain with a CXXS motif, similar to ERp44. CXXS is still a redox active motif; however, the mixed disulfide formed with the substrate is more stable than those formed by CXXC motif proteins. PDI-related proteins are usually involved in the oxidative protein folding in the ER by acting as catalysts and folding assistants. ERp44 is involved in thiol-mediated retention in the ER. 48549 cd03000: PDIa family, TMX3 subfamily; composed of eukaryotic proteins similar to human TMX3, a TRX related transmembrane protein containing one redox active TRX domain at the N-terminus and a classical ER retrieval sequence for type I transmembrane proteins at the C-terminus. The TMX3 transcript is found in a variety of tissues with the highest levels detected in skeletal muscle and the heart. In vitro, TMX3 showed oxidase activity albeit slightly lower than that of protein disulfide isomerase. 48550 cd03001: PDIa family, P5 subfamily; composed of eukaryotic proteins similar to human P5, a PDI-related protein with a domain structure of aa'b (where a and a' are redox active TRX domains and b is a redox inactive TRX-like domain). Like PDI, P5 is located in the endoplasmic reticulum (ER) and displays both isomerase and chaperone activities, which are independent of each other. Compared to PDI, the isomerase and chaperone activities of P5 are lower. The first cysteine in the CXXC motif of both redox active domains in P5 is necessary for isomerase activity. The P5 gene was first isolated as an amplified gene from a hydroxyurea-resistant hamster cell line. The zebrafish P5 homolog has been implicated to play a critical role in establishing left/right asymmetries in the embryonic midline. Some members of this subfamily are P5-like proteins containing only one redox active TRX domain. 48551 cd03002: PDI family, MPD1-like subfamily; composed of eukaryotic proteins similar to Saccharomyces cerevisiae MPD1 protein, which contains a single redox active TRX domain located at the N-terminus, and an ER retention signal at the C-terminus indicative of an ER-resident protein. MPD1 has been shown to suppress the maturation defect of carboxypeptidase Y caused by deletion of the yeast PDI1 gene. Other characterized members of this subfamily include the Aspergillus niger prpA protein and Giardia PDI-1. PrpA is non-essential to strain viability, however, its transcript level is induced by heterologous protein expression suggesting a possible role in oxidative protein folding during high protein production. Giardia PDI-1 has the ability to refold scrambled RNase and exhibits transglutaminase activity. 48552 cd03003: PDIa family, N-terminal ERdj5 subfamily; ERdj5, also known as JPDI and macrothioredoxin, is a protein containing an N-terminal DnaJ domain and four redox active TRX domains. This subfamily is comprised of the first TRX domain of ERdj5 located after the DnaJ domain at the N-terminal half of the protein. ERdj5 is a ubiquitous protein localized in the endoplasmic reticulum (ER) and is abundant in secretory cells. It's transcription is induced during ER stress. It interacts with BiP through its DnaJ domain in an ATP-dependent manner. BiP, an ER-resident member of the Hsp70 chaperone family, functions in ER-associated degradation and protein translocation. 48553 cd03004: PDIa family, C-terminal ERdj5 subfamily; ERdj5, also known as JPDI and macrothioredoxin, is a protein containing an N-terminal DnaJ domain and four redox active TRX domains. This subfamily is composed of the three TRX domains located at the C-terminal half of the protein. ERdj5 is a ubiquitous protein localized in the endoplasmic reticulum (ER) and is abundant in secretory cells. It's transcription is induced during ER stress. It interacts with BiP through its DnaJ domain in an ATP-dependent manner. BiP, an ER-resident member of the Hsp70 chaperone family, functions in ER-associated degradation and protein translocation. Also included in the alignment is the single complete TRX domain of an uncharacterized protein from Tetraodon nigroviridis, which also contains a DnaJ domain at its N-terminus. 48554 cd03005: PDIa family, endoplasmic reticulum protein 46 (ERp46) subfamily; ERp46 is an ER-resident protein containing three redox active TRX domains. Yeast complementation studies show that ERp46 can substitute for protein disulfide isomerase (PDI) function in vivo. It has been detected in many tissues, however, transcript and protein levels do not correlate in all tissues, suggesting regulation at a posttranscriptional level. An identical protein, named endoPDI, has been identified as an endothelial PDI that is highly expressed in the endothelium of tumors and hypoxic lesions. It has a protective effect on cells exposed to hypoxia. 48555 cd03006: PDIa family, N-terminal EFP1 subfamily; EFP1 is a binding partner protein of thyroid oxidase (ThOX), also called Duox. ThOX proteins are responsible for the generation of hydrogen peroxide, a crucial substrate of thyroperoxidase, which functions to iodinate thyroglobulin and synthesize thyroid hormones. EFP1 was isolated through a yeast two-hybrid method using the EF-hand fragment of dog Duox1 as a bait. It could be one of the partners in the assembly of a multiprotein complex constituting the thyroid hydrogen peroxide generating system. EFP1 contains two TRX domains related to the redox active TRX domains of protein disulfide isomerase (PDI). This subfamily is composed of the N-terminal TRX domain of EFP1, which contains a CXXS sequence in place of the typical CXXC motif, similar to ERp44. The CXXS motif allows the formation of stable mixed disulfides, crucial for the ER-retention function of ERp44. 48556 cd03007: PDIa family, endoplasmic reticulum protein 29 (ERp29) subfamily; ERp29 is a ubiquitous ER-resident protein expressed in high levels in secretory cells. It forms homodimers and higher oligomers in vitro and in vivo. It contains a redox inactive TRX-like domain at the N-terminus, which is homologous to the redox active TRX (a) domains of PDI, and a C-terminal helical domain similar to the C-terminal domain of P5. The expression profile of ERp29 suggests a role in secretory protein production distinct from that of PDI. It has also been identified as a member of the thyroglobulin folding complex. The Drosophila homolog, Wind, is the product of windbeutel, an essential gene in the development of dorsal-ventral patterning. Wind is required for correct targeting of Pipe, a Golgi-resident type II transmembrane protein with homology to 2-O-sulfotransferase. 48557 cd03008: Tryparedoxin (TryX)-like family, Rod-derived cone viability factor (RdCVF) subfamily; RdCVF is a thioredoxin (TRX)-like protein specifically expressed in photoreceptors. RdCVF was isolated and identified as a factor that supports cone survival in retinal cultures. Cone photoreceptor loss is responsible for the visual handicap resulting from the inherited disease, retinitis pigmentosa. RdCVF shows 33% similarity to TRX but does not exhibit any detectable thiol oxidoreductase activity. 48558 cd03009: Tryparedoxin (TryX)-like family, TryX and nucleoredoxin (NRX) subfamily; TryX and NRX are thioredoxin (TRX)-like protein disulfide oxidoreductases that alter the redox state of target proteins via the reversible oxidation of an active center CXXC motif. TryX is involved in the regulation of oxidative stress in parasitic trypanosomatids by reducing TryX peroxidase, which in turn catalyzes the reduction of hydrogen peroxide and organic hydroperoxides. TryX derives reducing equivalents from reduced trypanothione, a polyamine peptide conjugate unique to trypanosomatids, which is regenerated by the NADPH-dependent flavoprotein trypanothione reductase. Vertebrate NRX is a 400-amino acid nuclear protein with one redox active TRX domain containing a CPPC active site motif followed by one redox inactive TRX-like domain. Mouse NRX transcripts are expressed in all adult tissues but is restricted to the nervous system and limb buds in embryos. Plant NRX, longer than the vertebrate NRX by about 100-200 amino acids, is a nuclear protein containing a redox inactive TRX-like domain between two redox active TRX domains. Both vertebrate and plant NRXs show thiol oxidoreductase activity in vitro. Their localization in the nucleus suggests a role in the redox regulation of nuclear proteins such as transcription factors. 48559 cd03010: TlpA-like family, DsbE (also known as CcmG and CycY) subfamily; DsbE is a membrane-anchored, periplasmic TRX-like reductase containing a CXXC motif that specifically donates reducing equivalents to apocytochrome c via CcmH, another cytochrome c maturation (Ccm) factor with a redox active CXXC motif. Assembly of cytochrome c requires the ligation of heme to reduced thiols of the apocytochrome. In bacteria, this assembly occurs in the periplasm. The reductase activity of DsbE in the oxidizing environment of the periplasm is crucial in the maturation of cytochrome c. 48560 cd03011: TlpA-like family, suppressor for copper sensitivity D protein (ScsD) and actinobacterial DsbE homolog subfamily; composed of ScsD, the DsbE homolog of Mycobacterium tuberculosis (MtbDsbE) and similar proteins, all containing a redox-active CXXC motif. The Salmonella typhimurium ScsD is a thioredoxin-like protein which confers copper tolerance to copper-sensitive mutants of E. coli. MtbDsbE has been characterized as an oxidase in vitro, catalyzing the disulfide bond formation of substrates like hirudin. The reduced form of MtbDsbE is more stable than its oxidized form, consistent with an oxidase function. This is in contrast to the function of DsbE from gram-negative bacteria which is a specific reductase of apocytochrome c. 48561 cd03012: TlpA-like family, DipZ-like subfamily; composed uncharacterized proteins containing a TlpA-like TRX domain. Some members show domain architectures similar to that of E. coli DipZ protein (also known as DsbD). The only eukaryotic members of the TlpA family belong to this subfamily. TlpA is a disulfide reductase known to have a crucial role in the biogenesis of cytochrome aa3. 48562 cd03013: Peroxiredoxin (PRX) family, PRX5-like subfamily; members are similar to the human protein, PRX5, a homodimeric TRX peroxidase, widely expressed in tissues and found cellularly in mitochondria, peroxisomes and the cytosol. The cellular location of PRX5 suggests that it may have an important antioxidant role in organelles that are major sources of reactive oxygen species (ROS), as well as a role in the control of signal transduction. PRX5 has been shown to reduce hydrogen peroxide, alkyl hydroperoxides and peroxynitrite. As with all other PRXs, the N-terminal peroxidatic cysteine of PRX5 is oxidized into a sulfenic acid intermediate upon reaction with peroxides. Human PRX5 is able to resolve this intermediate by forming an intramolecular disulfide bond with its C-terminal cysteine (the resolving cysteine), which can then be reduced by TRX, just like an atypical 2-cys PRX. This resolving cysteine, however, is not conserved in other members of the subfamily. In such cases, it is assumed that the oxidized cysteine is directly resolved by an external small-molecule or protein reductant, typical of a 1-cys PRX. In the case of the H. influenza PRX5 hybrid, the resolving glutaredoxin domain is on the same protein chain as PRX. PRX5 homodimers show an A-type interface, similar to atypical 2-cys PRXs. 48563 cd03014: Peroxiredoxin (PRX) family, Atypical 2-cys PRX subfamily; composed of PRXs containing peroxidatic and resolving cysteines, similar to the homodimeric thiol specific antioxidant (TSA) protein also known as TRX-dependent thiol peroxidase (Tpx). Tpx is a bacterial periplasmic peroxidase which differs from other PRXs in that it shows substrate specificity toward alkyl hydroperoxides over hydrogen peroxide. As with all other PRXs, the peroxidatic cysteine (N-terminal) of Tpx is oxidized into a sulfenic acid intermediate upon reaction with peroxides. Tpx is able to resolve this intermediate by forming an intramolecular disulfide bond with a conserved C-terminal cysteine (the resolving cysteine), which can then be reduced by thioredoxin. This differs from the typical 2-cys PRX which resolves the oxidized cysteine by forming an intermolecular disulfide bond with the resolving cysteine from the other subunit of the homodimer. Atypical 2-cys PRX homodimers have a loop-based interface (A-type for alternate), in contrast with the B-type interface of typical 2-cys and 1-cys PRXs. 48564 cd03015: Peroxiredoxin (PRX) family, Typical 2-Cys PRX subfamily; PRXs are thiol-specific antioxidant (TSA) proteins, which confer a protective role in cells through its peroxidase activity by reducing hydrogen peroxide, peroxynitrite, and organic hydroperoxides. The functional unit of typical 2-cys PRX is a homodimer. A unique intermolecular redox-active disulfide center is utilized for its activity. Upon reaction with peroxides, its peroxidatic cysteine is oxidized into a sulfenic acid intermediate which is resolved by bonding with the resolving cysteine from the other subunit of the homodimer. This intermolecular disulfide bond is then reduced by thioredoxin, tryparedoxin or AhpF. Typical 2-cys PRXs, like 1-cys PRXs, form decamers which are stabilized by reduction of the active site cysteine. Typical 2-cys PRX interacts through beta strands at one edge of the monomer (B-type interface) to form the functional homodimer, and uses an A-type interface (similar to the dimeric interface in atypical 2-cys PRX and PRX5) at the opposite end of the monomer to form the stable decameric (pentamer of dimers) structure. 48565 cd03016: Peroxiredoxin (PRX) family, 1-cys PRX subfamily; composed of PRXs containing only one conserved cysteine, which serves as the peroxidatic cysteine. They are homodimeric thiol-specific antioxidant (TSA) proteins that confer a protective role in cells by reducing and detoxifying hydrogen peroxide, peroxynitrite, and organic hydroperoxides. As with all other PRXs, a cysteine sulfenic acid intermediate is formed upon reaction of 1-cys PRX with its substrates. Having no resolving cysteine, the oxidized enzyme is resolved by an external small-molecule or protein reductant such as thioredoxin or glutaredoxin. Similar to typical 2-cys PRX, 1-cys PRX forms a functional dimeric unit with a B-type interface, as well as a decameric structure which is stabilized in the reduced form of the enzyme. Other oligomeric forms, tetramers and hexamers, have also been reported. Mammalian 1-cys PRX is localized cellularly in the cytosol and is expressed at high levels in brain, eye, testes and lung. The seed-specific plant 1-cys PRXs protect tissues from reactive oxygen species during desiccation and are also called rehydrins. 48566 cd03017: Peroxiredoxin (PRX) family, Bacterioferritin comigratory protein (BCP) subfamily; composed of thioredoxin-dependent thiol peroxidases, widely expressed in pathogenic bacteria, that protect cells against toxicity from reactive oxygen species by reducing and detoxifying hydroperoxides. The protein was named BCP based on its electrophoretic mobility before its function was known. BCP shows substrate selectivity toward fatty acid hydroperoxides rather than hydrogen peroxide or alkyl hydroperoxides. BCP contains the peroxidatic cysteine but appears not to possess a resolving cysteine (some sequences, not all, contain a second cysteine but its role is still unknown). Unlike other PRXs, BCP exists as a monomer. The plant homolog of BCP is PRX Q, which is expressed only in leaves and is cellularly localized in the chloroplasts and the guard cells of stomata. Also included in this subfamily is the fungal nuclear protein, Dot5p (for disrupter of telomere silencing protein 5), which functions as an alkyl-hydroperoxide reductase during post-diauxic growth. 48567 cd03018: Peroxiredoxin (PRX) family, AhpE-like subfamily; composed of proteins similar to Mycobacterium tuberculosis AhpE. AhpE is described as a 1-cys PRX because of the absence of a resolving cysteine. The structure and sequence of AhpE, however, show greater similarity to 2-cys PRXs than 1-cys PRXs. PRXs are thiol-specific antioxidant (TSA) proteins that confer a protective role in cells through their peroxidase activity in which hydrogen peroxide, peroxynitrate, and organic hydroperoxides are reduced and detoxified using reducing equivalents derived from either thioredoxin, glutathione, trypanothione and AhpF. The first step of catalysis is the nucleophilic attack by the peroxidatic cysteine on the peroxide leading to the formation of a cysteine sulfenic acid intermediate. The absence of a resolving cysteine suggests that functional AhpE is regenerated by an external reductant. The solution behavior and crystal structure of AhpE show that it forms dimers and octamers. 48568 cd03019: DsbA family, DsbA subfamily; DsbA is a monomeric thiol disulfide oxidoreductase protein containing a redox active CXXC motif imbedded in a TRX fold. It is involved in the oxidative protein folding pathway in prokaryotes, and is the strongest thiol oxidant known, due to the unusual stability of the thiolate anion form of the first cysteine in the CXXC motif. The highly unstable oxidized form of DsbA directly donates disulfide bonds to reduced proteins secreted into the bacterial periplasm. This rapid and unidirectional process helps to catalyze the folding of newly-synthesized polypeptides. To regain catalytic activity, reduced DsbA is then reoxidized by the membrane protein DsbB, which generates its disulfides from oxidized quinones, which in turn are reoxidized by the electron transport chain. 48569 cd03020: DsbA family, DsbC and DsbG subfamily; V-shaped homodimeric proteins containing a redox active CXXC motif imbedded in a TRX fold. They function as protein disulfide isomerases and chaperones in the bacterial periplasm to correct non-native disulfide bonds formed by DsbA and prevent aggregation of incorrectly folded proteins. DsbC and DsbG are kept in their reduced state by the cytoplasmic membrane protein DsbD, which utilizes the TRX/TRX reductase system in the cytosol as a source of reducing equivalents. DsbG differ from DsbC in that it has a more limited substrate specificity, and it may preferentially act later in the folding process to catalyze disulfide rearrangements in folded or partially folded proteins. Also included in the alignment is the predicted protein TrbB, whose gene was sequenced from the enterohemorrhagic E. coli type IV pilus gene cluster, which is required for efficient plasmid transfer. 48570 cd03021: DsbA family, Glutathione (GSH) S-transferase Kappa (GSTK) subfamily; GSTK is a member of the GST family of enzymes which catalyzes the transfer of the thiol of GSH to electrophilic substrates. It is specifically located in the mitochondria and peroxisomes, unlike other members of the canonical GST family, which are mainly cytosolic. The biological substrates of GSTK are not yet known. It is presumed to have a protective role during respiration when large amounts of reactive oxygen species are generated. GSTK has the same general fold as DsbA, consisting of a thioredoxin domain interrupted by an alpha-helical domain and its biological unit is a homodimer. GSTK is closely related to the bacterial enzyme, 2-hydroxychromene-2-carboxylate (HCCA) isomerase. It shows little sequence similarity to the other members of the GST family. 48571 cd03022: DsbA family, 2-hydroxychromene-2-carboxylate (HCCA) isomerase subfamily; HCCA isomerase is a glutathione (GSH) dependent enzyme involved in the naphthalene catabolic pathway. It converts HCCA, a hemiketal formed spontaneously after ring cleavage of 1,2-dihydroxynapthalene by a dioxygenase, into cis-o-hydroxybenzylidenepyruvate (cHBPA). This is the fourth reaction in a six-step pathway that converts napthalene into salicylate. HCCA isomerase is unique to bacteria that degrade polycyclic aromatic compounds. It is closely related to the eukaryotic protein, GSH transferase kappa (GSTK).. 48572 cd03023: DsbA family, Com1-like subfamily; composed of proteins similar to Com1, a 27-kDa outer membrane-associated immunoreactive protein originally found in both acute and chronic disease strains of the pathogenic bacteria Coxiella burnetti. It contains a CXXC motif, assumed to be imbedded in a DsbA-like structure. Its homology to DsbA suggests that the protein is a protein disulfide oxidoreductase. The role of such a protein in pathogenesis is unknown. 48573 cd03024: DsbA family, FrnE subfamily; FrnE is a DsbA-like protein containing a CXXC motif. It is presumed to be a thiol oxidoreductase involved in polyketide biosynthesis, specifically in the production of the aromatic antibiotics frenolicin and nanaomycins. 48574 cd03025: DsbA family, FrnE-like subfamily; composed of uncharacterized proteins containing a CXXC motif with similarity to DsbA and FrnE. FrnE is presumed to be a thiol oxidoreductase involved in polyketide biosynthesis, specifically in the production of the aromatic antibiotics frenolicin and nanaomycins. 48575 cd03026: TRX-GRX-like family, Alkyl hydroperoxide reductase F subunit (AhpF) N-terminal domain (NTD) subfamily, C-terminal TRX-fold subdomain; AhpF is a homodimeric flavoenzyme which catalyzes the NADH-dependent reduction of the peroxiredoxin AhpC, which then reduces hydrogen peroxide and organic hydroperoxides. AhpF contains an NTD containing two contiguous TRX-fold subdomains similar to Pyrococcus furiosus protein disulfide oxidoreductase (PfPDO). It also contains a catalytic core similar to TRX reductase containing FAD and NADH binding domains with an active site disulfide. The proposed mechanism of action of AhpF is similar to a TRX/TRX reductase system. The flow of reducing equivalents goes from NADH -> catalytic core of AhpF -> NTD of AhpF -> AhpC -> peroxide substrates. The catalytic CXXC motif of the NTD of AhpF is contained in its C-terminal TRX subdomain. 48576 cd03027: Glutaredoxin (GRX) family, Dishevelled, Egl-10, and Pleckstrin (DEP) subfamily; composed of uncharacterized proteins containing a GRX domain and additional domains DEP and DUF547, both of which have unknown functions. GRX is a glutathione (GSH) dependent reductase containing a redox active CXXC motif in a TRX fold. It has preference for mixed GSH disulfide substrates, in which it uses a monothiol mechanism where only the N-terminal cysteine is required. By altering the redox state of target proteins, GRX is involved in many cellular functions. 48577 cd03028: Glutaredoxin (GRX) family, PKC-interacting cousin of TRX (PICOT)-like subfamily; composed of PICOT and GRX-PICOT-like proteins. The non-PICOT members of this family contain only the GRX-like domain, whereas PICOT contains an N-terminal TRX-like domain followed by one to three GRX-like domains. It is interesting to note that PICOT from plants contain three repeats of the GRX-like domain, metazoan proteins (except for insect) have two repeats, while fungal sequences contain only one copy of the domain. PICOT is a protein that interacts with protein kinase C (PKC) theta, a calcium independent PKC isoform selectively expressed in skeletal muscle and T lymphocytes. PICOT inhibits the activation of c-Jun N-terminal kinase and the transcription factors, AP-1 and NF-kB, induced by PKC theta or T-cell activating stimuli. Both GRX and TRX domains of PICOT are required for its activity. Characterized non-PICOT members of this family include CXIP1, a CAX-interacting protein in Arabidopsis thaliana, and PfGLP-1, a GRX-like protein from Plasmodium falciparum. 48578 cd03029: Glutaredoxin (GRX) family, PRX5 hybrid subfamily; composed of hybrid proteins containing peroxiredoxin (PRX) and GRX domains, which is found in some pathogenic bacteria and cyanobacteria. PRXs are thiol-specific antioxidant (TSA) proteins that confer a protective antioxidant role in cells through their peroxidase activity in which hydrogen peroxide, peroxynitrate, and organic hydroperoxides are reduced and detoxified using reducing equivalents derived from either thioredoxin, glutathione, trypanothione and AhpF. GRX is a glutathione (GSH) dependent reductase, catalyzing the disulfide reduction of target proteins. PRX-GRX hybrid proteins from Haemophilus influenza and Neisseria meningitis exhibit GSH-dependent peroxidase activity. The flow of reducing equivalents in the catalytic cycle of the hybrid protein goes from NADPH -> GSH reductase -> GSH -> GRX domain of hybrid -> PRX domain of hybrid -> peroxide substrate. 48579 cd03030: Glutaredoxin (GRX) family, SH3BGR (SH3 domain binding glutamic acid-rich protein) subfamily; a recently-identified subfamily composed of SH3BGR and similar proteins possessing significant sequence similarity to GRX, but without a redox active CXXC motif. The SH3BGR gene was cloned in an effort to identify genes mapping to chromosome 21, which could be involved in the pathogenesis of congenital heart disease affecting Down syndrome newborns. Several human SH3BGR-like (SH3BGRL) genes have been identified since, mapping to different locations in the chromosome. Of these, SH3BGRL3 was identified as a tumor necrosis factor (TNF) alpha inhibitory protein and was also named TIP-B1. Upregulation of expression of SH3BGRL3 is associated with differentiation. It has been suggested that it functions as a regulator of differentiation-related signal transduction pathways. 48580 cd03031: Glutaredoxin (GRX) family, GRX-like domain containing protein subfamily; composed of uncharacterized eukaryotic proteins containing a GRX-like domain having only one conserved cysteine, aligning to the C-terminal cysteine of the CXXC motif of GRXs. This subfamily is predominantly composed of plant proteins. GRX is a glutathione (GSH) dependent reductase, catalyzing the disulfide reduction of target proteins via a redox active CXXC motif using a similar dithiol mechanism employed by TRXs. GRX has preference for mixed GSH disulfide substrates, in which it uses a monothiol mechanism where only the N-terminal cysteine is required. Proteins containing only the C-terminal cysteine are generally redox inactive. 48581 cd03032: Arsenate Reductase (ArsC) family, Spx subfamily; Spx is a unique RNA polymerase (RNAP)-binding protein present in bacilli and some mollicutes. It inhibits transcription by binding to the C-terminal domain of the alpha subunit of RNAP, disrupting complex formation between RNAP and certain transcriptional activator proteins like ResD and ComA. In response to oxidative stress, Spx can also activate transcription, making it a general regulator that exerts both positive and negative control over transcription initiation. Spx has been shown to exert redox-sensitive transcriptional control over genes like trxA (TRX) and trxB (TRX reductase), genes that function in thiol homeostasis. This redox-sensitive activity is dependent on the presence of a CXXC motif, present in some members of the Spx subfamily, that acts as a thiol/disulfide switch. Spx has also been shown to repress genes in a sulfate-dependent manner independent of the presence of the CXXC motif. 48582 cd03033: Arsenate Reductase (ArsC) family, 15kD protein subfamily; composed of proteins of unknown function with similarity to thioredoxin-fold arsenic reductases, ArsC. It is encoded by an ORF present in a gene cluster associated with nitrogen fixation that also encodes dinitrogenase reductase ADP-ribosyltransferase (DRAT) and dinitrogenase reductase activating glycohydrolase (DRAG). ArsC catalyzes the reduction of arsenate [As(V)] to arsenite [As(III)], using reducing equivalents derived from glutathione via glutaredoxin, through a single catalytic cysteine. 48583 cd03034: Arsenate Reductase (ArsC) family, ArsC subfamily; arsenic reductases similar to that encoded by arsC on the R733 plasmid of Escherichia coli. E. coli ArsC catalyzes the reduction of arsenate [As(V)] to arsenite [As(III)], the first step in the detoxification of arsenic, using reducing equivalents derived from glutathione (GSH) via glutaredoxin (GRX). ArsC contains a single catalytic cysteine, within a thioredoxin fold, that forms a covalent thiolate-As(V) intermediate, which is reduced by GRX through a mixed GSH-arsenate intermediate. This family of predominantly bacterial enzymes is unrelated to two other families of arsenate reductases which show similarity to low-molecular-weight acid phosphatases and phosphotyrosyl phosphatases. 48584 cd03035: Arsenate Reductase (ArsC) family, Yffb subfamily; Yffb is an uncharacterized bacterial protein encoded by the yffb gene, related to the thioredoxin-fold arsenic reductases, ArsC. The structure of Yffb and the conservation of the catalytic cysteine suggest that it is likely to function as a glutathione (GSH)-dependent thiol reductase. ArsC catalyzes the reduction of arsenate [As(V)] to arsenite [As(III)], using reducing equivalents derived from GSH via glutaredoxin, through a single catalytic cysteine. 48585 cd03036: Arsenate Reductase (ArsC) family, unknown subfamily; uncharacterized proteins containing a CXXC motif with similarity to thioredoxin (TRX)-fold arsenic reductases, ArsC. Proteins containing a redox active CXXC motif like TRX and glutaredoxin (GRX) function as protein disulfide oxidoreductases, altering the redox state of target proteins via the reversible oxidation of the active site dithiol. ArsC catalyzes the reduction of arsenate [As(V)] to arsenite [As(III)], using reducing equivalents derived from glutathione via GRX, through a single catalytic cysteine. 48586 cd03037: GST_N family, Glutaredoxin 2 (GRX2) subfamily; composed of bacterial proteins similar to E. coli GRX2, an atypical GRX with a molecular mass of about 24kD, compared with other GRXs which are 9-12kD in size. GRX2 adopts a GST fold containing an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain. It contains a redox active CXXC motif located in the N-terminal domain but is not able to reduce ribonucleotide reductase like other GRXs. However, it catalyzes GSH-dependent protein disulfide reduction of other substrates efficiently. GRX2 is thought to function primarily in catalyzing the reversible glutathionylation of proteins in cellular redox regulation including stress responses. 48587 cd03038: GST_N family, Beta etherase LigE subfamily; composed of proteins similar to Sphingomonas paucimobilis beta etherase, LigE, a GST-like protein that catalyzes the cleavage of the beta-aryl ether linkages present in low-moleculer weight lignins using GSH as the hydrogen donor. This reaction is an essential step in the degradation of lignin, a complex phenolic polymer that is the most abundant aromatic material in the biosphere. The beta etherase activity of LigE is enantioselective and it complements the activity of the other GST family beta etherase, LigF. 48588 cd03039: GST_N family, Class Sigma_like; composed of GSTs belonging to class Sigma and similar proteins, including GSTs from class Mu, Pi and Alpha. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. Vertebrate class Sigma GSTs are characterized as GSH-dependent hematopoietic prostaglandin (PG) D synthases and are responsible for the production of PGD2 by catalyzing the isomerization of PGH2. The functions of PGD2 include the maintenance of body temperature, inhibition of platelet aggregation, bronchoconstriction, vasodilation and mediation of allergy and inflammation. Other class Sigma members include the class II insect GSTs, S-crystallins from cephalopods and 28-kDa GSTs from parasitic flatworms. Drosophila GST2 is associated with indirect flight muscle and exhibits preference for catalyzing GSH conjugation to lipid peroxidation products, indicating an anti-oxidant role. S-crystallin constitutes the major lens protein in cephalopod eyes and is responsible for lens transparency and proper refractive index. The 28-kDa GST from Schistosoma is a multifunctional enzyme, exhibiting GSH transferase, GSH peroxidase and PGD2 synthase activities, and may play an important role in host-parasite interactions. Also members are novel GSTs from the fungus Cunninghamella elegans, designated as class Gamma, and from the protozoan Blepharisma japonicum, described as a light-inducible GST. 48589 cd03040: GST_N family; microsomal Prostaglandin E synthase Type 2 (mPGES2) subfamily; mPGES2 is a membrane-anchored dimeric protein containing a CXXC motif which catalyzes the isomerization of PGH2 to PGE2. Unlike cytosolic PGE synthase (cPGES) and microsomal PGES Type 1 (mPGES1), mPGES2 does not require glutathione (GSH) for its activity, although its catalytic rate is increased two- to four-fold in the presence of DTT, GSH or other thiol compounds. PGE2 is widely distributed in various tissues and is implicated in the sleep/wake cycle, relaxation/contraction of smooth muscle, excretion of sodium ions, maintenance of body temperature and mediation of inflammation. mPGES2 contains an N-terminal hydrophobic domain which is membrane associated, and a C-terminal soluble domain with a GST-like structure. 48590 cd03041: GST_N family, 2 repeats of the N-terminal domain of soluble GSTs (2 GST_N) subfamily; composed of uncharacterized proteins. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. 48591 cd03042: GST_N family, Class Zeta subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. Class Zeta GSTs, also known as maleylacetoacetate (MAA) isomerases, catalyze the isomerization of MAA to fumarylacetoacetate, the penultimate step in tyrosine/phenylalanine catabolism, using GSH as a cofactor. They show little GSH-conjugating activity towards traditional GST substrates but display modest GSH peroxidase activity. They are also implicated in the detoxification of the carcinogen dichloroacetic acid by catalyzing its dechlorination to glyoxylic acid. 48592 cd03043: GST_N family, unknown subfamily 1; composed of uncharacterized proteins, predominantly from bacteria, with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. 48593 cd03044: GST_N family, Gamma subunit of Elongation Factor 1B (EFB1gamma) subfamily; EF1Bgamma is part of the eukaryotic translation elongation factor-1 (EF1) complex which plays a central role in the elongation cycle during protein biosynthesis. EF1 consists of two functionally distinct units, EF1A and EF1B. EF1A catalyzes the GTP-dependent binding of aminoacyl-tRNA to the ribosomal A site concomitant with the hydrolysis of GTP. The resulting inactive EF1A:GDP complex is recycled to the active GTP form by the guanine-nucleotide exchange factor EF1B, a complex composed of at least two subunits, alpha and gamma. Metazoan EFB1 contain a third subunit, beta. The EF1B gamma subunit contains a GST fold consisting of an N-terminal TRX-fold domain and a C-terminal alpha helical domain. The GST-like domain of EF1Bgamma is believed to mediate the dimerization of the EF1 complex, which in yeast is a dimer of the heterotrimer EF1A:EF1Balpha:EF1Bgamma. In addition to its role in protein biosynthesis, EF1Bgamma may also display other functions. The recombinant rice protein has been shown to possess GSH conjugating activity. The yeast EF1Bgamma binds membranes in a calcium dependent manner and is also part of a complex that binds to the msrA (methionine sulfoxide reductase) promoter suggesting a function in the regulation of its gene expression. 48594 cd03045: GST_N family, Class Delta and Epsilon subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. The class Delta and Epsilon subfamily is made up primarily of insect GSTs, which play major roles in insecticide resistance by facilitating reductive dehydrochlorination of insecticides or conjugating them with GSH to produce water-soluble metabolites that are easily excreted. They are also implicated in protection against cellular damage by oxidative stress. 48595 cd03046: GST_N family, Saccharomyces cerevisiae GTT1-like subfamily; composed of predominantly uncharacterized proteins with similarity to the S. cerevisiae GST protein, GTT1, and the Schizosaccharomyces pombe GST-III. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GTT1, a homodimer, exhibits GST activity with standard substrates and associates with the endoplasmic reticulum. Its expression is induced after diauxic shift and remains high throughout the stationary phase. S. pombe GST-III is implicated in the detoxification of various metals. 48596 cd03047: GST_N family, unknown subfamily 2; composed of uncharacterized bacterial proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. The sequence from Burkholderia cepacia was identified as part of a gene cluster involved in the degradation of 2,4,5-trichlorophenoxyacetic acid. Some GSTs (e.g. Class Zeta and Delta) are known to catalyze dechlorination reactions. 48597 cd03048: GST_N family, Ure2p-like subfamily; composed of the Saccharomyces cerevisiae Ure2p and related GSTs. Ure2p is a regulator for nitrogen catabolism in yeast. It represses the expression of several gene products involved in the use of poor nitrogen sources when rich sources are available. A transmissible conformational change of Ure2p results in a prion called [Ure3], an inactive, self-propagating and infectious amyloid. Ure2p displays a GST fold containing an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. The N-terminal TRX-fold domain is sufficient to induce the [Ure3] phenotype and is also called the prion domain of Ure2p. In addition to its role in nitrogen regulation, Ure2p confers protection to cells against heavy metal ion and oxidant toxicity, and shows glutathione (GSH) peroxidase activity. Characterized GSTs in this subfamily include Aspergillus fumigatus GSTs 1 and 2, and Schizosaccharomyces pombe GST-I. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of GSH with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. 48598 cd03049: GST_N family, unknown subfamily 3; composed of uncharacterized bacterial proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. 48599 cd03050: GST_N family, Class Theta subfamily; composed of eukaryotic class Theta GSTs and bacterial dichloromethane (DCM) dehalogenase. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. Mammalian class Theta GSTs show poor GSH conjugating activity towards the standard substrates, CDNB and ethacrynic acid, differentiating them from other mammalian GSTs. GSTT1-1 shows similar cataytic activity as bacterial DCM dehalogenase, catalyzing the GSH-dependent hydrolytic dehalogenation of dihalomethanes. This is an essential process in methylotrophic bacteria to enable them to use chloromethane and DCM as sole carbon and energy sources. The presence of polymorphisms in human GSTT1-1 and its relationship to the onset of diseases including cancer is subject of many studies. Human GSTT2-2 exhibits a highly specific sulfatase activity, catalyzing the cleavage of sulfate ions from aralkyl sufate esters, but not from aryl or alkyl sulfate esters. 48600 cd03051: GST_N family, Saccharomyces cerevisiae GTT2-like subfamily; composed of predominantly uncharacterized proteins with similarity to the S. cerevisiae GST protein, GTT2. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GTT2, a homodimer, exhibits GST activity with standard substrates. Strains with deleted GTT2 genes are viable but exhibit increased sensitivity to heat shock. 48601 cd03052: GST_N family, Ganglioside-induced differentiation-associated protein 1 (GDAP1) subfamily; GDAP1 was originally identified as a highly expressed gene at the differentiated stage of GD3 synthase-transfected cells. More recently, mutations in GDAP1 have been reported to cause both axonal and demyelinating autosomal-recessive Charcot-Marie-Tooth (CMT) type 4A neuropathy. CMT is characterized by slow and progressive weakness and atrophy of muscles. Sequence analysis of GDAP1 shows similarities and differences with GSTs; it appears to contain both N-terminal TRX-fold and C-terminal alpha helical domains of GSTs, however, it also contains additional C-terminal transmembrane domains unlike GSTs. GDAP1 is mainly expressed in neuronal cells and is localized in the mitochondria through its transmembrane domains. It does not exhibit GST activity using standard substrates. 48602 cd03053: GST_N family, Class Phi subfamily; composed of plant-specific class Phi GSTs and related fungal and bacterial proteins. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. The class Phi GST subfamily has experience extensive gene duplication. The Arabidopsis and Oryza genomes contain 13 and 16 Phi GSTs, respectively. They are primarily responsible for herbicide detoxification together with class Tau GSTs, showing class specificity in substrate preference. Phi enzymes are highly reactive toward chloroacetanilide and thiocarbamate herbicides. Some Phi GSTs have other functions including transport of flavonoid pigments to the vacuole, shoot regeneration and GSH peroxidase activity. 48603 cd03054: GST_N family, Metaxin subfamily; composed of metaxins and related proteins. Metaxin 1 is a component of a preprotein import complex of the mitochondrial outer membrane. It extends to the cytosol and is anchored to the mitochondrial membrane through its C-terminal domain. In mice, metaxin is required for embryonic development. In humans, alterations in the metaxin gene may be associated with Gaucher disease. Metaxin 2 binds to metaxin 1 and may also play a role in protein translocation into the mitochondria. Genome sequencing shows that a third metaxin gene also exists in zebrafish, Xenopus, chicken and mammals. Sequence analysis suggests that all three metaxins share a common ancestry and that they possess similarity to GSTs. Also included in the subfamily are uncharacterized proteins with similarity to metaxins, including a novel GST from Rhodococcus with toluene o-monooxygenase and glutamylcysteine synthetase activities. 48604 cd03055: GST_N family, Class Omega subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. Class Omega GSTs show little or no GSH-conjugating activity towards standard GST substrates. Instead, they catalyze the GSH dependent reduction of protein disulfides, dehydroascorbate and monomethylarsonate, activities which are more characteristic of glutaredoxins. They contain a conserved cysteine equivalent to the first cysteine in the CXXC motif of glutaredoxins, which is a redox active residue capable of reducing GSH mixed disulfides in a monothiol mechanism. Polymorphisms of the class Omega GST genes may be associated with the development of some types of cancer and the age-at-onset of both Alzheimer's and Parkinson's diseases. 48605 cd03056: GST_N family, unknown subfamily 4; composed of uncharacterized bacterial proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. 48606 cd03057: GST_N family, Class Beta subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. Unlike mammalian GSTs which detoxify a broad range of compounds, the bacterial class Beta GSTs exhibit limited GSH conjugating activity with a narrow range of substrates. In addition to GSH conjugation, they also bind antibiotics and reduce the antimicrobial activity of beta-lactam drugs. The structure of the Proteus mirabilis enzyme reveals that the cysteine in the active site forms a covalent bond with GSH. 48607 cd03058: GST_N family, Class Tau subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. The plant-specific class Tau GST subfamily has undergone extensive gene duplication. The Arabidopsis and Oryza genomes contain 28 and 40 Tau GSTs, respectively. They are primarily responsible for herbicide detoxification together with class Phi GSTs, showing class specificity in substrate preference. Tau enzymes are highly efficient in detoxifying diphenylether and aryloxyphenoxypropionate herbicides. In addition, Tau GSTs play important roles in intracellular signalling, biosynthesis of anthocyanin, responses to soil stresses and responses to auxin and cytokinin hormones. 48608 cd03059: GST_N family, Stringent starvation protein A (SspA) subfamily; SspA is a RNA polymerase (RNAP)-associated protein required for the lytic development of phage P1 and for stationary phase-induced acid tolerance of E. coli. It is implicated in survival during nutrient starvation. SspA adopts the GST fold with an N-terminal TRX-fold domain and a C-terminal alpha helical domain, but it does not bind glutathione (GSH) and lacks GST activity. SspA is highly conserved among gram-negative bacteria. Related proteins found in Neisseria (called RegF), Francisella and Vibrio regulate the expression of virulence factors necessary for pathogenesis. 48609 cd03060: GST_N family, Omega-like subfamily; composed of uncharacterized proteins with similarity to class Omega GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. Class Omega GSTs show little or no GSH-conjugating activity towards standard GST substrates. Instead, they catalyze the GSH dependent reduction of protein disulfides, dehydroascorbate and monomethylarsonate, activities which are more characteristic of glutaredoxins. Like Omega enzymes, proteins in this subfamily contain a conserved cysteine equivalent to the first cysteine in the CXXC motif of glutaredoxins, which is a redox active residue capable of reducing GSH mixed disulfides in a monothiol mechanism. 48610 cd03061: GST_N family, Chloride Intracellular Channel (CLIC) subfamily; composed of CLIC1-5, p64, parchorin and similar proteins. They are auto-inserting, self-assembling intracellular anion channels involved in a wide variety of functions including regulated secretion, cell division and apoptosis. They can exist in both water-soluble and membrane-bound states, and are found in various vesicles and membranes. Biochemical studies of the C. elegans homolog, EXC-4, show that the membrane localization domain is present in the N-terminal part of the protein. The structure of soluble human CLIC1 reveals that it is monomeric and it adopts a fold similar to GSTs, containing an N-terminal domain with a TRX fold and a C-terminal alpha helical domain. Upon oxidation, the N-terminal domain of CLIC1 undergoes a structural change to form a non-covalent dimer stabilized by the formation of an intramolecular disulfide bond between two cysteines that are far apart in the reduced form. The CLIC1 dimer bears no similarity to GST dimers. The redox-controlled structural rearrangement exposes a large hydrophobic surface, which is masked by dimerization in vitro. In vivo, this surface may represent the docking interface of CLIC1 in its membrane-bound state. The two cysteines in CLIC1 that form the disulfide bond in oxidizing conditions are essential for dimerization and chloride channel activity, however, in other subfamily members, the second cysteine is not conserved. 48611 cd03062: TRX-like [2Fe-2S] Ferredoxin (Fd) family, Sucrase subfamily; composed of proteins with similarity to a novel plant enzyme, isolated from potato, which contains a Fd-like domain and exhibits sucrolytic activity. The putative active site of the Fd-like domain of the enzyme contains two cysteines and two histidines for possible binding to iron-sulfur clusters, compared to four cysteines present in the active site of Fd. 48612 cd03063: TRX-like [2Fe-2S] Ferredoxin (Fd) family, NAD-dependent formate dehydrogenase (FDH) beta subunit; composed of proteins similar to the beta subunit of NAD-linked FDH of Ralstonia eutropha, a soluble enzyme that catalyzes the irreversible oxidation of formate to carbon dioxide accompanied by the reduction of NAD to NADH. FDH is a heteromeric enzyme composed of four nonidentical subunits (alpha, beta, gamma and delta). The FDH beta subunit contains a NADH:ubiquinone oxidoreductase (Nuo) F domain C-terminal to a Fd-like domain without the active site cysteines. The absence of conserved metal-binding residues in the putative active site suggests that members of this subfamily have lost the ability to bind iron-sulfur clusters in the N-terminal Fd-like domain. The C-terminal NuoF domain is a component of Nuo, a multisubunit complex catalyzing the electron transfer of NADH to quinone coupled with the transfer of protons across the membrane. NuoF contains one [4Fe-4S] cluster and binds NADH and FMN. 48613 cd03064: TRX-like [2Fe-2S] Ferredoxin (Fd) family, NADH:ubiquinone oxidoreductase (Nuo) subunit E subfamily; Nuo, also called respiratory chain Complex 1, is the entry point for electrons into the respiratory chains of bacteria and the mitochondria of eukaryotes. It is a multisubunit complex with at least 14 core subunits. It catalyzes the electron transfer of NADH to quinone coupled with the transfer of protons across the membrane, providing the proton motive force required for energy-consuming processes. Electrons are transferred from NADH to quinone through a chain of iron-sulfur clusters in Nuo, including the [2Fe-2S] cluster present in NuoE core subunit, also called the 24 kD subunit of Complex 1. This subfamily also include formate dehydrogenases, NiFe hydrogenases and NAD-reducing hydrogenases, that contain a NuoE domain. A subset of these proteins contain both NuoE and NuoF in a single chain. NuoF, also called the 51 kD subunit of Complex 1, contains one [4Fe-4S] cluster and also binds the NADH substrate and FMN. 48614 cd03065: PDIb family, Calsequestrin subfamily, N-terminal TRX-fold domain; Calsequestrin is the major calcium storage protein in the sarcoplasmic reticulum (SR) of skeletal and cardiac muscle. It stores calcium ions in sufficient quantities (up to 20 mM) to allow repetitive contractions and is essential to maintain movement, respiration and heart beat. A missense mutation in human cardiac calsequestrin is associated with catecholamine-induced polymorphic ventricular tachycardia (CPVT), a rare disease characterized by seizures or sudden death in response to physiologic or emotional stress. Calsequestrin is a highly acidic protein with up to 50 calcium binding sites formed simply by the clustering of two or more acidic residues. The monomer contains three redox inactive TRX-fold domains. Calsequestrin is condensed as a linear polymer in the SR lumen and is membrane-anchored through binding with intra-membrane proteins triadin, junctin and ryanodine receptor (RyR) Ca2+ release channel. In addition to its role as a calcium ion buffer, calsequestrin also regulates the activity of the RyR channel, coordinating the release of calcium ions from the SR with the loading of the calcium store. The N-terminal TRX-fold domain (or domain I) mediates front-to-front dimer interaction, an important feature in the formation of calsequestrin polymers. 48615 cd03066: PDIb family, Calsequestrin subfamily, Middle TRX-fold domain; Calsequestrin is the major calcium storage protein in the sarcoplasmic reticulum (SR) of skeletal and cardiac muscle. It stores calcium ions in sufficient quantities (up to 20 mM) to allow repetitive contractions and is essential to maintain movement, respiration and heart beat. A missense mutation in human cardiac calsequestrin is associated with catecholamine-induced polymorphic ventricular tachycardia (CPVT), a rare disease characterized by seizures or sudden death in response to physiologic or emotional stress. Calsequestrin is a highly acidic protein with up to 50 calcium binding sites formed simply by the clustering of two or more acidic residues. The monomer contains three redox inactive TRX-fold domains. Calsequestrin is condensed as a linear polymer in the SR lumen and is membrane-anchored through binding with intra-membrane proteins triadin, junctin and ryanodine receptor (RyR) Ca2+ release channel. In addition to its role as a calcium ion buffer, calsequestrin also regulates the activity of the RyR channel, coordinating the release of calcium ions from the SR with the loading of the calcium store. 48616 cd03067: PDIb family, PDIR subfamily, N-terminal TRX-like b domain; composed of proteins similar to human PDIR (for Protein Disulfide Isomerase Related). PDIR is composed of three redox active TRX (a) domains and an N-terminal redox inactive TRX-like (b) domain. Similar to PDI, it is involved in oxidative protein folding in the endoplasmic reticulum (ER) through its isomerase and chaperone activities. These activities are lower compared to PDI, probably due to PDIR acting only on a subset of proteins. PDIR is preferentially expressed in cells actively secreting proteins and its expression is induced by stress. Similar to PDI, the isomerase and chaperone activities of PDIR are independent; CXXC mutants lacking isomerase activity retain chaperone activity. The TRX-like b domain of PDIR is critical for its chaperone activity. 48617 cd03068: PDIb family, ERp72 subfamily, first redox inactive TRX-like domain b; ERp72 exhibits both disulfide oxidase and reductase functions like PDI, by catalyzing the formation of disulfide bonds of newly synthesized polypeptides in the ER and acting as isomerases to correct any non-native disulfide bonds. It also displays chaperone activity to prevent protein aggregation and facilitate the folding of newly synthesized proteins. ERp72 contains three redox-active TRX (a) domains and two redox inactive TRX-like (b) domains. Its molecular structure is a""abb'a ', compared to the abb'a' structure of PDI. ERp72 associates with several ER chaperones and folding factors to form complexes in the ER that bind nascent proteins. Similar to PDI, the b domain of ERp72 is likely involved in binding to substrates. 48618 cd03069: PDIb family, ERp57 subfamily, first redox inactive TRX-like domain b; ERp57 (or ERp60) exhibits both disulfide oxidase and reductase functions like PDI, by catalyzing the formation of disulfide bonds of newly synthesized polypeptides in the ER and acting as isomerases to correct any non-native disulfide bonds. It also displays chaperone activity to prevent protein aggregation and facilitate the folding of newly synthesized proteins. ERp57 contains two redox-active TRX (a) domains and two redox inactive TRX-like (b) domains. It shares the same domain arrangement of abb'a' as PDI, but lacks the C-terminal acid-rich region (c domain) that is present in PDI. ERp57 interacts with the lectin chaperones, calnexin and calreticulin, and specifically promotes the oxidative folding of glycoproteins. Similar to PDI, the b domain of ERp57 is likely involved in binding to substrates. 48619 cd03070: PDIb family, ERp44 subfamily, first redox inactive TRX-like domain b; ERp44 is an endoplasmic reticulum (ER)-resident protein, induced during stress, involved in thiol-mediated ER retention. It contains an N-terminal TRX domain with a CXFS motif followed by two redox inactive TRX-like domains, homologous to the b and b' domains of PDI. Through the formation of reversible mixed disulfides, ERp44 mediates the ER localization of Ero1alpha, a protein that oxidizes protein disulfide isomerases into their active form. ERp44 also prevents the secretion of unassembled cargo protein with unpaired cysteines. ERp44 also modulates the activity of inositol 1,4,5-triphosphate type I receptor (IP3R1), an intracellular channel protein that mediates calcium release from the ER to the cytosol. Similar to PDI, the b domain of ERp44 is likely involved in binding to substrates. 48620 cd03071: PDIb' family, NRX subgroup, redox inactive TRX-like domain b'; composed of vertebrate nucleoredoxins (NRX). NRX is a 400-amino acid nuclear protein with one redox active TRX domain followed by one redox inactive TRX-like domain homologous to the b' domain of PDI. In vitro studies show that NRX has thiol oxidoreductase activity and that it may be involved in the redox regulation of transcription, in a manner different from that of TRX or glutaredoxin. NRX enhances the activation of NF-kB by TNFalpha, as well as PMA-1 induced AP-1 and FK-induced CREB activation. Mouse NRX transcripts are expressed in all adult tissues but is restricted to the nervous system and limb buds in embryos. The mouse NRX gene is implicated in streptozotocin-induced diabetes. Similar to PDI, the b' domain of NRX is likely involved in substrate recognition. 48621 cd03072: PDIb' family, ERp44 subfamily, second redox inactive TRX-like domain b'; ERp44 is an endoplasmic reticulum (ER)-resident protein, induced during stress, involved in thiol-mediated ER retention. It contains an N-terminal TRX domain with a CXFS motif followed by two redox inactive TRX-like domains, homologous to the b and b' domains of PDI. Through the formation of reversible mixed disulfides, ERp44 mediates the ER localization of Ero1alpha, a protein that oxidizes protein disulfide isomerases into their active form. ERp44 also prevents the secretion of unassembled cargo protein with unpaired cysteines. ERp44 also modulates the activity of inositol 1,4,5-triphosphate type I receptor (IP3R1), an intracellular channel protein that mediates calcium release from the ER to the cytosol. Similar to PDI, the b' domain of ERp44 is likely involved in substrate recognition and may be the primary binding site. 48622 cd03073: PDIb' family, ERp72 and ERp57 subfamily, second redox inactive TRX-like domain b'; ERp72 and ER57 are involved in oxidative protein folding in the ER, like PDI. They exhibit both disulfide oxidase and reductase functions, by catalyzing the formation of disulfide bonds of newly synthesized polypeptides and acting as isomerases to correct any non-native disulfide bonds. They also display chaperone activity to prevent protein aggregation and facilitate the folding of newly synthesized proteins. ERp57 contains two redox-active TRX (a) domains and two redox inactive TRX-like (b) domains. It shares the same domain arrangement of abb'a' as PDI, but lacks the C-terminal acid-rich region (c domain) that is present in PDI. ERp72 contains one additional redox-active TRX (a) domain at the N-terminus with a molecular structure of a""abb'a'. ERp57 interacts with the lectin chaperones, calnexin and calreticulin, and specifically promotes the oxidative folding of glycoproteins. ERp72 associates with several ER chaperones and folding factors to form complexes in the ER that bind nascent proteins. The b' domain of ERp57 is the primary binding site and is adapted for ER lectin association. Similarly, the b' domain of ERp72 is likely involved in substrate recognition. 48623 cd03074: Protein Disulfide Isomerase (PDIb') family, Calsequestrin subfamily, C-terminal TRX-fold domain; Calsequestrin is the major calcium storage protein in the sarcoplasmic reticulum (SR) of skeletal and cardiac muscle. It stores calcium ions in sufficient quantities (up to 20 mM) to allow repetitive contractions and is essential to maintain movement, respiration and heart beat. A missense mutation in human cardiac calsequestrin is associated with catecholamine-induced polymorphic ventricular tachycardia (CPVT), a rare disease characterized by seizures or sudden death in response to physiologic or emotional stress. Calsequestrin is a highly acidic protein with up to 50 calcium binding sites formed simply by the clustering of two or more acidic residues. The monomer contains three redox inactive TRX-fold domains. Calsequestrin is condensed as a linear polymer in the SR lumen and is membrane-anchored through binding with intra-membrane proteins triadin, junctin and ryanodine receptor (RyR) Ca2+ release channel. In addition to its role as a calcium ion buffer, calsequestrin also regulates the activity of the RyR channel, coordinating the release of calcium ions from the SR with the loading of the calcium store. The C-terminal TRX-fold domain (or domain III) mediates back-to-back dimer interaction and also contriubutes to the front-to-front dimer interface, both of which are important features in the formation of calsequestrin polymers. 48624 cd03075: GST_N family, Class Mu subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. The class Mu subfamily is composed of eukaryotic GSTs. In rats, at least six distinct class Mu subunits have been identified, with homologous genes in humans for five of these subunits. Class Mu GSTs can form homodimers and heterodimers, giving a large number of possible isoenzymes that can be formed, all with overlapping activities but different substrate specificities. They are the most abundant GSTs in human liver, skeletal muscle and brain, and are believed to provide protection against diseases including cancer and neurodegenerative disorders. Some isoenzymes have additional specific functions. Human GST M1-1 acts as an endogenous inhibitor of ASK1 (apoptosis signal-regulating kinase 1), thereby suppressing ASK1-mediated cell death. Human GSTM2-2 and 3-3 have been identified as prostaglandin E2 synthases in the brain and may play crucial roles in temperature and sleep-wake regulation. 48625 cd03076: GST_N family, Class Pi subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. Class Pi GST is a homodimeric eukaryotic protein. The human GSTP1 is mainly found in erythrocytes, kidney, placenta and fetal liver. It is involved in stress responses and in cellular proliferation pathways as an inhibitor of JNK (c-Jun N-terminal kinase). Following oxidative stress, monomeric GSTP1 dissociates from JNK and dimerizes, losing its ability to bind JNK and causing an increase in JNK activity, thereby promoting apoptosis. GSTP1 is expressed in various tumors and is the predominant GST in a wide range of cancer cells. It has been implicated in the development of multidrug-resistant tumours. 48626 cd03077: GST_N family, Class Alpha subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. The class Alpha subfamily is composed of eukaryotic GSTs which can form homodimer and heterodimers. There are at least six types of class Alpha GST subunits in rats, four of which have human counterparts, resulting in many possible isoenzymes with different activities, tissue distribution and substrate specificities. Human GSTA1-1 and GSTA2-2 show high GSH peroxidase activity. GSTA3-3 catalyzes the isomerization of intermediates in steroid hormone biosynthesis. GSTA4-4 preferentially catalyzes the GSH conjugation of alkenals. 48627 cd03078: GST_N family, Metaxin subfamily, Metaxin 1-like proteins; composed of metaxins 1 and 3, and similar proteins including Tom37 from fungi. Mammalian metaxin (or metaxin 1) and the fungal protein Tom37 are components of preprotein import complexes of the mitochondrial outer membrane. Metaxin extends to the cytosol and is anchored to the mitochondrial membrane through its C-terminal domain. In mice, metaxin is required for embryonic development. Like the murine gene, the human metaxin gene is located downstream to the glucocerebrosidase (GBA) pseudogene and is convergently transcribed. Inherited deficiency of GBA results in Gaucher disease, which presents many diverse clinical phenotypes. Alterations in the metaxin gene, in addition to GBA mutations, may be associated with Gaucher disease. Genome sequencing shows that a third metaxin gene also exists in zebrafish, Xenopus, chicken and mammals. 48628 cd03079: GST_N family, Metaxin subfamily, Metaxin 2; a metaxin 1 binding protein identified through a yeast two-hybrid system using metaxin 1 as the bait. Metaxin 2 shares sequence similarity with metaxin 1 but does not contain a C-terminal mitochondrial outer membrane signal-anchor domain. It associates with mitochondrial membranes through its interaction with metaxin 1, which is a component of the mitochondrial preprotein import complex of the outer membrane. The biological function of metaxin 2 is unknown. It is likely that it also plays a role in protein translocation into the mitochondria. However, this has not been experimentally validated. In a recent proteomics study, it has been shown that metaxin 2 is overexpressed in response to lipopolysaccharide-induced liver injury. 48629 cd03080: GST_N family, Metaxin subfamily, Metaxin-like proteins; a heterogenous group of proteins, predominantly uncharacterized, with similarity to metaxins and GSTs. Metaxin 1 is a component of a preprotein import complex of the mitochondrial outer membrane. It extends to the cytosol and is anchored to the mitochondrial membrane through its C-terminal domain. In mice, metaxin is required for embryonic development. In humans, alterations in the metaxin gene may be associated with Gaucher disease. One characterized member of this subgroup is a novel GST from Rhodococcus with toluene o-monooxygenase and gamma-glutamylcysteine synthetase activities. Also members are the cadmium-inducible lysosomal protein CDR-1 and its homologs from C. elegans, and the failed axon connections (fax) protein from Drosophila. CDR-1 is an integral membrane protein that functions to protect against cadmium toxicity and may also have a role in osmoregulation to maintain salt balance in C. elegans. The fax gene of Drosophila was identified as a genetic modifier of Abelson (Abl) tyrosine kinase. The fax protein is localized in cellular membranes and is expressed in embryonic mesoderm and axons of the central nervous system. 48630 cd03081: TRX-like [2Fe-2S] Ferredoxin (Fd) family, NADH:ubiquinone oxidoreductase (Nuo) subunit E subfamily, NAD-dependent formate dehydrogenase (FDH) gamma subunit; composed of proteins similar to the gamma subunit of NAD-linked FDH of Ralstonia eutropha, a soluble enzyme that catalyzes the irreversible oxidation of formate to carbon dioxide accompanied by the reduction of NAD+ to NADH. FDH is a heteromeric enzyme composed of four nonidentical subunits (alpha, beta, gamma and delta). The FDH gamma subunit is closely related to NuoE, which is part of a multisubunit complex (Nuo) catalyzing the electron transfer of NADH to quinone coupled with the transfer of protons across the membrane. Electrons are transferred from NADH to quinone through a chain of iron-sulfur clusters in Nuo, including the [2Fe-2S] cluster present in NuoE. Similarly, the FDH gamma subunit is hypothesized to be involved in an electron transport chain involving other FDH subunits, upon the oxidation of formate. 48631 cd03082: TRX-like [2Fe-2S] Ferredoxin (Fd) family, NADH:ubiquinone oxidoreductase (Nuo) subunit E family, Tungsten-containing formate dehydrogenase (W-FDH) beta subunit; composed of proteins similar to the W-FDH beta subunit of Methylobacterium extorquens. W-FDH is a heterodimeric NAD-dependent enzyme catalyzing the conversion of formate to carbon dioxide. The beta subunit is a fusion protein containing an N-terminal NuoE domain and a C-terminal NuoF domain. NuoE and NuoF are components of Nuo, a multisubunit complex catalyzing the electron transfer of NADH to quinone coupled with the transfer of protons across the membrane. Electrons are transferred from NADH to quinone through a chain of iron-sulfur clusters in Nuo, including the [2Fe-2S] cluster in NuoE and the [4Fe-4S] cluster in NuoF. In addition, NuoF is also the NADH- and FMN-binding subunit. Similarly, the beta subunit of W-FDH is most likely involved in the electron transport chain during the NAD-dependent oxidation of formate. 48632 cd03083: TRX-like [2Fe-2S] Ferredoxin (Fd) family, NADH:ubiquinone oxidoreductase (Nuo) subunit E subfamily, hoxF; composed of proteins similar to the NAD-reducing hydrogenase (hoxS) alpha subunit of Alcaligenes eutrophus H16. HoxS is a cytoplasmic hydrogenase catalyzing the oxidation of molecular hydrogen accompanied by the reduction of NAD. It is composed of four structural subunits encoded by the genes hoxF, hoxU, hoxY and hoxH. The hoxF protein (or alpha subunit) is a fusion protein containing an N-terminal NuoE-like domain and a C-terminal NuoF domain. NuoE and NuoF are components of Nuo, a multisubunit complex catalyzing the electron transfer of NADH to quinone coupled with the transfer of protons across the membrane. Electrons are transferred from NADH to quinone through a chain of iron-sulfur clusters in Nuo, including the [2Fe-2S] cluster in NuoE and the [4Fe-4S] cluster in NuoF. In addition, NuoF is also the NADH- and FMN-binding subunit. HoxF may be involved in the electron transport chain during the NAD-dependent oxidation of hydrogen through its NuoF domain. The NuoE-like domain of hoxF contains only one conserved cysteine in its putative active site, compared to four cysteines in NuoE, and may have lost the ability to bind [2Fe-2S] clusters. 48633 cd03418: Glutaredoxin (GRX) family, GRX bacterial class 1 and 3 (b_1_3)-like subfamily; composed of bacterial GRXs, approximately 10 kDa in size, and proteins containing a GRX or GRX-like domain. GRX is a glutathione (GSH) dependent reductase, catalyzing the disulfide reduction of target proteins such as ribonucleotide reductase. It contains a redox active CXXC motif in a TRX fold and uses a similar dithiol mechanism employed by TRXs for intramolecular disulfide bond reduction of protein substrates. Unlike TRX, GRX has preference for mixed GSH disulfide substrates, in which it uses a monothiol mechanism where only the N-terminal cysteine is required. The flow of reducing equivalents in the GRX system goes from NADPH -> GSH reductase -> GSH -> GRX -> protein substrates. By altering the redox state of target proteins, GRX is involved in many cellular functions including DNA synthesis, signal transduction and the defense against oxidative stress. Different classes are known including E. coli GRX1 and GRX3, which are members of this subfamily. 48634 cd03419: Glutaredoxin (GRX) family, GRX human class 1 and 2 (h_1_2)-like subfamily; composed of proteins similar to human GRXs, approximately 10 kDa in size, and proteins containing a GRX or GRX-like domain. GRX is a glutathione (GSH) dependent reductase, catalyzing the disulfide reduction of target proteins such as ribonucleotide reductase. It contains a redox active CXXC motif in a TRX fold and uses a similar dithiol mechanism employed by TRXs for intramolecular disulfide bond reduction of protein substrates. Unlike TRX, GRX has preference for mixed GSH disulfide substrates, in which it uses a monothiol mechanism where only the N-terminal cysteine is required. The flow of reducing equivalents in the GRX system goes from NADPH -> GSH reductase -> GSH -> GRX -> protein substrates. By altering the redox state of target proteins, GRX is involved in many cellular functions including DNA synthesis, signal transduction and the defense against oxidative stress. Different classes are known including human GRX1 and GRX2, which are members of this subfamily. Also included in this subfamily are the N-terminal GRX domains of proteins similar to human thioredoxin reductase 1 and 3. 48635 cd02852: Isoamylase N-terminus domain. Isoamylase (aka glycogen 6-glucanohydrolase) is one of the starch-debranching enzymes that catalyzes the hydrolysis of alpha-1,6-glucosidic linkages specific in alpha-glucans such as amylopectin or glycogen. Isoamylase contains a bound calcium ion, but this is not in the same position as the conserved calcium ion that has been reported in other alpha-amylase family enzymes. The N-terminus of isoamylase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. 48636 cd00419: Ferrochelatase, C-terminal domain: Ferrochelatase (protoheme ferrolyase or HemH) is the terminal enzyme of the heme biosynthetic pathway. It catalyzes the insertion of ferrous iron into the protoporphyrin IX ring yielding protoheme. This enzyme is ubiquitous in nature and widely distributed in bacteria and eukaryotes. Recently, some archaeal members have been identified. The oligomeric state of these enzymes varies depending on the presence of a dimerization motif at the C-terminus. 48637 cd03409: Class II Chelatase: a family of ATP-independent monomeric or homodimeric enzymes that catalyze the insertion of metal into protoporphyrin rings. This family includes protoporphyrin IX ferrochelatase (HemH), sirohydrochlorin ferrochelatase (SirB) and the cobaltochelatases, CbiK and CbiX. HemH and SirB are involved in heme and siroheme biosynthesis, respectively, while the cobaltochelatases are associated with cobalamin biosynthesis. Excluded from this family are the ATP-dependent heterotrimeric chelatases (class I) and the multifunctional homodimeric enzymes with dehydrogenase and chelatase activities (class III).. 48638 cd03411: Ferrochelatase, N-terminal domain: Ferrochelatase (protoheme ferrolyase or HemH) is the terminal enzyme of the heme biosynthetic pathway. It catalyzes the insertion of ferrous iron into the protoporphyrin IX ring yielding protoheme. This enzyme is ubiquitous in nature and widely distributed in bacteria and eukaryotes. Recently, some archaeal members have been identified. The oligomeric state of these enzymes varies depending on the presence of a dimerization motif at the C-terminus. 48639 cd03412: Anaerobic cobalamin biosynthetic cobalt chelatase (CbiK), N-terminal domain. CbiK is part of the cobalt-early path for cobalamin biosynthesis. It catalyzes the insertion of cobalt into the oxidized form of precorrin-2, factor II (sirohydrochlorin), the second step of the anaerobic branch of vitamin B12 biosynthesis. CbiK belongs to the class II family of chelatases and is a homomeric enzyme that does not require ATP for its enzymatic activity. 48640 cd03413: Anaerobic cobalamin biosynthetic cobalt chelatase (CbiK), C-terminal domain. CbiK is part of the cobalt-early path for cobalamin biosynthesis. It catalyzes the insertion of cobalt into the oxidized form of precorrin-2, factor II (sirohydrochlorin), the second step of the anaerobic branch of vitamin B12 biosynthesis. CbiK belongs to the class II family of chelatases, and is a homomeric enzyme that does not require ATP for its enzymatic activity. 48641 cd03414: Sirohydrochlorin cobalt chelatase (CbiX) and sirohydrochlorin iron chelatase (SirB), C-terminal domain. SirB catalyzes the ferro-chelation of sirohydrochlorin to siroheme, the prosthetic group of sulfite and nitrite reductases. CbiX is a cobaltochelatase, responsible for the chelation of Co2+ into sirohydrochlorin, an important step in the vitamin B12 biosynthetic pathway. CbiX often contains a C-terminal histidine-rich region that may be important for metal delivery and/or storage, and may also contain an iron-sulfur center. Both CbiX and SirB are found in a wide range of bacteria. 48642 cd03415: Archaeal sirohydrochlorin cobalt chelatase (CbiX) single domain. Proteins in this subgroup contain a single CbiX domain N-terminal to a precorrin-8X methylmutase (CbiC) domain. CbiX is a cobaltochelatase, responsible for the chelation of Co2+ into sirohydrochlorin, while CbiC catalyzes the conversion of cobalt-precorrin 8 to cobyrinic acid by methyl rearrangement. Both CbiX and CbiC are involved in vitamin B12 biosynthesis. 48643 cd03416: Sirohydrochlorin cobalt chelatase (CbiX) and sirohydrochlorin iron chelatase (SirB), N-terminal domain. SirB catalyzes the ferro-chelation of sirohydrochlorin to siroheme, the prosthetic group of sulfite and nitrite reductases. CbiX is a cobaltochelatase, responsible for the chelation of Co2+ into sirohydrochlorin, an important step in the vitamin B12 biosynthetic pathway. CbiX often contains a C-terminal histidine-rich region that may be important for metal delivery and/or storage, and may also contain an iron-sulfur center. Both are found in a wide range of bacteria. This subgroup also contains single domain proteins from archaea and bacteria which may represent the ancestral form of class II chelatases before domain duplication occurred. 48657 cd00197: VHS, ENTH and ANTH domain superfamily; composed of proteins containing a VHS, ENTH or ANTH domain. The VHS domain is present in Vps27 (Vacuolar Protein Sorting), Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate) and STAM (Signal Transducing Adaptor Molecule). It is located at the N-termini of proteins involved in intracellular membrane trafficking. The epsin N-terminal homology (ENTH) domain is an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. A set of proteins previously designated as harboring an ENTH domain in fact contains a highly similar, yet unique module referred to as an AP180 N-terminal homology (ANTH) domain. VHS, ENTH and ANTH domains are structurally similar and are composed of a superhelix of eight alpha helices. ENTH adnd ANTH (E/ANTH) domains bind both inositol phospholipids and proteins and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. E/ANTH domain-bearing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the trans-Golgi network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding. 48658 cd03561: VHS domain family; The VHS domain is present in Vps27 (Vacuolar Protein Sorting), Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate) and STAM (Signal Transducing Adaptor Molecule). It has a superhelical structure similar to that of the ARM (Armadillo) repeats and is present at the N-termini of proteins involved in intracellular membrane trafficking. There are four general groups of VHS domain containing proteins based on their association with other domains. The first group consists of proteins of the STAM/EAST/Hbp family which has the domain composition VHS-SH3-ITAM. The second consists of proteins with a FYVE domain C-terminal to VHS. The third consists of GGA proteins with a domain composition VHS-GAT (GGA and TOM)-GAE (gamma-adaptin ear) domain. The fourth consists of proteins with a VHS domain alone or with domains other than those mentioned above. In GGA proteins, VHS domains are involved in cargo recognition in trans-Golgi, thereby having a general membrane targeting/cargo recognition role in vesicular trafficking. 48659 cd03562: CID (CTD-Interacting Domain) domain family; CID is present in several RNA-processing factors such as Pcf11 and Nrd1. Pcf11 is a conserved and essential subunit of the yeast cleavage factor IA, which is required for polyadenylation-dependent 3'-RNA processing and transcription termination. Nrd1 is implicated in polyadenylation-independent 3'-RNA processing. CID binds tightly to the carboxy-terminal domain (CTD) of RNA polymerase (Pol) II. During transcription, Pol II synthesizes eukaryotic messenger RNA. Transcription is coupled to RNA processing through the CTD, which consists of up to 52 repeats of the sequence Tyr 1-Ser 2-Pro 3-Thr 4-Ser 5-Pro 6-Ser 7. CID contains eight alpha-helices in a right-handed superhelical arrangement, which closely resembles that of the VHS domains and ARM (Armadillo) repeat proteins, except for its two amino-terminal helices. 48660 cd03564: ANTH domain family; composed of adaptor protein 180 (AP180), clathrin assembly lymphoid myeloid leukemia protein (CALM) and similar proteins. A set of proteins previously designated as harboring an ENTH domain in fact contains a highly similar, yet unique module referred to as an AP180 N-terminal homology (ANTH) domain. AP180 and CALM play important roles in clathrin-mediated endocytosis. AP180 is a brain-specific clathrin-binding protein which stimulates clathrin assembly during the recycling of synaptic vesicles. The ANTH domain is structurally similar to the VHS domain and is composed of a superhelix of eight alpha helices. ANTH domains bind both inositol phospholipids and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. ANTH-bearing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the trans-Golgi network, which suggests that the ANTH domain is a universal component of the machinery for clathrin-mediated membrane budding. 48661 cd03565: VHS domain family, Tom1 subfamily; The VHS domain is an essential part of Tom1 (Target of myb1 - retroviral oncogene) protein. The VHS domain has a superhelical structure similar to the structure of the ARM repeats and is present at the very N-termini of proteins. It is a right-handed superhelix of eight alpha helices. The VHS domain has been found in a number of proteins, some of which have been implicated in intracellular trafficking and sorting. The VHS domain of the Tom1 protein is essential for the negative regulation of Interleukin-1 and Tumor Necrosis Factor-induced signaling pathways. 48662 cd03567: VHS domain family, GGA subfamily; GGA (Golgi-localized, Gamma-ear-containing, Arf-binding) comprise a subfamily of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins. The VHS domain has a superhelical structure similar to the structure of the ARM (Armadillo) repeats and is present at the N-termini of proteins. GGA proteins have a multidomain structure consisting of an N-terminal VHS domain linked by a short proline-rich linker to a GAT (GGA and TOM) domain, which is followed by a long flexible linker to the C-terminal appendage, GAE (gamma-adaptin ear) domain. The VHS domain of GGA proteins binds to the acidic-cluster dileucine (DxxLL) motif found on the cytoplasmic tails of cargo proteins trafficked between the trans-Golgi network and the endosomal system. 48663 cd03568: VHS domain family, STAM subfamily; members include STAM (Signal Transducing Adaptor Molecule), EAST (EGFR-associated protein with SH3 and TAM domains) and Hbp (Hrs-binding protein). Collectively, they are referred to as STAM. All STAMs have at their N-termini a VHS domain, which is involved in cytokine-mediated intracellular signal transduction and has a superhelical structure similar to the structure of ARM (Armadillo) repeats, followed by a SH3 (Src homology 3) domain, a well-established protein-protein interaction domain. At the C-termini of most vertebrate STAMS, an ITAM (Immunoreceptor Tyrosine-based Activation) motif is present, which mediates the binding of HRS (hepatocyte growth factor-regulated tyrosine kinase substrate) in endocytic and exocytic machineries. 48664 cd03569: VHS domain family, Hrs and Vps27p subfamily; composed of Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate) and its yeast homolog Vps27p (vacuolar protein sorting). The VHS domain, an essential part of Hrs/Vps27p, has a superhelical structure similar to the structure of ARM (Armadillo) repeats and is present at the N-termini of proteins. Hrs also contains a FYVE (Fab1p, YOTB, Vac1p, and EEA1) zinc finger domain C-terminal to VHS, as well as two coiled-coil domains. Hrs has been proposed to play a role in at least three vesicle trafficking events: exocytosis, endocytosis, and endosome to lysosome trafficking. Hrs is involved in promoting rapid recycling of endocytosed signaling receptors to the plasma membrane. 48665 cd03571: ENTH domain, Epsin family; The epsin (Eps15 interactor) N-terminal homology (ENTH) domain is an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. A set of proteins previously designated as harboring an ENTH domain in fact contains a highly similar, yet unique module referred to as an AP180 N-terminal homology (ANTH) domain. ENTH and ANTH (E/ANTH) domains are structurally similar to the VHS domain and are composed of a superhelix of eight alpha helices. E/ANTH domains bind both inositol phospholipids and proteins and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. E/ANTH-bearing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the trans-Golgi network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding. 48666 cd03572: ENTH domain, Epsin Related family; composed of hypothetical proteins containing an ENTH-like domain. The epsin N-terminal homology (ENTH) domain is an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. A set of proteins previously designated as harboring an ENTH domain in fact contains a highly similar, yet unique module referred to as an AP180 N-terminal homology (ANTH) domain. ENTH and ANTH (E/ANTH) domains are structurally similar to the VHS domain and are composed of a superhelix of eight alpha helices. E/ANTH domains bind both inositol phospholipids and proteins and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. E/ANTH-bearing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the trans-Golgi network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding. 57907 cd00037: CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model. 57908 cd03588: CLECT_CSPGs: C-type lectin-like domain (CTLD) of the type found in chondroitin sulfate proteoglycan core proteins (CSPGs) in human and chicken aggrecan, frog brevican, and zebra fish dermacan. CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. In cartilage, aggrecan forms cartilage link protein stabilized aggregates with hyaluronan (HA). These aggregates contribute to the tissue's load bearing properties. Aggregates having other CSPGs substituting for aggrecan may contribute to the structural integrity of many different tissues. Xenopus brevican is expressed in the notochord and the brain during early embryogenesis. Zebra fish dermacan is expressed in dermal bones and may play a role in dermal bone development. CSPGs do contain LINK domain(s) which bind HA. These LINK domains are considered by one classification system to be a variety of CTLD, but are omitted from this hierarchical classification based on insignificant sequence similarity. 57909 cd03589: CLECT_CEL-1_like: C-type lectin-like domain (CTLD) of the type found in CEL-1 from Cucumaria echinata and Echinoidin from Anthocidaris crassispina. CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. The CEL-1 CTLD binds three calcium ions and has a high specificity for N-acteylgalactosamine (GalNAc). CEL-1 exhibits strong cytotoxicity which is inhibited by GalNAc. This protein may play a role as a toxin defending against predation. Echinoidin is found in the coelomic fluid of the sea urchin and is specific for GalBeta1-3GalNAc. Echinoidin has a cell adhesive activity towards human cancer cells which is not mediated through the CTLD. Both CEL-1 and Echinoidin are multimeric proteins comprised of multiple dimers linked by disulfide bonds. 57910 cd03590: CLECT_DC-SIGN_like: C-type lectin-like domain (CTLD) of the type found in human dendritic cell (DC)-specific intercellular adhesion molecule 3-grabbing non-integrin (DC-SIGN) and the related receptor, DC-SIGN receptor (DC-SIGNR). This group also contains proteins similar to hepatic asialoglycoprotein receptor (ASGP-R) and langerin in human. These proteins are type II membrane proteins with a CTLD ectodomain. CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. DC-SIGN is thought to mediate the initial contact between dendritic cells and resting T cells, and may also mediate the rolling of DCs on epithelium. DC-SIGN and DC-SIGNR bind to oligosaccharides present on human tissues, as well as, on pathogens including parasites, bacteria, and viruses. DC-SIGN and DC-SIGNR bind to HIV enhancing viral infection of T cells. DC-SIGN and DC-SIGNR are homotetrameric, and contain four CTLDs stabilized by a coiled coil of alpha helices. The hepatic ASGP-R is an endocytic recycling receptor which binds and internalizes desialylated glycoproteins having a terminal galactose or N-acetylgalactosamine residues on their N-linked carbohydrate chains, via the clathrin-coated pit mediated endocytic pathway, and delivers them to lysosomes for degradation. It has been proposed that glycoproteins bearing terminal Sia (sialic acid) alpha2, 6GalNAc and Sia alpha2, 6Gal are endogenous ligands for ASGP-R and that ASGP-R participates in regulating the relative concentration of serum glycoproteins bearing alpha 2,6-linked Sia. The human ASGP-R is a hetero-oligomer composed of two subunits, both of which are found within this group. Langerin is expressed in a subset of dendritic leukocytes, the Langerhans cells (LC). Langerin induces the formation of Birbeck Granules (BGs) and associates with these BGs following internalization. Langerin binds, in a calcium-dependent manner, to glyco-conjugates containing mannose and related sugars mediating their uptake and degradation. Langerin molecules oligomerize as trimers with three CTLDs held together by a coiled-coil of alpha helices. 57911 cd03591: CLECT_collectin_like: C-type lectin-like domain (CTLD) of the type found in human collectins including lung surfactant proteins A and D, mannose- or mannan binding lectin (MBL), and CL-L1 (collectin liver 1). CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. The CTLDs of these collectins bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, or apoptotic cells) and mediate functions associated with killing and phagocytosis. MBPs recognize high mannose oligosaccharides in a calcium dependent manner, bind to a broad range of pathogens, and trigger cell killing by activating the complement pathway. MBP also acts directly as an opsonin. SP-A and SP-D in addition to functioning as host defense components, are components of pulmonary surfactant which play a role in surfactant homeostasis. Pulmonary surfactant is a phospholipid-protein complex which reduces the surface tension within the lungs. SP-A binds the major surfactant lipid: dipalmitoylphosphatidylcholine (DPPC). SP-D binds two minor components of surfactant that contain sugar moieties: glucosylceramide and phosphatidylinositol (PI). MBP and SP-A, -D monomers are homotrimers with an N-terminal collagen region and three CTLDs. Multiple homotrimeric units associate to form supramolecular complexes. MBL deficiency results in an increased susceptibility to a large number of different infections and to inflammatory disease, such as rheumatoid arthritis. 57912 cd03592: CLECT_selectins_like: C-type lectin-like domain (CTLD) of the type found in the type 1 transmembrane proteins: P(platlet)-, E(endothelial)-, and L(leukocyte)- selectins (sels). CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. P- E- and L-sels are cell adhesion receptors that mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. L- sel is expressed constitutively on most leukocytes. P-sel is stored in the Weibel-Palade bodies of endothelial cells and in the alpha granules of platlets. E- sels are present on endothelial cells. Following platelet and/or endothelial cell activation P- sel is rapidly translocated to the cell surface and E-sel expression is induced. The initial step in leukocyte migration involves interactions of selectins with fucosylated, sialylated, and sulfated carbohydrate moieties on target ligands displayed on glycoprotein scaffolds on endothelial cells and leucocytes. A major ligand of P- E- and L-sels is PSGL-1 (P-sel glycoprotein ligand). Interactions of E- and P- sels with tumor cells may promote extravasation of cancer cells. Regulation of L-sel and P-sel function includes proteolytic shedding of the most extracellular portion (containing the CTLD) from the cell surface. Increased levels of the soluble form of P-sel in the plasma have been found in a number of diseases including coronary disease and diabetes. E- and P- sel also play roles in the development of synovial inflammation in inflammatory arthritis. Platelet P-sel, but not endothelial P-sel, plays a role in the inflammatory response and neointimal formation after arterial injury. Selectins may also function as signal-transducing receptors. 57913 cd03593: CLECT_NK_receptors_like: C-type lectin-like domain (CTLD) of the type found in natural killer cell receptors (NKRs), including proteins similar to oxidized low density lipoprotein (OxLDL) receptor (LOX-1), CD94, CD69, NKG2-A and -D, osteoclast inhibitory lectin (OCIL), dendritic cell-associated C-type lectin-1 (dectin-1), human myeloid inhibitory C-type lectin-like receptor (MICL), mast cell-associated functional antigen (MAFA), killer cell lectin-like receptors: subfamily F, member 1 (KLRF1) and subfamily B, member 1 (KLRB1), and lys49 receptors. CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. NKRs are variously associated with activation or inhibition of natural killer (NK) cells. Activating NKRs stimulate cytolysis by NK cells of virally infected or transformed cells; inhibitory NKRs block cytolysis upon recognition of markers of healthy self cells. Most Lys49 receptors are inhibitory; some are stimulatory. OCIL inhibits NK cell function via binding to the receptor NKRP1D. Murine OCIL in addition to inhibiting NK cell function inhibits osteoclast differentiation. MAFA clusters with the type I Fc epsilon receptor (FcepsilonRI) and inhibits the mast cells secretory response to FcepsilonRI stimulus. CD72 is a negative regulator of B cell receptor signaling. NKG2D is an activating receptor for stress-induced antigens; human NKG2D ligands include the stress induced MHC-I homologs, MICA, MICB, and ULBP family of glycoproteins Several NKRs have a carbohydrate-binding capacity which is not mediated through calcium ions (e.g. OCIL binds a range of high molecular weight sulfated glycosaminoglycans including dextran sulfate, fucoidan, and gamma-carrageenan sugars). Dectin-1 binds fungal beta-glucans and in involved in the innate immune responses to fungal pathogens. MAFA binds saccharides having terminal alpha-D mannose residues in a calcium-dependent manner. LOX-1 is the major receptor for OxLDL in endothelial cells and thought to play a role in the pathology of atherosclerosis. Some NKRs exist as homodimers (e.g.Lys49, NKG2D, CD69, LOX-1) and some as heterodimers (e.g. CD94/NKG2A). Dectin-1 can function as a monomer in vitro. 57914 cd03594: CLECT_REG-1_like: C-type lectin-like domain (CTLD) of the type found in Human REG-1 (lithostathine), REG-4, and avian eggshell-specific proteins: ansocalcin, structhiocalcin-1(SCA-1), and -2(SCA-2). CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. REG-1 is a proliferating factor which participates in various kinds of tissue regeneration including pancreatic beta-cell regeneration, regeneration of intestinal mucosa, regeneration of motor neurons, and perhaps in tissue regeneration of damaged heart. REG-1 may play a role on the pathophysiology of Alzheimer's disease and in the development of gastric cancers. Its expression is correlated with reduced survival from early-stage colorectal cancer. REG-1 also binds and aggregates several bacterial strains from the intestinal flora and it has been suggested that it is involved in the control of the intestinal bacterial ecosystem. Rat lithostathine has calcium carbonate crystal inhibitor activity in vitro. REG-IV is unregulated in pancreatic, gastric, hepatocellular, and prostrate adenocarcinomas. REG-IV activates the EGF receptor/Akt/AP-1 signaling pathway in colorectal carcinoma. Ansocalcin, SCA-1 and -2 are found at high concentration in the calcified egg shell layer of goose and ostrich, respectively and tend to form aggregates. Ansocalcin nucleates calcite crystal aggregates in vitro. 57915 cd03595: CLECT_chondrolectin_like: C-type lectin-like domain (CTLD) of the type found in the human type-1A transmembrane proteins chondrolectin (CHODL) and layilin. CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. CHODL is predominantly expressed in muscle cells and is associated with T-cell maturation. Various alternatively spliced isoforms have been of CHODL have been identified. The transmembrane form of CHODL is localized in the ER-Golgi apparatus. Layilin is widely expressed in different cell types. The extracellular CTLD of layilin binds hyaluronan (HA), a major constituent of the extracellular matrix (ECM). The cytoplasmic tail of layilin binds various members of the band 4.1/ERM superfamily (talin, radixin, and merlin). The ERM proteins are cytoskeleton-membrane linker molecules which link actin to receptors in the plasma membrane. Layilin co-localizes in with talin in membrane ruffles and may mediate signals from the ECM to the cell cytoskeleton. 57916 cd03596: CLECT_tetranectin_like: C-type lectin-like domain (CTLD) of the type found in the tetranectin (TN), cartilage derived C-type lectin (CLECSF1), and stem cell growth factor (SCGF). CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. TN binds to plasminogen and stimulates activation of plasminogen, playing a key role in the regulation of proteolytic processes. The TN CTLD binds two calcium ions. Its calcium free form binds to various kringle-like protein ligands. Two residues involved in the coordination of calcium are critical for the binding of TN to the fourth kringle (K4) domain of plasminogen (Plg K4). TN binds the kringle 1-4 form of angiostatin (AST K1-4). AST K1-4 is a fragment of Plg, commonly found in cancer tissues. TN inhibits the binding of Plg and AST K1-4 to the extracellular matrix (EMC) of endothelial cells and counteracts the antiproliferative effects of AST K1-4 on these cells. TN also binds the tenth kringle domain of apolipoprotein (a). In addition, TN binds fibrin and complex polysaccharides in a Ca2+ dependent manner. The binding site for complex sulfated polysaccharides is N-terminal to the CTLD. TN is homotrimeric; N-terminal to the CTLD is an alpha helical domain responsible for trimerization of monomeric units. TN may modulate angiogenesis through interactions with angiostatin and coagulation through interaction with fibrin. TN may play a role in myogenesis and in bone development. Mice having a deletion in the TN gene exhibit a kyphotic spine abnormality. TN is a useful prognostic marker of certain cancer types. CLECSF1 is expressed in cartilage tissue, which is primarily intracellular matrix (ECM), and is a candidate for organizing ECM. SCGF is strongly expressed in bone marrow and is a cytokine for primitive hematopoietic progenitor cells. 57917 cd03597: CLECT_attractin_like: C-type lectin-like domain (CTLD) of the type found in human and mouse attractin (AtrN) and attractin-like protein (ALP). CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. Mouse AtrN (the product of the mahogany gene) has been shown to bind Agouti protein and to function in agouti-induced pigmentation and obesity. Mutations in AtrN have also been shown to cause spongiform encephalopathy and hypomyelination in rats and hamsters. The cytoplasmic region of mouse ALP has been shown to binds to melanocortin receptor (MCR4). Signaling through MCR4 plays a role in appetite suppression. Attractin may have therapeutic potential in the treatment of obesity. Human attractin (hAtrN) has been shown to be expressed on activated T cells and released extracellularly. The circulating serum attractin induces the spreading of monocytes that become the focus of the clustering of non-proliferating T cells. 57918 cd03598: CLECT_EMBP_like: C-type lectin-like domain (CTLD) of the type found in the human proteins, eosinophil major basic protein (EMBP) and prepro major basic protein homolog (MBPH). CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. Eosinophils and basophils carry out various functions in allergic, parasitic, and inflammatory diseases. EMBP is stored in eosinophil crystalloid granules and is released upon degranulation. EMBP is also expressed in basophils. The proform of EMBP is expressed in placental X cells and breast tissue and increases significantly during human pregnancy. EMBP has cytotoxic properties and damages bacteria and mammalian cells, in vitro, as well as, helminth parasites. EMBP deposition has been observed in the inflamed tissue of allergy patients in a variety of diseases including asthma, atopic dermatitis, and rhinitis. In addition to its cytotoxic functions, EMBP activates cells and stimulates cytokine production. EMBP has been shown to bind the proteoglycan heparin. The binding site is similar to the carbohydrate binding site of other classical CTLD, such as mannose-binding protein (MBP1), however, heparin binding to EMBP is calcium ion independent. MBPH has reduced potency in cytotoxic and cytostimulatory assays compared with EMBP. 57919 cd03599: CLECT_DGCR2_like: C-type lectin-like domain (CTLD) of the type found in DGCR2, an integral membrane protein deleted in DiGeorge Syndrome (DGS). CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. DGS is also known velo-cardio-facial syndrome (VCFS). DGS is a genetic abnormality that results in malformations of the heart, face, and limbs and is associated with schizophrenia and depressive disorders. DGCR2 is a candidate for involvement in the pathogenesis of DGS since the DGCR2 gene lies within the minimal DGS critical region (MDGRC) of 22q11, which when deleted gives rise to DGS, and the DGCR2 gene is in close proximity to the balanced translocation breakpoint in a DGS patient having a balanced translocation. 57920 cd03600: CLECT_thrombomodulin_like: C-type lectin-like domain (CTLD) of the type found in human thrombomodulin(TM), Endosialin, C14orf27, and C1qR. CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. In these thrombomodulin-like proteins the residues involved in coordinating Ca2+ in the classical MBP-A CTLD are not conserved. TM exerts anti-fibrinolytic and anti-inflammatory activity. TM also regulates blood coagulation in the anticoagulant protein C pathway. In this pathway, the procoagulant properties of thrombin (T) are lost when it binds TM. TM also plays a key role in tumor biology. It is expressed on endothelial cells and on several type of tumor cell including squamous cell carcinoma. Loss of TM expression correlates with advanced stage and poor prognosis. Loss of function of TM function may be associated with arterial or venous thrombosis and with late fetal loss. Soluble molecules of TM retaining the CTLD are detected in human plasma and urine where higher levels indicate injury and/or enhanced turnover of the endothelium. C1qR is expressed on endothelial cells and stem cells. It is also expressed on monocots and neutrophils, where it is subject to ectodomain shedding. Soluble forms of C1qR retaining the CTLD is detected in human plasma. C1qR modulates the phagocytosis of apoptotic cells in vivo. C1qR-deficient mice are defective in clearance of apoptotic cells in vivo. The cytoplasmic tail of C1qR, C-terminal to the CTLD of CD93, contains a PDZ binding domain which interacts with the PDZ domain-containing adaptor protein, GIPC. The juxtamembrane region of this tail interacts with the ezrin/radixin/moesin family. Endosialin functions in the growth and progression of abdominal tumors and is expressed in the stroma of several tumors. 57921 cd03601: CLECT_TC14_like: C-type lectin-like domain (CTLD) of the type found in lectins TC14, TC14-2, TC14-3, and TC14-4 from the budding tunicate Polyandrocarpa misakiensis and PfG6 from the Acorn worm. CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. TC14 is homodimeric. The CTLD of TC14 binds D-galactose and D-fucose. TC14 is expressed constitutively by multipotent epithelial and mesenchymal cells and plays in role during budding, in inducing the aggregation of undifferentiated mesenchymal cells to give rise to epithelial forming tissue. TC14-2 and TC14-3 shows calcium-dependent galactose binding activity. TC14-3 is a cytostatic factor which blocks cell growth and dedifferentiation of the atrial epithelium during asexual reproduction. It may also act as a differentiation inducing factor. Galactose inhibits the cytostatic activity of TC14-3. The gene for Acorn worm PfG6 is gill-specific; PfG6 may be a secreted protein. 57922 cd03602: CLECT_1: C-type lectin (CTL)/C-type lectin-like (CTLD) domain subgroup 1; a subgroup of protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers from which ligand-binding sites project in different orientations. In some CTLDs a loop extends to the adjoining domain to form a loop-swapped dimer. 57923 cd03603: CLECT_VCBS: A bacterial subgroup of the C-type lectin-like (CTLD) domain; a subgroup of bacterial protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces including CaCO3 and ice. Bacterial CTLDs within this group are functionally uncharacterized. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers from which ligand-binding sites project in different orientations. In some CTLDs a loop extends to the adjoining domain to form a loop-swapped dimer. 57924 cd01849: YlqF-related GTPases. These proteins are found in bacteria, eukaryotes, and archaea. They all exhibit a circular permutation of the GTPase signature motifs so that the order of the conserved G box motifs is G4-G5-G1-G2-G3, with G4 and G5 being permuted from the C-terminal region of proteins in the Ras superfamily to the N-terminus of YlqF-related GTPases. 57925 cd01854: YjeQ/EngC. YjeQ (YloQ in Bacillus subtilis) represents a protein family whose members are broadly conserved in bacteria and have been shown to be essential to the growth of E. coli and B. subtilis. Proteins of the YjeQ family contain all sequence motifs typical of the vast class of P-loop-containing GTPases, but show a circular permutation, with a G4-G1-G3 pattern of motifs as opposed to the regular G1-G3-G4 pattern seen in most GTPases. All YjeQ family proteins display a unique domain architecture, which includes an N-terminal OB-fold RNA-binding domain, the central permuted GTPase domain, and a zinc knuckle-like C-terminal cysteine domain. This domain architecture suggests a role for YjeQ as a regulator of translation. 57926 cd01855: YqeH. YqeH is an essential GTP-binding protein. Depletion of YqeH induces an excess initiation of DNA replication, suggesting that it negatively controls initiation of chromosome replication. The YqeH subfamily is common in eukaryotes and sporadically present in bacteria with probable acquisition by plants from chloroplasts. Proteins of the YqeH family contain all sequence motifs typical of the vast class of P-loop-containing GTPases, but show a circular permutation, with a G4-G1-G3 pattern of motifs as opposed to the regular G1-G3-G4 pattern seen in most GTPases. 57927 cd01856: YlqF. Proteins of the YlqF family contain all sequence motifs typical of the vast class of P-loop-containing GTPases, but show a circular permutation, with a G4-G1-G3 pattern of motifs as opposed to the regular G1-G3-G4 pattern seen in most GTPases. The YlqF subfamily is represented in a phylogenetically diverse array of bacteria (including gram-positive bacteria, proteobacteria, Synechocystis, Borrelia, and Thermotoga) and in all eukaryotes. 57928 cd01857: HSR1/MMR1. Human HSR1, is localized to the human MHC class I region and is highly homologous to a putative GTP-binding protein, MMR1 from mouse. These proteins represent a new subfamily of GTP-binding proteins that has only eukaryote members. This subfamily shows a circular permutation of the GTPase signature motifs so that the C-terminal strands 5, 6, and 7 (strand 6 contains the G4 box with sequence NKXD) are relocated to the N terminus. 57929 cd01858: NGP-1. Autoantigen NGP-1 (Nucleolar G-protein gene 1) has been shown to localize in the nucleolus and nucleolar organizers in all cell types analyzed, which is indicative of a function in ribosomal assembly. NGP-1 and its homologs show a circular permutation of the GTPase signature motifs so that the C-terminal strands 5, 6, and 7 (strand 6 contains the G4 box with NKXD motif) are relocated to the N terminus. 57930 cd01859: MJ1464. This family represents archaeal GTPase typified by the protein MJ1464 from Methanococcus jannaschii. The members of this family show a circular permutation of the GTPase signature motifs so that C-terminal strands 5, 6, and 7 (strands 6 contain the NKxD motif) are relocated to the N terminus. 57931 cd04178: Nucleostemin-like. Nucleostemin (NS) is a nucleolar protein that functions as a regulator of cell growth and proliferation in stem cells and in several types of cancer cells, but is not expressed in the differentiated cells of most mammalian adult tissues. NS shuttles between the nucleolus and nucleoplasm bidirectionally at a rate that is fast and independent of cell type. Lowering GTP levels decreases the nucleolar retention of NS, and expression of NS is abruptly down-regulated during differentiation prior to terminal cell division. Found only in eukaryotes, NS consists of an N-terminal basic domain, a coiled-coil domain, a GTP-binding domain, an intermediate domain, and a C-terminal acidic domain. Experimental evidence indicates that NS uses its GTP-binding property as a molecular switch to control the transition between the nucleolus and nucleoplasm, and this process involves interaction between the basic, GTP-binding, and intermediate domains of the protein. 57932 cd00066: G protein alpha subunit. The alpha subunit of G proteins contains the guanine nucleotide binding site. The heterotrimeric GNP-binding proteins are signal transducers that communicate signals from many hormones, neurotransmitters, chemokines, and autocrine and paracrine factors. Extracellular signals are received by receptors, which activate the G proteins, which in turn route the signals to several distinct intracellular signaling pathways. The alpha subunit of G proteins is a weak GTPase. In the resting state, heterotrimeric G proteins are associated at the cytosolic face of the plasma membrane and the alpha subunit binds to GDP. Upon activation by a receptor GDP is replaced with GTP, and the G-alpha/GTP complex dissociates from the beta and gamma subunits. This results in activation of downstream signaling pathways, such as cAMP synthesis by adenylyl cyclase, which is terminated when GTP is hydrolized and the heterotrimers reconstitute. 57933 cd00154: Rab family. Rab GTPases form the largest family within the Ras superfamily. There are at least 60 Rab genes in the human genome, and a number of Rab GTPases are conserved from yeast to humans. Rab GTPases are small, monomeric proteins that function as molecular switches to regulate vesicle trafficking pathways. The different Rab GTPases are localized to the cytosolic face of specific intracellular membranes, where they regulate distinct steps in membrane traffic pathways. In the GTP-bound form, Rab GTPases recruit specific sets of effector proteins onto membranes. Through their effectors, Rab GTPases regulate vesicle formation, actin- and tubulin-dependent vesicle movement, and membrane fusion. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which mask C-terminal lipid binding and promote cytosolic localization. While most unicellular organisms possess 5-20 Rab members, several have been found to possess 60 or more Rabs; for many of these Rab isoforms, homologous proteins are not found in other organisms. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Since crystal structures often lack C-terminal residues, the lipid modification site is not available for annotation in many of the CDs in the hierarchy, but is included where possible. 57934 cd00157: Rho (Ras homology) family. Members of the Rho family include RhoA, Cdc42, Rac, Rnd, Wrch1, RhoBTB, and Rop. There are 22 human Rho family members identified currently. These proteins are all involved in the reorganization of the actin cytoskeleton in response to external stimuli. They also have roles in cell transformation by Ras in cytokinesis, in focal adhesion formation and in the stimulation of stress-activated kinase. These various functions are controlled through distinct effector proteins and mediated through a GTP-binding/GTPase cycle involving three classes of regulating proteins: GAPs (GTPase-activating proteins), GEFs (guanine nucleotide exchange factors), and GDIs (guanine nucleotide dissociation inhibitors). Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Since crystal structures often lack C-terminal residues, this feature is not available for annotation in many of the CDs in the hierarchy. 57935 cd00876: Ras family. The Ras family of the Ras superfamily includes classical N-Ras, H-Ras, and K-Ras, as well as R-Ras, Rap, Ral, Rheb, Rhes, ARHI, RERG, Rin/Rit, RSR1, RRP22, Ras2, Ras-dva, and RGK proteins. Ras proteins regulate cell growth, proliferation and differentiation. Ras is activated by guanine nucleotide exchange factors (GEFs) that release GDP and allow GTP binding. Many RasGEFs have been identified. These are sequestered in the cytosol until activation by growth factors triggers recruitment to the plasma membrane or Golgi, where the GEF colocalizes with Ras. Active GTP-bound Ras interacts with several effector proteins: among the best characterized are the Raf kinases, phosphatidylinositol 3-kinase (PI3K), RalGEFs and NORE/MST1. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 57936 cd00877: Ran (Ras-related nuclear proteins) /TC4 subfamily of small GTPases. Ran GTPase is involved in diverse biological functions, such as nuclear transport, spindle formation during mitosis, DNA replication, and cell division. Among the Ras superfamily, Ran is a unique small G protein. It does not have a lipid modification motif at the C-terminus to bind to the membrane, which is often observed within the Ras superfamily. Ran may therefore interact with a wide range of proteins in various intracellular locations. Like other GTPases, Ran exists in GTP- and GDP-bound conformations that interact differently with effectors. Conversion between these forms and the assembly or disassembly of effector complexes requires the interaction of regulator proteins. The intrinsic GTPase activity of Ran is very low, but it is greatly stimulated by a GTPase-activating protein (RanGAP1) located in the cytoplasm. By contrast, RCC1, a guanine nucleotide exchange factor that generates RanGTP, is bound to chromatin and confined to the nucleus. Ran itself is mobile and is actively imported into the nucleus by a mechanism involving NTF-2. Together with the compartmentalization of its regulators, this is thought to produce a relatively high concentration of RanGTP in the nucleus. 57937 cd00878: Arf (ADP-ribosylation factor)/Arl (Arf-like) small GTPases. Arf proteins are activators of phospholipase D isoforms. Unlike Ras proteins they lack cysteine residues at their C-termini and therefore are unlikely to be prenylated. Arfs are N-terminally myristoylated. Members of the Arf family are regulators of vesicle formation in intracellular traffic that interact reversibly with membranes of the secretory and endocytic compartments in a GTP-dependent manner. They depart from other small GTP-binding proteins by a unique structural device, interswitch toggle, that implements front-back communication from N-terminus to the nucleotide binding site. Arf-like (Arl) proteins are close relatives of the Arf, but only Arl1 has been shown to function in membrane traffic like the Arf proteins. Arl2 has an unrelated function in the folding of native tubulin, and Arl4 may function in the nucleus. Most other Arf family proteins are so far relatively poorly characterized. Thus, despite their significant sequence homologies, Arf family proteins may regulate unrelated functions. 57938 cd00879: Sar1 subfamily. Sar1 is an essential component of COPII vesicle coats involved in export of cargo from the ER. The GTPase activity of Sar1 functions as a molecular switch to control protein-protein and protein-lipid interactions that direct vesicle budding from the ER. Activation of the GDP to the GTP-bound form of Sar1 involves the membrane-associated guanine nucleotide exchange factor (GEF) Sec12. Sar1 is unlike all Ras superfamily GTPases that use either myristoyl or prenyl groups to direct membrane association and function, in that Sar1 lacks such modification. Instead, Sar1 contains a unique nine-amino-acid N-terminal extension. This extension contains an evolutionarily conserved cluster of bulky hydrophobic amino acids, referred to as the Sar1-N-terminal activation recruitment (STAR) motif. The STAR motif mediates the recruitment of Sar1 to ER membranes and facilitates its interaction with mammalian Sec12 GEF leading to activation. 57939 cd00880: Era (E. coli Ras-like protein)-like. This family includes several distinct subfamilies (TrmE/ThdF, FeoB, YihA (EngG), Era, and EngA/YfgK) that generally show sequence conservation in the region between the Walker A and B motifs (G1 and G3 box motifs), to the exclusion of other GTPases. TrmE is ubiquitous in bacteria and is a widespread mitochondrial protein in eukaryotes, but is absent from archaea. The yeast member of TrmE family, MSS1, is involved in mitochondrial translation; bacterial members are often present in translation-related operons. FeoB represents an unusual adaptation of GTPases for high-affinity iron (II) transport. YihA (EngB) family of GTPases is typified by the E. coli YihA, which is an essential protein involved in cell division control. Era is characterized by a distinct derivative of the KH domain (the pseudo-KH domain) which is located C-terminal to the GTPase domain. EngA and its orthologs are composed of two GTPase domains and, since the sequences of the two domains are more similar to each other than to other GTPases, it is likely that an ancient gene duplication, rather than a fusion of evolutionarily distinct GTPases, gave rise to this family. 57940 cd00881: GTP translation factor family. This family consists primarily of translation initiation, elongation, and release factors, which play specific roles in protein translation. In addition, the family includes Snu114p, a component of the U5 small nuclear riboprotein particle which is a component of the spliceosome and is involved in excision of introns, TetM, a tetracycline resistance gene that protects the ribosome from tetracycline binding, and the unusual subfamily CysN/ATPS, which has an unrelated function (ATP sulfurylase) acquired through lateral transfer of the EF1-alpha gene and development of a new function. 57941 cd00882: Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulate initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions. 57942 cd01850: CDC/Septin. Septins are a conserved family of GTP-binding proteins associated with diverse processes in dividing and non-dividing cells. They were first discovered in the budding yeast S. cerevisiae as a set of genes (CDC3, CDC10, CDC11 and CDC12) required for normal bud morphology. Septins are also present in metazoan cells, where they are required for cytokinesis in some systems, and implicated in a variety of other processes involving organization of the cell cortex and exocytosis. In humans, 12 septin genes generate dozens of polypeptides, many of which comprise heterooligomeric complexes. Since septin mutants are commonly defective in cytokinesis and formation of the neck formation of the neck filaments/septin rings, septins have been considered to be the primary constituents of the neck filaments. Septins belong to the GTPase superfamily for their conserved GTPase motifs and enzymatic activities. 57943 cd01851: Guanylate-binding protein (GBP), N-terminal domain. Guanylate-binding proteins (GBPs) define a group of proteins that are synthesized after activation of the cell by interferons. The biochemical properties of GBPs are clearly different from those of Ras-like and heterotrimeric GTP-binding proteins. They bind guanine nucleotides with low affinity (micromolar range), are stable in their absence and have a high turnover GTPase. In addition to binding GDP/GTP, they have the unique ability to bind GMP with equal affinity and hydrolyze GTP not only to GDP, but also to GMP. Furthermore, two unique regions around the base and the phosphate-binding areas, the guanine and the phosphate caps, respectively, give the nucleotide-binding site a unique appearance not found in the canonical GTP-binding proteins. The phosphate cap, which constitutes the region analogous to switch I, completely shields the phosphate-binding site from solvent such that a potential GTPase-activating protein (GAP) cannot approach. 57944 cd01852: AIG1 (avrRpt2-induced gene 1). This represents Arabidoposis protein AIG1 that appears to be involved in plant resistance to bacteria. The Arabidopsis disease resistance gene RPS2 is involved in recognition of bacterial pathogens carrying the avirulence gene avrRpt2. AIG1 exhibits RPS2- and avrRpt1-dependent induction early after infection with Pseudomonas syringae carrying avrRpt2. This subfamily also includes IAN-4 protein, which has GTP-binding activity and shares sequence homology with a novel family of putative GTP-binding proteins: the immuno-associated nucleotide (IAN) family. The evolutionary conservation of the IAN family provides a unique example of a plant pathogen response gene conserved in animals. The IAN/IMAP subfamily has been proposed to regulate apoptosis in vertebrates and angiosperm plants, particularly in relation to cancer, diabetes, and infections. The human IAN genes were renamed GIMAP (GTPase of the immunity associated proteins).. 57945 cd01853: Toc34-like (Translocon at the Outer-envelope membrane of Chloroplasts). This family contains several Toc proteins, including Toc34, Toc33, Toc120, Toc159, Toc86, Toc125, and Toc90. The Toc complex at the outer envelope membrane of chloroplasts is a molecular machine of ~500 kDa that contains a single Toc159 protein, four Toc75 molecules, and four or five copies of Toc34. Toc64 and Toc12 are associated with the translocon, but do not appear to be part of the core complex. The Toc translocon initiates the import of nuclear-encoded preproteins from the cytosol into the organelle. Toc34 and Toc159 are both GTPases, while Toc75 is a beta-barrel integral membrane protein. Toc159 is equally distributed between a soluble cytoplasmic form and a membrane-inserted form, suggesting that assembly of the Toc complex is dynamic. Toc34 and Toc75 act sequentially to mediate docking and insertion of Toc159 resulting in assembly of the functional translocon. 57946 cd01860: Rab5-related subfamily. This subfamily includes Rab5 and Rab22 of mammals, Ypt51/Ypt52/Ypt53 of yeast, and RabF of plants. The members of this subfamily are involved in endocytosis and endocytic-sorting pathways. In mammals, Rab5 GTPases localize to early endosomes and regulate fusion of clathrin-coated vesicles to early endosomes and fusion between early endosomes. In yeast, Ypt51p family members similarly regulate membrane trafficking through prevacuolar compartments. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 57947 cd01861: Rab6 subfamily. Rab6 is involved in microtubule-dependent transport pathways through the Golgi and from endosomes to the Golgi. Rab6A of mammals is implicated in retrograde transport through the Golgi stack, and is also required for a slow, COPI-independent, retrograde transport pathway from the Golgi to the endoplasmic reticulum (ER). This pathway may allow Golgi residents to be recycled through the ER for scrutiny by ER quality-control systems. Yeast Ypt6p, the homolog of the mammalian Rab6 GTPase, is not essential for cell viability. Ypt6p acts in endosome-to-Golgi, in intra-Golgi retrograde transport, and possibly also in Golgi-to-ER trafficking. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 57948 cd01862: Rab7 subfamily. Rab7 is a small Rab GTPase that regulates vesicular traffic from early to late endosomal stages of the endocytic pathway. The yeast Ypt7 and mammalian Rab7 are both involved in transport to the vacuole/lysosome, whereas Ypt7 is also required for homotypic vacuole fusion. Mammalian Rab7 is an essential participant in the autophagic pathway for sequestration and targeting of cytoplasmic components to the lytic compartment. Mammalian Rab7 is also proposed to function as a tumor suppressor. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 57949 cd01863: Rab18 subfamily. Mammalian Rab18 is implicated in endocytic transport and is expressed most highly in polarized epithelial cells. However, trypanosomal Rab, TbRAB18, is upregulated in the BSF (Blood Stream Form) stage and localized predominantly to elements of the Golgi complex. In human and mouse cells, Rab18 has been identified in lipid droplets, organelles that store neutral lipids. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 57950 cd01864: Rab19 subfamily. Rab19 proteins are associated with Golgi stacks. Similarity analysis indicated that Rab41 is closely related to Rab19. However, the function of these Rabs is not yet chracterized. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 57951 cd01865: Rab3 subfamily. The Rab3 subfamily contains Rab3A, Rab3B, Rab3C, and Rab3D. All four isoforms were found in mouse brain and endocrine tissues, with varying levels of expression. Rab3A, Rab3B, and Rab3C localized to synaptic and secretory vesicles; Rab3D was expressed at high levels only in adipose tissue, exocrine glands, and the endocrine pituitary, where it is localized to cytoplasmic secretory granules. Rab3 appears to control Ca2+-regulated exocytosis. The appropriate GDP/GTP exchange cycle of Rab3A is required for Ca2+-regulated exocytosis to occur, and interaction of the GTP-bound form of Rab3A with effector molecule(s) is widely believed to be essential for this process. Functionally, most studies point toward a role for Rab3 in the secretion of hormones and neurotransmitters. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 57952 cd01866: Rab2 subfamily. Rab2 is localized on cis-Golgi membranes and interacts with Golgi matrix proteins. Rab2 is also implicated in the maturation of vesicular tubular clusters (VTCs), which are microtubule-associated intermediates in transport between the ER and Golgi apparatus. In plants, Rab2 regulates vesicle trafficking between the ER and the Golgi bodies and is important to pollen tube growth. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 57953 cd01867: Rab8/Sec4/Ypt2. Rab8/Sec4/Ypt2 are known or suspected to be involved in post-Golgi transport to the plasma membrane. It is likely that these Rabs have functions that are specific to the mammalian lineage and have no orthologs in plants. Rab8 modulates polarized membrane transport through reorganization of actin and microtubules, induces the formation of new surface extensions, and has an important role in directed membrane transport to cell surfaces. The Ypt2 gene of the fission yeast Schizosaccharomyces pombe encodes a member of the Ypt/Rab family of small GTP-binding proteins, related in sequence to Sec4p of Saccharomyces cerevisiae but closer to mammalian Rab8. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 57954 cd01868: Rab11-like. Rab11a, Rab11b, and Rab25 are closely related, evolutionary conserved Rab proteins that are differentially expressed. Rab11a is ubiquitously synthesized, Rab11b is enriched in brain and heart and Rab25 is only found in epithelia. Rab11/25 proteins seem to regulate recycling pathways from endosomes to the plasma membrane and to the trans-Golgi network. Furthermore, Rab11a is thought to function in the histamine-induced fusion of tubulovesicles containing H+, K+ ATPase with the plasma membrane in gastric parietal cells and in insulin-stimulated insertion of GLUT4 in the plasma membrane of cardiomyocytes. Overexpression of Rab25 has recently been observed in ovarian cancer and breast cancer, and has been correlated with worsened outcomes in both diseases. In addition, Rab25 overexpression has also been observed in prostate cancer, transitional cell carcinoma of the bladder, and invasive breast tumor cells. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 57955 cd01869: Rab1/Ypt1 subfamily. Rab1 is found in every eukaryote and is a key regulatory component for the transport of vesicles from the ER to the Golgi apparatus. Studies on mutations of Ypt1, the yeast homolog of Rab1, showed that this protein is necessary for the budding of vesicles of the ER as well as for their transport to, and fusion with, the Golgi apparatus. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 57956 cd01870: RhoA-like subfamily. The RhoA subfamily consists of RhoA, RhoB, and RhoC. RhoA promotes the formation of stress fibers and focal adhesions, regulating cell shape, attachment, and motility. RhoA can bind to multiple effector proteins, thereby triggering different downstream responses. In many cell types, RhoA mediates local assembly of the contractile ring, which is necessary for cytokinesis. RhoA is vital for muscle contraction; in vascular smooth muscle cells, RhoA plays a key role in cell contraction, differentiation, migration, and proliferation. RhoA activities appear to be elaborately regulated in a time- and space-dependent manner to control cytoskeletal changes. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. RhoA and RhoC are observed only in geranylgeranylated forms; however, RhoB can be present in palmitoylated, farnesylated, and geranylgeranylated forms. RhoA and RhoC are highly relevant for tumor progression and invasiveness; however, RhoB has recently been suggested to be a tumor suppressor. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 57957 cd01871: Rac1-like subfamily. The Rac1-like subfamily consists of Rac1, Rac2, and Rac3 proteins, plus the splice variant Rac1b that contains a 19-residue insertion near switch II relative to Rac1. While Rac1 is ubiquitously expressed, Rac2 and Rac3 are largely restricted to hematopoietic and neural tissues respectively. Rac1 stimulates the formation of actin lamellipodia and membrane ruffles. It also plays a role in cell-matrix adhesion and cell anoikis. In intestinal epithelial cells, Rac1 is an important regulator of migration and mediates apoptosis. Rac1 is also essential for RhoA-regulated actin stress fiber and focal adhesion complex formation. In leukocytes, Rac1 and Rac2 have distinct roles in regulating cell morphology, migration, and invasion, but are not essential for macrophage migration or chemotaxis. Rac3 has biochemical properties that are closely related to Rac1, such as effector interaction, nucleotide binding, and hydrolysis; Rac2 has a slower nucleotide association and is more efficiently activated by the RacGEF Tiam1. Both Rac1 and Rac3 have been implicated in the regulation of cell migration and invasion in human metastatic breast cancer. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 57958 cd01873: RhoBTB subfamily. Members of the RhoBTB subfamily of Rho GTPases are present in vertebrates, Drosophila, and Dictyostelium. RhoBTB proteins are characterized by a modular organization, consisting of a GTPase domain, a proline rich region, a tandem of two BTB (Broad-Complex, Tramtrack, and Bric a brac) domains, and a C-terminal region of unknown function. RhoBTB proteins may act as docking points for multiple components participating in signal transduction cascades. RhoBTB genes appeared upregulated in some cancer cell lines, suggesting a participation of RhoBTB proteins in the pathogenesis of particular tumors. Note that the Dictyostelium RacA GTPase domain is more closely related to Rac proteins than to RhoBTB proteins, where RacA actually belongs. Thus, the Dictyostelium RacA is not included here. Most Rho proteins contain a lipid modification site at the C-terminus; however, RhoBTB is one of few Rho subfamilies that lack this feature. 57959 cd01874: Cdc42 subfamily. Cdc42 is an essential GTPase that belongs to the Rho family of Ras-like GTPases. These proteins act as molecular switches by responding to exogenous and/or endogenous signals and relaying those signals to activate downstream components of a biological pathway. Cdc42 transduces signals to the actin cytoskeleton to initiate and maintain polarized growth and to mitogen-activated protein morphogenesis. In the budding yeast Saccharomyces cerevisiae, Cdc42 plays an important role in multiple actin-dependent morphogenetic events such as bud emergence, mating-projection formation, and pseudohyphal growth. In mammalian cells, Cdc42 regulates a variety of actin-dependent events and induces the JNK/SAPK protein kinase cascade, which leads to the activation of transcription factors within the nucleus. Cdc42 mediates these processes through interactions with a myriad of downstream effectors, whose number and regulation we are just starting to understand. In addition, Cdc42 has been implicated in a number of human diseases through interactions with its regulators and downstream effectors. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 57960 cd01875: RhoG subfamily. RhoG is a GTPase with high sequence similarity to members of the Rac subfamily, including the regions involved in effector recognition and binding. However, RhoG does not bind to known Rac1 and Cdc42 effectors, including proteins containing a Cdc42/Rac interacting binding (CRIB) motif. Instead, RhoG interacts directly with Elmo, an upstream regulator of Rac1, in a GTP-dependent manner and forms a ternary complex with Dock180 to induce activation of Rac1. The RhoG-Elmo-Dock180 pathway is required for activation of Rac1 and cell spreading mediated by integrin, as well as for neurite outgrowth induced by nerve growth factor. Thus RhoG activates Rac1 through Elmo and Dock180 to control cell morphology. RhoG has also been shown to play a role in caveolar trafficking and has a novel role in signaling the neutrophil respiratory burst stimulated by G protein-coupled receptor (GPCR) agonists. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. 57961 cd01876: The YihA (EngB) subfamily. This subfamily of GTPases is typified by the E. coli YihA, an essential protein involved in cell division control. YihA and its orthologs are small proteins that typically contain less than 200 amino acid residues and consists of the GTPase domain only (some of the eukaryotic homologs contain an N-terminal extension of about 120 residues that might be involved in organellar targeting). Homologs of yihA are found in most Gram-positive and Gram-negative pathogenic bacteria, with the exception of Mycobacterium tuberculosis. The broad-spectrum nature of YihA and its essentiality for cell viability in bacteria make it an attractive antibacterial target. 57962 cd01878: HflX subfamily. A distinct conserved domain with a glycine-rich segment N-terminal of the GTPase domain characterizes the HflX subfamily. The E. coli HflX has been implicated in the control of the lambda cII repressor proteolysis, but the actual biological functions of these GTPases remain unclear. HflX is widespread, but not universally represented in all three superkingdoms. 57963 cd01879: Ferrous iron transport protein B (FeoB) subfamily. E. coli has an iron(II) transport system, known as feo, which may make an important contribution to the iron supply of the cell under anaerobic conditions. FeoB has been identified as part of this transport system. FeoB is a large 700-800 amino acid integral membrane protein. The N terminus contains a P-loop motif suggesting that iron transport may be ATP dependent. 57964 cd01881: The Obg-like subfamily consists of five well-delimited, ancient subfamilies, namely Obg, DRG, YyaF/YchF, Ygr210, and NOG1. Four of these groups (Obg, DRG, YyaF/YchF, and Ygr210) are characterized by a distinct glycine-rich motif immediately following the Walker B motif (G3 box). Obg/CgtA is an essential gene that is involved in the initiation of sporulation and DNA replication in the bacteria Caulobacter and Bacillus, but its exact molecular role is unknown. Furthermore, several OBG family members possess a C-terminal RNA-binding domain, the TGS domain, which is also present in threonyl-tRNA synthetase and in bacterial guanosine polyphosphatase SpoT. Nog1 is a nucleolar protein that might function in ribosome assembly. The DRG and Nog1 subfamilies are ubiquitous in archaea and eukaryotes, the Ygr210 subfamily is present in archaea and fungi, and the Obg and YyaF/YchF subfamilies are ubiquitous in bacteria and eukaryotes. The Obg/Nog1 and DRG subfamilies appear to form one major branch of the Obg family and the Ygr210 and YchF subfamilies form another branch. No GEFs, GAPs, or GDIs for Obg have been identified. 57965 cd01882: Bms1. Bms1 is an essential, evolutionarily conserved, nucleolar protein. Its depletion interferes with processing of the 35S pre-rRNA at sites A0, A1, and A2, and the formation of 40S subunits. Bms1, the putative endonuclease Rc11, and the essential U3 small nucleolar RNA form a stable subcomplex that is believed to control an early step in the formation of the 40S subumit. The C-terminal domain of Bms1 contains a GTPase-activating protein (GAP) that functions intramolecularly. It is believed that Rc11 activates Bms1 by acting as a guanine-nucleotide exchange factor (GEF) to promote GDP/GTP exchange, and that activated (GTP-bound) Bms1 delivers Rc11 to the preribosomes. 57966 cd01883: Eukaryotic elongation factor 1 (EF1) alpha subfamily. EF1 is responsible for the GTP-dependent binding of aminoacyl-tRNAs to the ribosomes. EF1 is composed of four subunits: the alpha chain which binds GTP and aminoacyl-tRNAs, the gamma chain that probably plays a role in anchoring the complex to other cellular components and the beta and delta (or beta') chains. This subfamily is the alpha subunit, and represents the counterpart of bacterial EF-Tu for the archaea (aEF1-alpha) and eukaryotes (eEF1-alpha). eEF1-alpha interacts with the actin of the eukaryotic cytoskeleton and may thereby play a role in cellular transformation and apoptosis. EF-Tu can have no such role in bacteria. In humans, the isoform eEF1A2 is overexpressed in 2/3 of breast cancers and has been identified as a putative oncogene. This subfamily also includes Hbs1, a G protein known to be important for efficient growth and protein synthesis under conditions of limiting translation initiation in yeast, and to associate with Dom34. It has been speculated that yeast Hbs1 and Dom34 proteins may function as part of a complex with a role in gene expression. 57967 cd01884: EF-Tu subfamily. This subfamily includes orthologs of translation elongation factor EF-Tu in bacteria, mitochondria, and chloroplasts. It is one of several GTP-binding translation factors found in the larger family of GTP-binding elongation factors. The eukaryotic counterpart, eukaryotic translation elongation factor 1 (eEF-1 alpha), is excluded from this family. EF-Tu is one of the most abundant proteins in bacteria, as well as, one of the most highly conserved, and in a number of species the gene is duplicated with identical function. When bound to GTP, EF-Tu can form a complex with any (correctly) aminoacylated tRNA except those for initiation and for selenocysteine, in which case EF-Tu is replaced by other factors. Transfer RNA is carried to the ribosome in these complexes for protein translation. 57968 cd01885: EF2 (for archaea and eukarya). Translocation requires hydrolysis of a molecule of GTP and is mediated by EF-G in bacteria and by eEF2 in eukaryotes. The eukaryotic elongation factor eEF2 is a GTPase involved in the translocation of the peptidyl-tRNA from the A site to the P site on the ribosome. The 95-kDa protein is highly conserved, with 60% amino acid sequence identity between the human and yeast proteins. Two major mechanisms are known to regulate protein elongation and both involve eEF2. First, eEF2 can be modulated by reversible phosphorylation. Increased levels of phosphorylated eEF2 reduce elongation rates presumably because phosphorylated eEF2 fails to bind the ribosomes. Treatment of mammalian cells with agents that raise the cytoplasmic Ca2+ and cAMP levels reduce elongation rates by activating the kinase responsible for phosphorylating eEF2. In contrast, treatment of cells with insulin increases elongation rates by promoting eEF2 dephosphorylation. Second, the protein can be post-translationally modified by ADP-ribosylation. Various bacterial toxins perform this reaction after modification of a specific histidine residue to diphthamide, but there is evidence for endogenous ADP ribosylase activity. Similar to the bacterial toxins, it is presumed that modification by the endogenous enzyme also inhibits eEF2 activity. 57969 cd01886: Elongation factor G (EF-G) subfamily. Translocation is mediated by EF-G (also called translocase). The structure of EF-G closely resembles that of the complex between EF-Tu and tRNA. This is an example of molecular mimicry; a protein domain evolved so that it mimics the shape of a tRNA molecule. EF-G in the GTP form binds to the ribosome, primarily through the interaction of its EF-Tu-like domain with the 50S subunit. The binding of EF-G to the ribosome in this manner stimulates the GTPase activity of EF-G. On GTP hydrolysis, EF-G undergoes a conformational change that forces its arm deeper into the A site on the 30S subunit. To accommodate this domain, the peptidyl-tRNA in the A site moves to the P site, carrying the mRNA and the deacylated tRNA with it. The ribosome may be prepared for these rearrangements by the initial binding of EF-G as well. The dissociation of EF-G leaves the ribosome ready to accept the next aminoacyl-tRNA into the A site. This group contains both eukaryotic and bacterial members. 57970 cd01887: IF2/eIF5B (initiation factors 2/ eukaryotic initiation factor 5B) subfamily. IF2/eIF5B contribute to ribosomal subunit joining and function as GTPases that are maximally activated by the presence of both ribosomal subunits. As seen in other GTPases, IF2/IF5B undergoes conformational changes between its GTP- and GDP-bound states. Eukaryotic IF2/eIF5Bs possess three characteristic segments, including a divergent N-terminal region followed by conserved central and C-terminal segments. This core region is conserved among all known eukaryotic and archaeal IF2/eIF5Bs and eubacterial IF2s. 57971 cd01888: eIF2-gamma (gamma subunit of initiation factor 2). eIF2 is a heterotrimeric translation initiation factor that consists of alpha, beta, and gamma subunits. The GTP-bound gamma subunit also binds initiator methionyl-tRNA and delivers it to the 40S ribosomal subunit. Following hydrolysis of GTP to GDP, eIF2:GDP is released from the ribosome. The gamma subunit has no intrinsic GTPase activity, but is stimulated by the GTPase activating protein (GAP) eIF5, and GDP/GTP exchange is stimulated by the guanine nucleotide exchange factor (GEF) eIF2B. eIF2B is a heteropentamer, and the epsilon chain binds eIF2. Both eIF5 and eIF2B-epsilon are known to bind strongly to eIF2-beta, but have also been shown to bind directly to eIF2-gamma. It is possible that eIF2-beta serves simply as a high-affinity docking site for eIF5 and eIF2B-epsilon, or that eIF2-beta serves a regulatory role. eIF2-gamma is found only in eukaryotes and archaea. It is closely related to SelB, the selenocysteine-specific elongation factor from eubacteria. The translational factor components of the ternary complex, IF2 in eubacteria and eIF2 in eukaryotes are not the same protein (despite their unfortunately similar names). Both factors are GTPases; however, eubacterial IF-2 is a single polypeptide, while eIF2 is heterotrimeric. eIF2-gamma is a member of the same family as eubacterial IF2, but the two proteins are only distantly related. This family includes translation initiation, elongation, and release factors. 57972 cd01889: SelB subfamily. SelB is an elongation factor needed for the co-translational incorporation of selenocysteine. Selenocysteine is coded by a UGA stop codon in combination with a specific downstream mRNA hairpin. In bacteria, the C-terminal part of SelB recognizes this hairpin, while the N-terminal part binds GTP and tRNA in analogy with elongation factor Tu (EF-Tu). It specifically recognizes the selenocysteine charged tRNAsec, which has a UCA anticodon, in an EF-Tu like manner. This allows insertion of selenocysteine at in-frame UGA stop codons. In E. coli SelB binds GTP, selenocysteyl-tRNAsec and a stem-loop structure immediately downstream of the UGA codon (the SECIS sequence). The absence of active SelB prevents the participation of selenocysteyl-tRNAsec in translation. Archaeal and animal mechanisms of selenocysteine incorporation are more complex. Although the SECIS elements have different secondary structures and conserved elements between archaea and eukaryotes, they do share a common feature. Unlike in E. coli, these SECIS elements are located in the 3' UTRs. This group contains eukaryotic SelBs and some from archaea. 57973 cd01890: LepA subfamily. LepA belongs to the GTPase family of and exhibits significant homology to the translation factors EF-G and EF-Tu, indicating its possible involvement in translation and association with the ribosome. LepA is ubiquitous in bacteria and eukaryota (e.g. yeast GUF1p), but is missing from archaea. This pattern of phyletic distribution suggests that LepA evolved through a duplication of the EF-G gene in bacteria, followed by early transfer into the eukaryotic lineage, most likely from the promitochondrial endosymbiont. Yeast GUF1p is not essential and mutant cells did not reveal any marked phenotype. 57974 cd01891: TypA (tyrosine phosphorylated protein A)/BipA subfamily. BipA is a protein belonging to the ribosome-binding family of GTPases and is widely distributed in bacteria and plants. BipA was originally described as a protein that is induced in Salmonella typhimurium after exposure to bactericidal/permeability-inducing protein (a cationic antimicrobial protein produced by neutrophils), and has since been identified in E. coli as well. The properties thus far described for BipA are related to its role in the process of pathogenesis by enteropathogenic E. coli. It appears to be involved in the regulation of several processes important for infection, including rearrangements of the cytoskeleton of the host, bacterial resistance to host defense peptides, flagellum-mediated cell motility, and expression of K5 capsular genes. It has been proposed that BipA may utilize a novel mechanism to regulate the expression of target genes. In addition, BipA from enteropathogenic E. coli has been shown to be phosphorylated on a tyrosine residue, while BipA from Salmonella and from E. coli K12 strains is not phosphorylated under the conditions assayed. The phosphorylation apparently modifies the rate of nucleotide hydrolysis, with the phosphorylated form showing greatly increased GTPase activity. 57975 cd01892: Miro2 subfamily. Miro (mitochondrial Rho) proteins have tandem GTP-binding domains separated by a linker region containing putative calcium-binding EF hand motifs. Genes encoding Miro-like proteins were found in several eukaryotic organisms. This CD represents the putative GTPase domain in the C terminus of Miro proteins. These atypical Rho GTPases have roles in mitochondrial homeostasis and apoptosis. Most Rho proteins contain a lipid modification site at the C-terminus; however, Miro is one of few Rho subfamilies that lack this feature. 57976 cd01893: Miro1 subfamily. Miro (mitochondrial Rho) proteins have tandem GTP-binding domains separated by a linker region containing putative calcium-binding EF hand motifs. Genes encoding Miro-like proteins were found in several eukaryotic organisms. This CD represents the N-terminal GTPase domain of Miro proteins. These atypical Rho GTPases have roles in mitochondrial homeostasis and apoptosis. Most Rho proteins contain a lipid modification site at the C-terminus; however, Miro is one of few Rho subfamilies that lack this feature. 57977 cd01894: EngA1 subfamily. This CD represents the first GTPase domain of EngA and its orthologs, which are composed of two adjacent GTPase domains. Since the sequences of the two domains are more similar to each other than to other GTPases, it is likely that an ancient gene duplication, rather than a fusion of evolutionarily distinct GTPases, gave rise to this family. Although the exact function of these proteins has not been elucidated, studies have revealed that the E. coli EngA homolog, Der, and Neisseria gonorrhoeae EngA are essential for cell viability. A recent report suggests that E. coli Der functions in ribosome assembly and stability. 57978 cd01895: EngA2 subfamily. This CD represents the second GTPase domain of EngA and its orthologs, which are composed of two adjacent GTPase domains. Since the sequences of the two domains are more similar to each other than to other GTPases, it is likely that an ancient gene duplication, rather than a fusion of evolutionarily distinct GTPases, gave rise to this family. Although the exact function of these proteins has not been elucidated, studies have revealed that the E. coli EngA homolog, Der, and Neisseria gonorrhoeae EngA are essential for cell viability. A recent report suggests that E. coli Der functions in ribosome assembly and stability. 57979 cd01896: The developmentally regulated GTP-binding protein (DRG) subfamily is an uncharacterized member of the Obg family, an evolutionary branch of GTPase superfamily proteins. GTPases act as molecular switches regulating diverse cellular processes. DRG2 and DRG1 comprise the DRG subfamily in eukaryotes. In view of their widespread expression in various tissues and high conservation among distantly related species in eukaryotes and archaea, DRG proteins may regulate fundamental cellular processes. It is proposed that the DRG subfamily proteins play their physiological roles through RNA binding. 57980 cd01897: NOG1 is a nucleolar GTP-binding protein present in eukaryotes ranging from trypanosomes to humans. NOG1 is functionally linked to ribosome biogenesis and found in association with the nuclear pore complexes and identified in many preribosomal complexes. Thus, defects in NOG1 can lead to defects in 60S biogenesis. The S. cerevisiae NOG1 gene is essential for cell viability, and mutations in the predicted G motifs abrogate function. It is a member of the ODN family of GTP-binding proteins that also includes the bacterial Obg and DRG proteins. 57981 cd01898: Obg subfamily. The Obg nucleotide binding protein subfamily has been implicated in stress response, chromosome partitioning, replication initiation, mycelium development, and sporulation. Obg proteins are among a large group of GTP binding proteins conserved from bacteria to humans. The E. coli homolog, ObgE is believed to function in ribosomal biogenesis. Members of the subfamily contain two equally and highly conserved domains, a C-terminal GTP binding domain and an N-terminal glycine-rich domain. 57982 cd01899: Ygr210 subfamily. Ygr210 is a member of Obg-like family and present in archaea and fungi. They are characterized by a distinct glycine-rich motif immediately following the Walker B motif. The Ygr210 and YyaF/YchF subfamilies appear to form one major branch of the Obg-like family. Among eukaryotes, the Ygr210 subfamily is represented only in fungi. These fungal proteins form a tight cluster with their archaeal orthologs, which suggests the possibility of horizontal transfer from archaea to fungi. 57983 cd01900: YchF subfamily. YchF is a member of the Obg family, which includes four other subfamilies of GTPases: Obg, DRG, Ygr210, and NOG1. Obg is an essential gene that is involved in DNA replication in C. crescentus and Streptomyces griseus and is associated with the ribosome. Several members of the family, including YchF, possess the TGS domain related to the RNA-binding proteins. Experimental data and genomic analysis suggest that YchF may be part of a nucleoprotein complex and may function as a GTP-dependent translational factor. 57984 cd04101: RabL4 (Rab-like4) subfamily. RabL4s are novel proteins that have high sequence similarity with Rab family members, but display features that are distinct from Rabs, and have been termed Rab-like. As in other Rab-like proteins, RabL4 lacks a prenylation site at the C-terminus. The specific function of RabL4 remains unknown. 57985 cd04102: RabL3 (Rab-like3) subfamily. RabL3s are novel proteins that have high sequence similarity with Rab family members, but display features that are distinct from Rabs, and have been termed Rab-like. As in other Rab-like proteins, RabL3 lacks a prenylation site at the C-terminus. The specific function of RabL3 remains unknown. 57986 cd04103: Centaurin gamma. The centaurins (alpha, beta, gamma, and delta) are large, multi-domain proteins that all contain an ArfGAP domain and ankyrin repeats, and in some cases, numerous additional domains. Centaurin gamma contains an additional GTPase domain near its N-terminus. The specific function of this GTPase domain has not been well characterized, but centaurin gamma 2 (CENTG2) may play a role in the development of autism. Centaurin gamma 1 is also called PIKE (phosphatidyl inositol (PI) 3-kinase enhancer) and centaurin gamma 2 is also known as AGAP (ArfGAP protein with a GTPase-like domain, ankyrin repeats and a Pleckstrin homology domain) or GGAP. Three isoforms of PIKE have been identified. PIKE-S (short) and PIKE-L (long) are brain-specific isoforms, with PIKE-S restricted to the nucleus and PIKE-L found in multiple cellular compartments. A third isoform, PIKE-A was identified in human glioblastoma brain cancers and has been found in various tissues. GGAP has been shown to have high GTPase activity due to a direct intramolecular interaction between the N-terminal GTPase domain and the C-terminal ArfGAP domain. In human tissue, AGAP mRNA was detected in skeletal muscle, kidney, placenta, brain, heart, colon, and lung. Reduced expression levels were also observed in the spleen, liver, and small intestine. 57987 cd04104: p47 (47-kDa) family. The p47 GTPase family consists of several highly homologous proteins, including IGTP, TGTP/Mg21, IRG-47, GTPI, LRG-47, and IIGP1. They are found in higher eukaryotes where they play a role in immune resistance against intracellular pathogens. p47 proteins exist at low resting levels in mouse cells, but are strongly induced by Type II interferon (IFN-gamma). ITGP is critical for resistance to Toxoplasma gondii infection and in involved in inhibition of Coxsackievirus-B3-induced apoptosis. TGTP was shown to limit vesicular stomatitis virus (VSV) infection of fibroblasts in vitro. IRG-47 is involved in resistance to T. gondii infection. LRG-47 has been implicated in resistance to T. gondii, Listeria monocytogenes, Leishmania, and mycobacterial infections. IIGP1 has been shown to localize to the ER and to the Golgi membranes in IFN-induced cells and inflamed tissues. In macrophages, IIGP1 interacts with hook3, a microtubule binding protein that participates in the organization of the cis-Golgi compartment. 57988 cd04105: Signal recognition particle receptor, beta subunit (SR-beta). SR-beta and SR-alpha form the heterodimeric signal recognition particle (SRP or SR) receptor that binds SRP to regulate protein translocation across the ER membrane. Nascent polypeptide chains are synthesized with an N-terminal hydrophobic signal sequence that binds SRP54, a component of the SRP. SRP directs targeting of the ribosome-nascent chain complex (RNC) to the ER membrane via interaction with the SR, which is localized to the ER membrane. The RNC is then transferred to the protein-conducting channel, or translocon, which facilitates polypeptide translation across the ER membrane or integration into the ER membrane. SR-beta is found only in eukaryotes; it is believed to control the release of the signal sequence from SRP54 upon binding of the ribosome to the translocon. High expression of SR-beta has been observed in human colon cancer, suggesting it may play a role in the development of this type of cancer. 57989 cd04106: Rab23-like subfamily. Rab23 is a member of the Rab family of small GTPases. In mouse, Rab23 has been shown to function as a negative regulator in the sonic hedgehog (Shh) signalling pathway. Rab23 mediates the activity of Gli2 and Gli3, transcription factors that regulate Shh signaling in the spinal cord, primarily by preventing Gli2 activation in the absence of Shh ligand. Rab23 also regulates a step in the cytoplasmic signal transduction pathway that mediates the effect of Smoothened (one of two integral membrane proteins that are essential components of the Shh signaling pathway in vertebrates). In humans, Rab23 is expressed in the retina. Mice contain an isoform that shares 93% sequence identity with the human Rab23 and an alternative splicing isoform that is specific to the brain. This isoform causes the murine open brain phenotype, indicating it may have a role in the development of the central nervous system. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 57990 cd04107: Rab38/Rab32 subfamily. Rab32 and Rab38 are members of the Rab family of small GTPases. Human Rab32 was first identified in platelets but it is expressed in a variety of cell types, where it functions as an A-kinase anchoring protein (AKAP). Rab38 has been shown to be melanocyte-specific. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. 57991 cd04108: Rab34/Rab36 subfamily. Rab34, found primarily in the Golgi, interacts with its effector, Rab-interacting lysosomal protein (RILP). This enables its participation in microtubular dynenin-dynactin-mediated repositioning of lysosomes from the cell periphery to the Golgi. A Rab34 (Rah) isoform that lacks the consensus GTP-binding region has been identified in mice. This isoform is associated with membrane ruffles and promotes macropinosome formation. Rab36 has been mapped to human chromosome 22q11.2, a region that is homozygously deleted in malignant rhabdoid tumors (MRTs). However, experimental assessments do not implicate Rab36 as a tumor suppressor that would enable tumor formation through a loss-of-function mechanism. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. 57992 cd04109: Rab28 subfamily. First identified in maize, Rab28 has been shown to be a late embryogenesis-abundant (Lea) protein that is regulated by the plant hormone abcisic acid (ABA). In Arabidopsis, Rab28 is expressed during embryo development and is generally restricted to provascular tissues in mature embryos. Unlike maize Rab28, it is not ABA-inducible. Characterization of the human Rab28 homolog revealed two isoforms, which differ by a 95-base pair insertion, producing an alternative sequence for the 30 amino acids at the C-terminus. The two human isoforms are presumbly the result of alternative splicing. Since they differ at the C-terminus but not in the GTP-binding region, they are predicted to be targeted to different cellular locations. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. 57993 cd04110: Rab35 subfamily. Rab35 is one of several Rab proteins to be found to participate in the regulation of osteoclast cells in rats. In addition, Rab35 has been identified as a protein that interacts with nucleophosmin-anaplastic lymphoma kinase (NPM-ALK) in human cells. Overexpression of NPM-ALK is a key oncogenic event in some anaplastic large-cell lymphomas; since Rab35 interacts with N|PM-ALK, it may provide a target for cancer treatments. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. 57994 cd04111: Rab39 subfamily. Found in eukaryotes, Rab39 is mainly found in epithelial cell lines, but is distributed widely in various human tissues and cell lines. It is believed to be a novel Rab protein involved in regulating Golgi-associated vesicular transport during cellular endocytosis. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. 57995 cd04112: Rab26 subfamily. First identified in rat pancreatic acinar cells, Rab26 is believed to play a role in recruiting mature granules to the plasma membrane upon beta-adrenergic stimulation. Rab26 belongs to the Rab functional group III, which are considered key regulators of intracellular vesicle transport during exocytosis. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. 57996 cd04113: Rab4 subfamily. Rab4 has been implicated in numerous functions within the cell. It helps regulate endocytosis through the sorting, recycling, and degradation of early endosomes. Mammalian Rab4 is involved in the regulation of many surface proteins including G-protein-coupled receptors, transferrin receptor, integrins, and surfactant protein A. Experimental data implicate Rab4 in regulation of the recycling of internalized receptors back to the plasma membrane. It is also believed to influence receptor-mediated antigen processing in B-lymphocytes, in calcium-dependent exocytosis in platelets, in alpha-amylase secretion in pancreatic cells, and in insulin-induced translocation of Glut4 from internal vesicles to the cell surface. Rab4 is known to share effector proteins with Rab5 and Rab11. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 57997 cd04114: Rab30 subfamily. Rab30 appears to be associated with the Golgi stack. It is expressed in a wide variety of tissue types and in humans maps to chromosome 11. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 57998 cd04115: Rab33B/Rab33A subfamily. Rab33B is ubiquitously expressed in mouse tissues and cells, where it is localized to the medial Golgi cisternae. It colocalizes with alpha-mannose II. Together with the other cisternal Rabs, Rab6A and Rab6A', it is believed to regulate the Golgi response to stress and is likely a molecular target in stress-activated signaling pathways. Rab33A (previously known as S10) is expressed primarily in the brain and immune system cells. In humans, it is located on the X chromosome at Xq26 and its expression is down-regulated in tuberculosis patients. Experimental evidence suggests that Rab33A is a novel CD8+ T cell factor that likely plays a role in tuberculosis disease processes. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 57999 cd04116: Rab9 subfamily. Rab9 is found in late endosomes, together with mannose 6-phosphate receptors (MPRs) and the tail-interacting protein of 47 kD (TIP47). Rab9 is a key mediator of vesicular transport from late endosomes to the trans-Golgi network (TGN) by redirecting the MPRs. Rab9 has been identified as a key component for the replication of several viruses, including HIV1, Ebola, Marburg, and measles, making it a potential target for inhibiting a variety of viruses. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 58000 cd04117: Rab15 subfamily. Rab15 colocalizes with the transferrin receptor in early endosome compartments, but not with late endosomal markers. It codistributes with Rab4 and Rab5 on early/sorting endosomes, and with Rab11 on pericentriolar recycling endosomes. It is believed to function as an inhibitory GTPase that regulates distinct steps in early endocytic trafficking. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 58001 cd04118: Rab24 subfamily. Rab24 is distinct from other Rabs in several ways. It exists primarily in the GTP-bound state, having a low intrinsic GTPase activity; it is not efficiently geranyl-geranylated at the C-terminus; it does not form a detectable complex with Rab GDP-dissociation inhibitors (GDIs); and it has recently been shown to undergo tyrosine phosphorylation when overexpressed in vitro. The specific function of Rab24 still remains unknown. It is found in a transport route between ER-cis-Golgi and late endocytic compartments. It is putatively involved in an autophagic pathway, possibly directing misfolded proteins in the ER to degradative pathways. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. 58002 cd04119: RJL (RabJ-Like) subfamily. RJLs are found in many protists and as chimeras with C-terminal DNAJ domains in deuterostome metazoa. They are not found in plants, fungi, and protostome metazoa, suggesting a horizontal gene transfer between protists and deuterostome metazoa. RJLs lack any known membrane targeting signal and contain a degenerate phosphate/magnesium-binding 3 (PM3) motif, suggesting an impaired ability to hydrolyze GTP. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. 58003 cd04120: Rab12 subfamily. Rab12 was first identified in canine cells, where it was localized to the Golgi complex. The specific function of Rab12 remains unknown, and inconsistent results about its cellular localization have been reported. More recent studies have identified Rab12 associated with post-Golgi vesicles, or with other small vesicle-like structures but not with the Golgi complex. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. 58004 cd04121: Rab40 subfamily. This subfamily contains Rab40a, Rab40b, and Rab40c, which are all highly homologous. In rat, Rab40c is localized to the perinuclear recycling compartment (PRC), and is distributed in a tissue-specific manor, with high expression in brain, heart, kidney, and testis, low expression in lung and liver, and no expression in spleen and skeletal muscle. Rab40c is highly expressed in differentiated oligodendrocytes but minimally expressed in oligodendrocyte progenitors, suggesting a role in the vesicular transport of myelin components. Unlike most other Ras-superfamily proteins, Rab40c was shown to have a much lower affinity for GTP, and an affinity for GDP that is lower than for GTP. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. 58005 cd04122: Rab14 subfamily. Rab14 GTPases are localized to biosynthetic compartments, including the rough ER, the Golgi complex, and the trans-Golgi network, and to endosomal compartments, including early endosomal vacuoles and associated vesicles. Rab14 is believed to function in both the biosynthetic and recycling pathways between the Golgi and endosomal compartments. Rab14 has also been identified on GLUT4 vesicles, and has been suggested to help regulate GLUT4 translocation. In addition, Rab14 is believed to play a role in the regulation of phagocytosis. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 58006 cd04123: Rab21 subfamily. The localization and function of Rab21 are not clearly defined, with conflicting data reported. Rab21 has been reported to localize in the ER in human intestinal epithelial cells, with partial colocalization with alpha-glucosidase, a late endosomal/lysosomal marker. More recently, Rab21 was shown to colocalize with and affect the morphology of early endosomes. In Dictyostelium, GTP-bound Rab21, together with two novel LIM domain proteins, LimF and ChLim, has been shown to regulate phagocytosis. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 58007 cd04124: RabL2 subfamily. RabL2 (Rab-like2) subfamily. RabL2s are novel Rab proteins identified recently which display features that are distinct from other Rabs, and have been termed Rab-like. RabL2 contains RabL2a and RabL2b, two very similar Rab proteins that share > 98% sequence identity in humans. RabL2b maps to the subtelomeric region of chromosome 22q13.3 and RabL2a maps to 2q13, a region that suggests it is also a subtelomeric gene. Both genes are believed to be expressed ubiquitously, suggesting that RabL2s are the first example of duplicated genes in human proximal subtelomeric regions that are both expressed actively. Like other Rab-like proteins, RabL2s lack a prenylation site at the C-terminus. The specific functions of RabL2a and RabL2b remain unknown. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. 58008 cd04125: RabA-like subfamily. RabA was first identified in D. discoideum, where its expression levels were compared to other Rabs in growing and developing cells. The RabA mRNA levels were below the level of detection by Northern blot analysis, suggesting a very low level of expression. The function of RabA remains unknown. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. 58009 cd04126: Rab20 subfamily. Rab20 is one of several Rab proteins that appear to be restricted in expression to the apical domain of murine polarized epithelial cells. It is expressed on the apical side of polarized kidney tubule and intestinal epithelial cells, and in non-polarized cells. It also localizes to vesico-tubular structures below the apical brush border of renal proximal tubule cells and in the apical region of duodenal epithelial cells. Rab20 has also been shown to colocalize with vacuolar H+-ATPases (V-ATPases) in mouse kidney cells, suggesting a role in the regulation of V-ATPase traffic in specific portions of the nephron. It was also shown to be one of several proteins whose expression is upregulated in human myelodysplastic syndrome (MDS) patients. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. 58010 cd04127: Rab27a subfamily. The Rab27a subfamily consists of Rab27a and its highly homologous isoform, Rab27b. Unlike most Rab proteins whose functions remain poorly defined, Rab27a has many known functions. Rab27a has multiple effector proteins, and depending on which effector it binds, Rab27a has different functions as well as tissue distribution and/or cellular localization. Putative functions have been assigned to Rab27a when associated with the effector proteins Slp1, Slp2, Slp3, Slp4, Slp5, DmSlp, rabphilin, Dm/Ce-rabphilin, Slac2-a, Slac2-b, Slac2-c, Noc2, JFC1, and Munc13-4. Rab27a has been associated with several human diseases, including hemophagocytic syndrome (Griscelli syndrome or GS), Hermansky-Pudlak syndrome, and choroidermia. In the case of GS, a rare, autosomal recessive disease, a Rab27a mutation is directly responsible for the disorder. When Rab27a is localized to the secretory granules of pancreatic beta cells, it is believed to mediate glucose-stimulated insulin secretion, making it a potential target for diabetes therapy. When bound to JFC1 in prostate cells, Rab27a is believed to regulate the exocytosis of prostate- specific markers. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 58011 cd04128: Spg1p. Spg1p (septum-promoting GTPase) was first identified in the fission yeast S. pombe, where it regulates septum formation in the septation initiation network (SIN) through the cdc7 protein kinase. Spg1p is an essential gene that localizes to the spindle pole bodies. When GTP-bound, it binds cdc7 and causes it to translocate to spindle poles. Sid4p (septation initiation defective) is required for localization of Spg1p to the spindle pole body, and the ability of Spg1p to promote septum formation from any point in the cell cycle depends on Sid4p. Spg1p is negatively regulated by Byr4 and cdc16, which form a two-component GTPase activating protein (GAP) for Spg1p. The existence of a SIN-related pathway in plants has been proposed. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. 58012 cd04129: Rho2 subfamily. Rho2 is a fungal GTPase that plays a role in cell morphogenesis, control of cell wall integrity, control of growth polarity, and maintenance of growth direction. Rho2 activates the protein kinase C homolog Pck2, and Pck2 controls Mok1, the major (1-3) alpha-D-glucan synthase. Together with Rho1 (RhoA), Rho2 regulates the construction of the cell wall. Unlike Rho1, Rho2 is not an essential protein, but its overexpression is lethal. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for proper intracellular localization via membrane attachment. As with other Rho family GTPases, the GDP/GTP cycling is regulated by GEFs (guanine nucleotide exchange factors), GAPs (GTPase-activating proteins) and GDIs (guanine nucleotide dissociation inhibitors).. 58013 cd04130: Wrch-1 subfamily. Wrch-1 (Wnt-1 responsive Cdc42 homolog) is a Rho family GTPase that shares significant sequence and functional similarity with Cdc42. Wrch-1 was first identified in mouse mammary epithelial cells, where its transcription is upregulated in Wnt-1 transformation. Wrch-1 contains N- and C-terminal extensions relative to cdc42, suggesting potential differences in cellular localization and function. The Wrch-1 N-terminal extension contains putative SH3 domain-binding motifs and has been shown to bind the SH3 domain-containing protein Grb2, which increases the level of active Wrch-1 in cells. Unlike Cdc42, which localizes to the cytosol and perinuclear membranes, Wrch-1 localizes extensively with the plasma membrane and endosomes. The membrane association, localization, and biological activity of Wrch-1 indicate an atypical model of regulation distinct from other Rho family GTPases. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 58014 cd04131: Rnd subfamily. The Rnd subfamily contains Rnd1/Rho6, Rnd2/Rho7, and Rnd3/RhoE/Rho8. These novel Rho family proteins have substantial structural differences compared to other Rho members, including N- and C-terminal extensions relative to other Rhos. Rnd3/RhoE is farnesylated at the C-terminal prenylation site, unlike most other Rho proteins that are geranylgeranylated. In addition, Rnd members are unable to hydrolyze GTP and are resistant to GAP activity. They are believed to exist only in the GTP-bound conformation, and are antagonists of RhoA activity. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 58015 cd04132: Rho4-like subfamily. Rho4 is a GTPase that controls septum degradation by regulating secretion of Eng1 or Agn1 during cytokinesis. Rho4 also plays a role in cell morphogenesis. Rho4 regulates septation and cell morphology by controlling the actin cytoskeleton and cytoplasmic microtubules. The localization of Rho4 is modulated by Rdi1, which may function as a GDI, and by Rga9, which is believed to function as a GAP. In S. pombe, both Rho4 deletion and Rho4 overexpression result in a defective cell wall, suggesting a role for Rho4 in maintaining cell wall integrity. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. 58016 cd04133: Rop subfamily. The Rop (Rho-related protein from plants) subfamily plays a role in diverse cellular processes, including cytoskeletal organization, pollen and vegetative cell growth, hormone responses, stress responses, and pathogen resistance. Rops are able to regulate several downstream pathways to amplify a specific signal by acting as master switches early in the signaling cascade. They transmit a variety of extracellular and intracellular signals. Rops are involved in establishing cell polarity in root-hair development, root-hair elongation, pollen-tube growth, cell-shape formation, responses to hormones such as abscisic acid (ABA) and auxin, responses to abiotic stresses such as oxygen deprivation, and disease resistance and disease susceptibility. An individual Rop can have a unique function or an overlapping function shared with other Rop proteins; in addition, a given Rop-regulated function can be controlled by one or multiple Rop proteins. For example, Rop1, Rop3, and Rop5 are all involved in pollen-tube growth; Rop2 plays a role in response to low-oxygen environments, cell-morphology, and root-hair development; root-hair development is also regulated by Rop4 and Rop6; Rop6 is also responsible for ABA response, and ABA response is also regulated by Rop10. Plants retain some of the regulatory mechanisms that are shared by other members of the Rho family, but have also developed a number of unique modes for regulating Rops. Unique RhoGEFs have been identified that are exclusively active toward Rop proteins, such as those containing the domain PRONE (plant-specific Rop nucleotide exchanger). Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 58017 cd04134: Rho3 subfamily. Rho3 is a member of the Rho family found only in fungi. Rho3 is believed to regulate cell polarity by interacting with the diaphanous/formin family protein For3 to control both the actin cytoskeleton and microtubules. Rho3 is also believed to have a direct role in exocytosis that is independent of its role in regulating actin polarity. The function in exocytosis may be two-pronged: first, in the transport of post-Golgi vesicles from the mother cell to the bud, mediated by myosin (Myo2); second, in the docking and fusion of vesicles to the plasma membrane, mediated by an exocyst (Exo70) protein. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. 58018 cd04135: TC10 subfamily. TC10 is a Rho family protein that has been shown to induce microspike formation and neurite outgrowth in vitro. Its expression changes dramatically after peripheral nerve injury, suggesting an important role in promoting axonal outgrowth and regeneration. TC10 regulates translocation of insulin-stimulated GLUT4 in adipocytes and has also been shown to bind directly to Golgi COPI coat proteins. GTP-bound TC10 in vitro can bind numerous potential effectors. Depending on its subcellular localization and distinct functional domains, TC10 can differentially regulate two types of filamentous actin in adipocytes. TC10 mRNAs are highly expressed in three types of mouse muscle tissues: leg skeletal muscle, cardiac muscle, and uterus; they were also present in brain, with higher levels in adults than in newborns. TC10 has also been shown to play a role in regulating the expression of cystic fibrosis transmembrane conductance regulator (CFTR) through interactions with CFTR-associated ligand (CAL). The GTP-bound form of TC10 directs the trafficking of CFTR from the juxtanuclear region to the secretory pathway toward the plasma membrane, away from CAL-mediated DFTR degradation in the lysosome. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 58019 cd04136: Rap-like subfamily. The Rap subfamily consists of the Rap1, Rap2, and RSR1. Rap subfamily proteins perform different cellular functions, depending on the isoform and its subcellular localization. For example, in rat salivary gland, neutrophils, and platelets, Rap1 localizes to secretory granules and is believed to regulate exocytosis or the formation of secretory granules. Rap1 has also been shown to localize in the Golgi of rat fibroblasts, zymogen granules, plasma membrane, and microsomal membrane of the pancreatic acini, as well as in the endocytic compartment of skeletal muscle cells and fibroblasts. Rap1 localizes in the nucleus of human oropharyngeal squamous cell carcinomas (SCCs) and cell lines. Rap1 plays a role in phagocytosis by controlling the binding of adhesion receptors (typically integrins) to their ligands. In yeast, Rap1 has been implicated in multiple functions, including activation and silencing of transcription and maintenance of telomeres. Rap2 is involved in multiple functions, including activation of c-Jun N-terminal kinase (JNK) to regulate the actin cytoskeleton and activation of the Wnt/beta-catenin signaling pathway in embryonic Xenopus. A number of effector proteins for Rap2 have been identified, including isoform 3 of the human mitogen-activated protein kinase kinase kinase kinase 4 (MAP4K4) and Traf2- and Nck-interacting kinase (TNIK), and the RalGEFs RalGDS, RGL, and Rlf, which also interact with Rap1 and Ras. RSR1 is the fungal homolog of Rap1 and Rap2. In budding yeasts, it is involved in selecting a site for bud growth, which directs the establishment of cell polarization. The Rho family GTPase Cdc42 and its GEF, Cdc24, then establish an axis of polarized growth. It is believed that Cdc42 interacts directly with RSR1 in vivo. In filamentous fungi such as Ashbya gossypii, RSR1 is a key regulator of polar growth in the hypha. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 58020 cd04137: Rheb (Ras Homolog Enriched in Brain) subfamily. Rheb was initially identified in rat brain, where its expression is elevated by seizures or by long-term potentiation. It is expressed ubiquitously, with elevated levels in muscle and brain. Rheb functions as an important mediator between the tuberous sclerosis complex proteins, TSC1 and TSC2, and the mammalian target of rapamycin (TOR) kinase to stimulate cell growth. TOR kinase regulates cell growth by controlling nutrient availability, growth factors, and the energy status of the cell. TSC1 and TSC2 form a dimeric complex that has tumor suppressor activity, and TSC2 is a GTPase activating protein (GAP) for Rheb. The TSC1/TSC2 complex inhibits the activation of TOR kinase through Rheb. Rheb has also been shown to induce the formation of large cytoplasmic vacuoles in a process that is dependent on the GTPase cycle of Rheb, but independent of the TOR kinase, suggesting Rheb plays a role in endocytic trafficking that leads to cell growth and cell-cycle progression. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. 58021 cd04138: H-Ras/N-Ras/K-Ras subfamily. H-Ras, N-Ras, and K-Ras4A/4B are the prototypical members of the Ras family. These isoforms generate distinct signal outputs despite interacting with a common set of activators and effectors, and are strongly associated with oncogenic progression in tumor initiation. Mutated versions of Ras that are insensitive to GAP stimulation (and are therefore constitutively active) are found in a significant fraction of human cancers. Many Ras guanine nucleotide exchange factors (GEFs) have been identified. They are sequestered in the cytosol until activation by growth factors triggers recruitment to the plasma membrane or Golgi, where the GEF colocalizes with Ras. Active (GTP-bound) Ras interacts with several effector proteins that stimulate a variety of diverse cytoplasmic signaling activities. Some are known to positively mediate the oncogenic properties of Ras, including Raf, phosphatidylinositol 3-kinase (PI3K), RalGEFs, and Tiam1. Others are proposed to play negative regulatory roles in oncogenesis, including RASSF and NORE/MST1. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 58022 cd04139: RalA/RalB subfamily. The Ral (Ras-like) subfamily consists of the highly homologous RalA and RalB. Ral proteins are believed to play a crucial role in tumorigenesis, metastasis, endocytosis, and actin cytoskeleton dynamics. Despite their high sequence similarity (>80% sequence identity), nonoverlapping and opposing functions have been assigned to RalA and RalBs in tumor migration. In human bladder and prostate cancer cells, RalB promotes migration while RalA inhibits it. A Ral-specific set of GEFs has been identified that are activated by Ras binding. This RalGEF activity is enhanced by Ras binding to another of its target proteins, phosphatidylinositol 3-kinase (PI3K). Ral effectors include RLIP76/RalBP1, a Rac/cdc42 GAP, and the exocyst (Sec6/8) complex, a heterooctomeric protein complex that is involved in tethering vesicles to specific sites on the plasma membrane prior to exocytosis. In rat kidney cells, RalB is required for functional assembly of the exocyst and for localizing the exocyst to the leading edge of migrating cells. In human cancer cells, RalA is required to support anchorage-independent proliferation and RalB is required to suppress apoptosis. RalA has been shown to localize to the plasma membrane while RalB is localized to the intracellular vesicles. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 58023 cd04140: ARHI subfamily. ARHI (A Ras homolog member I) is a member of the Ras family with several unique structural and functional properties. ARHI is expressed in normal human ovarian and breast tissue, but its expression is decreased or eliminated in breast and ovarian cancer. ARHI contains an N-terminal extension of 34 residues (human) that is required to retain its tumor suppressive activity. Unlike most other Ras family members, ARHI is maintained in the constitutively active (GTP-bound) state in resting cells and has modest GTPase activity. ARHI inhibits STAT3 (signal transducers and activators of transcription 3), a latent transcription factor whose abnormal activation plays a critical role in oncogenesis. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 58024 cd04141: Rit/Rin/Ric subfamily. Rit (Ras-like protein in all tissues), Rin (Ras-like protein in neurons) and Ric (Ras-related protein which interacts with calmodulin) form a subfamily with several unique structural and functional characteristics. These proteins all lack a the C-terminal CaaX lipid-binding motif typical of Ras family proteins, and Rin and Ric contain calmodulin-binding domains. Rin, which is expressed only in neurons, induces neurite outgrowth in rat pheochromocytoma cells through its association with calmodulin and its activation of endogenous Rac/cdc42. Rit, which is ubiquitously expressed in mammals, inhibits growth-factor withdrawl-mediated apoptosis and induces neurite extension in pheochromocytoma cells. Rit and Rin are both able to form a ternary complex with PAR6, a cell polarity-regulating protein, and Rac/cdc42. This ternary complex is proposed to have physiological function in processes such as tumorigenesis. Activated Ric is likely to signal in parallel with the Ras pathway or stimulate the Ras pathway at some upstream point, and binding of calmodulin to Ric may negatively regulate Ric activity. 58025 cd04142: RRP22 subfamily. RRP22 (Ras-related protein on chromosome 22) subfamily consists of proteins that inhibit cell growth and promote caspase-independent cell death. Unlike most Ras proteins, RRP22 is down-regulated in many human tumor cells due to promoter methylation. RRP22 localizes to the nucleolus in a GTP-dependent manner, suggesting a novel function in modulating transport of nucleolar components. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Like most Ras family proteins, RRP22 is farnesylated. 58026 cd04143: Rhes_like subfamily. This subfamily includes Rhes (Ras homolog enriched in striatum) and Dexras1/AGS1 (activator of G-protein signaling 1). These proteins are homologous, but exhibit significant differences in tissue distribution and subcellular localization. Rhes is found primarily in the striatum of the brain, but is also expressed in other areas of the brain, such as the cerebral cortex, hippocampus, inferior colliculus, and cerebellum. Rhes expression is controlled by thyroid hormones. In rat PC12 cells, Rhes is farnesylated and localizes to the plasma membrane. Rhes binds and activates PI3K, and plays a role in coupling serpentine membrane receptors with heterotrimeric G-protein signaling. Rhes has recently been shown to be reduced under conditions of dopamine supersensitivity and may play a role in determining dopamine receptor sensitivity. Dexras1/AGS1 is a dexamethasone-induced Ras protein that is expressed primarily in the brain, with low expression levels in other tissues. Dexras1 localizes primarily to the cytoplasm, and is a critical regulator of the circadian master clock to photic and nonphotic input. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. 58027 cd04144: Ras2 subfamily. The Ras2 subfamily, found exclusively in fungi, was first identified in Ustilago maydis. In U. maydis, Ras2 is regulated by Sql2, a protein that is homologous to GEFs (guanine nucleotide exchange factors) of the CDC25 family. Ras2 has been shown to induce filamentous growth, but the signaling cascade through which Ras2 and Sql2 regulate cell morphology is not known. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. 58028 cd04145: M-Ras/R-Ras-like subfamily. This subfamily contains R-Ras2/TC21, M-Ras/R-Ras3, and related members of the Ras family. M-Ras is expressed in lympho-hematopoetic cells. It interacts with some of the known Ras effectors, but appears to also have its own effectors. Expression of mutated M-Ras leads to transformation of several types of cell lines, including hematopoietic cells, mammary epithelial cells, and fibroblasts. Overexpression of M-Ras is observed in carcinomas from breast, uterus, thyroid, stomach, colon, kidney, lung, and rectum. In addition, expression of a constitutively active M-Ras mutant in murine bone marrow induces a malignant mast cell leukemia that is distinct from the monocytic leukemia induced by H-Ras. TC21, along with H-Ras, has been shown to regulate the branching morphogenesis of ureteric bud cell branching in mice. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 58029 cd04146: RERG/RasL11-like subfamily. RERG (Ras-related and Estrogen- Regulated Growth inhibitor) and Ras-like 11 are members of a novel subfamily of Ras that were identified based on their behavior in breast and prostate tumors, respectively. RERG expression was decreased or lost in a significant fraction of primary human breast tumors that lack estrogen receptor and are correlated with poor clinical prognosis. Elevated RERG expression correlated with favorable patient outcome in a breast tumor subtype that is positive for estrogen receptor expression. In contrast to most Ras proteins, RERG overexpression inhibited the growth of breast tumor cells in vitro and in vivo. RasL11 was found to be ubiquitously expressed in human tissue, but down-regulated in prostate tumors. Both RERG and RasL11 lack the C-terminal CaaX prenylation motif, where a = an aliphatic amino acid and X = any amino acid, and are localized primarily in the cytoplasm. Both are believed to have tumor suppressor activity. 58030 cd04147: Ras-dva subfamily. Ras-dva (Ras - dorsal-ventral anterior localization) subfamily consists of a set of proteins characterized only in Xenopus leavis, to date. In Xenopus Ras-dva expression is activated by the transcription factor Otx2 and begins during gastrulation throughout the anterior ectoderm. Ras-dva expression is inhibited in the anterior neural plate by factor Xanf1. Downregulation of Ras-dva results in head development abnormalities through the inhibition of several regulators of the anterior neural plate and folds patterning, including Otx2, BF-1, Xag2, Pax6, Slug, and Sox9. Downregulation of Ras-dva also interferes with the FGF-8a signaling within the anterior ectoderm. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. 58031 cd04148: RGK subfamily. The RGK (Rem, Rem2, Rad, Gem/Kir) subfamily of Ras GTPases are expressed in a tissue-specific manner and are dynamically regulated by transcriptional and posttranscriptional mechanisms in response to environmental cues. RGK proteins bind to the beta subunit of L-type calcium channels, causing functional down-regulation of these voltage-dependent calcium channels, and either termination of calcium-dependent secretion or modulation of electrical conduction and contractile function. Inhibition of L-type calcium channels by Rem2 may provide a mechanism for modulating calcium-triggered exocytosis in hormone-secreting cells, and has been proposed to influence the secretion of insulin in pancreatic beta cells. RGK proteins also interact with and inhibit the Rho/Rho kinase pathway to modulate remodeling of the cytoskeleton. Two characteristics of RGK proteins cited in the literature are N-terminal and C-terminal extensions beyond the GTPase domain typical of Ras superfamily members. The N-terminal extension is not conserved among family members; the C-terminal extension is reported to be conserved among the family and lack the CaaX prenylation motif typical of membrane-associated Ras proteins. However, a putative CaaX motif has been identified in the alignment of the C-terminal residues of this CD. 58032 cd04149: Arf6 subfamily. Arf6 (ADP ribosylation factor 6) proteins localize to the plasma membrane, where they perform a wide variety of functions. In its active, GTP-bound form, Arf6 is involved in cell spreading, Rac-induced formation of plasma membrane ruffles, cell migration, wound healing, and Fc-mediated phagocytosis. Arf6 appears to change the actin structure at the plasma membrane by activating Rac, a Rho family protein involved in membrane ruffling. Arf6 is required for and enhances Rac formation of ruffles. Arf6 can regulate dendritic branching in hippocampal neurons, and in yeast it localizes to the growing bud, where it plays a role in polarized growth and bud site selection. In leukocytes, Arf6 is required for chemokine-stimulated migration across endothelial cells. Arf6 also plays a role in down-regulation of beta2-adrenergic receptors and luteinizing hormone receptors by facilitating the release of sequestered arrestin to allow endocytosis. Arf6 is believed to function at multiple sites on the plasma membrane through interaction with a specific set of GEFs, GAPs, and effectors. Arf6 has been implicated in breast cancer and melanoma cell invasion, and in actin remodelling at the invasion site of Chlamydia infection. 58033 cd04150: Arf1-Arf5-like subfamily. This subfamily contains Arf1, Arf2, Arf3, Arf4, Arf5, and related proteins. Arfs1-5 are soluble proteins that are crucial for assembling coat proteins during vesicle formation. Each contains an N-terminal myristoylated amphipathic helix that is folded into the protein in the GDP-bound state. GDP/GTP exchange exposes the helix, which anchors to the membrane. Following GTP hydrolysis, the helix dissociates from the membrane and folds back into the protein. A general feature of Arf1-5 signaling may be the cooperation of two Arfs at the same site. Arfs1-5 are generally considered to be interchangeable in function and location, but some specific functions have been assigned. Arf1 localizes to the early/cis-Golgi, where it is activated by GBF1 and recruits the coat protein COPI. It also localizes to the trans-Golgi network (TGN), where it is activated by BIG1/BIG2 and recruits the AP1, AP3, AP4, and GGA proteins. Humans, but not rodents and other lower eukaryotes, lack Arf2. Human Arf3 shares 96% sequence identity with Arf1 and is believed to generally function interchangeably with Arf1. Human Arf4 in the activated (GTP-bound) state has been shown to interact with the cytoplasmic domain of epidermal growth factor receptor (EGFR) and mediate the EGF-dependent activation of phospholipase D2 (PLD2), leading to activation of the activator protein 1 (AP-1) transcription factor. Arf4 has also been shown to recognize the C-terminal sorting signal of rhodopsin and regulate its incorporation into specialized post-Golgi rhodopsin transport carriers (RTCs). There is some evidence that Arf5 functions at the early-Golgi and the trans-Golgi to affect Golgi-associated alpha-adaptin homology Arf-binding proteins (GGAs).. 58034 cd04151: Arl1 subfamily. Arl1 (Arf-like 1) localizes to the Golgi complex, where it is believed to recruit effector proteins to the trans-Golgi network. Like most members of the Arf family, Arl1 is myristoylated at its N-terminal helix and mutation of the myristoylation site disrupts Golgi targeting. In humans, the Golgi-localized proteins golgin-97 and golgin-245 have been identified as Arl1 effectors. Golgins are large coiled-coil proteins found in the Golgi, and these golgins contain a C-terminal GRIP domain, which is the site of Arl1 binding. Additional Arl1 effectors include the GARP (Golgi-associated retrograde protein)/VFT (Vps53) vesicle-tethering complex and Arfaptin 2. Arl1 is not required for exocytosis, but appears necessary for trafficking from the endosomes to the Golgi. In Drosophila zygotes, mutation of Arl1 is lethal, and in the host-bloodstream form of Trypanosoma brucei, Arl1 is essential for viability. 58035 cd04152: Arl4/Arl7 subfamily. Arl4 (Arf-like 4) is highly expressed in testicular germ cells, and is found in the nucleus and nucleolus. In mice, Arl4 is developmentally expressed during embryogenesis, and a role in somite formation and central nervous system differentiation has been proposed. Arl7 has been identified as the only Arf/Arl protein to be induced by agonists of liver X-receptor and retinoid X-receptor and by cholesterol loading in human macrophages. Arl7 is proposed to play a role in transport between a perinuclear compartment and the plasma membrane, apparently linked to the ABCA1-mediated cholesterol secretion pathway. Older literature suggests that Arl6 is a part of the Arl4/Arl7 subfamily, but analyses based on more recent sequence data place Arl6 in its own subfamily. 58036 cd04153: Arl5/Arl8 subfamily. Arl5 (Arf-like 5) and Arl8, like Arl4 and Arl7, are localized to the nucleus and nucleolus. Arl5 is developmentally regulated during embryogenesis in mice. Human Arl5 interacts with the heterochromatin protein 1-alpha (HP1alpha), a nonhistone chromosomal protein that is associated with heterochromatin and telomeres, and prevents telomere fusion. Arl5 may also play a role in embryonic nuclear dynamics and/or signaling cascades. Arl8 was identified from a fetal cartilage cDNA library. It is found in brain, heart, lung, cartilage, and kidney. No function has been assigned for Arl8 to date. 58037 cd04154: Arl2 subfamily. Arl2 (Arf-like 2) GTPases are members of the Arf family that bind GDP and GTP with very low affinity. Unlike most Arf family proteins, Arl2 is not myristoylated at its N-terminal helix. The protein PDE-delta, first identified in photoreceptor rod cells, binds specifically to Arl2 and is structurally very similar to RhoGDI. Despite the high structural similarity between Arl2 and Rho proteins and between PDE-delta and RhoGDI, the interactions between the GTPases and their effectors are very different. In its GTP bound form, Arl2 interacts with the protein Binder of Arl2 (BART), and the complex is believed to play a role in mitochondrial adenine nucleotide transport. In its GDP bound form, Arl2 interacts with tubulin- folding Cofactor D; this interaction is believed to play a role in regulation of microtubule dynamics that impact the cytoskeleton, cell division, and cytokinesis. 58038 cd04155: Arl3 subfamily. Arl3 (Arf-like 3) is an Arf family protein that differs from most Arf family members in the N-terminal extension. In is inactive, GDP-bound form, the N-terminal extension forms an elongated loop that is hydrophobically anchored into the membrane surface; however, it has been proposed that this region might form a helix in the GTP-bound form. The delta subunit of the rod-specific cyclic GMP phosphodiesterase type 6 (PDEdelta) is an Arl3 effector. Arl3 binds microtubules in a regulated manner to alter specific aspects of cytokinesis via interactions with retinitis pigmentosa 2 (RP2). It has been proposed that RP2 functions in concert with Arl3 to link the cell membrane and the cytoskeleton in photoreceptors as part of the cell signaling or vesicular transport machinery. In mice, the absence of Arl3 is associated with abnormal epithelial cell proliferation and cyst formation. 58039 cd04156: ARLTS1 subfamily. ARLTS1 (Arf-like tumor suppressor gene 1), also known as Arl11, is a member of the Arf family of small GTPases that is believed to play a major role in apoptotic signaling. ARLTS1 is widely expressed and functions as a tumor suppressor gene in several human cancers. ARLTS1 is a low-penetrance suppressor that accounts for a small percentage of familial melanoma or familial chronic lymphocytic leukemia (CLL). ARLTS1 inactivation seems to occur most frequently through biallelic down-regulation by hypermethylation of the promoter. In breast cancer, ARLTS1 alterations were typically a combination of a hypomorphic polymorphism plus loss of heterozygosity. In a case of thyroid adenoma, ARLTS1 alterations were polymorphism plus promoter hypermethylation. The nonsense polymorphism Trp149Stop occurs with significantly greater frequency in familial cancer cases than in sporadic cancer cases, and the Cys148Arg polymorphism is associated with an increase in high-risk familial breast cancer. 58040 cd04157: Arl6 subfamily. Arl6 (Arf-like 6) forms a subfamily of the Arf family of small GTPases. Arl6 expression is limited to the brain and kidney in adult mice, but it is expressed in the neural plate and somites during embryogenesis, suggesting a possible role for Arl6 in early development. Arl6 is also believed to have a role in cilia or flagella function. Several proteins have been identified that bind Arl6, including Arl6 interacting protein (Arl6ip), and SEC61beta, a subunit of the heterotrimeric conducting channel SEC61p. Based on Arl6 binding to these effectors, Arl6 is also proposed to play a role in protein transport, membrane trafficking, or cell signaling during hematopoietic maturation. At least three specific homozygous Arl6 mutations in humans have been found to cause Bardet-Biedl syndrome, a disorder characterized by obesity, retinopathy, polydactyly, renal and cardiac malformations, learning disabilities, and hypogenitalism. Older literature suggests that Arl6 is a part of the Arl4/Arl7 subfamily, but analyses based on more recent sequence data place Arl6 in its own subfamily. 58041 cd04158: ARD1 subfamily. ARD1 (ADP-ribosylation factor domain protein 1) is an unusual member of the Arf family. In addition to the C-terminal Arf domain, ARD1 has an additional 46-kDa N-terminal domain that contains a RING finger domain, two predicted B-Boxes, and a coiled-coil protein interaction motif. This domain belongs to the TRIM (tripartite motif) or RBCC (RING, B-Box, coiled-coil) family. Like most Arfs, the ARD1 Arf domain lacks detectable GTPase activity. However, unlike most Arfs, the full-length ARD1 protein has significant GTPase activity due to the GAP (GTPase-activating protein) activity exhibited by the 46-kDa N-terminal domain. The GAP domain of ARD1 is specific for its own Arf domain and does not bind other Arfs. The rate of GDP dissociation from the ARD1 Arf domain is slowed by the adjacent 15 amino acids, which act as a GDI (GDP-dissociation inhibitor) domain. ARD1 is ubiquitously expressed in cells and localizes to the Golgi and to the lysosomal membrane. Two Tyr-based motifs in the Arf domain are responsible for Golgi localization, while the GAP domain controls lysosomal localization. 58042 cd04159: Arl10-like subfamily. Arl9/Arl10 was identified from a human cancer-derived EST dataset. No functional information about the subfamily is available at the current time, but crystal structures of human Arl10b and Arl10c have been solved. 58043 cd04160: Arfrp1 subfamily. Arfrp1 (Arf-related protein 1), formerly known as ARP, is a membrane-associated Arf family member that lacks the N-terminal myristoylation motif. Arfrp1 is mainly associated with the trans-Golgi compartment and the trans-Golgi network, where it regulates the targeting of Arl1 and the GRIP domain-containing proteins, golgin-97 and golgin-245, onto Golgi membranes. It is also involved in the anterograde transport of the vesicular stomatitis virus G protein from the Golgi to the plasma membrane, and in the retrograde transport of TGN38 and Shiga toxin from endosomes to the trans-Golgi network. Arfrp1 also inhibits Arf/Sec7-dependent activation of phospholipase D. Deletion of Arfrp1 in mice causes embryonic lethality at the gastrulation stage and apoptosis of mesodermal cells, indicating its importance in development. 58044 cd04161: Arl2l1/Arl13 subfamily. Arl2l1 (Arl2-like protein 1) and Arl13 form a subfamily of the Arf family of small GTPases. Arl2l1 was identified in human cells during a search for the gene(s) responsible for Bardet-Biedl syndrome (BBS). Like Arl6, the identified BBS gene, Arl2l1 is proposed to have cilia-specific functions. Arl13 is found on the X chromosome, but its expression has not been confirmed; it may be a pseudogene. 58045 cd04162: Arl9/Arfrp2-like subfamily. Arl9 (Arf-like 9) was first identified as part of the Human Cancer Genome Project. It maps to chromosome 4q12 and is sometimes referred to as Arfrp2 (Arf-related protein 2). This is a novel subfamily identified in human cancers that is uncharacterized to date. 58046 cd04163: Era subfamily. Era (E. coli Ras-like protein) is a multifunctional GTPase found in all bacteria except some eubacteria. It binds to the 16S ribosomal RNA (rRNA) of the 30S subunit and appears to play a role in the assembly of the 30S subunit, possibly by chaperoning the 16S rRNA. It also contacts several assembly elements of the 30S subunit. Era couples cell growth with cytokinesis and plays a role in cell division and energy metabolism. Homologs have also been found in eukaryotes. Era contains two domains: the N-terminal GTPase domain and a C-terminal domain KH domain that is critical for RNA binding. Both domains are important for Era function. Era is functionally able to compensate for deletion of RbfA, a cold-shock adaptation protein that is required for efficient processing of the 16S rRNA. 58047 cd04164: TrmE (MnmE, ThdF, MSS1) is a 3-domain protein found in bacteria and eukaryotes. It controls modification of the uridine at the wobble position (U34) of tRNAs that read codons ending with A or G in the mixed codon family boxes. TrmE contains a GTPase domain that forms a canonical Ras-like fold. It functions a molecular switch GTPase, and apparently uses a conformational change associated with GTP hydrolysis to promote the tRNA modification reaction, in which the conserved cysteine in the C-terminal domain is thought to function as a catalytic residue. In bacteria that are able to survive in extremely low pH conditions, TrmE regulates glutamate-dependent acid resistance. 58048 cd04165: GTPBP1-like. Mammalian GTP binding protein 1 (GTPBP1), GTPBP2, and nematode homologs AGP-1 and CGP-1 are GTPases whose specific functions remain unknown. In mouse, GTPBP1 is expressed in macrophages, in smooth muscle cells of various tissues and in some neurons of the cerebral cortex; GTPBP2 tissue distribution appears to overlap that of GTPBP1. In human leukemia and macrophage cell lines, expression of both GTPBP1 and GTPBP2 is enhanced by interferon-gamma (IFN-gamma). The chromosomal location of both genes has been identified in humans, with GTPBP1 located in chromosome 22q12-13.1 and GTPBP2 located in chromosome 6p21-12. Human glioblastoma multiforme (GBM), a highly-malignant astrocytic glioma and the most common cancer in the central nervous system, has been linked to chromosomal deletions and a translocation on chromosome 6. The GBM translocation results in a fusion of GTPBP2 and PTPRZ1, a protein involved in oligodendrocyte differentiation, recovery, and survival. This fusion product may contribute to the onset of GBM. 58049 cd04166: CysN_ATPS subfamily. CysN, together with protein CysD, form the ATP sulfurylase (ATPS) complex in some bacteria and lower eukaryotes. ATPS catalyzes the production of ATP sulfurylase (APS) and pyrophosphate (PPi) from ATP and sulfate. CysD, which catalyzes ATP hydrolysis, is a member of the ATP pyrophosphatase (ATP PPase) family. CysN hydrolysis of GTP is required for CysD hydrolysis of ATP; however, CysN hydrolysis of GTP is not dependent on CysD hydrolysis of ATP. CysN is an example of lateral gene transfer followed by acquisition of new function. In many organisms, an ATPS exists which is not GTP-dependent and shares no sequence or structural similarity to CysN. 58050 cd04167: Snu114p subfamily. Snu114p is one of several proteins that make up the U5 small nuclear ribonucleoprotein (snRNP) particle. U5 is a component of the spliceosome, which catalyzes the splicing of pre-mRNA to remove introns. Snu114p is homologous to EF-2, but typically contains an additional N-terminal domain not found in Ef-2. This protein is part of the GTP translation factor family and the Ras superfamily, characterized by five G-box motifs. 58051 cd04168: Tet(M)-like subfamily. Tet(M), Tet(O), Tet(W), and OtrA are tetracycline resistance genes found in Gram-positive and Gram-negative bacteria. Tetracyclines inhibit protein synthesis by preventing aminoacyl-tRNA from binding to the ribosomal acceptor site. This subfamily contains tetracycline resistance proteins that function through ribosomal protection and are typically found on mobile genetic elements, such as transposons or plasmids, and are often conjugative. Ribosomal protection proteins are homologous to the elongation factors EF-Tu and EF-G. EF-G and Tet(M) compete for binding on the ribosomes. Tet(M) has a higher affinity than EF-G, suggesting these two proteins may have overlapping binding sites and that Tet(M) must be released before EF-G can bind. Tet(M) and Tet(O) have been shown to have ribosome-dependent GTPase activity. These proteins are part of the GTP translation factor family, which includes EF-G, EF-Tu, EF2, LepA, and SelB. 58052 cd04169: RF3 subfamily. Peptide chain release factor 3 (RF3) is a protein involved in the termination step of translation in bacteria. Termination occurs when class I release factors (RF1 or RF2) recognize the stop codon at the A-site of the ribosome and activate the release of the nascent polypeptide. The class II release factor RF3 then initiates the release of the class I RF from the ribosome. RF3 binds to the RF/ribosome complex in the inactive (GDP-bound) state. GDP/GTP exchange occurs, followed by the release of the class I RF. Subsequent hydrolysis of GTP to GDP triggers the release of RF3 from the ribosome. RF3 also enhances the efficiency of class I RFs at less preferred stop codons and at stop codons in weak contexts. 58053 cd04170: Elongation factor G (EF-G) subfamily. Translocation is mediated by EF-G (also called translocase). The structure of EF-G closely resembles that of the complex between EF-Tu and tRNA. This is an example of molecular mimicry; a protein domain evolved so that it mimics the shape of a tRNA molecule. EF-G in the GTP form binds to the ribosome, primarily through the interaction of its EF-Tu-like domain with the 50S subunit. The binding of EF-G to the ribosome in this manner stimulates the GTPase activity of EF-G. On GTP hydrolysis, EF-G undergoes a conformational change that forces its arm deeper into the A site on the 30S subunit. To accommodate this domain, the peptidyl-tRNA in the A site moves to the P site, carrying the mRNA and the deacylated tRNA with it. The ribosome may be prepared for these rearrangements by the initial binding of EF-G as well. The dissociation of EF-G leaves the ribosome ready to accept the next aminoacyl-tRNA into the A site. This group contains only bacterial members. 58054 cd04171: SelB subfamily. SelB is an elongation factor needed for the co-translational incorporation of selenocysteine. Selenocysteine is coded by a UGA stop codon in combination with a specific downstream mRNA hairpin. In bacteria, the C-terminal part of SelB recognizes this hairpin, while the N-terminal part binds GTP and tRNA in analogy with elongation factor Tu (EF-Tu). It specifically recognizes the selenocysteine charged tRNAsec, which has a UCA anticodon, in an EF-Tu like manner. This allows insertion of selenocysteine at in-frame UGA stop codons. In E. coli SelB binds GTP, selenocysteyl-tRNAsec, and a stem-loop structure immediately downstream of the UGA codon (the SECIS sequence). The absence of active SelB prevents the participation of selenocysteyl-tRNAsec in translation. Archaeal and animal mechanisms of selenocysteine incorporation are more complex. Although the SECIS elements have different secondary structures and conserved elements between archaea and eukaryotes, they do share a common feature. Unlike in E. coli, these SECIS elements are located in the 3' UTRs. This group contains bacterial SelBs, as well as, one from archaea. 58055 cd04172: Rnd3/RhoE/Rho8 subfamily. Rnd3/RhoE/Rho8 is a member of the novel Rho subfamily Rnd, together with Rnd1/Rho6 and Rnd2/Rho7. Rnd3/RhoE is known to bind the serine-threonine kinase ROCK I. Unphosphorylated Rnd3/RhoE associates primarily with membranes, but ROCK I-phosphorylated Rnd3/RhoE localizes in the cytosol. Phosphorylation of Rnd3/RhoE correlates with its activity in disrupting RhoA-induced stress fibers and inhibiting Ras-induced fibroblast transformation. In cells that lack stress fibers, such as macrophages and monocytes, Rnd3/RhoE induces a redistribution of actin, causing morphological changes in the cell. In addition, Rnd3/RhoE has been shown to inhibit cell cycle progression in G1 phase at a point upstream of the pRb family pocket protein checkpoint. Rnd3/RhoE has also been shown to inhibit Ras- and Raf-induced fibroblast transformation. In mammary epithelial tumor cells, Rnd3/RhoE regulates the assembly of the apical junction complex and tight junction formation. Rnd3/RhoE is underexpressed in prostate cancer cells both in vitro and in vivo; re-expression of Rnd3/RhoE suppresses cell cycle progression and increases apoptosis, suggesting it may play a role in tumor suppression. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 58056 cd04173: Rnd2/Rho7 subfamily. Rnd2/Rho7 is a member of the novel Rho subfamily Rnd, together with Rnd1/Rho6 and Rnd3/RhoE/Rho8. Rnd2/Rho7 is transiently expressed in radially migrating cells in the brain while they are within the subventricular zone of the hippocampus and cerebral cortex. These migrating cells typically develop into pyramidal neurons. Cells that exogenously expressed Rnd2/Rho7 failed to migrate to upper layers of the brain, suggesting that Rnd2/Rho7 plays a role in the radial migration and morphological changes of developing pyramidal neurons, and that Rnd2/Rho7 degradation is necessary for proper cellular migration. The Rnd2/Rho7 GEF Rapostlin is found primarily in the brain and together with Rnd2/Rho7 induces dendrite branching. Unlike Rnd1/Rho6 and Rnd3/RhoE/Rho8, which are RhoA antagonists, Rnd2/Rho7 binds the GEF Pragmin and significantly stimulates RhoA activity and Rho-A mediated cell contraction. Rnd2/Rho7 is also found to be expressed in spermatocytes and early spermatids, with male-germ-cell Rac GTPase-activating protein (MgcRacGAP), where it localizes to the Golgi-derived pro-acrosomal vesicle. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. 58057 cd04174: Rnd1/Rho6 subfamily. Rnd1/Rho6 is a member of the novel Rho subfamily Rnd, together with Rnd2/Rho7 and Rnd3/RhoE/Rho8. Rnd1/Rho6 binds GTP but does not hydrolyze it to GDP, indicating that it is constitutively active. In rat, Rnd1/Rho6 is highly expressed in the cerebral cortex and hippocampus during synapse formation, and plays a role in spine formation. Rnd1/Rho6 is also expressed in the liver and in endothelial cells, and is upregulated in uterine myometrial cells during pregnancy. Like Rnd3/RhoE/Rho8, Rnd1/Rho6 is believed to function as an antagonist to RhoA. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 58058 cd04175: Rap1 subgroup. The Rap1 subgroup is part of the Rap subfamily of the Ras family. It can be further divided into the Rap1a and Rap1b isoforms. In humans, Rap1a and Rap1b share 95% sequence homology, but are products of two different genes located on chromosomes 1 and 12, respectively. Rap1a is sometimes called smg p21 or Krev1 in the older literature. Rap1 proteins are believed to perform different cellular functions, depending on the isoform, its subcellular localization, and the effector proteins it binds. For example, in rat salivary gland, neutrophils, and platelets, Rap1 localizes to secretory granules and is believed to regulate exocytosis or the formation of secretory granules. Rap1 has also been shown to localize in the Golgi of rat fibroblasts, zymogen granules, plasma membrane, and the microsomal membrane of pancreatic acini, as well as in the endocytic compartment of skeletal muscle cells and fibroblasts. High expression of Rap1 has been observed in the nucleus of human oropharyngeal squamous cell carcinomas (SCCs) and cell lines; interestingly, in the SCCs, the active GTP-bound form localized to the nucleus, while the inactive GDP-bound form localized to the cytoplasm. Rap1 plays a role in phagocytosis by controlling the binding of adhesion receptors (typically integrins) to their ligands. In yeast, Rap1 has been implicated in multiple functions, including activation and silencing of transcription and maintenance of telomeres. Rap1a, which is stimulated by T-cell receptor (TCR) activation, is a positive regulator of T cells by directing integrin activation and augmenting lymphocyte responses. In murine hippocampal neurons, Rap1b determines which neurite will become the axon and directs the recruitment of Cdc42, which is required for formation of dendrites and axons. In murine platelets, Rap1b is required for normal homeostasis in vivo and is involved in integrin activation. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 58059 cd04176: Rap2 subgroup. The Rap2 subgroup is part of the Rap subfamily of the Ras family. It consists of Rap2a, Rap2b, and Rap2c. Both isoform 3 of the human mitogen-activated protein kinase kinase kinase kinase 4 (MAP4K4) and Traf2- and Nck-interacting kinase (TNIK) are putative effectors of Rap2 in mediating the activation of c-Jun N-terminal kinase (JNK) to regulate the actin cytoskeleton. In human platelets, Rap2 was shown to interact with the cytoskeleton by binding the actin filaments. In embryonic Xenopus development, Rap2 is necessary for the Wnt/beta-catenin signaling pathway. The Rap2 interacting protein 9 (RPIP9) is highly expressed in human breast carcinomas and correlates with a poor prognosis, suggesting a role for Rap2 in breast cancer oncogenesis. Rap2b, but not Rap2a, Rap2c, Rap1a, or Rap1b, is expressed in human red blood cells, where it is believed to be involved in vesiculation. A number of additional effector proteins for Rap2 have been identified, including the RalGEFs RalGDS, RGL, and Rlf, which also interact with Rap1 and Ras. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation. 58060 cd04177: RSR1 subgroup. RSR1/Bud1p is a member of the Rap subfamily of the Ras family that is found in fungi. In budding yeasts, RSR1 is involved in selecting a site for bud growth on the cell cortex, which directs the establishment of cell polarization. The Rho family GTPase cdc42 and its GEF, cdc24, then establish an axis of polarized growth by organizing the actin cytoskeleton and secretory apparatus at the bud site. It is believed that cdc42 interacts directly with RSR1 in vivo. In filamentous fungi, polar growth occurs at the tips of hypha and at novel growth sites along the extending hypha. In Ashbya gossypii, RSR1 is a key regulator of hyphal growth, localizing at the tip region and regulating in apical polarization of the actin cytoskeleton. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. 58061 cd01514: Elongation factor G C-terminus. This domain includes the carboxyl terminal regions of elongation factors (EFs) bacterial EF-G, eukaryotic and archeal EF-2 and eukaryotic mitochondrial mtEFG1s and mtEFG2s. This group also includes proteins similar to the ribosomal protection proteins Tet(M) and Tet(O), BipA, LepA and, spliceosomal proteins: human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and yeast counterpart Snu114p. This domain adopts a ferredoxin-like fold consisting of an alpha-beta sandwich with anti-parallel beta-sheets, resembling the topology of domain III found in the elongation factors EF-G and eukaryotic EF-2, with which it forms the C-terminal block. The two domains however are not superimposable and domain III lacks some of the characteristics of this domain. EF-2/EF-G in complex with GTP, promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site, the uncharged tRNA from the P site to the E-site and, the mRNA is shifted one codon relative to the ribosome. Tet(M) and Tet(O) mediate Tc resistance. Typical Tcs bind to the ribosome and inhibit the elongation phase of protein synthesis, by inhibiting the occupation of site A by aminoacyl-tRNA. Tet(M) and Tet(O) catalyze the release of tetracycline (Tc) from the ribosome in a GTP-dependent manner. BipA is a highly conserved protein with global regulatory properties in Escherichia coli. Yeast Snu114p is essential for cell viability and for splicing in vivo. Experiments suggest that GTP binding and probably GTP hydrolysis is important for the function of the U5-116 kD/Snu114p. The function of LepA proteins is unknown. 58062 cd03709: lepA_C: This family represents the C-terminal region of LepA, a GTP-binding protein localized in the cytoplasmic membrane. LepA is ubiquitous in Bacteria and Eukaryota (e.g. Saccharomyces cerevisiae GUF1p), but is missing from Archaea. LepA exhibits significant homology to elongation factors (EFs) Tu and G. The function(s) of the proteins in this family are unknown. The N-terminal domain of LepA is homologous to a domain of similar size found in initiation factor 2 (IF2), and in EF-Tu and EF-G (factors required for translation in Escherichia coli). Two types of phylogenetic tree, rooted by other GTP-binding proteins, suggest that eukaryotic homologs (including S. cerevisiae GUF1) originated within the bacterial LepA family. LepA has never been observed in archaea, and eukaryl LepA is organellar. LepA is therefore a true bacterial GTPase, found only in the bacterial lineage. 58063 cd03710: BipA_TypA_C: a C-terminal portion of BipA or TypA having homology to the C terminal domains of the elongation factors EF-G and EF-2. A member of the ribosome binding GTPase superfamily, BipA is widely distributed in bacteria and plants. BipA is a highly conserved protein with global regulatory properties in Escherichia coli. BipA is phosphorylated on a tyrosine residue under some cellular conditions. Mutants show altered regulation of some pathways. BipA functions as a translation factor that is required specifically for the expression of the transcriptional modulator Fis. BipA binds to ribosomes at a site that coincides with that of EF-G and has a GTPase activity that is sensitive to high GDP:GTP ratios and, is stimulated by 70S ribosomes programmed with mRNA and aminoacylated tRNAs. The growth rate-dependent induction of BipA allows the efficient expression of Fis, thereby modulating a range of downstream processes, including DNA metabolism and type III secretion. 58064 cd03711: Tet_C: C-terminus of ribosomal protection proteins Tet(M) and Tet(O). This domain has homology to the C terminal domains of the elongation factors EF-G and EF-2. Tet(M) and Tet(O) catalyze the release of tetracycline (Tc) from the ribosome in a GTP-dependent manner thereby mediating Tc resistance. Tcs are broad-spectrum antibiotics. Typical Tcs bind to the ribosome and inhibit the elongation phase of protein synthesis, by inhibiting the occupation of site A by aminoacyl-tRNA. 58065 cd03713: EFG_mtEFG_C: domains similar to the C-terminal domain of the bacterial translational elongation factor (EF) EF-G. Included in this group is the C-terminus of mitochondrial Elongation factor G1 (mtEFG1) and G2 (mtEFG2) proteins. Eukaryotic cells harbor 2 protein synthesis systems: one localized in the cytoplasm, the other in the mitochondria. Most factors regulating mitochondrial protein synthesis are encoded by nuclear genes, translated in the cytoplasm, and then transported to the mitochondria. The eukaryotic system of elongation factor (EF) components is more complex than that in prokaryotes, with both cytoplasmic and mitochondrial elongation factors and multiple isoforms being expressed in certain species. During the process of peptide synthesis and tRNA site changes, the ribosome is moved along the mRNA a distance equal to one codon with the addition of each amino acid. In bacteria this translocation step is catalyzed by EF-G_GTP, which is hydrolyzed to provide the required energy. Thus, this action releases the uncharged tRNA from the P site and transfers the newly formed peptidyl-tRNA from the A site to the P site. Eukaryotic mtEFG1 proteins show significant homology to bacterial EF-Gs. Mutants in yeast mtEFG1 have impaired mitochondrial protein synthesis, respiratory defects and a tendency to lose mitochondrial DNA. No clear phenotype has been found for mutants in the yeast homologue of mtEFG2, MEF2. 58066 cd04096: eEF2_snRNP_like_C: this family represents a C-terminal domain of eukaryotic elongation factor 2 (eEF-2) and a homologous domain of the spliceosomal human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and, its yeast counterpart Snu114p. Yeast Snu114p is essential for cell viability and for splicing in vivo. U5-116 kD binds GTP. Experiments suggest that GTP binding and probably GTP hydrolysis is important for the function of the U5-116 kD/Snu114p. In complex with GTP, EF-2 promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site, the uncharged tRNA from the P site to the E-site and, the mRNA is shifted one codon relative to the ribosome. 58067 cd04097: mtEFG1_C: C-terminus of mitochondrial Elongation factor G1 (mtEFG1)-like proteins found in eukaryotes. Eukaryotic cells harbor 2 protein synthesis systems: one localized in the cytoplasm, the other in the mitochondria. Most factors regulating mitochondrial protein synthesis are encoded by nuclear genes, translated in the cytoplasm, and then transported to the mitochondria. The eukaryotic system of elongation factor (EF) components is more complex than that in prokaryotes, with both cytoplasmic and mitochondrial elongation factors and multiple isoforms being expressed in certain species. Eukaryotic EF-2 operates in the cytosolic protein synthesis machinery of eukaryotes, EF-Gs in protein synthesis in bacteria. Eukaryotic mtEFG1 proteins show significant homology to bacterial EF-Gs. Mutants in yeast mtEFG1 have impaired mitochondrial protein synthesis, respiratory defects and a tendency to lose mitochondrial DNA. There are two forms of mtEFG present in mammals (designated mtEFG1s and mtEFG2s) mtEFG2s are not present in this group. 58068 cd04098: eEF2_C_snRNP: This family includes a C-terminal portion of the spliceosomal human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and, its yeast counterpart Snu114p. This domain is homologous to the C-terminal domain of the eukaryotic translational elongation factor EF-2. Yeast Snu114p is essential for cell viability and for splicing in vivo. U5-116 kD binds GTP. Experiments suggest that GTP binding and probably GTP hydrolysis is important for the function of the U5-116 kD/Snu114p. In complex with GTP, EF-2 promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site, the uncharged tRNA from the P site to the E-site and, the mRNA is shifted one codon relative to the ribosome. 58069 cd01513: Domain III of Elongation factor (EF) Tu (EF-TU) and EF-G. Elongation factors (EF) EF-Tu and EF-G participate in the elongation phase during protein biosynthesis on the ribosome. Their functional cycles depend on GTP binding and its hydrolysis. The EF-Tu complexed with GTP and aminoacyl-tRNA delivers tRNA to the ribosome, whereas EF-G stimulates translocation, a process in which tRNA and mRNA movements occur in the ribosome. Experimental data showed that: (1) intrinsic GTPase activity of EF-G is influenced by excision of its domain III; (2) that EF-G lacking domain III has a 1,000-fold decreased GTPase activity on the ribosome and, a slightly decreased affinity for GTP; and (3) EF-G lacking domain III does not stimulate translocation, despite the physical presence of domain IV which is also very important for translocation. These findings indicate an essential contribution of domain III to activation of GTP hydrolysis. Domains III and V of EF-G have the same fold (although they are not completely superimposable), the double split beta-alpha-beta fold. This fold is observed in a large number of ribonucleotide binding proteins and is also referred to as the ribonucleoprotein (RNP) or RNA recognition (RRM) motif. This domain III is found in several elongation factors, as well as in peptide chain release factors and in GT-1 family of GTPase (GTPBP1).. 58070 cd03704: This family represents eEF1alpha-like C-terminal region of eRF3 homologous to the domain III of EF-Tu. eRF3 is a GTPase, which enhances the termination efficiency by stimulating the eRF1 activity in a GTP-dependent manner. The C-terminal region is responsible for translation termination activity and is essential for viability. Saccharomyces cerevisiae eRF3 (Sup35p) is a translation termination factor which is divided into three regions N, M and a C-terminal eEF1a-like region essential for translation termination. Sup35NM is a non-pathogenic prion-like protein with the property of aggregating into polymer-like fibrils. 58071 cd03705: Domain III of EF-1. Eukaryotic elongation factor 1 (EF-1) is responsible for the GTP-dependent binding of aminoacyl-tRNAs to ribosomes. EF-1 is composed of four subunits: the alpha chain, which binds GTP and aminoacyl-tRNAs, the gamma chain that probably plays a role in anchoring the complex to other cellular components and the beta and delta (or beta') chains. This family is the alpha subunit, and represents the counterpart of bacterial EF-Tu for the archaea (aEF-1 alpha) and eukaryotes (eEF-1 alpha).. 58072 cd03706: Domain III of mitochondrial EF-TU (mtEF-TU). mtEF-TU is highly conserved and is 55-60% identical to bacterial EF-TU. The overall structure is similar to that observed in the Escherichia coli and Thermus aquaticus EF-TU. However, compared with that observed in prokaryotic EF-TU the nucleotide-binding domain (domain I) of EF-TUmt is in a different orientation relative to the rest of the structure. Furthermore, domain III is followed by a short 11-amino acid extension that forms one helical turn. This extension seems to be specific to the mitochondrial factors and has not been observed in any of the prokaryotic factors. 58073 cd03707: Domain III of elongation factor (EF) Tu. Ef-Tu consists of three structural domains, designated I, II and III. Domain III adopts a beta barrel structure. Domain III is involved in binding to both charged tRNA and binding to elongation factor Ts (EF-Ts). EF-Ts is the guanine-nucleotide-exchange factor for EF-Tu. EF-Tu and EF-G participate in the elongation phase during protein biosynthesis on the ribosome. Their functional cycles depend on GTP binding and its hydrolysis. The EF-Tu complexed with GTP and aminoacyl-tRNA delivers tRNA to the ribosome, whereas EF-G stimulates translocation, a process in which tRNA and mRNA movements occur in the ribosome. Crystallographic studies revealed structural similarities (""molecular mimicry"") between tertiary structures of EF-G and the EF-Tu-aminoacyl-tRNA ternary complex. Domains III, IV, and V of EF-G mimic the tRNA structure in the EF-Tu ternary complex; domains III, IV and V can be related to the acceptor stem, anticodon helix and T stem of tRNA respectively. 58074 cd03708: Domain III of the GP-1 family of GTPase. This group includes proteins similar to GTPBP1 and GTPBP2. GTPB1 is structurally, related to elongation factor 1 alpha, a key component of protein biosynthesis machinery. Immunohistochemical analyses on mouse tissues revealed that GTPBP1 is expressed in some neurons and smooth muscle cells of various organs as well as macrophages. Immunofluorescence analyses revealed that GTPBP1 is localized exclusively in cytoplasm and shows a diffuse granular network forming a gradient from the nucleus to the periphery of the cells in smooth muscle cell lines and macrophages. No significant difference was observed in the immune response to protein antigen between mutant mice and wild-type mice, suggesting normal function of antigen-presenting cells of the mutant mice. The absence of an eminent phenotype in GTPBP1-deficient mice may be due to functional compensation by GTPBP2, which is similar to GTPBP1 in structure and tissue distribution. 58075 cd04093: HBS1_C: this family represents the C-terminal domain of Hsp70 subfamily B suppressor 1 (HBS1) which is homologous to the domain III of EF-1alpha. This group contains proteins similar to yeast Hbs1, a G protein known to be important for efficient growth and protein synthesis under conditions of limiting translation initiation and, to associate with Dom34. It has been speculated that yeast Hbs1 and Dom34 proteins may function as part of a complex with a role in gene expression. 58076 cd04094: This family represents the domain of elongation factor SelB, homologous to domain III of EF-Tu. SelB may function by replacing EF-Tu. In prokaryotes, the incorporation of selenocysteine as the 21st amino acid, encoded by TGA, requires several elements: SelC is the tRNA itself, SelD acts as a donor of reduced selenium, SelA modifies a serine residue on SelC into selenocysteine, and SelB is a selenocysteine-specific translation elongation factor. 3' or 5' non-coding elements of mRNA have been found as probable structures for directing selenocysteine incorporation. 58077 cd04095: TCysN_NoDQ_II: This subfamily represents the domain II of the large subunit of ATP sulfurylase (ATPS): CysN or the N-terminal portion of NodQ, found mainly in proteobacteria and homologous to the domain II of EF-Tu. Escherichia coli ATPS consists of CysN and a smaller subunit CysD and CysN. ATPS produces adenosine-5 '-phosphosulfate (APS) from ATP and sulfate, coupled with GTP hydrolysis. In the subsequent reaction APS is phosphorylated by an APS kinase (CysC), to produce 3'-phosphoadenosine-5'-phosphosulfate (PAPS) for use in amino acid (aa) biosynthesis. The Rhizobiaceae group (alpha-proteobacteria) appears to carry out the same chemistry for the sufation of a nodulation factor. In Rhizobium meliloti, a the hererodimeric complex comprised of NodP and NodQ appears to possess both ATPS and APS kinase activities. The N and C termini of NodQ correspond to CysN and CysC, respectively. Other eubacteria, Archaea, and eukaryotes use a different ATP sulfurylase, which shows no aa sequence similarity to CysN or NodQ. CysN and the N-terminal portion of NodQ show similarity to GTPases involved in translation, in particular, EF-Tu and EF-1alpha. 58078 cd01342: Translation_Factor_II_like: Elongation factor Tu (EF-Tu) domain II-like proteins. Elongation factor Tu consists of three structural domains, this family represents the second domain. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought to be involved in binding of E.coli IF-2 to 30S subunits. 58079 cd03688: eIF2_gamma_II: this subfamily represents the domain II of the gamma subunit of eukaryotic translation initiation factor 2 (eIF2-gamma) found in Eukaryota and Archaea. eIF2 is a G protein that delivers the methionyl initiator tRNA to the small ribosomal subunit and releases it upon GTP hydrolysis after the recognition of the initiation codon. eIF2 is composed three subunits, alpha, beta and gamma. Subunit gamma shows strongest conservation, and it confers both tRNA binding and GTP/GDP binding. 58080 cd03689: RF3_II: this subfamily represents the domain II of bacterial Release Factor 3 (RF3). Termination of protein synthesis by the ribosome requires two release factor (RF) classes. The class II RF3 is a GTPase that removes class I RFs (RF1 or RF2) from the ribosome after release of the nascent polypeptide. RF3 in the GDP state binds to the ribosomal class I RF complex, followed by an exchange of GDP for GTP and release of the class I RF. Sequence comparison of class II release factors with elongation factors shows that prokaryotic RF3 is more similar to EF-G whereas eukaryotic eRF3 is more similar to eEF1A, implying that their precise function may differ. 58081 cd03690: Tet_II: This subfamily represents domain II of ribosomal protection proteins Tet(M) and Tet(O). This domain has homology to domain II of the elongation factors EF-G and EF-2. Tet(M) and Tet(O) catalyze the release of tetracycline (Tc) from the ribosome in a GTP-dependent manner thereby mediating Tc resistance. Tcs are broad-spectrum antibiotics. Typical Tcs bind to the ribosome and inhibit the elongation phase of protein synthesis, by inhibiting the occupation of site A by aminoacyl-tRNA. 58082 cd03691: BipA_TypA_II: domain II of BipA (also called TypA) having homology to domain II of the elongation factors (EFs) EF-G and EF-Tu. BipA is a highly conserved protein with global regulatory properties in Escherichia coli. BipA is phosphorylated on a tyrosine residue under some cellular conditions. Mutants show altered regulation of some pathways. BipA functions as a translation factor that is required specifically for the expression of the transcriptional modulator Fis. BipA binds to ribosomes at a site that coincides with that of EF-G and has a GTPase activity that is sensitive to high GDP:GTP ratios and, is stimulated by 70S ribosomes programmed with mRNA and aminoacylated tRNAs. The growth rate-dependent induction of BipA allows the efficient expression of Fis, thereby modulating a range of downstream processes, including DNA metabolism and type III secretion. 58083 cd03692: mtIF2_IVc: this family represents the C2 subdomain of domain IV of mitochondrial translation initiation factor 2 (mtIF2) which adopts a beta-barrel fold displaying a high degree of structural similarity with domain II of the translation elongation factor EF-Tu. The C-terminal part of mtIF2 contains the entire fMet-tRNAfmet binding site of IF-2 and is resistant to proteolysis. This C-terminal portion consists of two domains, IF2 C1 and IF2 C2. IF2 C2 been shown to contain all molecular determinants necessary and sufficient for the recognition and binding of fMet-tRNAfMet. Like IF2 from certain prokaryotes such as Thermus thermophilus, mtIF2lacks domain II which is thought to be involved in binding of E.coli IF-2 to 30S subunits. 58084 cd03693: EF1_alpha_II: this family represents the domain II of elongation factor 1-alpha (EF-1a) that is found in archaea and all eukaryotic lineages. EF-1A is very abundant in the cytosol, where it is involved in the GTP-dependent binding of aminoacyl-tRNAs to the A site of the ribosomes in the second step of translation from mRNAs to proteins. Both domain II of EF1A and domain IV of IF2/eIF5B have been implicated in recognition of the 3'-ends of tRNA. More than 61% of eukaryotic elongation factor 1A (eEF-1A) in cells is estimated to be associated with actin cytoskeleton. The binding of eEF1A to actin is a noncanonical function that may link two distinct cellular processes, cytoskeleton organization and gene expression. 58085 cd03694: Domain II of the GP-1 family of GTPase. This group includes proteins similar to GTPBP1 and GTPBP2. GTPB1 is structurally, related to elongation factor 1 alpha, a key component of protein biosynthesis machinery. Immunohistochemical analyses on mouse tissues revealed that GTPBP1 is expressed in some neurons and smooth muscle cells of various organs as well as macrophages. Immunofluorescence analyses revealed that GTPBP1 is localized exclusively in cytoplasm and shows a diffuse granular network forming a gradient from the nucleus to the periphery of the cells in smooth muscle cell lines and macrophages. No significant difference was observed in the immune response to protein antigen between mutant mice and wild-type mice, suggesting normal function of antigen-presenting cells of the mutant mice. The absence of an eminent phenotype in GTPBP1-deficient mice may be due to functional compensation by GTPBP2, which is similar to GTPBP1 in structure and tissue distribution. 58086 cd03695: CysN_NodQ_II: This subfamily represents the domain II of the large subunit of ATP sulfurylase (ATPS): CysN or the N-terminal portion of NodQ, found mainly in proteobacteria and homologous to the domain II of EF-Tu. Escherichia coli ATPS consists of CysN and a smaller subunit CysD and CysN. ATPS produces adenosine-5'-phosphosulfate (APS) from ATP and sulfate, coupled with GTP hydrolysis. In the subsequent reaction APS is phosphorylated by an APS kinase (CysC), to produce 3'-phosphoadenosine-5 '-phosphosulfate (PAPS) for use in amino acid (aa) biosynthesis. The Rhizobiaceae group (alpha-proteobacteria) appears to carry out the same chemistry for the sufation of a nodulation factor. In Rhizobium meliloti, a the hererodimeric complex comprised of NodP and NodQ appears to possess both ATPS and APS kinase activities. The N and C termini of NodQ correspond to CysN and CysC, respectively. Other eubacteria, Archaea, and eukaryotes use a different ATP sulfurylase, which shows no aa sequence similarity to CysN or NodQ. CysN and the N-terminal portion of NodQ show similarity to GTPases involved in translation, in particular, EF-Tu and EF-1alpha. 58087 cd03696: selB_II: this subfamily represents the domain of elongation factor SelB, homologous to domain II of EF-Tu. SelB may function by replacing EF-Tu. In prokaryotes, the incorporation of selenocysteine as the 21st amino acid, encoded by TGA, requires several elements: SelC is the tRNA itself, SelD acts as a donor of reduced selenium, SelA modifies a serine residue on SelC into selenocysteine, and SelB is a selenocysteine-specific translation elongation factor. 3' or 5' non-coding elements of mRNA have been found as probable structures for directing selenocysteine incorporation. 58088 cd03697: EFTU_II: Elongation factor Tu domain II. Elongation factors Tu (EF-Tu) are three-domain GTPases with an essential function in the elongation phase of mRNA translation. The GTPase center of EF-Tu is in the N-terminal domain (domain I), also known as the catalytic or G-domain. The G-domain is composed of about 200 amino acid residues, arranged into a predominantly parallel six-stranded beta-sheet core surrounded by seven a-helices. Non-catalytic domains II and III are beta-barrels of seven and six, respectively, antiparallel beta-strands that share an extended interface. Either non-catalytic domain is composed of about 100 amino acid residues. EF-Tu proteins exist in two principal conformations: in a compact one, EF-Tu*GTP, with tight interfaces between all three domains and a high affinity for aminoacyl-tRNA, and in an open one, EF-Tu*GDP, with essentially no G-domain-domain II interactions and a low affinity for aminoacyl-tRNA. EF-Tu has approximately a 100-fold higher affinity for GDP than for GTP. 58089 cd03698: eRF3_II_like: domain similar to domain II of the eukaryotic class II release factor (eRF3). In eukaryotes, translation termination is mediated by two interacting release factors, eRF1 and eRF3, which act as class I and II factors, respectively. eRF1 functions as an omnipotent release factor, decoding all three stop codons and triggering the release of the nascent peptide catalyzed by the ribsome. eRF3 is a GTPase, which enhances the termination efficiency by stimulating the eRF1 activity in a GTP-dependent manner. Sequence comparison of class II release factors with elongation factors shows that eRF3 is more similar to eEF1alpha whereas prokaryote RF3 is more similar to EF-G, implying that their precise function may differ. Only eukaryote RF3s are found in this group. Saccharomyces cerevisiae eRF3 (Sup35p) is a translation termination factor which is divided into three regions N, M and a C-terminal eEF1a-like region essential for translation termination. Sup35NM is a non-pathogenic prion-like protein with the property of aggregating into polymer-like fibrils. This group also contains proteins similar to S. cerevisiae Hbs1, a G protein known to be important for efficient growth and protein synthesis under conditions of limiting translation initiation and, to associate with Dom34. It has been speculated that yeast Hbs1 and Dom34 proteins may function as part of a complex with a role in gene expression. 58090 cd03699: lepA_II: This subfamily represents the domain II of LepA, a GTP-binding protein localized in the cytoplasmic membrane. The N-terminal domain of LepA shares regions of homology to translation factors. In terms of interaction with the ribosome, EF-G, EF-Tu and IF2 have all been demonstrated to interact at overlapping sites on the ribosome. Chemical protection studies demonstrate that they all include the universally conserved alpha-sarcin loop as part of their binding site. These data indicate that LepA may bind to this location on the ribosome as well. LepA has never been observed in archaea, and eukaryl LepA is organellar. LepA is therefore a true bacterial GTPase, found only in the bacterial lineage. 58091 cd03700: EF2_snRNP_like_II: this subfamily represents domain II of elongation factor (EF) EF-2 found eukaryotes and archaea and, the C-terminal portion of the spliceosomal human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and, its yeast counterpart Snu114p. During the process of peptide synthesis and tRNA site changes, the ribosome is moved along the mRNA a distance equal to one codon with the addition of each amino acid. This translocation step is catalyzed by EF-2_GTP, which is hydrolyzed to provide the required energy. Thus, this action releases the uncharged tRNA from the P site and transfers the newly formed peptidyl-tRNA from the A site to the P site. Yeast Snu114p is essential for cell viability and for splicing in vivo. U5-116 kD binds GTP. Experiments suggest that GTP binding and probably GTP hydrolysis is important for the function of the U5-116 kD/Snu114p. 58092 cd03701: IF2_IF5B_II: This family represents the domain II of prokaryotic Initiation Factor 2 (IF2) and its archeal and eukaryotic homologue aeIF5B. IF2, the largest initiation factor is an essential GTP binding protein. In E. coli three natural forms of IF2 exist in the cell, IF2alpha, IF2beta1, and IF2beta2. Disruption of the eIF5B gene (FUN12) in yeast causes a severe slow-growth phenotype, associated with a defect in translation. eIF5B has a function analogous to prokaryotic IF2 in mediating the joining of the 60S ribosomal subunit. The eIF5B consists of three N-terminal domains (I, II, II) connected by a long helix to domain IV. Domain I is a G domain, domain II and IV are beta-barrels and domain III has a novel alpha-beta-alpha sandwich fold. The G domain and the beta-barrel domain II display a similar structure and arrangement to the homologous domains in EF1A, eEF1A and aeIF2gamma. 58093 cd03702: This family represents the domain II of bacterial Initiation Factor 2 (IF2) and its eukaryotic mitochondrial homologue mtIF2. IF2, the largest initiation factor is an essential GTP binding protein. In E. coli three natural forms of IF2 exist in the cell, IF2alpha, IF2beta1, and IF2beta2. Bacterial IF-2 is structurally and functionally related to eukaryotic mitochondrial mtIF-2. 58094 cd03703: aeIF5B_II: This family represents the domain II of archeal and eukaryotic aeIF5B. aeIF5B is a homologue of prokaryotic Initiation Factor 2 (IF2). Disruption of the eIF5B gene (FUN12) in yeast causes a severe slow-growth phenotype, associated with a defect in translation. eIF5B has a function analogous to prokaryotic IF2 in mediating the joining of joining of 60S subunits. The eIF5B consists of three N-terminal domains (I, II, II) connected by a long helix to domain IV. Domain I is a G domain, domain II and IV are beta-barrels and domain III has a novel alpha-beta-alpha sandwich fold. The G domain and the beta-barrel domain II display a similar structure and arrangement to the homologous domains of EF1A, eEF1A and aeIF2gamma. 58095 cd04088: EFG_mtEFG_II: this subfamily represents the domain II of elongation factor G (EF-G) in bacteria and, the C-terminus of mitochondrial Elongation factor G1 (mtEFG1) and G2 (mtEFG2)_like proteins found in eukaryotes. During the process of peptide synthesis and tRNA site changes, the ribosome is moved along the mRNA a distance equal to one codon with the addition of each amino acid. In bacteria this translocation step is catalyzed by EF-G_GTP, which is hydrolyzed to provide the required energy. Thus, this action releases the uncharged tRNA from the P site and transfers the newly formed peptidyl-tRNA from the A site to the P site. Eukaryotic cells harbor 2 protein synthesis systems: one localized in the cytoplasm, the other in the mitochondria. Most factors regulating mitochondrial protein synthesis are encoded by nuclear genes, translated in the cytoplasm, and then transported to the mitochondria. The eukaryotic system of elongation factor (EF) components is more complex than that in prokaryotes, with both cytoplasmic and mitochondrial elongation factors and multiple isoforms being expressed in certain species. mtEFG1 and mtEFG2 show significant homology to bacterial EF-Gs. Mutants in yeast mtEFG1 have impaired mitochondrial protein synthesis, respiratory defects and a tendency to lose mitochondrial DNA. No clear phenotype has been found for mutants in the yeast homologue of mtEFG2, MEF2. 58096 cd04089: eRF3_II: domain II of the eukaryotic class II release factor (eRF3). In eukaryotes, translation termination is mediated by two interacting release factors, eRF1 and eRF3, which act as class I and II factors, respectively. eRF1 functions as an omnipotent release factor, decoding all three stop codons and triggering the release of the nascent peptide catalyzed by the ribsome. eRF3 is a GTPase, which enhances the termination efficiency by stimulating the eRF1 activity in a GTP-dependent manner. Sequence comparison of class II release factors with elongation factors shows that eRF3 is more similar to eEF1alpha whereas prokaryote RF3 is more similar to EF-G, implying that their precise function may differ. Only eukaryote RF3s are found in this group. Saccharomyces cerevisiae eRF3 (Sup35p) is a translation termination factor which is divided into three regions N, M and a C-terminal eEF1a-like region essential for translation termination. Sup35NM is a non-pathogenic prion-like protein with the property of aggregating into polymer-like fibrils. 58097 cd04090: Loc2 eEF2_C_snRNP, cd01514/C terminal domain:eEF2_C_snRNP: This family includes C-terminal portion of the spliceosomal human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and, its yeast counterpart Snu114p. This domain is homologous to domain II of the eukaryotic translational elongation factor EF-2. Yeast Snu114p is essential for cell viability and for splicing in vivo. U5-116 kD binds GTP. Experiments suggest that GTP binding and probably GTP hydrolysis is important for the function of the U5-116 kD/Snu114p. In complex with GTP, EF-2 promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site, the uncharged tRNA from the P site to the E-site and, the mRNA is shifted one codon relative to the ribosome. 58098 cd04091: mtEFG1_C: C-terminus of mitochondrial Elongation factor G1 (mtEFG1)-like proteins found in eukaryotes. Eukaryotic cells harbor 2 protein synthesis systems: one localized in the cytoplasm, the other in the mitochondria. Most factors regulating mitochondrial protein synthesis are encoded by nuclear genes, translated in the cytoplasm, and then transported to the mitochondria. The eukaryotic system of elongation factor (EF) components is more complex than that in prokaryotes, with both cytoplasmic and mitochondrial elongation factors and multiple isoforms being expressed in certain species. Eukaryotic EF-2 operates in the cytosolic protein synthesis machinery of eukaryotes, EF-Gs in protein synthesis in bacteria. Eukaryotic mtEFG1 proteins show significant homology to bacterial EF-Gs. Mutants in yeast mtEFG1 have impaired mitochondrial protein synthesis, respiratory defects and a tendency to lose mitochondrial DNA. There are two forms of mtEFG present in mammals (designated mtEFG1s and mtEFG2s) mtEFG2s are not present in this group. 58099 cd04092: mtEFG2_C: C-terminus of mitochondrial Elongation factor G2 (mtEFG2)-like proteins found in eukaryotes. Eukaryotic cells harbor 2 protein synthesis systems: one localized in the cytoplasm, the other in the mitochondria. Most factors regulating mitochondrial protein synthesis are encoded by nuclear genes, translated in the cytoplasm, and then transported to the mitochondria. The eukaryotic system of elongation factor (EF) components is more complex than that in prokaryotes, with both cytoplasmic and mitochondrial elongation factors and multiple isoforms being expressed in certain species. Eukaryotic EF-2 operates in the cytosolic protein synthesis machinery of eukaryotes, EF-Gs in protein synthesis in bacteria. Eukaryotic mtEFG1 proteins show significant homology to bacterial EF-Gs. No clear phenotype has been found for mutants in the yeast homologue of mtEFG2, MEF2. There are two forms of mtEFG present in mammals (designated mtEFG1s and mtEFG2s) mtEFG1s are not present in this group. 58100 cd00121: MATH (meprin and TRAF-C homology) domain; an independent folding unit with an eight-stranded beta-sandwich structure found in meprins, TRAFs and other proteins. Meprins comprise a class of extracellular metalloproteases which are anchored to the membrane and are capable of cleaving growth factors, extracellular matrix proteins, and biologically active peptides. TRAF molecules serve as adapter proteins that link cell surface receptors of the Tumor Necrosis Factor and 1nterleukin-1/Toll-like families to downstream kinase cascades, which results in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses in the immune and inflammatory systems. Other members include the ubiquitin ligases, TRIM37 and SPOP, and the ubiquitin-specific proteases, HAUSP and Ubp21p. A large number of uncharacterized members mostly from lineage-specific expansions in C. elegans and rice contain MATH and BTB domains, similar to SPOP. The MATH domain has been shown to bind peptide/protein substrates in TRAFs and HAUSP. It is possible that the MATH domain in other members of this superfamily also interacts with various protein substrates. The TRAF domain may also be involved in the trimerization of TRAFs. Based on homology, it is postulated that the MATH domain in meprins may be involved in its tetramer assembly and that the MATH domain, in general, may take part in diverse modular arrangements defined by adjacent multimerization domains. 58101 cd00270: Tumor Necrosis Factor Receptor (TNFR)-Associated Factor (TRAF) family, TRAF domain, C-terminal MATH subdomain; TRAF molecules serve as adapter proteins that link cell surface TNFRs and receptors of the interleukin-1/Toll-like family to downstream kinase signaling cascades which results in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses in the immune and inflammatory systems. There are at least six mammalian and three Drosophila proteins containing TRAF domains. The mammalian TRAFs display varying expression profiles, indicating independent and cell type-specific regulation. They display distinct, as well as overlapping functions and interactions with receptors. Most TRAFs, except TRAF1, share N-terminal homology and contain a RING domain, multiple zinc finger domains, and a TRAF domain. TRAFs form homo- and heterotrimers through its TRAF domain. The TRAF domain can be divided into a more divergent N-terminal alpha helical region (TRAF-N), and a highly conserved C-terminal MATH subdomain (TRAF-C) with an eight-stranded beta-sandwich structure. TRAF-N mediates trimerization while TRAF-C interacts with receptors. 58102 cd03771: Meprin family, MATH domain; Meprins are multidomain, highly glycosylated extracellular metalloproteases, which are either anchored to the membrane or secreted into extracellular spaces. They are expressed in renal and intestinal brush border membranes, leukocytes, and cancer cells, and are capable of cleaving growth factors, cytokines, extracellular matrix proteins, and biologically active peptides. Meprin proteases are composed of two related subunits, alpha and beta, which form homo- or hetro-complexes where the basic unit is a disulfide-linked dimer. Despite their similarity, the two subunits differ in their ability to self-associate, in proteolytic processing during biosynthesis and in substrate specificity. Both subunits are synthesized as membrane spanning proteins, however, the alpha subunit is cleaved during biosynthesis and loses its transmembrane domain. Meprin beta forms homodimers or heterotetramers while meprin alpha oligomerizes into large complexes containing 10-100 subunits. Both alpha and beta subunits contain a catalytic astacin (M12 family) protease domain followed by the adhesion or interaction domains MAM, MATH and AM. The MATH and MAM domains provide symmetrical intersubunit disulfide bonds necessary for the dimerization of meprin subunits. The MATH domain may also be required for folding of an activable zymogen. 58103 cd03772: Herpesvirus-associated ubiquitin-specific protease (HAUSP, also known as USP7) family, N-terminal MATH (TRAF-like) domain; composed of proteins similar to human HAUSP, an enzyme that specifically catalyzes the deubiquitylation of p53 and MDM2, hence playing an important role in the p53-MDM2 pathway. It contains an N-terminal TRAF-like domain and a C-terminal catalytic protease (C19 family) domain. The tumor suppressor p53 protein is a transcription factor that responds to many cellular stress signals and is regulated primarily through ubiquitylation and subsequent degradation. MDM2 is a RING-finger E3 ubiquitin ligase that promotes p53 ubiquitinylation. p53 and MDM2 bind to the same site in the N-terminal TRAF-like domain of HAUSP in a mutually exclusive manner. HAUSP also interacts with the Epstein-Barr nuclear antigen 1 (EBNA1) protein of the Epstein-Barr virus (EBV), which efficiently immortalizes infected cells predisposing the host to a variety of cancers. EBNA1 plays several important roles in EBV latent infection and cellular transformation. It binds the same pocket as p53 in the HAUSP TRAF-like domain. Through interactions with p53, MDM2 and EBNA1, HAUSP plays a role in cell proliferation, apoptosis and EBV-mediated immortalization. 58104 cd03773: Tripartite motif containing protein 37 (TRIM37) family, MATH domain; TRIM37 is a peroxisomal protein and is a member of the tripartite motif (TRIM) protein subfamily, also known as the RING-B-box-coiled-coil (RBCC) subfamily of zinc-finger proteins. Mutations in the human TRIM37 gene (also known as MUL) cause Mulibrey (muscle-liver-brain-eye) nanism, a rare growth disorder of prenatal onset characterized by dysmorphic features, pericardial constriction and hepatomegaly. TRIM37, similar to other TRIMs, contains a cysteine-rich, zinc-binding RING-finger domain followed by another cysteine-rich zinc-binding domain, the B-box, and a coiled-coil domain. TRIM37 is autoubiquitinated in a RING domain-dependent manner, indicating that it functions as an ubiquitin E3 ligase. In addition to the tripartite motif, TRIM37 also contains a MATH domain C-terminal to the coiled-coil domain. The MATH domain of TRIM37 has been shown to interact with the TRAF domain of six known TRAFs in vitro, however, it is unclear whether this is physiologically relevant. Eleven TRIM37 mutations have been associated with Mulibrey nanism so far. One mutation, Gly322Val, is located in the MATH domain and is the only mutation that does not affect the length of the protein. It results in the incorrect subcellular localization of TRIM37. 58105 cd03774: Speckle-type POZ protein (SPOP) family, MATH domain; composed of proteins with similarity to human SPOP. SPOP was isolated as a novel antigen recognized by serum from a scleroderma patient, whose overexpression in COS cells results in a discrete speckled pattern in the nuclei. It contains an N-terminal MATH domain and a C-terminal BTB (also called POZ) domain. Together with Cul3, SPOP constitutes an ubiquitin E3 ligase which is able to ubiquitinate the PcG protein BMI1, the variant histone macroH2A1 and the death domain-associated protein Daxx. Therefore, SPOP may be involved in the regulation of these proteins and may play a role in transcriptional regulation, apoptosis and X-chromosome inactivation. Cul3 binds to the BTB domain of SPOP whereas Daxx and the macroH2A1 nonhistone region have been shown to bind to the MATH domain. Both MATH and BTB domains are necessary for the nuclear speckled accumulation of SPOP. There are many proteins, mostly uncharacterized, containing both MATH and BTB domains from C. elegans and plants which are excluded from this family. 58106 cd03775: Ubiquitin-specific protease 21 (Ubp21p) family, MATH domain; composed of fungal proteins with similarity to Ubp21p of fission yeast. Ubp21p is a deubiquitinating enzyme that may be involved in the regulation of the protein kinase Prp4p, which controls the formation of active spliceosomes. Members of this family are similar to human HAUSP (Herpesvirus-associated ubiquitin-specific protease) in that they contain an N-terminal MATH domain and a C-terminal catalytic protease (C19 family) domain. HAUSP is also an ubiquitin-specific protease that specifically catalyzes the deubiquitylation of p53 and MDM2. The MATH domain of HAUSP contains the binding site for p53 and MDM2. Similarly, the MATH domain of members in this family may be involved in substrate binding. 58107 cd03776: Tumor Necrosis Factor Receptor (TNFR)-Associated Factor (TRAF) family, TRAF6 subfamily, TRAF domain, C-terminal MATH subdomain; composed of proteins with similarity to human TRAF6, including the Drosophila protein DTRAF2. TRAF molecules serve as adapter proteins that link TNFRs and downstream kinase cascades resulting in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses. TRAF6 is the most divergent in its TRAF domain among the mammalian TRAFs. In addition to mediating TNFR family signaling, it is also an essential signaling molecule of the interleukin-1/Toll-like receptor superfamily. Whereas other TRAF molecules display similar and overlapping TNFR-binding specificities, TRAF6 binds completely different sites on receptors such as CD40 and RANK. TRAF6 serves as a molecular bridge between innate and adaptive immunity and plays a central role in osteoimmunology. DTRAF2, as an activator of nuclear factor-kappaB, plays a pivotal role in Drosophila development and innate immunity. TRAF6 contains a RING finger domain, five zinc finger domains, and a TRAF domain. The TRAF domain can be divided into a more divergent N-terminal alpha helical region (TRAF-N), and a highly conserved C-terminal MATH subdomain (TRAF-C) with an eight-stranded beta-sandwich structure. TRAF-N mediates trimerization while TRAF-C interacts with receptors. 58108 cd03777: Tumor Necrosis Factor Receptor (TNFR)-Associated Factor (TRAF) family, TRAF3 subfamily, TRAF domain; TRAF molecules serve as adapter proteins that link TNFRs and downstream kinase cascades resulting in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses. TRAF3 was first described as a molecule that binds the cytoplasmic tail of CD40. However, it is not required for CD40 signaling. More recently, TRAF3 has been identified as a key regulator of type I interferon (IFN) production and the mammalian innate antiviral immunity. It mediates IFN responses in Toll-like receptor (TLR)-dependent as well as TLR-independent viral recognition pathways. It is also a key element in immunological homeostasis through its regulation of the anti-inflammatory cytokine interleukin-10. TRAF3 contains a RING finger domain, five zinc finger domains, and a TRAF domain. The TRAF domain can be divided into a more divergent N-terminal alpha helical region (TRAF-N), and a highly conserved C-terminal MATH subdomain (TRAF-C) with an eight-stranded beta-sandwich structure. TRAF-N mediates trimerization while TRAF-C interacts with receptors. 58109 cd03778: Tumor Necrosis Factor Receptor (TNFR) Associated Factor (TRAF) family, TRAF2 subfamily, TRAF domain; TRAF molecules serve as adapter proteins that link TNFRs and downstream kinase cascades resulting in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses. TRAF2 associates with the receptors TNFR-1, TNFR-2, RANK (which mediates differentiation and maturation of osteoclasts) and CD40 (which is important for the proliferation and activation of B cells), among others. It regulates distinct pathways that lead to the activation of nuclear factor-kappaB and Jun NH2-terminal kinases. TRAF2 also indirectly associates with death receptors through its interaction with TRADD (TNFR-associated death domain protein). It is involved in regulating oxidative stress or ROS-induced cell death and in the preconditioning of cells by sublethal stress for protection from subsequent injury. TRAF2 contains a RING finger domain, five zinc finger domains, and a TRAF domain. The TRAF domain can be divided into a more divergent N-terminal alpha helical region (TRAF-N), and a highly conserved C-terminal MATH subdomain (TRAF-C) with an eight-stranded beta-sandwich structure. TRAF-N mediates trimerization while TRAF-C interacts with receptors. 58110 cd03779: Tumor Necrosis Factor Receptor (TNFR) Associated Factor (TRAF) family, TRAF1 subfamily, TRAF domain, C-terminal MATH subdomain; TRAF molecules serve as adapter proteins that link TNFRs and downstream kinase cascades resulting in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses. TRAF1 expression is the most restricted among the TRAFs. It is found exclusively in activated lymphocytes, dendritic cells and certain epithelia. TRAF1 associates, directly or indirectly through heterodimerization with TRAF2, with the TNFR family receptors TNFR-2, CD30, RANK, CD40 and LMP1, among others. It also binds the intracellular proteins TRADD, TANK, TRIP, RIP1, RIP2 and FLIP. TRAF1 is unique among the TRAFs in that it lacks a RING domain, which is critical for the activation of nuclear factor-kappaB and Jun NH2-terminal kinase. Studies on TRAF1-deficient mice suggest that TRAF1 has a negative regulatory role in TNFR-mediated signaling events. TRAF1 contains one zinc finger and one TRAF domain. The TRAF domain can be divided into a more divergent N-terminal alpha helical region (TRAF-N), and a highly conserved C-terminal MATH subdomain (TRAF-C) with an eight-stranded beta-sandwich structure. TRAF-N mediates trimerization while TRAF-C interacts with receptors. 58111 cd03780: Tumor Necrosis Factor Receptor (TNFR)-Associated Factor (TRAF) family, TRAF5 subfamily, TRAF domain, C-terminal MATH subdomain; TRAF molecules serve as adapter proteins that link TNFRs and downstream kinase cascades resulting in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses. TRAF5 was identified as an activator of nuclear factor-kappaB and a regulator of lymphotoxin-beta receptor and CD40 signaling. Its interaction with CD40 is indirect, involving hetero-oligomerization with TRAF3. In addition, TRAF5 has been shown to associate with other TNFRs including CD27, CD30, OX40 and GITR (glucocorticoid-induced TNFR). It plays a role in modulating Th2 immune responses (driven by OX40 costimulation) and T-cell activation (triggered by GITR). It is also involved in osteoclastogenesis. TRAF5 contains a RING finger domain, five zinc finger domains, and a TRAF domain. The TRAF domain can be divided into a more divergent N-terminal alpha helical region (TRAF-N), and a highly conserved C-terminal MATH subdomain (TRAF-C) with an eight-stranded beta-sandwich structure. TRAF-N mediates trimerization while TRAF-C interacts with receptors. 58112 cd03781: Tumor Necrosis Factor Receptor (TNFR)-Associated Factor (TRAF) family, TRAF4 subfamily, TRAF domain, C-terminal MATH subdomain; composed of proteins with similarity to human TRAF4, including the Drosophila protein DTRAF1. TRAF molecules serve as adapter proteins that link TNFRs and downstream kinase cascades resulting in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses. TRAF4 is highly expressed during embryogenesis, especially in the central and peripheral nervous system. Studies using TRAF4-deficient mice show that TRAF4 is required for neurogenesis, as well as the development of the trachea and the axial skeleton. In addition, TRAF4 augments nuclear factor-kappaB activation triggered by GITR (glucocorticoid-induced TNFR), a receptor expressed in T-cells, B-cells and macrophages. It also participates in counteracting the signaling mediated by Toll-like receptors through its association with TRAF6 and TRIF. DTRAF1 plays a pivotal role in the development of eye imaginal discs and photosensory neuron arrays in Drosophila. TRAF4 contains a RING finger domain, seven zinc finger domains, and a TRAF domain. The TRAF domain can be divided into a more divergent N-terminal alpha helical region (TRAF-N), and a highly conserved C-terminal MATH subdomain (TRAF-C) with an eight-stranded beta-sandwich structure. TRAF-N mediates trimerization while TRAF-C interacts with receptors. 58113 cd03782: Meprin family, Beta subunit, MATH domain; Meprins are multidomain extracellular metalloproteases capable of cleaving growth factors, cytokines, extracellular matrix proteins, and biologically active peptides. They are composed of two related subunits, alpha and beta, which form homo- or hetro-complexes where the basic unit is a disulfide-linked dimer. The beta subunit is a type I membrane protein, which forms homodimers or heterotetramers (alpha2beta2 or alpha3beta). Meprin beta shows preference for acidic residues at the P1 and P1' sites of its substrate. Among its best substrates are growth factors and chemokines such as gastrin and osteopontin. Both alpha and beta subunits contain a catalytic astacin (M12 family) protease domain followed by the adhesion or interaction domains MAM, MATH and AM. The MATH and MAM domains provide symmetrical intersubunit disulfide bonds necessary for the dimerization of meprin subunits. The MATH domain may also be required for folding of an activable zymogen. 58114 cd03783: Meprin family, Alpha subunit, MATH domain; Meprins are multidomain extracellular metalloproteases capable of cleaving growth factors, cytokines, extracellular matrix proteins, and biologically active peptides. They are composed of two related subunits, alpha and beta, which form homo- or hetro-complexes where the basic unit is a disulfide-linked dimer. The alpha subunit is synthesized as a membrane spanning protein, however, it is cleaved during biosynthesis and loses its transmembrane domain. It oligomerizes into large complexes, containing 10-100 subunits (dimers that associate noncovalently), which are secreted as latent proteases and can move through extracellular spaces in a nondestructive manner. This allows delivery of the concentrated protease to sites containing activating enzymes, such as sites of inflammation, infection or cancerous growth. Meprin alpha shows preference for small or hydrophobic residues at the P1 and P1' sites of its substrate. Both alpha and beta subunits contain a catalytic astacin (M12 family) protease domain followed by the adhesion or interaction domains MAM, MATH and AM. The MATH and MAM domains provide symmetrical intersubunit disulfide bonds necessary for the dimerization of meprin subunits. The MATH domain may also be required for folding of an activable zymogen. 58115 cd00338: Serine Recombinase family, catalytic domain; a DNA binding domain may be present either N- or C-terminal to the catalytic domain. These enzymes perform site-specific recombination of DNA molecules by a concerted, four-strand cleavage and rejoining mechanism which involves a transient phosphoserine linkage between DNA and serine recombinase. Serine recombinases demonstrate functional versatility and include resolvases, invertases, integrases, and transposases. Resolvases and invertases (i.e. Tn3, gamma-delta, Tn5044 resolvases, Gin and Hin invertases) in this family contain a C-terminal DNA binding domain and comprise a major phylogenic group. Also included are phage- and bacterial-encoded recombinases such as phiC31 integrase, SpoIVCA excisionase, and Tn4451 TnpX transposase. These integrases and transposases have larger C-terminal domains compared to resolvases/invertases and are referred to as large serine recombinases. Also belonging to this family are proteins with N-terminal DNA binding domains similar to IS607- and IS1535-transposases from Helicobacter and Mycobacterium. 58116 cd03767: Serine recombinase (SR) family, Partitioning (par)-Resolvase subfamily, catalytic domain; Serine recombinases catalyze site-specific recombination of DNA molecules by a concerted, four-strand cleavage and rejoining mechanism which involves a transient phosphoserine linkage between DNA and the enzyme. They are functionally versatile and include resolvases, invertases, integrases, and transposases. This subgroup is composed of proteins similar to the E. coli resolvase found in the par region of the RP4 plasmid, which encodes a highly efficient partitioning system. This protein is part of a complex stabilization system involved in the resolution of plasmid dimers during cell division. Similar to Tn3 and other resolvases, members of this family may contain a C-terminal DNA binding domain. 58117 cd03768: Serine Recombinase (SR) family, Resolvase and Invertase subfamily, catalytic domain; members contain a C-terminal DNA binding domain. Serine recombinases catalyze site-specific recombination of DNA molecules by a concerted, four-strand cleavage and rejoining mechanism which involves a transient phosphoserine linkage between DNA and the enzyme. They are functionally versatile and include resolvases, invertases, integrases, and transposases. Resolvases and invertases affect resolution or inversion and comprise a major phylogenic group. Resolvases (e.g. Tn3, gamma-delta, and Tn5044) normally recombine two sites in direct repeat causing deletion of the DNA between the sites. Invertases (e.g. Gin and Hin) recombine sites in inverted repeat to invert the DNA between the sites. Cointegrate resolution with gamma-delta resolvase requires the formation of a synaptosome of three resolvase dimers bound to each of two res sites on the DNA. Also included in this subfamily are some putative integrases including a sequence from bacteriophage phi-FC1. 58118 cd03769: Serine Recombinase (SR) family, IS607-like transposase subfamily, catalytic domain; members contain a DNA binding domain with homology to MerR/SoxR located N-terminal to the catalytic domain. Serine recombinases catalyze site-specific recombination of DNA molecules by a concerted, four-strand cleavage and rejoining mechanism which involves a transient phosphoserine linkage between DNA and the enzyme. They are functionally versatile and include resolvases, invertases, integrases, and transposases. This subfamily is composed of proteins that catalyze the transposition of insertion sequence (IS) elements such as IS607 from Helicobacter and IS1535 from Mycobacterium, and similar proteins from other bacteria and several archaeal species. IS elements are DNA segments that move to new sites in prokaryotic and eukaryotic genomes causing insertion mutations and gene rearrangements. 58119 cd03770: Serine Recombinase (SR) family, TndX-like transposase subfamily, catalytic domain; composed of large serine recombinases similar to Clostridium TndX and TnpX transposases. Serine recombinases catalyze site-specific recombination of DNA molecules by a concerted, four-strand cleavage and rejoining mechanism which involves a transient phosphoserine linkage between DNA and the enzyme. They are functionally versatile and include resolvases, invertases, integrases, and transposases. TndX mediates the excision and circularization of the conjugative transposon Tn5397 from Clostridium difficile. TnpX is responsible for the movement of the nonconjugative chloramphenicol resistance elements of the Tn4451/3 family. Mobile genetic elements such as transposons are important vehicles for the transmission of virulence and antibiotic resistance in many microorganisms. 58120 cd03587: SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58121 cd03716: SOCS (suppressors of cytokine signaling) box of ASB (ankyrin repeat and SOCS box) and SSB (SPRY domain-containing SOCS box proteins) protein families. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence of a variable number of repeats. SSB proteins contain a central SPRY domain and a C-terminal SOCS. Recently, it has been shown that all four SSB proteins interact with the MET, the receptor protein-tyrosine kinase for hepatocyte growth factor (HGF), and that SSB-1, SSB-2, and SSB-4 interact with prostate apoptosis response protein-4. Both types of interactions are mediated through the SPRY domain. 58122 cd03717: SOCS (suppressors of cytokine signaling) box of SOCS-like proteins. The CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. These intracellular proteins regulate the responses of immune cells to cytokines. Identified as negative regulators of the cytokine-JAK-STAT pathway, they seem to play a role in many immunological and pathological processes. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. Related SOCS boxes are also present in Rab40-like proteins and insect proteins of unknown function that also contain a NEUZ (domain in neuralized proteins) domain. 58123 cd03718: SOCS (suppressors of cytokine signaling) box of SSB1 and SSB4 (SPRY domain-containing SOCS box proteins)-like proteins. SSB proteins contain a central SPRY domain and a C-terminal SOCS. SSB1 and SSB4 has been shown to bind to MET, the receptor protein-tyrosine kinase for hepatocyte growth factor (HGF) and also interacts with prostate apoptosis response protein-4. Both types of interactions are mediated through the SPRY domain. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58124 cd03719: SOCS (suppressors of cytokine signaling) box of SSB2 (SPRY domain-containing SOCS box proteins)-like proteins. SSB proteins contain a central SPRY domain and a C-terminal SOCS. SSB2 has been shown to bind to MET, the receptor protein-tyrosine kinase for hepatocyte growth factor (HGF). SSB2, like SSB4 and SSB1, also interacts with prostate apoptosis response protein-4. Both types of interactions are mediated through the SPRY domain. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58125 cd03720: SOCS (suppressors of cytokine signaling) box of ASB1-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58126 cd03721: SOCS (suppressors of cytokine signaling) box of ASB2-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. ASB2 targets specific proteins to destruction by the proteasome in leukemia cells that have been induced to differentiate. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58127 cd03722: SOCS (suppressors of cytokine signaling) box of ASB3-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. ABS3 has been shown to be negative regulator of TNF-R2-mediated cellular responses to TNF-alpha by direct targeting of tumor necrosis factor receptor II (TNF-R2) for ubiquitination and proteasome-mediated degradation. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58128 cd03723: SOCS (suppressors of cytokine signaling) box of ASB4 and ASB18 proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. Asb4 was identified as imprinted gene in mice. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58129 cd03724: SOCS (suppressors of cytokine signaling) box of ASB5-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. ASB5 has been implicated in the initiation of arteriogenesis. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58130 cd03725: SOCS (suppressors of cytokine signaling) box of ASB6-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. ASB6 interacts with the adaptor protein APS and recruits elongin B/C to the insulin receptor signaling complex. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58131 cd03726: SOCS (suppressors of cytokine signaling) box of ASB7-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58132 cd03727: SOCS (suppressors of cytokine signaling) box of ASB8-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. Human ASB8 is highly transcribed in skeletal muscle and in lung carcinoma cell lines. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58133 cd03728: SOCS (suppressors of cytokine signaling) box of ASB9 and 11 proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58134 cd03729: SOCS (suppressors of cytokine signaling) box of ASB13-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58135 cd03730: SOCS (suppressors of cytokine signaling) box of ASB14-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58136 cd03731: SOCS (suppressors of cytokine signaling) box of ASB15-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. Human ASB15 is expressed predominantly in skeletal muscle and participates in the regulation of protein turnover and muscle cell development by stimulating protein synthesis and regulating differentiation of muscle cells. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58137 cd03733: SOCS (suppressors of cytokine signaling) box of WSB/SWiP-like proteins. This subfamily contains WSB-1 (SOCS-box-containing WD-40 protein), part of an E3 ubiquitin ligase for the thyroid-hormone-activating type 2 iodothyronine deiodinase (D2), and SWiP-1 (SOCS box and WD-repeats in Protein), a WD40-containing protein that is expressed in embryonic structures of chickens and regulated by Sonic Hedgehog (Shh), as well as, their isoforms WSB-2 and SWiP-2. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58138 cd03734: SOCS (suppressors of cytokine signaling) box of CIS (cytokine-inducible SH2 protein) 1-like proteins. Together with the SOCS proteins, the CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. CIS1, like SOCS1 and SOCS3, is involved in the down-regulation of the JAK/STAT pathway. CIS1 binds to cytokine receptors at STAT5-docking sites, which prohibits recruitment of STAT5 to the receptor signaling complex and results in the down-regulation of activation by STAT5. 58139 cd03735: SOCS (suppressors of cytokine signaling) box of SOCS1-like proteins. Together with CIS1, the CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. SOCS1, like CIS1 and SOCS3, is involved in the down-regulation of the JAK/STAT pathway. SOCS1 has a dual function as a direct potent JAK kinase inhibitor and as a component of an E3 ubiquitin-ligase complex recruiting substrates to the protein degradation machinery. 58140 cd03736: SOCS (suppressors of cytokine signaling) box of SOCS2-like proteins. Together with CIS1, the CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. SOCS2 has recently been shown to regulate neuronal differentiation by controlling expression of a neurogenic transcription factor, Neurogenin-1. SOCS2 binds to GH receptors and inhibits the activation of STAT5b induced by GH. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58141 cd03737: SOCS (suppressors of cytokine signaling) box of SOCS3-like proteins. Together with CIS1, the CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. SOCS3, like CIS1 and SOCS1, is involved in the down-regulation of the JAK/STAT pathway. SOCS3 inhibits JAK activity indirectly through recruitment to the cytokine receptors. SOCS3 has been shown to play an essential role in placental development and a non-essential role in embryo development. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58142 cd03738: SOCS (suppressors of cytokine signaling) box of SOCS4-like proteins. Together with CIS1, the CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58143 cd03739: SOCS (suppressors of cytokine signaling) box of SOCS5-like proteins. Together with CIS1, the CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. SOCS5 inhibits Th2 differentiation by inhibiting IL-4 signaling. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58144 cd03740: SOCS (suppressors of cytokine signaling) box of SOCS6-like proteins. Together with CIS1, the CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58145 cd03741: SOCS (suppressors of cytokine signaling) box of SOCS7-like proteins. Together with CIS1, the CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. SOCS7 is important in the functioning of neuronal cells. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58146 cd03742: SOCS (suppressors of cytokine signaling) box of Rab40-like proteins. Rab40 is part of the Rab family of small GTP-binding proteins that form the largest family within the Ras superfamily. Rab proteins regulate vesicular trafficking pathways, behaving as membrane-associated molecular switches. Rab40 is characterized by a SOCS box c-terminal to the GTPase domain. The SOCS boxes interact with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58147 cd03743: SOCS (suppressors of cytokine signaling) box of SSB4 (SPRY domain-containing SOCS box proteins)-like proteins. SSB proteins contain a central SPRY domain and a C-terminal SOCS. SSB4 has been shown to bind to MET, the receptor protein-tyrosine kinase for hepatocyte growth factor (HGF). SSB4, like SSB2 and SSB1, also interacts with prostate apoptosis response protein-4. Both types of interactions are mediated through the SPRY domain. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58148 cd03744: SOCS (suppressors of cytokine signaling) box of SSB1 (SPRY domain-containing SOCS box proteins)-like proteins. SSB proteins contain a central SPRY domain and a C-terminal SOCS. SSB1 has been shown to bind to MET, the receptor protein-tyrosine kinase for hepatocyte growth factor (HGF), both the absence and the presence of HGF and enhances the HGF-MET-induced mitogen-activated protein kinases Erk-transcription factor Elk-1-serum response elements (SRE) pathway. SSB1, like SSB2 and SSB4, also interacts with prostate apoptosis response protein-4. Both types of interactions are mediated through the SPRY domain. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58149 cd03745: SOCS (suppressors of cytokine signaling) box of WSB2/SWiP2-like proteins. This family consists of WSB-2 (SOCS-box-containing WD-40 protein) and SWiP-2 (SOCS box and WD-repeats in Protein). No functional information is available for WSB2 or SWiP-2, but limited information is available for the isoforms WSB-1 and SWiP-1. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58150 cd03746: SOCS (suppressors of cytokine signaling) box of WSB1/SWiP1-like proteins. This subfamily contains WSB-1 (SOCS-box-containing WD-40 protein), part of an E3 ubiquitin ligase for the thyroid-hormone-activating type 2 iodothyronine deiodinase (D2) and SWiP-1 (SOCS box and WD-repeats in Protein), a WD40-containing protein that is expressed in embryonic structures of chickens and regulated by Sonic Hedgehog (Shh). The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions. 58151 cd00512: Coenzyme B12-dependent-methylmalonyl coenzyme A (CoA) mutase (MCM)-like family; contains proteins similar to MCM, and the large subunit of Streptomyces coenzyme B12-dependent isobutyryl-CoA mutase (ICM). MCM catalyzes the isomerization of methylmalonyl-CoA to succinyl-CoA. The reaction proceeds via radical intermediates beginning with a substrate-induced homolytic cleavage of the Co-C bond of coenzyme B12 to produce cob(II)alamin and the deoxyadenosyl radical. MCM plays an important role in the conversion of propionyl-CoA to succinyl-CoA during the degradation of propionate for the Krebs cycle. In higher animals, MCM is involved in the breakdown of odd-chain fatty acids, several amino acids, and cholesterol. Methylobacterium extorquens MCM participates in the glyoxylate regeneration pathway. In M. extorquens, MCM forms a complex with MeaB; MeaB may protect MCM from irreversible inactivation. In some bacteria, MCM is involved in the reverse metabolic reaction, the rearrangement of succinyl-CoA to methylmalonyl-CoA. Examples include Propionbacterium shermanni MCM during propionic acid fermentation, E.coli MCM in a pathway for the conversion of succinate to propionate and Streptomyces MCM in polyketide biosynthesis. P. shermanni and Streptomyces cinnamonensis MCMs are alpha/beta heterodimers, with both subunits being homologous members of this family. It has been shown for P. shermanni MCM that only the alpha subunit binds coenzyme B12 and substrates. Human MCM is a homodimer with two active sites. Mouse and E.coli MCMs are also homodimers. ICM from S. cinnamonensis is comprised of a large and a small subunit. The holoenzyme appears to be an alpha2beta2 heterotetramer with up to 2 molecules of coenzyme B12 bound. The small subunit binds coenzyme B12. ICM catalyzes the reversible rearrangement of n-butyryl-CoA to isobutyryl-CoA (intermediates in fatty acid and valine catabolism, which in S. cinnamonensis can be converted to methylmalonyl-CoA and used in polyketide synthesis). In humans, impaired activity of MCM results in methylmalonic aciduria, a disorder of propionic acid metabolism. 58152 cd03677: Coenzyme B12-dependent-methylmalonyl coenzyme A (CoA) mutase (MCM) family, Beta subunit-like subfamily; contains bacterial proteins similar to the beta subunit of MCMs from Propionbacterium shermanni and Streptomyces cinnamonensis, which are alpha/beta heterodimers. For P. shermanni MCM, it is known that only the alpha subunit binds coenzyme B12 and substrates. The role of the beta subunit is unclear. MCM catalyzes the isomerization of methylmalonyl-CoA to succinyl-CoA. The reaction proceeds via radical intermediates beginning with a substrate-induced homolytic cleavage of the Co-C bond of coenzyme B12 to produce cob(II)alamin and the deoxyadenosyl radical. MCM plays an important role in the conversion of propionyl-CoA to succinyl-CoA during the degradation of propionate for the Krebs cycle. Methylobacterium extorquens MCM participates in the glyoxylate regeneration pathway. In M. extorquens, MCM forms a complex with MeaB; MeaB may protect MCM from irreversible inactivation. In some bacteria, MCM is involved in the reverse metabolic reaction, the rearrangement of succinyl-CoA to methylmalonyl-CoA. Examples include P. shermanni MCM during propionic acid fermentation and Streptomyces MCM in polyketide biosynthesis. 58153 cd03678: Coenzyme B12-dependent-methylmalonyl coenzyme A (CoA) mutase (MCM) family, unknown subfamily 1; composed of uncharacterized bacterial proteins containing a C-terminal MCM domain. MCM catalyzes the isomerization of methylmalonyl-CoA to succinyl-CoA. The reaction proceeds via radical intermediates beginning with a substrate-induced homolytic cleavage of the Co-C bond of coenzyme B12 to produce cob(II)alamin and the deoxyadenosyl radical. MCM plays an important role in the conversion of propionyl-CoA to succinyl-CoA during the degradation of propionate for the Krebs cycle. In some bacteria, MCM is involved in the reverse metabolic reaction, the rearrangement of succinyl-CoA to methylmalonyl-CoA. Members of this subfamily also contain an N-terminal coenzyme B12 binding domain followed by a domain similar to the E. coli ArgK membrane ATPase. 58154 cd03679: Coenzyme B12-dependent-methylmalonyl coenzyme A (CoA) mutase (MCM) family, Alpha subunit-like subfamily; contains proteins similar to the alpha subunit of Propionbacterium shermanni MCM, as well as human and E. coli MCM. Members of this subfamily contain an N-terminal MCM domain and a C-terminal coenzyme B12 binding domain. MCM catalyzes the isomerization of methylmalonyl-CoA to succinyl-CoA. The reaction proceeds via radical intermediates beginning with a substrate-induced homolytic cleavage of the Co-C bond of coenzyme B12 to produce cob(II)alamin and the deoxyadenosyl radical. MCM plays an important role in the conversion of propionyl-CoA to succinyl-CoA during the degradation of propionate for the Krebs cycle. In higher animals, MCM is involved in the breakdown of odd-chain fatty acids, several amino acids, and cholesterol. Methylobacterium extorquens MCM participates in the glyoxylate regeneration pathway. In M. extorquens, MCM forms a complex with MeaB; MeaB may protect MCM from irreversible inactivation. In some bacteria, MCM is involved in the reverse metabolic reaction, the rearrangement of succinyl-CoA to methylmalonyl-CoA. Examples include P. shermanni MCM during propionic acid fermentation, E.coli MCM in a pathway for the conversion of succinate to propionate and Streptomyces MCM in polyketide biosynthesis. Sinorhizobium meliloti strain SU47 MCM plays a role in the polyhydroxyalkanoate degradation pathway. P. shermanni and Streptomyces cinnamonensis MCMs are alpha/beta heterodimers. It has been shown for P. shermanni MCM that only the alpha subunit binds coenzyme B12 and substrates. Human MCM is a homodimer with two active sites. Mouse and E.coli MCMs are also homodimers. In humans, impaired activity of MCM results in methylmalonic aciduria, a disorder of propionic acid metabolism. 58155 cd03680: Coenzyme B12-dependent-methylmalonyl coenzyme A (CoA) mutase (MCM) family, isobutyryl-CoA mutase (ICM)-like subfamily; contains archaeal and bacterial proteins similar to the large subunit of Streptomyces cinnamonensis coenzyme B12-dependent ICM. ICM from S. cinnamonensis is comprised of a large and a small subunit. The holoenzyme appears to be an alpha2beta2 heterotetramer with up to 2 molecules of coenzyme B12 bound. The small subunit binds coenzyme B12. ICM catalyzes the reversible rearrangement of n-butyryl-CoA to isobutyryl-CoA, intermediates in fatty acid and valine catabolism, which in S. cinnamonensis can be converted to methylmalonyl-CoA and used in polyketide synthesis. 58156 cd03681: Coenzyme B12-dependent-methylmalonyl coenzyme A (CoA) mutase (MCM) family, MeaA-like subfamily; contains various methylmalonyl coenzyme A (CoA) mutase (MCM)-like proteins similar to the Streptomyces cinnamonensis MeaA, Methylobacterium extorquens MeaA and Streptomyces collinus B12-dependent mutase. Members of this subfamily contain an N-terminal MCM domain and a C-terminal coenzyme B12 binding domain. S. cinnamonensis MeaA is a putative B12-dependent mutase which provides methylmalonyl-CoA precursors for the biosynthesis of the monensin polyketide via an unknown pathway. S. collinus B12-dependent mutase may be involved in a pathway for acetate assimilation. 58157 cd01102: The link domain is a hyaluronan (HA)-binding domain. It functions to mediate adhesive interactions during inflammatory leukocyte homing and tumor metastasis. It is found in the CD44 receptor and in human TSG-6. TSG-6 is the protein product of the tumor necrosis factor-stimulated gene-6. TSG-6 has a strong anti-inflammatory effect in models of acute inflammation and autoimmune arthritis and plays an essential role in female fertility. This group also contains the link domains of the chondroitin sulfate proteoglycan core proteins (CSPG) including aggrecan, versican, neurocan, and brevican and the link domains of the vertebrate HAPLN (HA and proteoglycan binding link) protein family. In cartilage, aggrecan forms cartilage link protein stabilized aggregates with HA. These aggregates contribute to the tissue's load bearing properties. Aggregates in which other CSPGs substitute for aggregan might contribute to the structural integrity of many different tissues. Members of the vertebrate HPLN gene family are physically linked adjacent to CSPG genes. TSG-6 contains a single link module which supports high affinity binding with HA. The functional HA-binding domain of CD44 is an extended domain comprised of a link module flanked with N-and C- extensions. These extensions are essential for folding and functional activity. CSPGs are characterized by an N-terminal globular domain (G1 domain) containing two contiguous link modules (modules 1 and 2). Both link modules of the G1 domain of the CSPG aggrecan are involved in interaction with HA. Aggrecan in addition contains a second globular domain (G2) which contains link modules 3 and 4 which lack HA-binding activity. HAPLNs contain two contiguous link modules. 58158 cd03515: This is the extracellular link domain of the type found in human TSG-6. The link domain is a hyaluronan (HA)-binding domain. TSG-6 is the protein product of tumor necrosis factor-stimulated gene-6. TSG-6 is up-regulated in inflammatory lesions and in the ovary during ovulation. It has a strong anti-inflammatory and chondroprotective effect in models of acute inflammation and autoimmune arthritis and plays an essential role in female fertility. Also included in this group are the stabilins: stabilin-1 (FEEL-1, CLEVER-1) and stabilin-2 (FEEL-2). Stabilin-2 functions as the major liver and lymph node-scavenging receptor for HA and related glycosaminoglycans. Stabilin-2 is a scavenger receptor with a broad range of ligands including advanced glycation end (AGE) products, acetylated low density lipoprotein and procollagen peptides. In contrast, stabilin-1 does not bind HA, but binds acetylated low density lipoprotein and AGEs with lower affinity. As AGEs accumulate in vascular tissues during aging and diabetes, these receptors may be implicated in the pathologies of these states. Both stabilins are present in the early endocytic pathway in hepatic sinusoidal epithelium associating with clathrin/AP-2. Stabilin-1 is expressed in macrophages. Stabilin-2 is absent from the latter. In macrophages: stabilin-1 is involved in trafficking between early/sorting endosomes and the trans-Golgi network. Stabilin-1 has also been implicated in angiogenesis and possibly leucocyte trafficking. Both stabilins bind gram-positive and gram-negative bacteria. TSG-6 and stabilins contain a single link module which supports high affinity binding to HA. 58159 cd03516: This domain is a hyaluronan (HA)-binding domain. It is found in CD44 receptor and mediates adhesive interactions during inflammatory leukocyte homing and tumor metastasis. It also plays an important role in arteriogenesis. The functional HA-binding domain of CD44 is an extended domain comprised of a single link module flanked with N-and C- extensions. These extensions are essential for folding and for functional activity. This group also contains the cell surface retention sequence (CRS) binding protein-1 (CRSBP-1) and lymph vessel endothelial receptor-1 (LYVE-1). CRSBP-1 is a cell surface binding protein for the CRS motif of PDGF-BB (platelet-derived growth factor-BB) and is responsible for the cell surface retention of PDGF-BB in SSV-transformed cells. CRSBP-1 may play a role in autocrine regulation of cell growth mediated by CRS containing growth regulators. LYVE-1 is preferentially expressed on the lymphatic endothelium and is used as a molecular marker for the detection and characterization of lymphatic vessels in tumors. 58160 cd03517: Link_domain_CSPGs_modules_1_3; this extracellular link domain is found in the first and third link modules of the chondroitin sulfate proteoglycan core protein (CSPG) aggrecan. In addition, it is found in the first link module of three other CSPGs: versican, neurocan, and brevican. The link domain is a hyaluronan (HA)-binding domain. CSPGs are characterized by an N-terminal globular domain (G1 domain) containing two contiguous link modules (modules 1 and 2). Both link modules of the G1 domain of aggrecan are involved in interaction with HA. In addition, aggrecan contains a second globular domain (G2) which contains link modules 3 and 4. G2 appears to lack HA-binding activity. In cartilage, aggrecan forms cartilage link protein stabilized aggregates with HA. These aggregates contribute to the tissue's load bearing properties. Aggregates having other CSPGs substituting for aggrecan may contribute to the structural integrity of many different tissues. Members of the vertebrate HPLN (hyaluronan/HA and proteoglycan binding link) protein family are physically linked adjacent to CSPG genes. 58161 cd03518: Link_domain_HAPLN_module_1; this link domain is found in the first link module of proteins similar to the vertebrate HAPLN (hyaluronan/HA and proteoglycan binding link) protein family which includes cartilage link protein. The link domain is a HA-binding domain. HAPLNs contain two contiguous link modules. Both link modules of cartilage link protein are involved in interaction with HA. In cartilage, a chondroitin sulfate proteoglycan core protein (CSPG) aggrecan forms cartilage link protein stabilized aggregates with HA. These aggregates contribute to the tissue's load bearing properties. Aggregates with other CSPGs substituting for aggregan may contribute to the structural integrity of many different tissues. Members of the vertebrate HAPLN gene family are physically linked adjacent to CSPG genes. 58162 cd03519: Link_domain_HAPLN_module_2; this link domain is found in the second link module of proteins similar to the vertebrate HAPLN (hyaluronan/HA and proteoglycan binding link) protein family which includes cartilage link protein. The link domain is a HA-binding domain. HAPLNs contain two contiguous link modules. Both link modules of cartilage link protein are involved in interaction with HA. In cartilage, a chondroitin sulfate proteoglycan core protein (CSPG) aggrecan forms cartilage link protein stabilized aggregates with HA. These aggregates contribute to the tissue's load bearing properties. Aggregates with other CSPGs substituting for aggregan may contribute to the structural integrity of many different tissues. Members of the vertebrate HAPLN gene family are physically linked adjacent to CSPG genes. 58163 cd03520: Link_domain_CSPGs_modules_2_4; this link domain is found in the second and fourth link modules of the chondroitin sulfate proteoglycan core protein (CSPG) aggrecan and, in the second link module of three other CSPGs: versican, neurocan, and brevican. The link domain is a hyaluronan (HA)-binding domain. CSPGs are characterized by an N-terminal globular domain (G1 domain) containing two contiguous link modules (modules 1 and 2). Both link modules of the G1 domain of aggrecan are involved in interaction with HA. Aggrecan in addition contains a second globular domain (G2) having link modules 3 and 4 which lack HA-binding activity. In cartilage, aggrecan forms cartilage link protein stabilized aggregates with HA. These aggregates contribute to the tissue's load bearing properties. Aggregates having other CSPGs substituting for aggregan may contribute to the structural integrity of many different tissues. Members of the vertebrate HPLN (hyaluronan/HA and proteoglycan binding link) protein family are physically linked adjacent to CSPG genes. 58164 cd03521: Link_domain_KIAA0527_like; this domain is found in the human protein KIAA0527. Sequence-wise, it is highly similar to the link domain. The link domain is a hyaluronan-binding (HA) domain. KIAA0527 contains a single link module. The KIAA0527 gene was originally cloned from human brain tissue. 58165 cd00758: MoCF_BD: molybdenum cofactor (MoCF) binding domain (BD). This domain is found a variety of proteins involved in biosynthesis of molybdopterin cofactor, like MoaB, MogA, and MoeA. The domain is presumed to bind molybdopterin. 58166 cd00885: Competence-damaged protein. CinA is the first gene in the competence- inducible (cin) operon and is thought to be specifically required at some stage in the process of transformation. This domain is closely related to a domain, found in a variety of proteins involved in biosynthesis of molybdopterin cofactor, where the domain is presumed to bind molybdopterin. 58167 cd00886: MogA_MoaB family. Members of this family are involved in biosynthesis of the molybdenum cofactor (MoCF) an essential cofactor of a diverse group of redox enzymes. MoCF biosynthesis is an evolutionarily conserved pathway present in eubacteria, archaea, and eukaryotes. MoCF contains a tricyclic pyranopterin, termed molybdopterin (MPT). MogA, together with MoeA, is responsible for the metal incorporation into MPT, the third step in MoCF biosynthesis. The plant homolog Cnx1 is a MoeA-MogA fusion protein. The mammalian homolog gephyrin is a MogA-MoeA fusion protein, that plays a critical role in postsynaptic anchoring of inhibitory glycine receptors and major GABAa receptor subtypes. In contrast, MoaB shows high similarity to MogA, but little is known about its physiological role. All well studied members of this family form highly stable trimers. 58168 cd00887: MoeA family. Members of this family are involved in biosynthesis of the molybdenum cofactor (MoCF), an essential cofactor of a diverse group of redox enzymes. MoCF biosynthesis is an evolutionarily conserved pathway present in eubacteria, archaea and eukaryotes. MoCF contains a tricyclic pyranopterin, termed molybdopterin (MPT). MoeA, together with MoaB, is responsible for the metal incorporation into MPT, the third step in MoCF biosynthesis. The plant homolog Cnx1 is a MoeA-MogA fusion protein. The mammalian homolog gephyrin is a MogA-MoeA fusion protein, that plays a critical role in postsynaptic anchoring of inhibitory glycine receptors and major GABAa receptor subtypes. 58169 cd03522: MoeA_like. This domain is similar to a domain found in a variety of proteins involved in biosynthesis of molybdopterin cofactor, like MoaB, MogA, and MoeA. There this domain is presumed to bind molybdopterin. The exact function of this subgroup is unknown. 58170 cd01060: The membrane fatty acid desaturase (Membrane_FADS)-like CD includes membrane FADSs, alkane hydroxylases, beta carotene ketolases (CrtW-like), hydroxylases (CrtR-like), and other related proteins. They are present in all groups of organisms with the exception of archaea. Membrane FADSs are non-heme, iron-containing, oxygen-dependent enzymes involved in regioselective introduction of double bonds in fatty acyl aliphatic chains. They play an important role in the maintenance of the proper structure and functioning of biological membranes. Alkane hydroxylases are bacterial, integral-membrane di-iron enzymes that share a requirement for iron and oxygen for activity similar to that of membrane FADSs, and are involved in the initial oxidation of inactivated alkanes. Beta-carotene ketolase and beta-carotene hydroxylase are carotenoid biosynthetic enzymes for astaxanthin and zeaxanthin, respectively. This superfamily domain has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of these sequences also reveals three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXX(X)H, HXX(X)HH, and HXXHH (an additional conserved histidine residue is seen between clusters 2 and 3). Spectroscopic and genetic evidence point to a nitrogen-rich coordination environment located in the cytoplasm with as many as eight histidines coordinating the two iron ions and a carboxylate residue bridging the two metals in the Pseudomonas oleovorans alkane hydroxylase (AlkB). In addition, the eight histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the rat stearoyl CoA delta-9 desaturase. 58171 cd03505: The Delta9 Fatty Acid Desaturase (Delta9-FADS)-like CD includes the delta-9 and delta-11 acyl CoA desaturases found in various eukaryotes including vertebrates, insects, higher plants, and fungi. The delta-9 acyl-lipid desaturases are found in a wide range of bacteria. These enzymes play essential roles in fatty acid metabolism and the regulation of cell membrane fluidity. Acyl-CoA desaturases are the enzymes involved in the CoA-bound desaturation of fatty acids. Mammalian stearoyl-CoA delta-9 desaturase is a key enzyme in the biosynthesis of monounsaturated fatty acids, and in yeast, the delta-9 acyl-CoA desaturase (OLE1) reaction accounts for all de nova unsaturated fatty acid production in Saccharomyces cerevisiae. These non-heme, iron-containing, ER membrane-bound enzymes are part of a three-component enzyme system involving cytochrome b5, cytochrome b5 reductase, and the delta-9 fatty acid desaturase. This complex catalyzes the NADH- and oxygen-dependent insertion of a cis double bond between carbons 9 and 10 of the saturated fatty acyl substrates, palmitoyl (16:0)-CoA or stearoyl (18:0)-CoA, yielding the monoenoic products palmitoleic (16:l) or oleic (18:l) acids, respectively. In cyanobacteria, the biosynthesis of unsaturated fatty acids is initiated by delta 9 acyl-lipid desaturase (DesC) which introduces the first double bond at the delta-9 position of a saturated fatty acid that has been esterified to a glycerolipid. This domain family has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of sequences also reveals the existence of three regions of conserved histidine cluster motifs that contain the residues: HXXXXH, HXXHH, and H/QXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the rat stearoyl CoA delta-9 desaturase. Some eukaryotic (Fungi, Euglenozoa, Mycetozoa, Rhodophyta) desaturase domains have an adjacent C-terminal cytochrome b5-like domain. 58172 cd03506: The Delta6 Fatty Acid Desaturase (Delta6-FADS)-like CD includes the integral-membrane enzymes: delta-4, delta-5, delta-6, delta-8, delta-8-sphingolipid, and delta-11 desaturases found in vertebrates, higher plants, fungi, and bacteria. These desaturases are required for the synthesis of highly unsaturated fatty acids (HUFAs), which are mainly esterified into phospholipids and contribute to maintaining membrane fluidity. While HUFAs may be required for cold tolerance in bacteria, plants and fish, the primary role of HUFAs in mammals is cell signaling. These enzymes are described as front-end desaturases because they introduce a double bond between the pre-exiting double bond and the carboxyl (front) end of the fatty acid. Various substrates are involved, with both acyl-coenzyme A (CoA) and acyl-lipid desaturases present in this CD. Acyl-lipid desaturases are localized in the membranes of cyanobacterial thylakoid, plant endoplasmic reticulum (ER), and plastid; and acyl-CoA desaturases are present in ER membrane. ER-bound plant acyl-lipid desaturases and acyl-CoA desaturases require cytochrome b5 as an electron donor. Most of the eukaryotic desaturase domains have an adjacent N-terminal cytochrome b5-like domain. This domain family has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of sequences also reveals the existence of three regions of conserved histidine cluster motifs that contain the residues: HXXXH, HXX(X)HH, and Q/HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the homolog, stearoyl CoA desaturase. 58173 cd03507: The Delta12 Fatty Acid Desaturase (Delta12-FADS)-like CD includes the integral-membrane enzymes, delta-12 acyl-lipid desaturases, oleate 12-hydroxylases, omega3 and omega6 fatty acid desaturases, and other related proteins, found in a wide range of organisms including higher plants, green algae, diatoms, nematodes, fungi, and bacteria. The expression of these proteins appears to be temperature dependent: decreases in temperature result in increased levels of fatty acid desaturation within membrane lipids subsequently altering cell membrane fluidity. An important enzyme for the production of polyunsaturates in plants is the oleate delta-12 desaturase (Arabidopsis FAD2) of the endoplasmic reticulum. This enzyme accepts l-acyl-2-oleoyl-sn-glycero-3-phosphocholine as substrate and requires NADH:cytochrome b oxidoreductase, cytochrome b, and oxygen for activity. FAD2 converts oleate(18:1) to linoleate (18:2) and is closely related to oleate 12-hydroxylase which catalyzes the hydroxylation of oleate to ricinoleate. Plastid-bound desaturases (Arabidopsis delta-12 desaturase (FAD6), omega-3 desaturase (FAD8), omega-6 desaturase (FAD6)), as well as, the cyanobacterial thylakoid-bound FADSs require oxygen, ferredoxin, and ferredoxin oxidoreductase for activity. As in higher plants, the cyanobacteria delta-12 (DesA) and omega-3 (DesB) FADSs desaturate oleate (18:1) to linoleate (18:2) and linoleate (18:2) to linolenate (18:3), respectively. Omega-3 (DesB/FAD8) and omega-6 (DesD/FAD6) desaturases catalyze reactions that introduce a double bond between carbons three and four, and carbons six and seven, respectively, from the methyl end of fatty acids. As with other members of this superfamily, this domain family has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of sequences also reveals the existence of three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXXH, HXX(X)HH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the homologue, stearoyl CoA desaturase. Mutation of any one of four of these histidines in the Synechocystis delta-12 acyl-lipid desaturase resulted in complete inactivity. 58174 cd03508: The Delta4-sphingolipid Fatty Acid Desaturase (Delta4-sphingolipid-FADS)-like CD includes the integral-membrane enzymes, dihydroceramide Delta-4 desaturase, involved in the synthesis of sphingosine; and the human membrane fatty acid (lipid) desaturase (MLD), reported to modulate biosynthesis of the epidermal growth factor receptor; and other related proteins. These proteins are found in various eukaryotes including vertebrates, higher plants, and fungi. Studies show that MLD is localized to the endoplasmic reticulum. As with other members of this superfamily, this domain family has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of sequences also reveals the existence of three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXXH, HXXHH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the homolog, stearoyl CoA desaturase. 58175 cd03509: Fatty acid desaturase protein family subgroup, a delta-12 acyl-lipid desaturase-like, DesA-like, yet uncharacterized subgroup of membrane fatty acid desaturase proteins found in alpha-, beta-, and gamma-proteobacteria. Sequences of this domain family appear to be structurally related to membrane fatty acid desaturases and alkane hydroxylases. They all share in common extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of these sequences also reveals three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXXH, HXXHH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within homologs, stearoyl CoA desaturase and alkane hydroxylase. 58176 cd03510: This CD includes the dihydrorhizobitoxine fatty acid desaturase (RtxC) characterized in Bradyrhizobium japonicum USDA110, and other related proteins. Dihydrorhizobitoxine desaturase is reported to be involved in the final step of rhizobitoxine biosynthesis. This domain family appears to be structurally related to the membrane fatty acid desaturases and the alkane hydroxylases. They all share in common extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of sequences also reveals the existence of three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXXH, HXX(X)HH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within homologs, stearoyl CoA desaturase and alkane hydroxylase. 58177 cd03511: This CD includes the putative hydrocarbon oxygenase, MocD, a bacterial rhizopine (3-O-methyl-scyllo-inosamine, 3-O-MSI) oxygenase, and other related proteins. It has been proposed that MocD, MocE (Rieske-like ferredoxin), and MocF (ferredoxin reductase) under the regulation of MocR, act in concert to form a ferredoxin oxygenase system that demethylates 3-O-MSI to form scyllo-inosamine. This domain family appears to be structurally related to the membrane fatty acid desaturases and the alkane hydroxylases. They all share in common extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of sequences also reveals the existence of three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXXH, HXXHH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within homologs, stearoyl CoA desaturase and alkane hydroxylase. 58178 cd03512: Alkane hydroxylase is a bacterial, integral-membrane di-iron enzyme that shares a requirement for iron and oxygen for activity similar to that of the non-heme integral-membrane acyl coenzyme A (CoA) desaturases and acyl lipid desaturases. The alk genes in Pseudomonas oleovorans encode conversion of alkanes to acyl CoA. The alkane omega-hydroxylase (AlkB) system is responsible for the initial oxidation of inactivated alkanes. It is a three-component system comprising a soluble NADH-rubredoxin reductase (AlkT), a soluble rubredoxin (AlkG), and the integral membrane oxygenase (AlkB). AlkB utilizes the oxygen rebound mechanism to hydroxylate alkanes. This mechanism involves homolytic cleavage of the C-H bond by an electrophilic metal-oxo intermediate to generate a substrate-based radical. As with other members of this superfamily, this domain family has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. The active site structure of AlkB is not known, however, spectroscopic and genetic evidence points to a nitrogen-rich coordination environment located in the cytoplasm with as many as eight histidines coordinating the two iron ions and a carboxylate residue bridging the two metals. Like all other members of this superfamily, there are eight conserved histidines seen in the histidine cluster motifs: HXXXH, HXXXHH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the homolog, stearoyl CoA desaturase. Also included in this CD are terminal alkane hydroxylases (AlkM), xylene monooxygenase hydroxylases (XylM), p-cymene monooxygenase hydroxylases (CymAa), and other related proteins. 58179 cd03513: Beta-carotene ketolase/oxygenase (CrtW, also known as CrtO), the carotenoid astaxanthin biosynthetic enzyme, initially catalyzes the addition of two keto groups to carbons C4 and C4' of beta-carotene. Carotenoids are important natural pigments produced by many microorganisms and plants. Astaxanthin is reported to be an antioxidant, an anti-cancer agent, and an immune system stimulant. A number of bacteria and green algae can convert beta-carotene into astaxanthin by using several ketocarotenoids as intermediates and CrtW and a beta-carotene hydroxylase (CrtZ). CrtW initially converts beta-carotene to canthaxanthin via echinenone, and CrtZ initially mediates the conversion of beta-carotene to zeaxanthin via beta-cryptoxanthin. After a few more intermediates are formed, CrtW and CrtZ act in combination to produce astaxanthin. Sequences of this domain family appear to be structurally related to membrane fatty acid desaturases and alkane hydroxylases. They all share in common extensive hydrophobic regions that are capable of spanning the membrane bilayer at least twice. Comparison of these sequences also reveals three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXXH, HXXHH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within homologs, stearoyl CoA desaturase and alkane hydroxylase. 58180 cd03514: Beta-carotene hydroxylase (CrtR), the carotenoid zeaxanthin biosynthetic enzyme catalyzes the addition of hydroxyl groups to the beta-ionone rings of beta-carotene to form zeaxanthin and is found in bacteria and red algae. Carotenoids are important natural pigments; zeaxanthin and lutein are the only dietary carotenoids that accumulate in the macular region of the retina and lens. It is proposed that these carotenoids protect ocular tissues against photooxidative damage. CrtR does not show overall amino acid sequence similarity to the beta-carotene hydroxylases similar to CrtZ, an astaxanthin biosynthetic beta-carotene hydroxylase. However, CrtR does show sequence similarity to the green alga, Haematococcus pluvialis, beta-carotene ketolase (CrtW), which converts beta-carotene to canthaxanthin. Sequences of the CrtR_beta-carotene-hydroxylase domain family, as well as, the CrtW_beta-carotene-ketolase domain family appear to be structurally related to membrane fatty acid desaturases and alkane hydroxylases. They all share in common extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of these sequences also reveals three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXXH, HXXHH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within homologs, stearoyl CoA desaturase and alkane hydroxylase. 58181 cd00267: ABC (ATP-binding cassette) transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 58182 cd03213: ABCG transporters are involved in eye pigment (EP) precursor transport, regulation of lipid-trafficking mechanisms, and pleiotropic drug resistance (DR). DR is a well-described phenomenon occurring in fungi and shares several similarities with processes in bacteria and higher eukaryotes. Compared to other members of the ABC transporter subfamilies, the ABCG transporter family is composed of proteins that have an ATP-binding cassette domain at the N-terminus and a TM (transmembrane) domain at the C-terminus. 58183 cd03214: ABC transporters, involved in the uptake of siderophores, heme, and vitamin B12, are widely conserved in bacteria and archaea. Only very few species lack representatives of the siderophore family transporters. The E. coli BtuCD protein is an ABC transporter mediating vitamin B12 uptake. The two ATP-binding cassettes (BtuD) are in close contact with each other, as are the two membrane-spanning subunits (BtuC); this arrangement is distinct from that observed for the E. coli lipid flippase MsbA. The BtuC subunits provide 20 transmembrane helices grouped around a translocation pathway that is closed to the cytoplasm by a gate region, whereas the dimer arrangement of the BtuD subunits resembles the ATP-bound form of the Rad50 DNA repair enzyme. A prominent cytoplasmic loop of BtuC forms the contact region with the ATP-binding cassette and represent a conserved motif among the ABC transporters. 58184 cd03215: This family represents domain II of the carbohydrate uptake proteins that transport only monosaccharides (Monos). The Carb_Monos family is involved in the uptake of monosaccharides, such as pentoses (such as xylose, arabinose, and ribose) and hexoses (such as xylose, arabinose, and ribose), that cannot be broken down to simple sugars by hydrolysis. In members of Carb_Monos family the single hydrophobic gene product forms a homodimer, while the ABC protein represents a fusion of two nucleotide-binding domains. However, it is assumed that two copies of the ABC domains are present in the assembled transporter. 58185 cd03216: This family represents the domain I of the carbohydrate uptake proteins that transport only monosaccharides (Monos). The Carb_Monos family is involved in the uptake of monosaccharides, such as pentoses (such as xylose, arabinose, and ribose) and hexoses (such as xylose, arabinose, and ribose), that cannot be broken down to simple sugars by hydrolysis. Pentoses include xylose, arabinose, and ribose. Important hexoses include glucose, galactose, and fructose. In members of the Carb_monos family, the single hydrophobic gene product forms a homodimer while the ABC protein represents a fusion of two nucleotide-binding domains. However, it is assumed that two copies of the ABC domains are present in the assembled transporter. 58186 cd03217: ABC-type transport system involved in Fe-S cluster assembly, ATPase component. Biosynthesis of iron-sulfur clusters (Fe-S) depends on multiprotein systems. The SUF system of E. coli and Erwinia chrysanthemi is important for Fe-S biogenesis under stressful conditions. The SUF system is made of six proteins: SufC is an atypical cytoplasmic ABC-ATPase, which forms a complex with SufB and SufD; SufA plays the role of a scaffold protein for assembly of iron-sulfur clusters and delivery to target proteins; SufS is a cysteine desulfurase which mobilizes the sulfur atom from cysteine and provides it to the cluster; SufE has no associated function yet. 58187 cd03218: The ABC transporters belonging to the YhbG family are similar to members of the Mj1267_LivG family, which is involved in the transport of branched-chain amino acids. The genes yhbG and yhbN are located in a single operon and may function together in cell envelope during biogenesis. YhbG is the putative ATP-binding cassette component and YhbN is the putative periplasmic-binding protein. Depletion of each gene product leads to growth arrest, irreversible cell damage and loss of viability in E. coli. The YhbG homolog (NtrA) is essential in Rhizobium meliloti, a symbiotic nitrogen-fixing bacterium. 58188 cd03219: The Mj1267/LivG ABC transporter subfamily is involved in the transport of the hydrophobic amino acids leucine, isoleucine and valine. MJ1267 is a branched-chain amino acid transporter with 29% similarity to both the LivF and LivG components of the E. coli branched-chain amino acid transporter. MJ1267 contains an insertion from residues 114 to 123 characteristic of LivG (Leucine-Isoleucine-Valine) homologs. The branched-chain amino acid transporter from E. coli comprises a heterodimer of ABCs (LivF and LivG), a heterodimer of six-helix TM domains (LivM and LivH), and one of two alternative soluble periplasmic substrate binding proteins (LivK or LivJ).. 58189 cd03220: ABC_KpsT_Wzt The KpsT/Wzt ABC transporter subfamily is involved in extracellular polysaccharide export. Among the variety of membrane-linked or extracellular polysaccharides excreted by bacteria, only capsular polysaccharides, lipopolysaccharides, and teichoic acids have been shown to be exported by ABC transporters. A typical system is made of a conserved integral membrane and an ABC. In addition to these proteins, capsular polysaccharide exporter systems require two 'accessory' proteins to perform their function: a periplasmic (E.coli) or a lipid-anchored outer membrane protein called OMA (Neisseria meningitidis and Haemophilus influenzae) and a cytoplasmic membrane protein MPA2. 58190 cd03221: ABCF_EF-3 Elongation factor 3 (EF-3) is a cytosolic protein required by fungal ribosomes for in vitro protein synthesis and for in vivo growth. EF-3 stimulates the binding of the EF-1: GTP: aa-tRNA ternary complex to the ribosomal A site by facilitated release of the deacylated tRNA from the E site. The reaction requires ATP hydrolysis. EF-3 contains two ATP nucleotide binding sequence (NBS) motifs. NBSI is sufficient for the intrinsic ATPase activity. NBSII is essential for the ribosome-stimulated functions. 58191 cd03222: The ABC ATPase RNase L inhibitor (RLI) is a key enzyme in ribosomal biogenesis, formation of translation preinitiation complexes, and assembly of HIV capsids. RLI's are not transport proteins, and thus cluster with a group of soluble proteins that lack the transmembrane components commonly found in other members of the family. Structurally, RLI's have an N-terminal Fe-S domain and two nucleotide-binding domains, which are arranged to form two composite active sites in their interface cleft. RLI is one of the most conserved enzymes between archaea and eukaryotes with a sequence identity more than 48%. The high degree of evolutionary conservation suggests that RLI performs a central role in archaeal and eukaryotic physiology. 58192 cd03223: Peroxisomal ATP-binding cassette transporter (Pat) is involved in the import of very long-chain fatty acids (VLCFA) into the peroxisome. The peroxisomal membrane forms a permeability barrier for a wide variety of metabolites required for and formed during fatty acid beta-oxidation. To communicate with the cytoplasm and mitochondria, peroxisomes need dedicated proteins to transport such hydrophilic molecules across their membranes. X-linked adrenoleukodystrophy (X-ALD) is caused by mutations in the ALD gene, which encodes ALDP (adrenoleukodystrophy protein ), a peroxisomal integral membrane protein that is a member of the ATP-binding cassette (ABC) transporter protein family. The disease is characterized by a striking and unpredictable variation in phenotypic expression. Phenotypes include the rapidly progressive childhood cerebral form (CCALD), the milder adult form, adrenomyeloneuropathy (AMN), and variants without neurologic involvement (i.e. asymptomatic).. 58193 cd03224: LivF (TM1139) is part of the LIV-I bacterial ABC-type two-component transport system that imports neutral, branched-chain amino acids. The E. coli branched-chain amino acid transporter comprises a heterodimer of ABC transporters (LivF and LivG), a heterodimer of six-helix TM domains (LivM and LivH), and one of two alternative soluble periplasmic substrate binding proteins (LivK or LivJ). ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. 58194 cd03225: Domain I of the ABC component of a cobalt transport family found in bacteria, archaea, and eukaryota. The transition metal cobalt is an essential component of many enzymes and must be transported into cells in appropriate amounts when needed. This ABC transport system of the CbiMNQO family is involved in cobalt transport in association with the cobalamin (vitamin B12) biosynthetic pathways. Most of cobalt (Cbi) transport systems possess a separate CbiN component, the cobalt-binding periplasmic protein, and they are encoded by the conserved gene cluster cbiMNQO. Both the CbiM and CbiQ proteins are integral cytoplasmic membrane proteins, and the CbiO protein has the linker peptide and the Walker A and B motifs commonly found in the ATPase components of the ABC-type transport systems. 58195 cd03226: Domain II of the ABC component of a cobalt transport family found in bacteria, archaea, and eukaryota. The transition metal cobalt is an essential component of many enzymes and must be transported into cells in appropriate amounts when needed. The CbiMNQO family ABC transport system is involved in cobalt transport in association with the cobalamin (vitamin B12) biosynthetic pathways. Most cobalt (Cbi) transport systems possess a separate CbiN component, the cobalt-binding periplasmic protein, and they are encoded by the conserved gene cluster cbiMNQO. Both the CbiM and CbiQ proteins are integral cytoplasmic membrane proteins, and the CbiO protein has the linker peptide and the Walker A and B motifs commonly found in the ATPase components of the ABC-type transport systems. 58196 cd03227: ABC-type Class 2 contains systems involved in cellular processes other than transport. These families are characterised by the fact that the ABC subunit is made up of duplicated, fused ABC modules (ABC2). No known transmembrane proteins or domains are associated with these proteins. 58197 cd03228: The MRP (Mutidrug Resistance Protein)-like transporters are involved in drug, peptide, and lipid export. They belong to the subfamily C of the ATP-binding cassette (ABC) superfamily of transport proteins. The ABCC subfamily contains transporters with a diverse functional spectrum that includes ion transport, cell surface receptor, and toxin secretion activities. The MRP-like family, simlar to all ABC proteins, have a common four-domain core structure constituted by two membrane-spanning domains, each composed of six transmembrane (TM) helices, and two nucleotide-binding domains (NBD). ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 58198 cd03229: This class is comprised of all BPD (Binding Protein Dependent) systems that are largely represented in archaea and eubacteria and are primarily involved in scavenging solutes from the environment. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 58199 cd03230: This family of ATP-binding proteins belongs to a multisubunit transporter involved in drug resistance (BcrA and DrrA), nodulation, lipid transport, and lantibiotic immunity. In bacteria and archaea, these transporters usually include an ATP-binding protein and one or two integral membrane proteins. Eukaryote systems of the ABCA subfamily display ABC domains that are quite similar to this family. The ATP-binding domain shows the highest similarity between all members of the ABC transporter family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 58200 cd03231: CcmA, the ATP-binding component of the bacterial CcmAB transporter. The CCM family is involved in bacterial cytochrome c biogenesis. Cytochrome c maturation in E. coli requires the ccm operon, which encodes eight membrane proteins (CcmABCDEFGH). CcmE is a periplasmic heme chaperone that binds heme covalently and transfers it onto apocytochrome c in the presence of CcmF, CcmG, and CcmH. The CcmAB proteins represent an ABC transporter and the CcmCD proteins participate in heme transfer to CcmE. 58201 cd03232: The pleiotropic drug resistance-like (PDR) family of ATP-binding cassette (ABC) transporters. PDR is a well-described phenomenon occurring in fungi and shares several similarities with processes in bacteria and higher eukaryotes. This PDR subfamily represents domain I of its (ABC-IM)2 organization. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds including sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 58202 cd03233: The pleiotropic drug resistance (PDR) family of ATP-binding cassette (ABC) transporters. PDR is a well-described phenomenon occurring in fungi and shares several similarities with processes in bacteria and higher eukaryotes. This PDR subfamily represents domain I of its (ABC-IM)2 organization. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds including sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 58203 cd03234: The White subfamily represents ABC transporters homologous to the Drosophila white gene, which acts as a dimeric importer for eye pigment precursors. The eye pigmentation of Drosophila is developed from the synthesis and deposition in the cells of red pigments, which are synthesized from guanine, and brown pigments, which are synthesized from tryptophan. The pigment precursors are encoded by the white, brown, and scarlet genes, respectively. Evidence from genetic and biochemical studies suggest that the White and Brown proteins function as heterodimers to import guanine, while the White and Scarlet proteins function to import tryptophan. However, a recent study also suggests that White may be involved in the transport of a metabolite, such as 3-hydroxykynurenine, across intracellular membranes. Mammalian ABC transporters belonging to the White subfamily (ABCG1, ABCG5, and ABCG8) have been shown to be involved in the regulation of lipid-trafficking mechanisms in macrophages, hepatocytes, and intestinal mucosa cells. ABCG1 (ABC8), the human homolog of the Drosophila white gene is induced in monocyte-derived macrophages during cholesterol influx mediated by acetylated low-density lipoprotein. It is possible that human ABCG1 forms heterodimers with several heterologous partners. 58204 cd03235: ABC component of the metal-type transporters. This family includes transporters involved in the uptake of various metallic cations such as iron, manganese, and zinc. The ATPases of this group of transporters are very similar to members of iron-siderophore uptake family suggesting that they share a common ancestor. The best characterized metal-type ABC transporters are the YfeABCD system of Y. pestis, the SitABCD system of Salmonella enterica serovar Typhimurium, and the SitABCD transporter of Shigella flexneri. Moreover other uncharacterized homologs of these metal-type transporters are mainly found in pathogens like Haemophilus or enteroinvasive E. coli isolates. 58205 cd03236: The ATPase domain 1 of RNase L inhibitor. The ABC ATPase, RNase L inhibitor (RLI), is a key enzyme in ribosomal biogenesis, formation of translation preinitiation complexes, and assembly of HIV capsids. RLI s are not transport proteins and thus cluster with a group of soluble proteins that lack the transmembrane components commonly found in other members of the family. Structurally, RLIs have an N-terminal Fe-S domain and two nucleotide binding domains which are arranged to form two composite active sites in their interface cleft. RLI is one of the most conserved enzymes between archaea and eukaryotes with a sequence identity more than 48%. The high degree of evolutionary conservation suggests that RLI performs a central role in archaeal and eukaryotic physiology. 58206 cd03237: The ATPase domain 2 of RNase L inhibitor. The ABC ATPase, RNase L inhibitor (RLI), is a key enzyme in ribosomal biogenesis, formation of translation preinitiation complexes, and assembly of HIV capsids. RLI's are not transport proteins and thus cluster with a group of soluble proteins that lack the transmembrane components commonly found in other members of the family. Structurally, RLI 's have an N-terminal Fe-S domain and two nucleotide-binding domains which are arranged to form two composite active sites in their interface cleft. RLI is one of the most conserved enzymes between archaea and eukaryotes with a sequence identity of more than 48%. The high degree of evolutionary conservation suggests that RLI performs a central role in archaeal and eukaryotic physiology. 58207 cd03238: The excision repair protein UvrA; Nucleotide excision repair in eubacteria is a process that repairs DNA damage by the removal of a 12-13-mer oligonucleotide containing the lesion. Recognition and cleavage of the damaged DNA is a multistep ATP-dependent reaction that requires the UvrA, UvrB, and UvrC proteins. Both UvrA and UvrB are ATPases, with UvrA having two ATP binding sites, which have the characteristic signature of the family of ABC proteins, and UvrB having one ATP binding site that is structurally related to that of helicases. 58208 cd03239: The structural maintenance of chromosomes (SMC) proteins are essential for successful chromosome transmission during replication and segregation of the genome in all organisms. SMCs are generally present as single proteins in bacteria, and as at least six distinct proteins in eukaryotes. The proteins range in size from approximately 110 to 170 kDa, and each has five distinct domains: amino- and carboxy-terminal globular domains, which contain sequences characteristic of ATPases, two coiled-coil regions separating the terminal domains , and a central flexible hinge. SMC proteins function together with other proteins in a range of chromosomal transactions, including chromosome condensation, sister-chromatid cohesion, recombination, DNA repair, and epigenetic silencing of gene expression. 58209 cd03240: The catalytic domains of Rad50 are similar to the ATP-binding cassette of ABC transporters, but are not associated with membrane-spanning domains. The conserved ATP-binding motifs common to Rad50 and the ABC transporter family include the Walker A and Walker B motifs, the Q loop, a histidine residue in the switch region, a D-loop, and a conserved LSGG sequence. This conserved sequence, LSGG, is the most specific and characteristic motif of this family and is thus known as the ABC signature sequence. 58210 cd03241: RecN ATPase involved in DNA repair; ABC (ATP-binding cassette) transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds including sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 58211 cd03242: RecF is a recombinational DNA repair ATPase that maintains replication in the presence of DNA damage. When replication is prematurely disrupted by DNA damage, several recF pathway gene products play critical roles processing the arrested replication fork, allowing it to resume and complete its task. This CD represents the nucleotide binding domain of RecF. RecF belongs to a large superfamily of ABC transporters involved in the transport of a wide variety of different compounds including sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases with a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 58212 cd03243: The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family also possess a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange. 58213 cd03244: Domain 2 of the ABC subfamily C. This family is also known as MRP (mulrtidrug resisitance-associated protein). Some of the MRP members have five additional transmembrane segments in their N-terminus, but the function of these additional membrane-spanning domains is not clear. The MRP was found in the multidrug-resistance lung cancer cell in which p-glycoprotein was not overexpressed. MRP exports glutathione by drug stimulation, as well as, certain substrates in conjugated forms with anions, such as glutathione, glucuronate, and sulfate. 58214 cd03245: ABC-type bacteriocin exporters. Many non-lantibiotic bacteriocins of lactic acid bacteria are produced as precursors which have N-terminal leader peptides that share similarities in amino acid sequence and contain a conserved processing site of two glycine residues in positions -1 and -2. A dedicated ATP-binding cassette (ABC) transporter is responsible for the proteolytic cleavage of the leader peptides and subsequent translocation of the bacteriocins across the cytoplasmic membrane. 58215 cd03246: This family represents the ABC component of the protease secretion system PrtD, a 60-kDa integral membrane protein sharing 37% identity with HlyB, the ABC component of the alpha-hemolysin secretion pathway, in the C-terminal domain. They export degradative enzymes by using a type I protein secretion system and lack an N-terminal signal peptide, but contain a C-terminal secretion signal. The Type I secretion apparatus is made up of three components, an ABC transporter, a membrane fusion protein (MFP), and an outer membrane protein (OMP). For the HlyA transporter complex, HlyB (ABC transporter) and HlyD (MFP) reside in the inner membrane of E. coli. The OMP component is TolC, which is thought to interact with the MFP to form a continuous channel across the periplasm from the cytoplasm to the exterior. HlyB belongs to the family of ABC transporters, which are ubiquitous, ATP-dependent transmembrane pumps or channels. The spectrum of transport substrates ranges from inorganic ions, nutrients such as amino acids, sugars, or peptides, hydrophobic drugs, to large polypeptides, such as HlyA. 58216 cd03247: The CYD subfamily implicated in cytochrome bd biogenesis. The CydC and CydD proteins are important for the formation of cytochrome bd terminal oxidase of E. coli and it has been proposed that they were necessary for biosynthesis of the cytochrome bd quinol oxidase and for periplasmic c-type cytochromes. CydCD were proposed to determine a heterooligomeric complex important for heme export into the periplasm or to be involved in the maintenance of the proper redox state of the periplasmic space. In Bacillus subtilius, the absence of CydCD does not affect the presence of halo-cytochrome c in the membrane and this observation suggests that CydCD proteins are not involved in the export of heme in this organism. 58217 cd03248: TAP, the Transporter Associated with Antigen Processing; TAP is essential for peptide delivery from the cytosol into the lumen of the endoplasmic reticulum (ER), where these peptides are loaded on major histocompatibility complex (MHC) I molecules. Loaded MHC I leave the ER and display their antigenic cargo on the cell surface to cytotoxic T cells. Subsequently, virus-infected or malignantly transformed cells can be eliminated. TAP belongs to the large family of ATP-binding cassette (ABC) transporters, which translocate a vast variety of solutes across membranes. 58218 cd03249: MTABC3 (also known as ABCB6) is a mitochondrial ATP-binding cassette protein involved in iron homeostasis and one of four ABC transporters expressed in the mitochondrial inner membrane, the other three being MDL1(ABC7), MDL2, and ATM1. In fact, the yeast MDL1 (multidrug resistance-like protein 1) and MDL2 (multidrug resistance-like protein 2) transporters are also included in this CD. MDL1 is an ATP-dependent permease that acts as a high-copy suppressor of ATM1 and is thought to have a role in resistance to oxidative stress. Interestingly, subfamily B is more closely related to the carboxyl-terminal component of subfamily C than the two halves of ABCC molecules are with one another. 58219 cd03250: Domain 1 of the ABC subfamily C. This family is also known as MRP (mulrtidrug resisitance-associated protein). Some of the MRP members have five additional transmembrane segments in their N-terminas, but the function of these additional membrane-spanning domains is not clear. The MRP was found in the multidrug-resisting lung cancer cell in which p-glycoprotein was not overexpressed. MRP exports glutathione by drug stimulation, as well as, certain substrates in conjugated forms with anions, such as glutathione, glucuronate, and sulfate. 58220 cd03251: MsbA is an essential ABC transporter, closely related to eukaryotic MDR proteins. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 58221 cd03252: The ABC-transporter hemolysin B is a central component of the secretion machinery that translocates the toxin, hemolysin A, in a Sec-independent fashion across both membranes of E. coli. The hemolysin A (HlyA) transport machinery is composed of the ATP-binding cassette (ABC) transporter HlyB located in the inner membrane, hemolysin D (HlyD), also anchored in the inner membrane, and TolC, which resides in the outer membrane. HlyD apparently forms a continuous channel that bridges the entire periplasm, interacting with TolC and HlyB. This arrangement prevents the appearance of periplasmic intermediates of HlyA during substrate transport. Little is known about the molecular details of HlyA transport, but it is evident that ATP-hydrolysis by the ABC-transporter HlyB is a necessary source of energy. 58222 cd03253: ATM1 is an ABC transporter that is expressed in the mitochondria. Although the specific function of ATM1 is unknown, its disruption results in the accumulation of excess mitochondrial iron, loss of mitochondrial cytochromes, oxidative damage to mitochondrial DNA, and decreased levels of cytosolic heme proteins. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 58223 cd03254: Glucan exporter ATP-binding protein. In A. tumefaciens cyclic beta-1, 2-glucan must be transported into the periplasmic space to exert its action as a virluence factor. This subfamily belongs to the MRP-like family and is involved in drug, peptide, and lipid export. The MRP-like family, similar to all ABC proteins, have a common four-domain core structure constituted by two membrane-spanning domains each composed of six transmembrane (TM) helices and two nucleotide-binding domains (NBD). ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 58224 cd03255: This family is comprised of MJ0796 ATP-binding cassette, macrolide-specific ABC-type efflux carrier (MacAB), and proteins involved in cell division (FtsE), and release of liporoteins from the cytoplasmic membrane (LolCDE). They are clustered together phylogenetically. MacAB is an exporter that confers resistance to macrolides, while the LolCDE system is not a transporter at all. An FtsE null mutants showed filamentous growth and appeared viable on high salt medium only, indicating a role for FtsE in cell division and/or salt transport. The LolCDE complex catalyses the release of lipoproteins from the cytoplasmic membrane prior to their targeting to the outer membrane. 58225 cd03256: ABC-type phosphate/phosphonate transport system. Phosphonates are a class of organophosphorus compounds characterized by a chemically stable carbon-to-phosphorus (C-P) bond. Phosphonates are widespread among naturally occurring compounds in all kingdoms of wildlife, but only procaryotic microorganisms are able to cleave this bond. Certain bacteria such as E. coli can use alkylphosphonates as a phosphorus source. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 58226 cd03257: The ABC transporter subfamily specific for the transport of dipeptides, oligopeptides (OppD), and nickel (NikDE). The NikABCDE system of E. coli belongs to this family and is composed of the periplasmic binding protein NikA, two integral membrane components (NikB and NikC), and two ATPase (NikD and NikE). The NikABCDE transporter is synthesized under anaerobic conditions to meet the increased demand for nickel resulting from hydrogenase synthesis. The molecular mechanism of nickel uptake in many bacteria and most archaea is not known. Many other members of this ABC family are also involved in the uptake of dipeptides and oligopeptides. The oligopeptide transport system (Opp) is a five-component ABC transport composed of a membrane-anchored substrate binding proteins (SRP), OppA, two transmembrane proteins, OppB and OppC, and two ATP-binding domains, OppD and OppF. 58227 cd03258: MetN (also known as YusC) is an ABC-type transporter encoded by metN of the metNPQ operon in Bacillus subtilis that is involved in methionine transport. Other members of this system include the MetP permease and the MetQ substrate binding protein. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 58228 cd03259: ABC Carbohydrate and Solute Transporters-like subgroup. This family is comprised of proteins involved in the transport of apparently unrelated solutes and proteins specific for di- and oligosaccharides and polyols. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 58229 cd03260: Phosphate uptake is of fundamental importance in the cell physiology of bacteria because phosphate is required as a nutrient. The Pst system of E. coli comprises four distinct subunits encoded by the pstS, pstA, pstB, and pstC genes. The PstS protein is a phosphate-binding protein located in the periplasmic space. P stA and PstC are hydrophobic and they form the transmembrane portion of the Pst system. PstB is the catalytic subunit, which couples the energy of ATP hydrolysis to the import of phosphate across cellular membranes through the Pst system, often referred as ABC-protein. PstB belongs to one of the largest superfamilies of proteins characterized by a highly conserved adenosine triphosphate (ATP) binding cassette (ABC), which is also a nucleotide binding domain (NBD).. 58230 cd03261: ABC (ATP-binding cassette) transport system involved in resistant to organic solvents; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 58231 cd03262: HisP and GlnQ are the ATP-binding components of the bacterial periplasmic histidine and glutamine permeases, repectively. Histidine permease is a multisubunit complex containing the HisQ and HisM integral membrane subunits and two copies of HisP. HisP has properties intermediate between those of integral and peripheral membrane proteins and is accessible from both sides of the membrane, presumably by its interaction with HisQ and HisM. The two HisP subunits form a homodimer within the complex. The domain structure of the amino acid uptake systems is typical for prokaryote extracellular solute binding protein-dependent uptake systems. All of the amino acid uptake systems also have at least one, and in a few cases, two extracellular solute binding proteins located in the periplasm of Gram-negative bacteria, or attached to the cell membrane of Gram-positive bacteria. The best-studied member of the PAAT (polar amino acid transport) family is the HisJQMP system of S. typhimurium, where HisJ is the extracellular solute binding proteins and HisP is the ABC protein. 58232 cd03263: The ABCA subfamily mediates the transport of a variety of lipid compounds. Mutations of members of ABCA subfamily are associated with human genetic diseases, such as, familial high-density lipoprotein (HDL) deficiency, neonatal surfactant deficiency, degenerative retinopathies, and congenital keratinization disorders. The ABCA1 protein is involved in disorders of cholesterol transport and high-density lipoprotein (HDL) biosynthesis. The ABCA4 (ABCR) protein transports vitamin A derivatives in the outer segments of photoreceptor cells, and therefore, performs a crucial step in the visual cycle. The ABCA genes are not present in yeast. However, evolutionary studies of ABCA genes indicate that they arose as transporters that subsequently duplicated and that certain sets of ABCA genes were lost in different eukaryotic lineages. 58233 cd03264: ABC-type multidrug transport system, ATPase component. The biological function of this family is not well characterized, but display ABC domains similar to members of ABCA subfamily. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 58234 cd03265: DrrA is the ATP-binding protein component of a bacterial exporter complex that confers resistance to the antibiotics daunorubicin and doxorubicin. In addition to DrrA, the complex includes an integral membrane protein called DrrB. DrrA belongs to the ABC family of transporters and shares sequence and functional similarities with a protein found in cancer cells called P-glycoprotein. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region in addition to the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 58235 cd03266: NatA is the ATPase component of a bacterial ABC-type Na+ transport system called NatAB, which catalyzes ATP-dependent electrogenic Na+ extrusion without mechanically coupled proton or K+ uptake. NatB possess six putative membrane spanning regions at its C-terminus. In B. subtilus, NatAB is inducible by agents such as ethanol and protonophores, which lower the protonmotive force across the membrane. The closest sequence similarity to NatA is exhibited by DrrA of the two-component daunomycin- and doxorubicin-efflux system. Hence, the functional NatAB is presumably assembled with two copies of a single ATP-binding protein and a single intergral membrane protein. 58236 cd03267: Similar in sequence to NatA, this is the ATPase component of a bacterial ABC-type Na+ transport system called NatAB, which catalyzes ATP-dependent electrogenic Na+ extrusion without mechanically coupled to proton or K+ uptake. NatB possess six putative membrane spanning regions at its C-terminus. In B. subtilis, NatAB is inducible by agents such as ethanol and protonophores, which lower the protonmotive force across the membrane. The closest sequence similarity to NatA is exhibited by DrrA of the two-component daunomycin- and doxorubicin-efflux system. Hence, the functional NatAB is presumably assembled with two copies of the single ATP-binding protein and the single intergral membrane protein. 58237 cd03268: The BcrA subfamily represents ABC transporters involved in peptide antibiotic resistance. Bacitracin is a dodecapeptide antibiotic produced by B. licheniformis and B. subtilis. The synthesis of bacitracin is non-ribosomally catalyzed by a multienzyme complex BcrABC. Bacitracin has potent antibiotic activity against gram-positive bacteria. The inhibition of peptidoglycan biosynthesis is the best characterized bacterial effect of bacitracin. The bacitracin resistance of B. licheniformis is mediated by the ABC transporter Bcr which is composed of two identical BcrA ATP-binding subunits and one each of the integral membrane proteins, BcrB and BcrC. B. subtilis cells carrying bcr genes on high-copy number plasmids develop collateral detergent sensitivity, a similar phenomenon in human cells with overexpressed multi-drug resistance P-glycoprotein. 58238 cd03269: This subfamily is involved in drug resistance, nodulation, lipid transport, and bacteriocin and lantibiotic immunity. In eubacteria and archaea, the typical organization consists of one ABC and one or two IMs. Eukaryote systems of the ABCA subfamily display ABC domains strongly similar to this family. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region in addition to the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 58239 cd03270: The excision repair protein UvrA domain I; Nucleotide excision repair in eubacteria is a process that repairs DNA damage by the removal of a 12-13-mer oligonucleotide containing the lesion. Recognition and cleavage of the damaged DNA is a multistep ATP-dependent reaction that requires the UvrA, UvrB, and UvrC proteins. Both UvrA and UvrB are ATPases, with UvrA having two ATP binding sites, which have the characteristic signature of the family of ABC proteins, and UvrB having one ATP binding site that is structurally related to that of helicases. 58240 cd03271: The excision repair protein UvrA domain II; Nucleotide excision repair in eubacteria is a process that repairs DNA damage by the removal of a 12-13-mer oligonucleotide containing the lesion. Recognition and cleavage of the damaged DNA is a multistep ATP-dependent reaction that requires the UvrA, UvrB, and UvrC proteins. Both UvrA and UvrB are ATPases, with UvrA having two ATP binding sites, which have the characteristic signature of the family of ABC proteins and UvrB having one ATP binding site that is structurally related to that of helicases. 58241 cd03272: Eukaryotic SMC3 proteins; SMC proteins are large (approximately 110 to 170 kDa), and each is arranged into five recognizable domains. Amino-acid sequence homology of SMC proteins between species is largely confined to the amino- and carboxy-terminal globular domains. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T, in the single-letter amino-acid code), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif, and a motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences. In eukaryotic cells, the proteins are found as heterodimers of SMC1 paired with SMC3, SMC2 with SMC4, and SMC5 with SMC6 (formerly known as Rad18).. 58242 cd03273: Eukaryotic SMC2 proteins; SMC proteins are large (approximately 110 to 170 kDa), and each is arranged into five recognizable domains. Amino-acid sequence homology of SMC proteins between species is largely confined to the amino- and carboxy-terminal globular domains. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T, in the single-letter amino-acid code), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif, and a motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences. In eukaryotic cells, the proteins are found as heterodimers of SMC1 paired with SMC3, SMC2 with SMC4, and SMC5 with SMC6 (formerly known as Rad18).. 58243 cd03274: Eukaryotic SMC4 proteins; SMC proteins are large (approximately 110 to 170 kDa), and each is arranged into five recognizable domains. Amino-acid sequence homology of SMC proteins between species is largely confined to the amino- and carboxy-terminal globular domains. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T, in the single-letter amino-acid code), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif, and a motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences. In eukaryotic cells, the proteins are found as heterodimers of SMC1 paired with SMC3, SMC2 with SMC4, and SMC5 with SMC6 (formerly known as Rad18).. 58244 cd03275: Eukaryotic SMC1 proteins; SMC proteins are large (approximately 110 to 170 kDa), and each is arranged into five recognizable domains. Amino-acid sequence homology of SMC proteins between species is largely confined to the amino- and carboxy-terminal globular domains. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T, in the single-letter amino-acid code), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif, and a motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences. In eukaryotic cells, the proteins are found as heterodimers of SMC1 paired with SMC3, SMC2 with SMC4, and SMC5 with SMC6 (formerly known as Rad18).. 58245 cd03276: Eukaryotic SMC6 proteins; SMC proteins are large (approximately 110 to 170 kDa), and each is arranged into five recognizable domains. Amino-acid sequence homology of SMC proteins between species is largely confined to the amino- and carboxy-terminal globular domains. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T, in the single-letter amino-acid code), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif, and a motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences. In eukaryotic cells, the proteins are found as heterodimers of SMC1 paired with SMC3, SMC2 with SMC4, and SMC5 with SMC6 (formerly known as Rad18).. 58246 cd03277: Eukaryotic SMC5 proteins; SMC proteins are large (approximately 110 to 170 kDa), and each is arranged into five recognizable domains. Amino-acid sequence homology of SMC proteins between species is largely confined to the amino- and carboxy-terminal globular domains. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T, in the single-letter amino-acid code), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif, and a motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences. In eukaryotic cells, the proteins are found as heterodimers of SMC1 paired with SMC3, SMC2 with SMC4, and SMC5 with SMC6 (formerly known as Rad18).. 58247 cd03278: Barmotin is a tight junction-associated protein expressed in rat epithelial cells which is thought to have an important regulatory role in tight junction barrier function. Barmotin belongs to the SMC protein family. SMC proteins are large (approximately 110 to 170 kDa), and each is arranged into five recognizable domains. Amino-acid sequence homology of SMC proteins between species is largely confined to the amino- and carboxy-terminal globular domains. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T, in the single-letter amino-acid code), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif, and a motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences. In eukaryotic cells, the proteins are found as heterodimers of SMC1 paired with SMC3, SMC2 with SMC4, and SMC5 with SMC6 (formerly known as Rad18).. 58248 cd03279: SbcCD and other Mre11/Rad50 (MR) complexes are implicated in the metabolism of DNA ends. They cleave ends sealed by hairpin structures and are thought to play a role in removing protein bound to DNA termini. 58249 cd03280: MutS2 homologs in bacteria and eukaryotes. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family also possess a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange. 58250 cd03281: MutS5 homolog in eukaryotes. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family possess C-terminal domain with a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange. 58251 cd03282: MutS4 homolog in eukaryotes. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family possess C-terminal domain with a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange. 58252 cd03283: MutS-like homolog in eukaryotes. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family possess C-terminal domain with a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange. 58253 cd03284: MutS1 homolog in eukaryotes. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family possess C-terminal domain with a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange. 58254 cd03285: MutS2 homolog in eukaryotes. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family possess C-terminal domain with a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange. 58255 cd03286: MutS6 homolog in eukaryotes. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family possess C-terminal domain with a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange. 58256 cd03287: MutS3 homolog in eukaryotes. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family possess C-terminal domain with a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange. 58257 cd03288: The SUR domain 2. The sulfonylurea receptor SUR is an ATP binding cassette (ABC) protein of the ABCC/MRP family. Unlike other ABC proteins, it has no intrinsic transport function, neither active nor passive, but associates with the potassium channel proteins Kir6.1 or Kir6.2 to form the ATP-sensitive potassium (K(ATP)) channel. Within the channel complex, SUR serves as a regulatory subunit that fine-tunes the gating of Kir6.x in response to alterations in cellular metabolism. It constitutes a major pharmaceutical target as it binds numerous drugs, K(ATP) channel openers and blockers, capable of up- or down-regulating channel activity. 58258 cd03289: The CFTR subfamily domain 2. The cystic fibrosis transmembrane regulator (CFTR), the product of the gene mutated in patients with cystic fibrosis, has adapted the ABC transporter structural motif to form a tightly regulated anion channel at the apical surface of many epithelia. Use of the term assembly of a functional ion channel implies the coming together of subunits or at least smaller not-yet functional components of the active whole. In fact, on the basis of current knowledge only the CFTR polypeptide itself is required to form an ATP- and protein kinase A-dependent low-conductance chloride channel of the type present in the apical membrane of many epithelial cells. CFTR displays the typical organization (IM-ABC)2 and carries a characteristic hydrophilic R-domain that separates IM1-ABC1 from IM2-ABC2. 58259 cd03290: The SUR domain 1. The sulfonylurea receptor SUR is an ATP transporter of the ABCC/MRP family with tandem ATPase binding domains. Unlike other ABC proteins, it has no intrinsic transport function, neither active nor passive, but associates with the potassium channel proteins Kir6.1 or Kir6.2 to form the ATP-sensitive potassium (K(ATP)) channel. Within the channel complex, SUR serves as a regulatory subunit that fine-tunes the gating of Kir6.x in response to alterations in cellular metabolism. It constitutes a major pharmaceutical target as it binds numerous drugs, K(ATP) channel openers and blockers, capable of up- or down-regulating channel activity. 58260 cd03291: The CFTR subfamily domain 1. The cystic fibrosis transmembrane regulator (CFTR), the product of the gene mutated in patients with cystic fibrosis, has adapted the ABC transporter structural motif to form a tightly regulated anion channel at the apical surface of many epithelia. Use of the term assembly of a functional ion channel implies the coming together of subunits, or at least smaller not-yet functional components of the active whole. In fact, on the basis of current knowledge only the CFTR polypeptide itself is required to form an ATP- and protein kinase A-dependent low-conductance chloride channel of the type present in the apical membrane of many epithelial cells. CFTR displays the typical organization (IM-ABC)2 and carries a characteristic hydrophilic R-domain that separates IM1-ABC1 from IM2-ABC2. 58261 cd03292: FtsE is a hydrophilic nucleotide-binding protein that binds FtsX to form a heterodimeric ATP-binding cassette (ABC)-type transporter that associates with the bacterial inner membrane. The FtsE/X transporter is thought to be involved in cell division and is important for assembly or stability of the septal ring. 58262 cd03293: NrtD and SsuB are the ATP-binding subunits of the bacterial ABC-type nitrate and sulfonate transport systems, respectively. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 58263 cd03294: This family comprises the glycine betaine/L-proline ATP binding subunit in bacteria and its equivalents in archaea. This transport system belong to the larger ATP-Binding Cassette (ABC) transporter superfamily. The characteristic feature of these transporters is the obligatory coupling of ATP hydrolysis to substrate translocation. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 58264 cd03295: OpuCA is a the ATP binding component of a bacterial solute transporter that serves a protective role to cells growing in a hyperosmolar environment. ABC (ATP-binding cassette) transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition, to the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 58265 cd03296: Part of the ABC transporter complex cysAWTP involved in sulfate import. Responsible for energy coupling to the transport system. The complex is composed of two ATP-binding proteins (cysA), two transmembrane proteins (cysT and cysW), and a solute-binding protein (cysP). ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 58266 cd03297: ModC is an ABC-type transporter and the ATPase component of a molybdate transport system that also includes the periplasmic binding protein ModA and the membrane protein ModB. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 58267 cd03298: ABC-type thiamine tranport system; part of the binding-protein-dependent transport system tbpA-thiPQ for thiamine and TPP. Probably responsible for the translocation of thiamine across the membrane. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 58268 cd03299: Archeal protein closely related to ModC. ModC is an ABC-type transporter and the ATPase component of a molybdate transport system that also includes the periplasmic binding protein ModA and the membrane protein ModB. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 58269 cd03300: PotA is an ABC-type transporter and the ATPase component of the spermidine/putrescine-preferential uptake system consisting of PotA, -B, -C, and -D. PotA has two domains with the N-terminal domain containing the ATPase activity and the residues required for homodimerization with PotA and heterdimerization with PotB. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. 58270 cd03301: The N-terminal ATPase domain of the maltose transporter, MalK. ATP binding cassette (ABC) proteins function from bacteria to human, mediating the translocation of substances into and out of cells or organelles. ABC transporters contain two transmembrane-spanning domains (TMDs) or subunits and two nucleotide binding domains (NBDs) or subunits that couple transport to the hydrolysis of ATP. In the maltose transport system, the periplasmic maltose binding protein (MBP) stimulates the ATPase activity of the membrane-associated transporter, which consists of two transmembrane subunits, MalF and MalG, and two copies of the ATP binding subunit, MalK, and becomes tightly bound to the transporter in the catalytic transition state, ensuring that maltose is passed to the transporter as ATP is hydrolyzed. 58271 cd03369: Domain 2 of NFT1 (New full-length MRP-type transporter 1). NFT1 belongs to the MRP (mulrtidrug resisitance-associated protein) family of ABC transporters. Some of the MRP members have five additional transmembrane segments in their N-terminas, but the function of these additional membrane-spanning domains is not clear. The MRP was found in the multidrug-resisting lung cancer cell in which p-glycoprotein was not overexpressed. MRP exports glutathione by drug stimulation, as well as, certain substrates in conjugated forms with anions such as glutathione, glucuronate, and sulfate. 58272 cd00245: Coenzyme B12-dependent glutamate mutase epsilon subunit-like family; contains proteins similar to Clostridium cochlearium glutamate mutase (Glm) and Streptomyces tendae Tu901 NikV. Glm catalyzes a carbon-skeleton rearrangement of L-glutamate to L-threo-3-methylaspartate. The first step in the catalysis is a homolytic cleavage of the Co-C bond of the coenzyme B12 cofactor to generate a 5'-deoxyadenosyl radical. This radical then initiates the rearrangement reaction. C. cochlearium Glm is a sigma2epsilon2 heterotetramer. Glm plays a role in glutamate fermentation in Clostridium sp. and in members of the family Enterobacteriaceae, and in the synthesis of the lipopeptide antibiotic friulimicin in Actinoplanes friuliensis. S. tendae Tu901 glutamate mutase-like proteins NikU and NIkV participate in the synthesis of the peptidyl nucleoside antibiotic nikkomycin. NikU and NikV proteins have sequence similarity to Clostridium Glm sigma and epsilon components respectively, and may catalyze the rearrangement of 2-oxoglutaric acid to 2-keto-3-methylsuccinic acid during nikkomycin synthesis. 58273 cd00580: 5-carboxymethyl-2-hydroxymuconate isomerase (CHMI) is a trimeric enzyme catalyzing the isomerization of the unsaturated ketone 5-(carboxymethyl)-2-hydroxymuconate to 5-(carboxymethyl)-2-oxo-3-hexene-1,6-dionate. This is one step in the homoprotocatechuate pathway, one of the microbial meta-fission pathways that degrade aromatic carbon sources to citric acid cycle intermediates. Despite the structural similarity of CHMI with 4-oxalocrotonate tautomerase (4-OT) and macrophage migration inhibitory factor (MIF), there is no significant sequence similarity among these protein families, and therefore, they are not combined in one hierarchy. 58274 cd01434: EFG_mtEFG1_IV: domains similar to domain IV of the bacterial translational elongation factor (EF) EF-G. Included in this group is a domain of mitochondrial Elongation factor G1 (mtEFG1) proteins homologous to domain IV of EF-G. Eukaryotic cells harbor 2 protein synthesis systems: one localized in the cytoplasm, the other in the mitochondria. Most factors regulating mitochondrial protein synthesis are encoded by nuclear genes, translated in the cytoplasm, and then transported to the mitochondria. The eukaryotic system of elongation factor (EF) components is more complex than that in prokaryotes, with both cytoplasmic and mitochondrial elongation factors and multiple isoforms being expressed in certain species. During the process of peptide synthesis and tRNA site changes, the ribosome is moved along the mRNA a distance equal to one codon with the addition of each amino acid. In bacteria this translocation step is catalyzed by EF-G_GTP, which is hydrolyzed to provide the required energy. Thus, this action releases the uncharged tRNA from the P site and transfers the newly formed peptidyl-tRNA from the A site to the P site. Eukaryotic mtEFG1 proteins show significant homology to bacterial EF-Gs. Mutants in yeast mtEFG1 have impaired mitochondrial protein synthesis, respiratory defects and a tendency to lose mitochondrial DNA. There are two forms of mtEFG present in mammals (designated mtEFG1s and mtEFG2s) mtEFG2s are not present in this group. 58275 cd01680: Elongation Factor G-like domain IV. This family includes the translational elongation factor termed EF-2 (for Archaea and Eukarya) and EF-G (for Bacteria), ribosomal protection proteins that mediate tetracycline resistance and, an evolutionarily conserved U5 snRNP-specific protein (U5-116kD). In complex with GTP, EF-G/EF-2 promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site of the small subunit of ribosome and the mRNA is shifted one codon relative to the ribosome. It has been shown that EF-G/EF-2_IV domain mimics the shape of anticodon arm of the tRNA in the structurally homologous ternary complex of Petra, EF-Tu (another transcriptional elongation factor) and GTP analog. The tip portion of this domain is found in a position that overlaps the anticodon arm of the A-site tRNA, implying that EF-G/EF-2 displaces the A-site tRNA to the P-site by physical interaction with the anticodon arm. 58276 cd01681: This family represents domain IV of archaeal and eukaryotic elongation factor 2 (aeEF-2) and of an evolutionarily conserved U5 snRNP-specific protein. U5 snRNP is a GTP-binding factor closely related to the ribosomal translocase EF-2. In complex with GTP, EF-2 promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site of the small subunit of ribosome and the mRNA is shifted one codon relative to the ribosome. It has been shown that EF-2_IV domain mimics the shape of anticodon arm of the tRNA in the structurally homologous ternary complex of Phe-tRNA, EF-1 (another transcriptional elongation factor) and GTP analog. The tip portion of this domain is found in a position that overlaps the anticodon arm of the A-site tRNA, implying that EF-2 displaces the A-site tRNA to the P-site by physical interaction with the anticodon arm. 58277 cd01683: EF-2_domain IV_snRNP domain is a part of 116kD U5-specific protein of the U5 small nucleoprotein (snRNP) particle, essential component of the spliceosome. The protein is structurally closely related to the eukaryotic translational elongation factor EF2. This domain has been also identified in 114kD U5-specific protein of Saccharomyces cerevisiae and may play an important role either in splicing process itself or the recycling of spliceosomal snRNP. 58278 cd01684: EF-G_domain IV_RPP domain is a part of bacterial ribosomal protected proteins (RPP) family. RPPs such as tetracycline resistance proteins Tet(M) and Tet(O) mediate tetracycline resistance in both gram-positive and -negative species. Tetracyclines inhibit the accommodation of aminoacyl-tRNA into ribosomal A site and therefore prevent the addition of new amino acids to the growing polypeptide. RPPs Tet(M) confer tetracycline resistance by releasing tetracycline from the ribosome and thereby freeing the ribosome from inhibitory effects of the drug, such that aa-tRNA can bind to the A site and protein synthesis can continue. 58279 cd01693: mtEF-G2 domain IV. This subfamily is a part the of mitochondrial transcriptional elongation factor, mtEF-G2. Mitochondrial translation is crucial for maintaining mitochondrial function and mutations in this system lead to a breakdown in the respiratory chain-oxidative phosphorylation system and to impaired maintenance of mitochondrial DNA. In complex with GTP, EF-G promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site of the small subunit of ribosome and the mRNA is shifted one codon relative to the ribosome. 58280 COG0209: Ribonucleotide reductase, alpha subunit [Nucleotide transport and metabolism]. 58281 COG0470: ATPase involved in DNA replication [DNA replication, recombination, and repair]. 58282 COG1964: Predicted Fe-S oxidoreductases [General function prediction only]. 58283 COG3298: Predicted 3'-5' exonuclease related to the exonuclease domain of PolB [DNA replication, recombination, and repair]. 58284 COG3468: Type V secretory pathway, adhesin AidA [Cell envelope biogenesis, outer membrane / Intracellular trafficking and secretion]. 58285 COG4608: ABC-type oligopeptide transport system, ATPase component [Amino acid transport and metabolism]. 58286 pfam00478: IMP dehydrogenase / GMP reductase domain. This family is involved in biosynthesis of guanosine nucleotide. Members of this family contain a TIM barrel structure. In the inosine monophosphate dehydrogenases 2 CBS domains pfam00571 are inserted in the TIM barrel. This family is a member of the common phosphate binding site TIM barrel family. 58287 pfam00593: TonB dependent receptor. 58288 pfam00730: HhH-GPD superfamily base excision DNA repair protein. This family contains a diverse range of structurally related DNA repair proteins. The superfamily is called the HhH-GPD family after its hallmark Helix-hairpin-helix and Gly/Pro rich loop followed by a conserved aspartate. This includes endonuclease III, EC:4.2.99.18 and MutY an A/G-specific adenine glycosylase, both have a C terminal 4Fe-4S cluster. The family also includes 8-oxoguanine DNA glycosylases. The methyl-CPG binding protein MBD4 also contains a related domain, that is a thymine DNA glycosylase. The family also includes DNA-3-methyladenine glycosylase II EC:3.2.2.21 and other members of the AlkA family. 58289 pfam01137: RNA 3'-terminal phosphate cyclase. RNA cyclases are a family of RNA-modifying enzymes that are conserved in all cellular organisms. They catalyse the ATP-dependent conversion of the 3'-phosphate to the 2',3'-cyclic phosphodiester at the end of RNA, in a reaction involving formation of the covalent AMP-cyclase intermediate. The structure of RTC demonstrates that RTCs are comprised two domain. The larger domain contains an insert domain of approximately 100 amino acids. 58290 pfam01193: RNA polymerase Rpb3/Rpb11 dimerisation domain. The two eukaryotic subunits Rpb3 and Rpb11 dimerise to from a platform onto which the other subunits of the RNA polymerase assemble (D/L in archaea). The prokaryotic equivalent to the Rpb3/Rpb11 platform is the alpha-alpha dimer. The dimerisation domain of the alpha subunit/Rpb3 is interrupted by an insert domain (pfam01000). Some of the alpha subunits also contain iron-sulphur binding domains (pfam00037). Rpb11 is found as a continuous domain. Members of this family include: alpha subunit from eubacteria alpha subunits from chloroplasts Rpb3 subunits from eukaryotes Rpb11 subunits from eukaryotes RpoD subunits from archaeal RpoL subunits from archaeal. 58291 pfam01323: DSBA-like thioredoxin domain. This family contains a diverse set of proteins with a thioredoxin-like structure pfam00085. This family also includes 2-hydroxychromene-2-carboxylate (HCCA) isomerase enzymes catalyse one step in prokaryotic polyaromatic hydrocarbon (PAH) catabolic pathways. This family also contains members with functions other than HCCA isomerisation, such as Kappa family GSTs, whose similarity to HCCA isomerases was not previously recognised. Some members of this family may have been mis-annotated in protein sequence databases. 58292 pfam01590: GAF domain. Domain present in phytochromes and cGMP-specific phosphodiesterases. 58293 pfam01609: Transposase DDE domain. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction. This family contains transposases for IS4, IS421, IS5377, IS427, IS402, IS1355, IS5, which was original isolated in bacteriophage lambda. 58294 pfam01697: Domain of unknown function. This family consists of an approximately 300 residue long region found in C. elegans and drosophila proteins, the function of this region is unknown. The aligned region contains several conserved cysteine residues and several charged residues that may be catalytic residues. 58295 pfam01757: Acyltransferase family. This family includes a range of acyltransferase enzymes. This domain is found in many as yet uncharacterised C. elegans proteins and it is approximately 300 amino acids long. 58296 pfam02305: Capsid protein (F protein). This is a family of proteins from single-stranded DNA bacteriophages. Protein F is the major capsid component, sixty copies of which are found in the virion. 58297 pfam02535: ZIP Zinc transporter. The ZIP family consists of zinc transport proteins and many putative metal transporters. The main contribution to this family is from the Arabidopsis thaliana ZIP protein family these proteins are responsible for zinc uptake in the plant. Also found within this family are C. elegans proteins of unknown function which are annotated as being similar to human growth arrest inducible gene product, although this protein in not found within this family. 58298 pfam03104: DNA polymerase family B, exonuclease domain. This domain has 3' to 5' exonuclease activity and adopts a ribonuclease H type fold. 58299 pfam04313: Type I restriction enzyme R protein N terminus (HSDR_N). This family consists of a number of N terminal regions found in type I restriction enzyme R (HSDR) proteins. Restriction and modification (R/M) systems are found in a wide variety of prokaryotes and are thought to protect the host bacterium from the uptake of foreign DNA. Type I restriction and modification systems are encoded by three genes: hsdR, hsdM, and hsdS. The three polypeptides, HsdR, HsdM, and HsdS, often assemble to give an enzyme (R2M2S1) that modifies hemimethylated DNA and restricts unmethylated DNA. 58300 pfam04520: Protein of unknown function, DUF584. This family contains several uncharacterized proteins. 58301 pfam04928: Poly(A) polymerase central domain. The central domain of Poly(A) polymerase shares structural similarity with the allosteric activity domain of ribonucleotide reductase R1, which comprises a four-helix bundle and a three-stranded mixed beta- sheet. Even though the two enzymes bind ATP, the ATP-recognition motifs are different. 58302 pfam05221: S-adenosyl-L-homocysteine hydrolase. 58303 smart00079: Eukaryotic homologues of bacterial periplasmic substrate binding proteins. Prokaryotic homologues are represented by a separate alignment: PBPb . 58304 smart00464: Found in ATP-dependent protease La (LON); N-terminal domain of the ATP-dependent protease La (LON), present also in other bacterial ORFs. . 58305 smart00642: Alpha-amylase domain; . 58306 cd00016: Alkaline phosphatase homologues; alkaline phosphatases are non-specific phosphomonoesterases that catalyze the hydrolysis reaction via a phosphoseryl intermediate to produce inorganic phosphate and the corresponding alcohol, optimally at high pH. Alkaline phosphatase exists as a dimer, each monomer binding 2 zinc atoms and one magnesium atom, which are essential for enzymatic activity. 58307 cd00071: Guanosine monophosphate kinase (GMPK, EC 2.7.4.8), also known as guanylate kinase (GKase), catalyzes the reversible phosphoryl transfer from adenosine triphosphate (ATP) to guanosine monophosphate (GMP) to yield adenosine diphosphate (ADP) and guanosine diphosphate (GDP). It plays an essential role in the biosynthesis of guanosine triphosphate (GTP). This enzyme is also important for the activation of some antiviral and anticancer agents, such as acyclovir, ganciclovir, carbovir, and thiopurines. 58308 cd00081: Hedgehog/Intein domain, found in Hedgehog proteins as well as proteins which contain inteins and undergo protein splicing (e.g. DnaB, RIR1-2, GyrA and Pol). In protein splicing an intervening polypeptide sequence - the intein - is excised from a protein, and the flanking polypeptide sequences - the exteins - are joined by a peptide bond. In addition to the autocatalytic splicing domain, many inteins contain an inserted endonuclease domain, which plays a role in spreading inteins. Hedgehog proteins are a major class of intercellular signaling molecules, which control inductive interactions during animal development. The mature signaling forms of hedgehog proteins are the N-terminal fragments, which are covalently linked to cholesterol at their C-termini. This modification is the result of an autoprocessing step catalyzed by the C-terminal fragments, which are aligned here. 58310 cd00123: L-Aminopeptidase domain; D-amidase/D-esterase domain; activated by auto-catalyzed protein splicing liberating an alpha-amino group presumably used as a general base in the catalytic mechanism; exopeptidases which catalyze the hydrolysis of the amino- terminal residue from polypeptide substrates; aminopeptidases are mainly divalent cation-dependent or thiol enzymes; this is one of the rare aminopeptidases that are not metalloenzymes; synthesized as a single polypeptide precursor; active form consists of 2 peptides resulting from the unique cleavage of the Gly-Ser peptide bond in the precursor; both residues are essential for protein maturation and catalysis; a heterodimer which forms a homotetramer. 58311 cd00137: Phospholipase C, catalytic domain; Phosphoinositide-specific phospholipases C catalyze hydrolysis of phosphatidylinositol-4,5-bisphosphate (PIP2) to D-myo-inositol-1,4,5-trisphosphate (1,4,5-IP3) and sn-1,2-diacylglycerol (DAG). Both products function as second messengers in eukaryotic signal transduction cascades. 1,4,5-IP3 triggers inflow of calcium from intracellular stores; the membrane resident product DAG controls cellular protein phosphorylation states by activating various protein kinase C isozymes. The enzyme comprises 2 regions (X and Y) connected via a linker which may contain inserted domains, X and Y together form a TIM barrel-like structure containing the active site residues. 58312 cd00139: Phosphatidylinositol phosphate kinases (PIPK) catalyze the phosphorylation of phosphatidylinositol phosphate on the fourth or fifth hydroxyl of the inositol ring, to form phosphatidylinositol bisphosphate. CD alignment includes type II phosphatidylinositol phosphate kinases (PIPKII-beta), type I andII PIPK (-alpha, -beta, and -gamma) kinases and related yeast Fab1p and Mss4p kinases. Signaling by phosphorylated species of phosphatidylinositol regulates secretion, vesicular trafficking, membrane translocation, cell adhesion, chemotaxis, DNA synthesis, and cell cycling. The catalytic core domains of PIPKs are structurally similar to PI3K, PI4K, and cAMP-dependent protein kinases (PKA), the dimerization region is a unique feature of the PIPKs. 58313 cd00777: Asp tRNA synthetase (aspRS) class II core domain. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs. AspRS is a homodimer, which attaches a specific amino acid to the 3' OH group of ribose of the appropriate tRNA. The catalytic core domain is primarily responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. AspRS in this family differ from those found in the AsxRS family by a GAD insert in the core domain. 58314 cd00186: DNA Topoisomerase, subtype IA; DNA-binding, ATP-binding and catalytic domain of bacterial DNA topoisomerases I and III, and eukaryotic DNA topoisomerase III and eubacterial and archael reverse gyrases. Topoisomerases clevage single or double stranded DNA and then rejoin the broken phosphodiester backbone. Proposed catalytic mechanism of single stranded DNA cleavage is by phosphoryl transfer through a tyrosine nucleophile using acid/base catalysis. Tyr is activated by a nearby group (not yet identified) acting as a general base for nucleophilic attack on the 5' phosphate of the scissile bond. Arg and Lys stabilize the pentavalent transition state. Glu then acts as a proton donor for the leaving 3'-oxygen, upon cleavage of the scissile strand. 58315 cd00216: Dehydrogenases with pyrrolo-quinoline quinone (PQQ) as cofactor, like ethanol, methanol, and membrane bound glucose dehydrogenases. The alignment model contains an 8-bladed beta-propeller. 58316 cd00287: ribokinase/pfkB superfamily: Kinases that accept a wide variety of substrates, including carbohydrates and aromatic small molecules, all are phosphorylated at a hydroxyl group. The superfamily includes ribokinase, fructokinase, ketohexokinase, 2-dehydro-3-deoxygluconokinase, 1-phosphofructokinase, the minor 6-phosphofructokinase (PfkB), inosine-guanosine kinase, and adenosine kinase. Even though there is a high degree of structural conservation within this superfamily, their multimerization level varies widely, monomeric (e.g. adenosine kinase), dimeric (e.g. ribokinase), and trimeric (e.g THZ kinase).. 58317 cd00308: Enolase-superfamily, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion. Enolase superfamily contains different enzymes, like enolases, glutarate-, fucanate- and galactonate dehydratases, o-succinylbenzoate synthase, N-acylamino acid racemase, L-alanine-DL-glutamate epimerase, mandelate racemase, muconate lactonizing enzyme and 3-methylaspartase. 58318 cd00309: chaperonin families, type I and type II. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. There are 2 main chaperonin groups. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). The symmetry of type II is eight- or nine-fold and they are found in archea (thermosome), thermophilic bacteria (TF55) and in the eukaryotic cytosol (CTT). Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. 58319 cd00314: Plant peroxidase superfamily. Along with animal peroxidases, these enzymes belong to a group of heme-dependent peroxidases containing a heme prosthetic group (ferriprotoporphyrin IX), which catalyzes a multistep oxidative reaction involving hydrogen peroxide as the electron acceptor. The plant peroxidase superfamily is comprised of three structurally and functionally divergent groups. They are found in all living kingdoms and carry out a variety of biosynthetic and degradative functions. Class I includes intracellular peroxidases present in fungi, plants, and archaeal and bacterial enzymes, called catalase-peroxidases, that can exhibit both catalase and broad- spectrum peroxidase activities depending on the steady-state concentration of hydrogen peroxide. Catalase-peroxidases are typically comprised of two homologous domains that probably arose via a single gene duplication event. Class II includes ligninase and other extracellular fungal peroxidases, while class III is comprised of classic extracellular plant peroxidases, like horseradish peroxidase. 58320 cd00315: Cytosine-C5 specific DNA methylases; Methyl transfer reactions play an important role in many aspects of biology. Cytosine-specific DNA methylases are found both in prokaryotes and eukaryotes. DNA methylation, or the covalent addition of a methyl group to cytosine within the context of the CpG dinucleotide, has profound effects on the mammalian genome. These effects include transcriptional repression via inhibition of transcription factor binding or the recruitment of methyl-binding proteins and their associated chromatin remodeling factors, X chromosome inactivation, imprinting and the suppression of parasitic DNA sequences. DNA methylation is also essential for proper embryonic development and is an important player in both DNA repair and genome stability. 58321 cd00327: Condensing enzymes; Family of enzymes that catalyze a (decarboxylating or non-decarboxylating) Claisen-like condensation reaction. Members are share strong structural similarity, and are involved in the synthesis and degradation of fatty acids, and the production of polyketides, a diverse group of natural products. 58322 cd00347: Flavin-utilizing monoxygenases. 58323 cd00368: Molybdopterin-Binding (MopB) domain of the MopB superfamily of proteins, a large, diverse, heterogeneous superfamily of enzymes that, in general, bind molybdopterin as a cofactor. The MopB domain is found in a wide variety of molybdenum- and tungsten-containing enzymes, including formate dehydrogenase-H (Fdh-H) and -N (Fdh-N), several forms of nitrate reductase (Nap, Nas, NarG), dimethylsulfoxide reductase (DMSOR), thiosulfate reductase, formylmethanofuran dehydrogenase, and arsenite oxidase. Molybdenum is present in most of these enzymes in the form of molybdopterin, a modified pterin ring with a dithiolene side chain, which is responsible for ligating the Mo. In many bacterial and archaeal species, molybdopterin is in the form of a dinucleotide, with two molybdopterin dinucleotide units per molybdenum. These proteins can function as monomers, heterodimers, or heterotrimers, depending on the protein and organism. Also included in the MopB superfamily is the eukaryotic/eubacterial protein domain family of the 75-kDa subunit/Nad11/NuoG (second domain) of respiratory complex 1/NADH-quinone oxidoreductase which is postulated to have lost an ancestral formate dehydrogenase activity and only vestigial sequence evidence remains of a molybdopterin binding site. 58324 cd00382: Carbonic anhydrases (CA) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism in which the nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide is followed by the regeneration of an active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. CAs are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionarily distinct families of CAs (the alpha-, beta-, and gamma-CAs) which show no significant sequence identity or structural similarity. Within the beta-CA family there are four evolutionarily distinct clades (A through D). The beta-CAs are multimeric enzymes (forming dimers,tetramers,hexamers and octamers) which are present in higher plants, algae, fungi, archaea and prokaryotes. 58325 cd00404: Aconitase swivel domain. Aconitase (aconitate hydratase) catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. This is the aconitase swivel domain, which undergoes swivelling conformational change in the enzyme mechanism. The aconitase family contains the following proteins: - Iron-responsive element binding protein (IRE-BP). IRE-BP is a cytosolic protein that binds to iron-responsive elements (IREs). IREs are stem-loop structures found in the 5 'UTR of ferritin, and delta aminolevulinic acid synthase mRNAs, and in the 3 'UTR of transferrin receptor mRNA. IRE-BP also express aconitase activity. - 3-isopropylmalate dehydratase (isopropylmalate isomerase), the enzyme that catalyzes the second step in the biosynthesis of leucine. - Homoaconitase (homoaconitate hydratase), an enzyme that participates in the alpha-aminoadipate pathway of lysine biosynthesis and that converts cis-homoaconitate into homoisocitric acid. 58326 cd00418: Glutamyl-tRNA synthetase(GluRS)/Glutaminyl-tRNA synthetase (GlnRS) cataytic core domain. These enzymes attach Glu or Gln, respectively, to the appropriate tRNA. Like other class I tRNA synthetases, they aminoacylate the 2'-OH of the nucleotide at the 3' end of the tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. These enzymes function as monomers. Archaea, cellular organelles, and some bacteria lack GlnRS. In these cases, the ""non-descriminating"" form of GluRS aminoacylates both tRNA(Glu) and tRNA(Gln) with Glu, which is converted to Gln when appropriate by a transamidation enzyme. The descriminating form of GluRS differs from GlnRS and the non-descriminating form of GluRS in their C-terminal anti-codon binding domains. 58327 cd00439: Transaldolase. Enzymes found in the non-oxidative branch of the pentose phosphate pathway, that catalyze the reversible transfer of a dihydroxyacetone group from fructose-6-phosphate to erythrose-4-phosphate yielding sedoheptulose-7-phosphate and glyceraldehyde-3-phosphate. They are members of the class I aldolases, who are characterized by using a Schiff-base mechanism for stabilization of the reaction intermediates. 58328 cd00443: Adenosine/AMP deaminase. Adenosine deaminases (ADAs) are present in pro- and eukaryotic organisms and catalyze the zinc dependent irreversible deamination of adenosine nucleosides to inosine nucleosides and ammonia. The eukaryotic AMP deaminase catalyzes a similar reaction leading to the hydrolytic removal of an amino group at the 6 position of the adenine nucleotide ring, a branch point in the adenylate catabolic pathway. 58329 cd00519: Lipase (class 3). Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is ""interfacial activation,"" the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A ""lid"" or ""flap"" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site. . 58330 cd00550: Oxyanion-translocating ATPase (ArsA). This ATPase is involved in transport of arsenite, antimonite or other oxyanions across biological membranes in all three kingdoms of life. ArsA contains a highly conserved AAA motif present in the AAA+ ATPase superfamily associated with a variety of cellular activities. To form a functional ATP-driven pump, ArsA interacts with the permease ArsB, which is a channel-forming integral membrane protein. One of the most interesting features of ArsA is the allosteric activation by its transport substrates. A divalent cation, typically Mg2+, is required for its enzymatic activity. 58331 cd00567: Acyl-CoA dehydrogenase (ACAD). Both mitochondrial acyl-CoA dehydrogenases (ACAD) and peroxisomal acyl-CoA oxidases (AXO) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. In contrast, AXO catalyzes a different oxidative half-reaction, in which the reduced FAD is reoxidized by molecular oxygen. The ACAD family includes the eukaryotic beta-oxidation enzymes, short (SCAD), medium (MCAD), long (LCAD) and very-long (VLCAD) chain acyl-CoA dehydrogenases. These enzymes all share high sequence similarity, but differ in their substrate specificities. The ACAD family also includes amino acid catabolism enzymes such as Isovaleryl-CoA dehydrogenase (IVD), short/branched chain acyl-CoA dehydrogenases(SBCAD), Isobutyryl-CoA dehydrogenase (IBDH), glutaryl-CoA deydrogenase (GCD) and Crotonobetainyl-CoA dehydrogenase. The mitochondrial ACAD's are generally homotetramers, except for VLCAD, which is a homodimer. Related enzymes include the SOS adaptive reponse proten aidB, Naphthocyclinone hydroxylase (NcnH), and and Dibenzothiophene (DBT) desulfurization enzyme C (DszC). 58332 cd00576: RNR_PFL. Ribonucleotide reductase (RNR) and pyruvate formate lyase (PFL) have a structurally similar ten-stranded alpha-beta barrel active site domain and are believed to have diverged from a common ancestor. RNRs are found in all organisms and provide the only mechanism by which nucleotides are converted to deoxynucleotides, while PFL, an essential enzyme in anaerobic bacteria, catalyzes the conversion of pyruvate and CoA to acteylCoA and formate. Both RNR and PFL are glycyl radical enzymes. 58333 cd00587: The HCP family of iron-sulfur proteins includes hybrid cluster protein (HCP), acetyl-CoA synthase (ACS), and carbon monoxide dehydrogenase (CODH), all of which contain [Fe4-S4] metal clusters at their active sites. These proteins have a conserved alpha-beta rossman fold domain. HCP, formerly known as prismane, is thought to play a role in nitrogen metabolism but its specific function is unknown. Acetyl-CoA synthase (ACS), is found in acetogenic and methanogenic organisms and is responsible for the synthesis and breakdown of acetyl-CoA. ACS forms a heterotetramer with carbon monoxide dehydrogenase (CODH) consisting of two ACS and two CODH subunits. CODH reduces carbon dioxide to carbon monoxide and ACS then synthesizes acetyl-CoA from carbon monoxide and CoA. 58334 cd00618: Phospholipase A2; Cleaves the sn-2 position of the glycerol backbone of phospholipids that have aracidonic acid at the sn-2 position. This reaction is metal dependent. The resulting products are either dietary or used in synthetic pathways for leukotrienes and prostaglandins. As a toxin, it is a potent presynaptic neurotoxin that acts by blocking release of neurotransmitters by competitive inhibition, since the key catalytic residue is missing. May form dimers or oligiomers and appears to recognize specific receptors on the cell membrane. 58335 cd00653: RNA polymerase beta subunit. RNA polymerases catalyse the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). Each RNA polymerase complex contains two related members of this family, in each case they are the two largest subunits.The clamp is a mobile structure that grips DNA during elongation. 58336 cd00668: This is the catalytic core domain of isoleucyl, leucyl, valyl and methioninyl tRNA synthetases. These class I enzymes are all monomers. However, in some species, MetRS functions as a homodimer, as a result of an additional C-terminal domain. These enzymes aminoacylate the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. Enzymes in this subfamily share an insertion in the core domain, which is subject to both deletions and rearrangements. This editing region hydrolyzes mischarged cognate tRNAs and thus prevents the incorporation of chemically similar amino acids. MetRS has a significantly shorter insertion, which lacks the editing function. 58337 cd00669: Asp_Lys_Asn_tRNA synthetase class II core domain. This domain is the core catalytic domain of class II aminoacyl-tRNA synthetases of the subgroup containing aspartyl, lysyl, and asparaginyl tRNA synthetases. It is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs. Nearly all class II tRNA synthetases are dimers and enzymes in this subgroup are homodimers. These enzymes attach a specific amino acid to the 3' OH group of ribose of the appropriate tRNA. 58338 cd00672: This is the catalytic core domain of cysteinyl tRNA synthetase (CysRS). This class I enzyme is a monomer, which aminoacylates the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. 58339 cd00685: Trans-Isoprenyl Diphosphate Synthases (Trans_IPPS), head-to-tail (HT) (1'-4) condensation reactions. This CD includes all-trans (E)-isoprenyl diphosphate synthases which synthesis various chain length (C10, C15, C20, C25, C30, C35, C40, C45, and C50) linear isoprenyl diphosphates from precursors, isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP). They catalyze the successive 1'-4 condensation of the 5-carbon IPP to allylic substrates geranyl-, farnesyl-, or geranylgeranyl-diphosphate. Isoprenoid chain elongation reactions proceed via electrophilic alkylations in which a new carbon-carbon single bond is generated through interaction between a highly reactive electron-deficient allylic carbocation and an electron-rich carbon-carbon double bond. The catalytic site consists of a large central cavity formed by mostly antiparallel alpha helices with two aspartate-rich regions (DDXX(XX)D) located on opposite walls. These residues mediate binding of prenyl phosphates via bridging Mg2+ ions, inducing proposed conformational changes that close the active site to solvent, protecting and stabilizing reactive carbocation intermediates. Farnesyl diphosphate synthases produce the precursors of steroids, cholesterol, sesquiterpenes, farnsylated proteins, heme, and vitamin K12; and geranylgeranyl diphosphate and longer chain synthases produce the precursors of carotenoids, retinoids, diterpenes, geranylgeranylated chlorophylls, ubiquinone, and archaeal ether linked lipids. Isoprenyl diphosphate synthases are widely distributed among archaea, bacteria, and eukareya. 58340 cd00772: Prolyl-tRNA synthetase (ProRS) class II core catalytic domain. ProRS is a homodimer. It is responsible for the attachment of proline to the 3' OH group of ribose of the appropriate tRNA. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs in the core domain. 58341 cd00773: Class II Histidinyl-tRNA synthetase (HisRS)-like catalytic core domain. HisRS is a homodimer. It is responsible for the attachment of histidine to the 3' OH group of ribose of the appropriate tRNA. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs. This domain is also found at the C-terminus of eukaryotic GCN2 protein kinase and at the N-terminus of the ATP phosphoribosyltransferase accessory subunit, HisZ. HisZ along with HisG catalyze the first reaction in histidine biosynthesis. HisZ is found only in a subset of bacteria and differs from HisRS in lacking a C-terminal anti-codon binding domain. 58342 cd00774: Glycyl-tRNA synthetase (GlyRS)-like class II core catalytic domain. GlyRS functions as a homodimer in eukaryotes, archaea and some bacteria and as a heterotetramer in the remainder of prokaryotes. It is responsible for the attachment of glycine to the 3' OH group of ribose of the appropriate tRNA. This domain is primarily responsible for ATP binding and hydrolysis. This alignment contains only sequences from the GlyRS form which homodimerizes. The heterotetramer glyQ is in a different family of class II aaRS. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs. This domain is also found at the N-terminus of the accessory subunit of mitochondrial polymerase gamma (Pol gamma b). Pol gamma b stimulates processive DNA synthesis and is functional as a homodimer, which can associate with the catalytic subunit Pol gamma alpha to form a heterotrimer. Despite significant both structural and sequence similarity with GlyRS, Pol gamma b lacks conservation of several class II functional residues. 58343 cd00779: Prolyl-tRNA synthetase (ProRS) class II core catalytic domain. ProRS is a homodimer. It is responsible for the attachment of proline to the 3' OH group of ribose of the appropriate tRNA. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs in the core domain. This subfamily contains the core domain of ProRS from prokaryotes and from the mitochondria of eukaryotes. 58344 cd00802: Class I amino acyl-tRNA synthetase (aaRS) catalytic core domain. These enzymes are mostly monomers, which aminoacylate the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. 58345 cd00807: Glutaminyl-tRNA synthetase (GlnRS) and non-descriminating Glutamyl-tRNA synthetase (GluRS) cataytic core domain. These enzymes attach Gln or Glu, respectively, to the appropriate tRNA. Like other class I tRNA synthetases, they aminoacylate the 2'-OH of the nucleotide at the 3' end of the tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. These enzymes function as monomers. Archaea and most bacteria lack GlnRS. In these organisms, the ""non -descriminating"" form of GluRS aminoacylates both tRNA(Glu) and tRNA(Gln) with Glu, which is converted to Gln when appropriate by a transamidation enzyme. 58346 cd00808: Descriminating Glutamyl-tRNA synthetase (GluRS) catalytic core domain . The descriminating form of GluRS is only found in bacteria and cellular organelles. GluRS is a monomer, that attaches Glu to the appropriate tRNA. Like other class I tRNA synthetases, GluRS aminoacylates the 2'-OH of the nucleotide at the 3' end of the tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. 58347 cd00812: This is the catalytic core domain of leucyl tRNA synthetase (LeuRS). This class I enzyme is a monomer, which aminoacylates the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. In Aquifex aeolicus, the gene encoding LeuRS is split in two, just before the KMSKS motif. Consequently, LeuRS is a heterodimer, which likely superimposes with the LeuRS monomer found in most other organisms. LeuRS has an insertion in the core domain, which is subject to both deletions and rearrangements and thus differs between prokaryotic LeuRS and archaeal/eukaryotic LeuRS. This editing region hydrolyzes mischarged cognate tRNAs and thus prevents the incorporation of chemically similar amino acids. 58348 cd00814: This is the catalytic core domain of methionine tRNA synthetase (MetRS). This class I enzyme aminoacylates the 2 '-OH of the nucleotide at the 3' of the appropriate tRNA. MetRS, which consists of the core domain and a anti-codon binding domain functions as a monomer. However, in some species, the anti-codon binding domain is followed by an EMAP domain. In this case, MetRS functions as a homodimer. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. As a result of a deletion event, MetRS has a significantly shorter core domain insertion that IleRS, ValRS, and LeuR. Consequently, the MetRS insertion lacks the editing function. 58349 cd00817: This is the catalytic core domain of valine amino-acyl tRNA synthetases (ValRS) . This enzyme is a monomer, which aminoacylates the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. ValRS has an insertion in the core domain, which is subject to both deletions and rearrangements. This editing region hydrolyzes mischarged cognate tRNAs and thus prevents the incorporation of chemically similar amino acids. 58350 cd00818: This is the catalytic core domain of isoleucine amino-acyl tRNA synthetases (IleRS) . This class I enzyme is a monomer, which aminoacylates the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. IleRS has an insertion in the core domain, which is subject to both deletions and rearrangements. This editing region hydrolyzes mischarged cognate tRNAs and thus prevents the incorporation of chemically similar amino acids. 58351 cd00820: Phosphoenolpyruvate carboxykinase (PEPCK), a critical gluconeogenic enzyme, catalyzes the first committed step in the diversion of tricarboxylic acid cycle intermediates toward gluconeogenesis. It catalyzes the reversible decarboxylation and phosphorylation of oxaloacetate to yield phosphoenolpyruvate and carbon dioxide, using a nucleotide molecule (ATP or GTP) for the phosphoryl transfer, and has a strict requirement for divalent metal ions for activity. PEPCK's separate into two phylogenetic groups based on their nucleotide substrate specificity (the ATP-, and GTP-dependent groups).HprK/P, the bifunctional histidine-containing protein kinase/phosphatase, controls the phosphorylation state of the phosphocarrier protein HPr and regulates the utilization of carbon sources by gram-positive bacteria. It catalyzes both the ATP-dependent phosphorylation of HPr and its dephosphorylation by phosphorolysis. PEPCK and the C-terminal catalytic domain of HprK/P are structurally similar with conserved active site residues suggesting that these two phosphotransferases have related functions. 58352 cd00825: decarboxylating condensing enzymes; Family of enzymes that catalyze the formation of a new carbon-carbon bond by a decarboxylating Claisen-like condensation reaction. Members are involved in the synthesis of fatty acids and polyketides, a diverse group of natural products. Both pathways are an iterative series of additions of small carbon units, usually acetate, to a nascent acyl group. There are 2 classes of decarboxylating condensing enzymes, which can be distinguished by sequence similarity, type of active site residues and type of primer units (acetyl CoA or acyl carrier protein (ACP) linked units).. 58353 cd00826: nondecarboxylating condensing enzymes; In general, thiolases catalyze the reversible thiolytic cleavage of 3-ketoacyl-CoA into acyl-CoA and acetyl-CoA, a 2-step reaction involving a covalent intermediate formed with a catalytic cysteine. There are 2 functional different classes: thiolase-I (3-ketoacyl-CoA thiolase) and thiolase-II (acetoacetyl-CoA thiolase). Thiolase-I can cleave longer fatty acid molecules and plays an important role in the beta-oxidative degradation of fatty acids. Thiolase-II has a high substrate specificity. Although it can cleave acetoacyl-CoA, its main function is the synthesis of acetoacyl-CoA from two molecules of acetyl-CoA, which gives it importance in several biosynthetic pathways. 58354 cd00867: Trans-Isoprenyl Diphosphate Synthases (Trans_IPPS) of class 1 isoprenoid biosynthesis enzymes which either synthesis geranyl/farnesyl diphosphates (GPP/FPP) or longer chained products from isoprene precursors, isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP), or use geranyl (C10)-, farnesyl (C15)-, or geranylgeranyl (C20)-diphosphate as substrate. These enzymes produce a myriad of precursors for such end products as steroids, cholesterol, sesquiterpenes, heme, carotenoids, retinoids, diterpenes, ubiquinone, and archaeal ether linked lipids; and are widely distributed among archaea, bacteria, and eukareya. The enzymes in this family share the same 'isoprenoid synthase fold ' and include the head-to-tail (HT) IPPS which catalyze the successive 1'-4 condensation of the 5-carbon IPP to the growing isoprene chain to form linear, all-trans, C10-, C15-, C20- C25-, C30-, C35-, C40-, C45-, or C50-isoprenoid diphosphates. The head-to-head (HH) IPPS catalyze the successive 1'-1 condensation of 2 farnesyl or 2 geranylgeranyl isoprenoid diphosphates. Isoprenoid chain elongation reactions proceed via electrophilic alkylations in which a new carbon-carbon single bond is generated through interaction between a highly reactive electron-deficient allylic carbocation and an electron-rich carbon-carbon double bond. The catalytic site consists of a large central cavity formed by mostly antiparallel alpha helices with two aspartate-rich regions located on opposite walls. These residues mediate binding of prenyl phosphates via bridging Mg2+ ions, inducing proposed conformational changes that close the active site to solvent, stabilizing reactive carbocation intermediates. Mechanistically and structurally distinct, cis-IPPS are not included in this CD. 58355 cd00892: Proteins related to phosphoinositide 3-kinase (PI3K), catalytic domain; All of the members have been found to possess lipid kinase activity. Many show Ser/Thr protein kinase activity. Many PI3K-related proteins are involved in cell-cycle checkpoints. They share two additional domains FATC, at the very C-terminus and FAT N-terminal to the PI3K-like domain. 58356 cd00985: Maf_Ham1. Maf, a nucleotide binding protein, has been implicated in inhibition of septum formation in eukaryotes, bacteria and archaea. A Ham1-related protein from Methanococcus jannaschii is a novel NTPase that has been shown to hydrolyze nonstandard nucleotides, such as hypoxanthine/xanthine NTP, but not standard nucleotides. 58357 cd01059: CCC1_like: This protein family includes the proteins related to CCC1, a yeast vacuole transmembrane protein responsible for the iron and manganese transport from the cytosol into vacuole. It also includes the proteins similar to nodulin-21, a plant nodule-specific protein that may be involved in symbiotic nitrogen fixation. . 58358 cd01089: Related to aminopepdidase M, this family contains proliferation-associated protein 2G4. Family members have been implicated in cell cycle control. 58359 cd01095: nitrilotriacetate monoxygenase oxidizes nitrilotriacetate utilizing reduced flavin mononucleotide (FMNH2) and oxygen. The FMNH2 is provided by an NADH:flavin mononucleotide (FMN) oxidorductase that uses NADH to reduce FMN to FMNH2. 58360 cd01097: N5,N10-methylenetetrahydromethanopterin reductase (Mer) catalyzes the reduction of N5,N10-methylenetetrahydromethanopterin with reduced coenzyme F420 to N5-methyltetrahydromethanopterin and oxidized coenzyme F420. 58361 cd01115: Permease SLC13 (solute carrier 13). The sodium/dicarboxylate cotransporter NaDC-1 has been shown to translocate Krebs cycle intermediates such as succinate, citrate, and alpha-ketoglutarate across plasma membranes rabbit, human, and rat kidney. It is related to renal and intestinal Na+/sulfate cotransporters and a few putative bacterial permeases. The SLC13-type proteins belong to the ArsB/NhaD superfamily of permeases that translocate sodium and various anions across biological membranes in all three kingdoms of life. A typical ArsB/NhaD permease is composed of 8-13 transmembrane helices. 58362 cd01150: Peroxisomal acyl-CoA oxidases (AXO) catalyze the first set in the peroxisomal fatty acid beta-oxidation, the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. In a second oxidative half-reaction, the reduced FAD is reoxidized by molecular oxygen. AXO is generally a homodimer, but it has been reported to form a different type of oligomer in yeast. There are several subtypes of AXO's, based on substrate specificity. Palmitoyl-CoA oxidase acts on straight-chain fatty acids and prostanoids; whereas, the closely related Trihydroxycoprostanoly-CoA oxidase has the greatest activity for 2-methyl branched side chains of bile precursors. Pristanoyl-CoA oxidase, acts on 2-methyl branched fatty acids. AXO has an additional domain, C-terminal to the region with similarity to acyl-CoA dehydrogenases, which is included in this alignment. 58363 cd01228: BCR (breakpoint cluster region)-related pleckstrin homology (PH) domain. The BCR-related protein has a RhoGEF(DH) domain followed by a PH domain, a C2 domain and a RhoGAP domain. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinases, tyrosine kinases, regulators of G-proteins, endocytotic GTPAses, adaptors, a well as cytoskeletal associated molecules and in lipid associated enzymes. 58364 cd01242: Rok (Rho- associated kinase) pleckstrin homology (PH) domain. Rok is a serine/threonine kinase that binds GTP-rho. It consists of a kinase domain, a coiled coil region and a PH domain. The Rok PH domain is interrupted by a C1 domain. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. 58365 cd01269: Pollux (PLX) Phosphotyrosine-binding (PTB) domain. PLX is calmodulin-binding protein containing a TBC domain, which is conserved from yeast to man, but it only has an N-terminal PTB domain in mammals. PTB domains have a PH-like fold and are found in various eukaryotic signaling molecules. They were initially identified based upon their ability to recognize phosphorylated tyrosine residues. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. More recent studies have found that some types of PTB domains can bind to peptides which are not tyrosine phosphorylated or lack tyrosine residues altogether. 58366 cd01291: PseudoU_synth: Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). Pseudouridine synthases contains the RsuA/RluD, TruA, TruB and TruD families. This group consists of eukaryotic, bacterial and archeal pseudouridine synthases. Some psi sites such as psi55,13,38 and 39 in tRNA are highly conserved, being in the same position in eubacteria, archeabacteria and eukaryotes. Other psi sites occur in a more restricted fashion, for example psi2604in 23S RNA made by E.coli RluF has only been detected in E.coli. Human dyskerin with the help of guide RNAs makes the hundreds of psueudouridnes present in rRNA and small nuclear RNAs (snRNAs). Mutations in human dyskerin cause X-linked dyskeratosis congenitas. Missense mutation in human PUS1 causes mitochondrial myopathy and sideroblastic anemia (MLASA).. 58367 cd01295: Adenine deaminase (AdeC) directly deaminates adenine to form hypoxanthine. This reaction is part of one of the adenine salvage pathways, as well as the degradation pathway. It is important for adenine utilization as a purine, as well as a nitrogen source in bacteria and archea. 58368 cd01297: D-aminoacylases (N-acyl-D-Amino acid amidohydrolases) catalyze the hydrolysis of N-acyl-D-amino acids to produce the corresponding D-amino acids, which are used as intermediates in the synthesis of pesticides, bioactive peptides, and antibiotics. 58369 cd01302: Cyclic amidohydrolases, including hydantoinase, dihydropyrimidinase, allantoinase, and dihydroorotase, are involved in the metabolism of pyrimidines and purines, sharing the property of hydrolyzing the cyclic amide bond of each substrate to the corresponding N-carbamyl amino acids. Allantoinases catalyze the degradation of purines, while dihydropyrimidinases and hydantoinases, a microbial counterpart of dihydropyrimidinase, are involved in pyrimidine degradation. Dihydroorotase participates in the de novo synthesis of pyrimidines. 58370 cd01321: Adenosine deaminase-related growth factors (ADGF), a novel family of secreted growth-factors with sequence similarty to adenosine deaminase. 58371 cd01345: Porin superfamily. These outer membrane channels share a beta-barrel structure that differ in strand and shear number. Classical (gram-negative ) porins are non-specific channels for small hydrophillic molecules and form 16 beta-stranded barrels (16,20), which associate as trimers. Maltoporin-like channels have specificities for various sugars and form 18 beta-stranded barrels (18,22), which associate as trimers. Ligand-gated protein channels cooperate with a TonB associated inner membrane complex to actively transport ligands via the proton motive force and they form monomeric, (22,24) barrels. The 150-200 N-terminal residues form a plug that blocks the channel from the periplasmic end. 58372 cd01347: TonB dependent/Ligand-Gated channels are created by a monomeric 22 strand (22,24) anti-parallel beta-barrel. Ligands apparently bind to the large extracellular loops. The N-terminal 150-200 residues form a plug from the periplasmic end of barrel. Energy (proton-motive force) and TonB-dependent conformational alteration of channel (parts of plug, and loops 7 and 8) allow passage of ligand. FepA residues 12-18 form the TonB box, which mediates the interaction with the TonB-containing inner membrane complex. TonB preferentially interacts with ligand-bound receptors. Transport thru the channel may resemble passage thru an air lock. In this model, ligand binding leads to closure of the extracellular end of pore, then a TonB-mediated signal facillitates opening of the interior side of pore, deforming the N-terminal plug and allowing passage of the ligand to the periplasm. Such a mechanism would prevent the free diffusion of small molecules thru the pore. 58373 cd01360: Adenylsuccinate lyase_1: Adenylsuccinate lyase (ASL)_subgroup 1. This subgroup contains bacterial and archeal proteins similar to ASL, a member of the Lyase class I family. Members of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. ASL catalyzes two steps in the de novo purine biosynthesis: the conversion of 5-aminoimidazole-(N-succinylocarboxamide) ribotide (SAICAR) into 5-aminoimidazole-4-carboxamide ribotide (AICAR) and, the conversion of adenylsuccinate (SAMP) into adenosine monophosphate (AMP).. 58374 cd01363: Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. 58375 cd01368: Kinesin motor domain, KIF23-like subgroup. Members of this group may play a role in mitosis. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward. 58376 cd01379: Myosin motor domain, type III myosins. Myosin III has been shown to play a role in the vision process in insects and in hearing in mammals. Myosin III, an unconventional myosin, does not form dimers. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. 58377 cd01385: Myosin motor domain, type IX myosins. Myosin IX is a processive single-headed motor, which might play a role in signalling. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. 58378 cd01483: Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS. 58379 cd01484: Ubiquitin activating enzyme (E1), repeat 2-like. E1, a highly conserved small protein present universally in eukaryotic cells, is part of cascade to attach ubiquitin (Ub) covalently to substrate proteins. This cascade consists of activating (E1), conjugating (E2), and/or ligating (E3) enzymes and then targets them for degradation by the 26S proteasome. E1 activates ubiquitin by C-terminal adenylation, and subsequently forms a highly reactive thioester bond between its catalytic cysteine and ubiquitin's C-terminus. E1 also associates with E2 and promotes ubiquitin transfer to the E2's catalytic cysteine. A set of novel molecules with a structural similarity to Ub, called Ub-like proteins (Ubls), have similar conjugation cascades. In contrast to ubiquitin-E1, which is a single-chain protein with a weakly conserved two-fold repeat, many of the Ubls-E1are a heterodimer where each subunit corresponds to one half of a single-chain E1. This CD represents the family homologous to the second repeat of Ub-E1. 58380 cd01485: Ubiquitin activating enzyme (E1), repeat 1-like. E1, a highly conserved small protein present universally in eukaryotic cells, is part of cascade to attach ubiquitin (Ub) covalently to substrate proteins. This cascade consists of activating (E1), conjugating (E2), and/or ligating (E3) enzymes and then targets them for degradation by the 26S proteasome. E1 activates ubiquitin by C-terminal adenylation, and subsequently forms a highly reactive thioester bond between its catalytic cysteine and ubiquitin's C-terminus. The E1 also associates with E2 and promotes ubiquitin transfer to the E2's catalytic cysteine. A set of novel molecules with a structural similarity to Ub, called Ub-like proteins (Ubls), have similar conjugation cascades. In contrast to ubiquitin-E1, which is a single-chain protein with a weakly conserved two-fold repeat, many of the Ubls-E1are a heterodimer where each subunit corresponds to one half of a single-chain E1. This CD represents the family homologous to the first repeat of Ub-E1. 58381 cd01489: Ubiquitin activating enzyme (E1) subunit UBA2. UBA2 is part of the heterodimeric activating enzyme (E1), specific for the SUMO family of ubiquitin-like proteins (Ubls). E1 enzymes are part of a conjugation cascade to attach Ub or Ubls, covalently to substrate proteins consisting of activating (E1), conjugating (E2), and/or ligating (E3) enzymes. E1 activates ubiquitin by C-terminal adenylation, and subsequently forms a highly reactive thioester bond between its catalytic cysteine and Ubls C-terminus. The E1 also associates with E2 and promotes ubiquitin transfer to the E2's catalytic cysteine. Post-translational modification by SUMO family of ubiquitin-like proteins (Ublps) is involved in cell division, nuclear transport, the stress response and signal transduction. UBA2 contains both the nucleotide-binding motif involved in adenylation and the catalytic cysteine involved in the thioester intermediate and Ublp transfer to E2. 58382 cd01490: Ubiquitin activating enzyme (E1), repeat 2. E1, a highly conserved small protein present universally in eukaryotic cells, is part of cascade to attach ubiquitin (Ub) covalently to substrate proteins. This cascade consists of activating (E1), conjugating (E2), and/or ligating (E3) enzymes and then targets them for degradation by the 26S proteasome. E1 activates ubiquitin by C-terminal adenylation, and subsequently forms a highly reactive thioester bond between its catalytic cysteine and ubiquitin's C-terminus. E1 also associates with E2 and promotes ubiquitin transfer to the E2's catalytic cysteine. Ubiquitin-E1 is a single-chain protein with a weakly conserved two-fold repeat. This CD represents the second repeat of Ub-E1. 58383 cd01491: Ubiquitin activating enzyme (E1), repeat 1. E1, a highly conserved small protein present universally in eukaryotic cells, is part of cascade to attach ubiquitin (Ub) covalently to substrate proteins. This cascade consists of activating (E1), conjugating (E2), and/or ligating (E3) enzymes and then targets them for degradation by the 26S proteasome. E1 activates ubiquitin by C-terminal adenylation, and subsequently forms a highly reactive thioester bond between its catalytic cysteine and ubiquitin's C-terminus. E1 also associates with E2 and promotes ubiquitin transfer to the E2's catalytic cysteine. Ubiquitin-E1 is a single-chain protein with a weakly conserved two-fold repeat. This CD represents the first repeat of Ub-E1. 58384 cd01492: Ubiquitin activating enzyme (E1) subunit Aos1. Aos1 is part of the heterodimeric activating enzyme (E1), specific for the SUMO family of ubiquitin-like proteins (Ubls). E1 enzymes are part of a conjugation cascade to attach Ub or Ubls, covalently to substrate proteins consisting of activating (E1), conjugating (E2), and/or ligating (E3) enzymes. E1 activates ubiquitin by C-terminal adenylation, and subsequently forms a highly reactive thioester bond between its catalytic cysteine and Ubls C-terminus. The E1 also associates with E2 and promotes ubiquitin transfer to the E2's catalytic cysteine. Post-translational modification by SUMO family of ubiquitin-like proteins (Ublps) is involved in cell division, nuclear transport, the stress response and signal transduction. Aos1 contains part of the adenylation domain. 58385 cd01493: Ubiquitin activating enzyme (E1) subunit APPBP1. APPBP1 is part of the heterodimeric activating enzyme (E1), specific for the Rub family of ubiquitin-like proteins (Ubls). E1 enzymes are part of a conjugation cascade to attach Ub or Ubls, covalently to substrate proteins consisting of activating (E1), conjugating (E2), and/or ligating (E3) enzymes. E1 activates ubiquitin(-like) by C-terminal adenylation, and subsequently forms a highly reactive thioester bond between its catalytic cysteine and Ubls C-terminus. E1 also associates with E2 and promotes ubiquitin transfer to the E2's catalytic cysteine. Post-translational modification by Rub family of ubiquitin-like proteins (Ublps) activates SCF ubiquitin ligases and is involved in cell cycle control, signaling and embryogenesis. ABPP1 contains part of the adenylation domain. 58386 cd01517: PAP-phosphatase_like domains. PAP-phosphatase is a member of the inositol monophosphatase family, and catalyses the hydrolysis of 3'-phosphoadenosine-5'-phosphate (PAP) to AMP. In Saccharomyces cerevisiae, HAL2 (MET22) is involved in methionine biosynthesis and provides increased salt tolerance when over-expressed. Bacterial members of this domain family may differ in their substrate specificity and dephosphorylate different targets, as the substrate binding site does not appear to be conserved in that sub-set. 58387 cd01553: This domain family includes the Enolpyruvate transferase (EPT) family and the RNA 3' phosphate cyclase family (RTPC). These 2 families differ in that EPT is formed by 3 repeats of an alpha-beta structural domain while RTPC has 3 similar repeats with a 4th slightly different domain inserted between the 2nd and 3rd repeat. They evidently share the same active site location, although the catalytic residues differ. 58388 cd01577: Aconatase-like swivel domain of 3-isopropylmalate dehydratase and related uncharacterized proteins. 3-isopropylmalate dehydratase catalyzes the isomerization between 2-isopropylmalate and 3-isopropylmalate, via the formation of 2-isopropylmaleate 3-isopropylmalate. IPMI is involved in fungal and bacterial leucine biosynthesis and is also found in eukaryotes. This is the aconitase-like swivel domain, which is believed to undergo swivelling conformational change in the enzyme mechanism. 58389 cd01582: Homoaconitase catalytic domain. Homoaconitase and other uncharacterized proteins of the Aconitase family. Homoaconitase is part of an unusual lysine biosynthesis pathway found only in filamentous fungi, in which lysine is synthesized via the alpha-aminoadipate pathway. In this pathway, homoaconitase catalyzes the conversion of cis-homoaconitic acid into homoisocitric acid. The reaction mechanism is believed to be similar to that of other aconitases. 58390 cd01586: Aconitase A catalytic domain. This is the major form of the TCA cycle enzyme aconitate hydratase, also known as aconitase and citrate hydrolyase. It includes bacterial and archaeal aconitase A, and the eukaryotic cytosolic form of aconitase. This group also includes sequences that have been shown to act as an iron-responsive element (IRE) binding protein in animals and may have the same role in other eukaryotes. 58391 cd01594: Lyase class I_like. Lyase class I_like superfamily of enzymes that catalyze beta-elimination reactions and are active as homotetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. This superfamily contains the lyase class I family, histidine ammonia-lyase and phenylalanine ammonia-lyase. The lyase class I family comprises proteins similar to class II fumarase, aspartase, adenylosuccinate lyase, argininosuccinate lyase, and 3-carboxy-cis, cis-muconate lactonizing enzyme which, for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. Histidine or phenylalanine ammonia-lyase catalyze a beta-elimination of ammonia from histidine and phenylalanine, respectively. 58392 cd01595: Adenylsuccinate lyase_like (ASL_like): This subgroup contains proteins similar to ASL and prokaryotic-type 3-carboxy-cis,cis-muconate cycloisomerase (pCMLE). These proteins are members of the Lyase class I family. Members of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. ASL catalyzes two steps in the de novo purine biosynthesis: the conversion of 5-aminoimidazole-(N-succinylocarboxamide) ribotide (SAICAR) into 5-aminoimidazole-4-carboxamide ribotide (AICAR) and; the conversion of adenylsuccinate (SAMP) into adenosine monophosphate (AMP). pCMLE catalyzes the cyclization of 3-carboxy-cis,cis-muconate (3CM) to 4-carboxy-muconolactone, in the beta-ketoadipate pathway. ASL deficiency has been linked to several pathologies including psychomotor retardation with autistic features, epilepsy and muscle wasting. 58393 cd01636: FIG, FBPase/IMPase/glpX-like domain. A superfamily of metal-dependent phosphatases with various substrates. Fructose-1,6-bisphospatase (both the major and the glpX-encoded variant) hydrolyze fructose-1,6,-bisphosphate to fructose-6-phosphate in gluconeogenesis. Inositol-monophosphatases and inositol polyphosphatases play vital roles in eukaryotic signalling, as they participate in metabolizing the messenger molecule Inositol-1,4,5-triphosphate. Many of these enzymes are inhibited by Li+.. 58394 cd01653: Type 1 glutamine amidotransferase (GATase1)-like domain. This group includes proteins similar to Class I glutamine amidotransferases, the intracellular PH1704 from Pyrococcus horikoshii, the C-terminal of the large catalase: Escherichia coli HP-II, Sinorhizobium meliloti Rm1021 ThuA. and, the A4 beta-galactosidase middle domain. The majority of proteins in this group have a reactive Cys found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. For Class I glutamine amidotransferases proteins which transfer ammonia from the amide side chain of glutamine to an acceptor substrate, this Cys forms a Cys-His-Glu catalytic triad in the active site. Glutamine amidotransferases activity can be found in a range of biosynthetic enzymes included in this cd: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase, anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase, cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. For Pyrococcus horikoshii PH1704, the Cys of the nucleophile elbow together with a different His and, a Glu from an adjacent monomer form a catalytic triad different from the typical GATase1 triad. The E. coli HP-II C-terminal domain, S. meliloti Rm1021 ThuA and the A4 beta-galactosidase middle domain lack the catalytic triad typical GATaseI domains. GATase1-like domains can occur either as single polypeptides, as in Class I glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. 58395 cd01679: RNR, class I. Ribonucleotide reductase (RNR) catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides the precursors necessary for DNA synthesis. RNRs are separated into three classes based on their metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, and many viruses, use a diiron-tyrosyl radical, Class II RNRs, found in bacteria, and bacteriophages, use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs, found in anaerobic bacteria, bacteriophages, and archaea, use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. Many organisms have more than one class of RNR present in their genomes. All three RNRs have a ten-stranded alpha-beta barrel domain that is structurally similar to the domain of PFL (pyruvate formate lyase). Class I RNR is oxygen-dependent and can be subdivided into classes Ia (eukaryotes, prokaryotes, viruses and phages) and Ib (which is found in prokaryotes only). It is a tetrameric enzyme of two alpha and two beta subunits; this model covers the major part of the alpha or large subunit, called R1 in class Ia and R1E in class Ib. 58396 cd01702: Pol eta is member of the DNA polimerase Y-family. Unlike other Y-family members, Pol eta can efficiently and accurately replicate DNA past UV-induced cis-syn cyclobutane thymine-thumine (T-T) lesions. It synthesizes AA opposite a TT dimer. Pol eta is able to replicate through a variety of other distorting DNA lesion as well. 58397 cd01741: This group contains a subgroup of proteins having the Type 1 glutamine amidotransferase (GATase1) domain. GATase activity catalyses the transfer of ammonia from the amide side chain of glutamine to an acceptor substrate. Glutamine amidotransferases (GATase) includes the triad family of amidotransferases which have a conserved Cys-His-Glu catalytic triad in the glutaminase active site. In this subgroup this triad is conserved. GATase activity can be found in a range of biosynthetic enzymes, including: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase , anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase, cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. Glutamine amidotransferase (GATase) domains can occur either as single polypeptides, as in glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. 58398 cd01745: This group contains a subgroup of proteins having the Type 1 glutamine amidotransferase (GATase1) domain. GATase activity catalyses the transfer of ammonia from the amide side chain of glutamine to an acceptor substrate. Glutamine amidotransferases (GATase) includes the triad family of amidotransferases which have a conserved Cys-His-Glu catalytic triad in the glutaminase active site. In this subgroup this triad is conserved. GATase activity can be found in a range of biosynthetic enzymes, including: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase , anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase, cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. Glutamine amidotransferase (GATase) domains can occur either as single polypeptides, as in glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. 58402 cd01907: Glutamine amidotransferases class-II (Gn-AT)_GlxB-type. GlxB is a glutamine amidotransferase-like protein of unknown function found in bacteria and archaea. GlxB has a structural fold similar to that of other class II glutamine amidotransferases including glucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase), asparagine synthetase B (AsnB), beta lactam synthetase (beta-LS) and glutamate synthase (GltS). The GlxB fold is also somewhat similar to the Ntn (N-terminal nucleophile) hydrolase fold of the proteasomal alpha and beta subunits. 58403 cd01914: Hybrid cluster protein (HCP), formerly known as prismane, is thought to play a role in nitrogen metabolism but its specific function is unknown. HCP has three structural domains, an N-terminal alpha-helical domain, and two similar domains comprising a central beta-sheet flanked by alpha-helices. HCP contains two iron-sulfur clusters, one of which is a [Fe4-S4] cubane cluster similar to that of carbon monoxide dehydrogenase (CODH). The second cluster, referred to as the hybrid cluster, is a hybrid [Fe4-S2-O2] center located at the interface of the three domains. Although the hybrid cluster is buried within the protein, it is accessible through a large hydrophobic cavity. 58404 cd01935: Choloylglycine hydrolase (CGH)_like. This family of choloylglycine hydrolase-like proteins includes conjugated bile acid hydrolase (CBAH), penicillin V acylase (PVA), acid ceramidase (AC), and N-acylethanolamine-hydrolyzing acid amidase (NAAA) which cleave non-peptide carbon-nitrogen bonds in bile salt constituents. These enzymes have an N-terminal nucleophilic cysteine, as do other members of the Ntn hydrolase family to which they belong. This nucleophilic cysteine is exposed by post-translational prossessing of the precursor protein. 58405 cd01936: Cephalosporin acylase (CA) belongs to a family of beta-lactam acylases that includes penicillin G acylase (PGA) and aculeacin A acylase. PGA and CA are crucial for the production of backbone chemicals like 6-aminopenicillanic acid and 7-aminocephalosporanic acid (7-ACA), which can be used to synthesize semi-synthetic penicillins and cephalosporins, respectively. While both PGA and CA have a conserved Ntn (N-terminal nucleophile) hydrolase fold and the structural similarity at their active sites is very high, their sequence similarity to other Ntn's is low. 58406 cd01983: The Fer4_NifH superfamily contains a variety of proteins which share a common ATP-binding domain. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion. 58407 cd01984: Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide. 58408 cd01986: Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases and ATP sulphurylases. The domain forms a apha/beta/apha fold which binds to Adenosine group... 58409 cd01995: ExsB is a transcription regulator related protein. It is a subfamily of a Adenosine nucleotide binding superfamily of proteins. This protein family is represented by a single member in nearly every completed large (> 1000 genes) prokaryotic genome. In Rhizobium meliloti, a species in which the exo genes make succinoglycan, a symbiotically important exopolysaccharide, exsB is located nearby and affects succinoglycan levels, probably through polar effects on exsA expression or the same polycistronic mRNA. In Arthrobacter viscosus, the homologous gene is designated ALU1 and is associated with an aluminum tolerance phenotype. The function is unknown. 58410 cd02007: Thiamine pyrophosphate (TPP) family, DXS subfamily, TPP-binding module; 1-Deoxy-D-xylulose-5-phosphate synthase (DXS) is a regulatory enzyme of the mevalonate-independent pathway involved in terpenoid biosynthesis. Terpeniods are plant natural products with important pharmaceutical activity. DXS catalyzes a transketolase-type condensation of pyruvate with D-glyceraldehyde-3-phosphate to form 1-deoxy-D-xylulose-5-phosphate (DXP) and carbon dioxide. The formation of DXP leads to the formation of the terpene precursor IPP (isopentyl diphosphate) and to the formation of thiamine (vitamin B1) and pyridoxal (vitamin B6).. 58411 cd02019: Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate. 58412 cd02020: Cytidine monophosphate kinase (CMPK) catalyzes the reversible phosphorylation of cytidine monophosphate (CMP) to produce cytidine diphosphate (CDP), using ATP as the preferred phosphoryl donor. 58413 cd02034: The accessory protein CooC, which contains a nucleotide-binding domain (P-loop) near the N-terminus, participates in the maturation of the nickel center of carbon monoxide dehydrogenase (CODH). CODH from Rhodospirillum rubrum catalyzes the reversible oxidation of CO to CO2. CODH contains a nickel-iron-sulfur cluster (C-center) and an iron-sulfur cluster (B-center). CO oxidation occurs at the C-center. Three accessory proteins encoded by cooCTJ genes are involved in nickel incorporation into a nickel site. CooC functions as a nickel insertase that mobilizes nickel to apoCODH using energy released from ATP hydrolysis. CooC is a homodimer and has NTPase activities. Mutation at the P-loop abolishs its function. 58414 cd02035: ArsA ATPase functionas as an efflux pump located on the inner membrane of the cell. This ATP-driven oxyanion pump catalyzes the extrusion of arsenite, antimonite and arsenate. Maintenance of a low intracellular concentration of oxyanion produces resistance to the toxic agents. The pump is composed of two subunits, the catalytic ArsA subunit and the membrane subunit ArsB, which are encoded by arsA and arsB genes respectively. Arsenic efflux in bacteria is catalyzed by either ArsB alone or by ArsAB complex. The ATP-coupled pump, however, is more efficient. ArsA is composed of two homologous halves, A1 and A2, connected by a short linker sequence. 58415 cd02036: Bacterial cell division requires the formation of a septum at mid-cell. The site is determined by the min operon products MinC, MinD and MinE. MinC is a nonspecific inhibitor of the septum protein FtsZ. MinE is the supressor of MinC. MinD plays a pivotal role, selecting the mid-cell over other sites through the activation and regulation of MinC and MinE. MinD is a membrane-associated ATPase, related to nitrogenase iron protein. More distantly related proteins include flagellar biosynthesis proteins and ParA chromosome partitioning proteins. MinD is a monomer. 58416 cd02037: MRP (Multiple Resistance and pH adaptation) is a homologue of the Fer4_NifH superfamily. Like the other members of the superfamily, MRP contains a ATP-binding domain at the N-termini. It is found in bacteria as a membrane-spanning protein and functions as a Na+/H+ antiporter. 58417 cd02038: FleN is a member of the Fer4_NifH superfamily. It shares the common function as an ATPase, with the ATP-binding domain at the N-terminus. In Pseudomonas aeruginosa, FleN gene is involved in regulating the number of flagella and chemotactic motility by influencing FleQ activity. 58418 cd02042: ParA and ParB of Caulobacter crescentus belong to a conserved family of bacterial proteins implicated in chromosome segregation. ParB binds to DNA sequences adjacent to the origin of replication and localizes to opposite cell poles shortly following the initiation of DNA replication. ParB regulates the ParA ATPase activity by promoting nucleotide exchange in a fashion reminiscent of the exchange factors of eukaryotic G proteins. ADP-bound ParA binds single-stranded DNA, whereas the ATP-bound form dissociates ParB from its DNA binding sites. Increasing the fraction of ParA-ADP in the cell inhibits cell division, suggesting that this simple nucleotide switch may regulate cytokinesis. ParA shares sequence similarity to a conserved and widespread family of ATPases which includes the repA protein of the repABC operon in R. etli Sym plasmid. This operon is involved in the plasmid replication and partition. 58419 cd02062: Proteins of this family catalyze the reduction of flavin or nitrocompounds using NAD(P)H as electron donor in a obligatory two-electron transfer, utilizing FMN or FAD as cofactor. They are often found to be homodimers. Enzymes of this family are described as NAD(P)H:FMN oxidoreductases, oxygen-insensitive nitroreductase, flavin reductase P, dihydropteridine reductase, NADH oxidase or NADH dehydrogenase. 58420 cd02136: Nitroreductase family. Members of this family utilize FMN as a cofactor and catalyze reduction of a variety of nitroaromatic compounds, including nitrofurans, nitrobenzens, nitrophenol, nitrobenzoate and quinones by using either NADH or NADPH as a source of reducing equivalents in an obligatory two-election transfer mechanism. The enzyme is typically a homodimer. Members of this family are also called NADH dehydrogenase, oxygen-insensitive NAD(P)H nitrogenase or dihydropteridine reductase. 58421 cd02137: Nitroreductase-like family 1. A subfamily of the nitroreductase family containing uncharacterized proteins that are similar to nitroreductase. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor. The enzyme is typically a homodimer. Members of this family are also called NADH dehydrogenase, oxygen-insensitive NAD(P)H nitrogenase or dihydropteridine reductase. 58422 cd02143: Nitroreductase family. Members of this family utilize FMN as a cofactor. This family is involved in the reduction of flavin or nitroaromatic compounds by using NAD(P)H as electron donor in a obligatory two-electron transfer. Nitrogenase is homodimer. Each subunit contains one FMN molecule. Members of this family are also called NADH dehydrogenase, oxygen-insensitive NAD(P)H nitrogenase or dihydropteridine reductase. 58423 cd02149: NAD(P)H:FMN oxidoreductase family. This domain catalyzes the reduction of flavin, nitrocompound, quinones and azo compounds using NADH or NADPH as an electron donor. The enzyme is a homodimer, and each monomer binds a FMN as co-factor. This family includes FRase I in Vibrio fischeri, wihich reduces FMN into FMNH2 as part of the bioluminescent reaction. The family also includes oxygen-insensitive nitroreductases that use NADH or NADPH as an electron donor in the ping pong bi bi mechanism. This type of nitroreductase can be used in cancer chemotherapy to activate a range of prodrugs. 58424 cd02189: The tubulin superfamily includes five distinct families, the alpha-, beta-, gamma-, delta-, and epsilon-tubulins and a sixth family (zeta-tubulin) which is present only in kinetoplastid protozoa. The alpha- and beta-tubulins are the major components of microtubules, while gamma-tubulin plays a major role in the nucleation of microtubule assembly. The delta- and epsilon-tubulins are widespread but unlike the alpha, beta, and gamma-tubulins they are not ubiquitous among eukaryotes. Delta-tubulin plays an essential role in forming the triplet microtubules of centrioles and basal bodies. 58425 cd02190: The tubulin superfamily includes five distinct families, the alpha-, beta-, gamma-, delta-, and epsilon-tubulins and a sixth family (zeta-tubulin) which is present only in kinetoplastid protozoa. The epsilon-tubulins which are widespread but not ubiquitous among eukaryotes play a role in basal body/centriole morphogenesis. 58426 cd02202: FtsZ is a GTPase that is similar to the eukaryotic tubulins and is essential for cell division in prokaryotes. FtsZ is capable of polymerizing in a GTP-driven process into structures similar to those formed by tubulin. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. 58427 cd02203: Formylglycinamide ribonucleotide amidotransferase (FGAR-AT) catalyzes the ATP-dependent conversion of formylglycinamide ribonucleotide (FGAR) and glutamine to formylglycinamidine ribonucleotide (FGAM), ADP, phosphate, and glutamate in the fourth step of the purine biosynthetic pathway. In eukaryotes and Gram-negative bacteria, FGAR-AT is encoded by the purL gene as a multidomain protein with a molecular mass of about 140 kDa. In Gram-positive bacteria and archaea FGAR-AT is a complex of three proteins: PurS, PurL, and PurQ. PurL itself contains two tandem N- and C-terminal domains (four domains altogether). The N-terminal domains bind ATP and are related to the ATP-binding domains of HypE, ThiL, SelD and PurM. 58428 cd02257: Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 58429 cd02552: PseudoU_synth_TruD_like: Pseudouridine synthase, TruD family. This group consists of eukaryotic, bacterial and archeal pseudouridine synthases similar to Escherichia coli TruD and Saccharomyces cerevisiae Pus7. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). E. coli TruD and S. cerevisiae Pus7 make psi13 in cytoplasmic tRNAs. In addition S. cerevisiae Pus7 makes psi35 in U2 small nuclear RNA (U2 snRNA) and psi35 in pre-tRNATyr. Psi35 in U2 snRNA and psi13 in tRNAs are highly phylogenetically conserved. Psi34 is the mammalian U2 snRNA counterpart of yeast U2 snRNA psi35. . 58430 cd02557: PseudoU_synth_ScRIB2_like: Pseudouridine synthase, Saccharomyces cerevisiae RIB2_like. This group is comprised of eukaryotic and bacterial proteins similar to Saccharomyces cerevisiae RIB2, S. cerevisiae Pus6p and human hRPUDSD2. S. cerevisiae RIB2 displays two distinct catalytic activities. The N-terminal domain of RIB2 is RNA:psi-synthase which makes psi32 on cytoplasmic tRNAs. Psi32 is highly phylogenetically conserved. The C-terminal domain of RIB2 has a DRAP deaminase activity which catalyses the formation of 5-amino-6-ribitylamino-2,4(1H,3H)-pyrimidinedione 5'-phosphate from 2,5-diamino-6-ribitylamino-4(3H)-pyrimidinone 5'-phosphate during riboflavin biosynthesis. S. cerevisiae Pus6p makes the psi31 of cytoplasmic and mitochondrial tRNAs. 58431 cd02568: PseudoU_synth_PUS1_PUS2: Pseudouridine synthase, PUS1/ PUS2 like. This group consists of eukaryotic pseudouridine synthases similar to Saccharomyces cerevisiae Pus1p, S. cerevisiae Pus2p, Caenorhabditis elegans Pus1p and human PUS1. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required. S. cerevisiae Pus1p catalyzes the formation of psi34 and psi36 in the intron-containing tRNAIle, psi35 in the intron-containing tRNATyr, psi27 and/or psi28 in several yeast cytoplasmic tRNAs and, psi44 in U2 small nuclear RNA (U2 snRNA). The presence of the intron is required for the formation of psi 34, 35 and 36. In addition S. cerevisiae PUS1 makes are psi 26, 65 and 67. C. elegans Pus1p does not modify psi44 in U2 snRNA. Mouse Pus1p makes psi27/28 in pre- tRNASer , tRNAVal and tRNAIle, psi 34/36 in tRNAIle and, psi 32 and potentially 67 in tRNAVal. Psi44 in U2 snRNA and psi32 in tRNAs are highly phylogenetically conserved. Psi 26,27,28,34,35,36,65 and 67 in tRNAs are less highly conserved. Mouse Pus1p regulates nuclear receptor activity through pseudouridylation of Steroid Receptor RNA Activator. Missense mutation in human PUS1 causes mitochondrial myopathy and sideroblastic anemia (MLASA).. 58432 cd02575: PseudoU_synth_EcTruD: Pseudouridine synthase, TruD family. This group consists of bacterial pseudouridine synthases similar to Escherichia coli TruD. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). E. coli TruD makes the highly phylogenetically conserved psi13 in tRNAs. . 58433 cd02576: PseudoU_synth_ScPUS7: Pseudouridine synthase, TruD family. This group consists of eukaryotic pseudouridine synthases similar to Saccharomyces cerevisiae Pus7. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). Saccharomyces cerevisiae Pus7 makes psi35 in U2 small nuclear RNA (U2 snRNA), psi13 in cytoplasmic tRNAs and psi35 in pre-tRNATyr. Psi35 in yeast U2 snRNA and psi13 in tRNAs are highly phylogenetically conserved. Psi34 is the mammalian U2 snRNA counterpart of yeast U2 snRNA psi35. . 58434 cd02577: PSTD1: Pseudouridine synthase, a subgroup of the TruD family. This group consists of several hypothetical archeal pseudouridine synthases assigned to the TruD family of psuedouridine synthases. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). The TruD family is comprised of proteins related to Escherichia coli TruD. . 58435 cd02648: NH_1: A subgroup of nucleoside hydrolases. This group contains fungal proteins similar to nucleoside hydrolases. Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity. . 58436 cd02657: A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 58437 cd02662: A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 58438 cd02663: A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 58439 cd02664: A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 58440 cd02665: A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 58441 cd02666: A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 58442 cd02667: A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 58443 cd02670: A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 58444 cd02671: A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 58445 cd02672: A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 58446 cd02673: A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 58447 cd02674: A subfamily of peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. 58448 cd02750: Respiratory nitrate reductase A (NarGHI), alpha chain (NarG) and related proteins. Under anaerobic conditions in the presence of nitrate, E. coli synthesizes the cytoplasmic membrane-bound quinol-nitrate oxidoreductase (NarGHI), which reduces nitrate to nitrite and forms part of a redox loop generating a proton-motive force. Found in prokaryotes and some archaea, NarGHI usually functions as a heterotrimer. The alpha chain contains the molybdenum cofactor-containing Mo-bisMGD catalytic subunit. Members of the MopB_Nitrate-R-NarG-like CD belong to the molybdopterin_binding (MopB) superfamily of proteins. 58449 cd02752: Formate dehydrogenase N, alpha subunit (Formate-Dh-Na) is a major component of nitrate respiration in bacteria such as in the E. coli formate dehydrogenase N (Fdh-N). Fdh-N is a membrane protein that is a complex of three different subunits and is the major electron donor to the nitrate respiratory chain. Also included in this CD is the Desulfovibrio gigas tungsten formate dehydrogenase, DgW-FDH. In contrast to Fdh-N, which is a functional heterotrimer, DgW-FDH is a heterodimer. The DgW-FDH complex is composed of a large subunit carrying the W active site and one [4Fe-4S] center, and a small subunit that harbors a series of three [4Fe-4S] clusters as well as a putative vacant binding site for a fourth cluster. The smaller subunit is not included in this alignment. Members of the MopB_Formate-Dh-Na-like CD belong to the molybdopterin_binding (MopB) superfamily of proteins. 58450 cd02753: Formate dehydrogenase H (Formate-Dh-H) catalyzes the reversible oxidation of formate to CO2 with the release of a proton and two electrons. It is a component of the anaerobic formate hydrogen lyase complex. The E. coli formate dehydrogenase H (Fdh-H) is a monomer composed of a single polypeptide chain with a Mo active site region and a [4Fe-4S] center. Members of the MopB_Formate-Dh-H CD belong to the molybdopterin_binding (MopB) superfamily of proteins. 58451 cd02755: The MopB_Thiosulfate-R-like CD contains thiosulfate-, sulfur-, and polysulfide-reductases, and other related proteins. Thiosulfate reductase catalyzes the cleavage of sulfur-sulfur bonds in thiosulfate. Polysulfide reductase is a membrane-bound enzyme that catalyzes the reduction of polysulfide using either hydrogen or formate as the electron donor. Members of the MopB_Thiosulfate-R-like CD belong to the molybdopterin_binding (MopB) superfamily of proteins. 58452 cd02756: Arsenite oxidase (Arsenite-Ox) oxidizes arsenite to the less toxic arsenate; it transfers the electrons obtained from the oxidation of arsenite towards the soluble periplasmic electron carriers cytochrome c and/or amicyanin. Arsenite oxidase is a heterodimeric enzyme containing a large and a small subunit. The large catalytic subunit harbors the molybdopterin cofactor and the [3Fe-4S] cluster; and the small subunit belongs to the structural class of the Rieske proteins. The small subunit is not included in this alignment. Members of MopB_Arsenite-Ox CD belong to the molybdopterin_binding (MopB) superfamily of proteins. 58453 cd02757: This CD includes the respiratory arsenate reductase, As(V), catalytic subunit (ArrA) and other related proteins. These members belong to the molybdopterin_binding (MopB) superfamily of proteins. 58454 cd02758: The MopB_Tetrathionate-Ra CD contains tetrathionate reductase, subunit A, (TtrA) and other related proteins. The Salmonella enterica tetrathionate reductase catalyses the reduction of trithionate but not sulfur or thiosulfate. Members of this CD belong to the molybdopterin_binding (MopB) superfamily of proteins. 58455 cd02759: The MopB_Acetylene-hydratase CD contains acetylene hydratase (Ahy) and other related proteins. The acetylene hydratase of Pelobacter acetylenicus is a tungsten iron-sulfur protein involved in the fermentation of acetylene to ethanol and acetate. Members of this CD belong to the molybdopterin_binding (MopB) superfamily of proteins. 58456 cd02762: The MopB_1 CD includes a group of related uncharacterized bacterial molybdopterin-binding oxidoreductase-like domains with a putative N-terminal iron-sulfur [4Fe-4S] cluster binding site and molybdopterin cofactor binding site. These members belong to the molybdopterin_binding (MopB) superfamily of proteins. 58457 cd02763: The MopB_2 CD includes a group of related uncharacterized bacterial molybdopterin-binding oxidoreductase-like domains with a putative N-terminal iron-sulfur [4Fe-4S] cluster binding site and molybdopterin cofactor binding site. These members belong to the molybdopterin_binding (MopB) superfamily of proteins. 58458 cd02765: The MopB_4 CD includes a group of related uncharacterized bacterial and archaeal molybdopterin-binding oxidoreductase-like domains with a putative N-terminal iron-sulfur [4Fe-4S] cluster binding site and molybdopterin cofactor binding site. These members belong to the molybdopterin_binding (MopB) superfamily of proteins. 58459 cd02766: The MopB_3 CD includes a group of related uncharacterized bacterial and archaeal molybdopterin-binding oxidoreductase-like domains with a putative N-terminal iron-sulfur [4Fe-4S] cluster binding site and molybdopterin cofactor binding site. These members belong to the molybdopterin_binding (MopB) superfamily of proteins. 58460 cd02809: Glycolate oxidase-like (GOX-like) FMN-binding domain. This protein familly includes a widespread family of homologous FMN-dependent a-hydroxyacid oxidizing enzymes. This family occurs in both prokaryotes and eukaryotes. Members of this family include flavocytochrome b2 (FCB2), glycolate oxidase (GOX), lactate monooxygenase (LMO), mandelate dehydrogenase (MDH), and long chain hydroxyacid oxidase (LCHAO). In green plants, glycolate oxidase is one of the key enzymes in photorespiration where it oxidizes glycolate to glyoxylate. LMO catalyzes the oxidation of L-lactate to acetate and carbon dioxide. MDH oxidizes (S)-mandelate to phenylglyoxalate. It is an enzyme in the mandelate pathway that occurs in several strains of Pseudomonas which converts (R)-mandelate to benzoate. 58461 cd02868: PseudoU_synth_ hTRUB2_Like: Pseudouridine synthase, humanTRUB2_like. This group consists of eukaryotic pseudouridine synthases similar to human TruB pseudouridine synthase homolog 2 (TRUB2). Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi).. 58462 cd02888: RNR, class I_like family. Ribonucleotide reductase (RNR) catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides the precursors necessary for DNA synthesis. RNRs are separated into three classes based on their metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, and many viruses, use a diiron-tyrosyl radical, Class II RNRs, found in bacteria, and bacteriophages, use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs, found in anaerobic bacteria, bacteriophages, and archaea, use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. Many organisms have more than one class of RNR present in their genomes. All three RNRs have a ten-stranded alpha-beta barrel domain that is structurally similar to the domain of PFL (pyruvate formate lyase). This family appears similar to class I RNRs, as judged by sequence similarity and the predicted active site. 58463 cd02889: Squalene cyclase (SQCY) domain; found in class II terpene cyclases that have an alpha 6 - alpha 6 barrel fold. Squalene cyclase (SQCY) and 2,3-oxidosqualene cyclase (OSQCY) are integral membrane proteins that catalyze a cationic cyclization cascade converting linear triterpenes to fused ring compounds. Bacterial SQCY catalyzes the convertion of squalene to hopene or diplopterol. Eukaryotic OSQCY transforms the 2,3-epoxide of squalene to compounds such as, lanosterol (a metabolic precursor of cholesterol and steroid hormones) in mammals and fungi or, cycloartenol in plants. Deletion of a single glycine residue of Alicyclobacillus acidocaldarius SQCY alters its substrate specificity into that of eukaryotic OSQCY. Both enzymes have a second minor domain, which forms an alpha-alpha barrel that is inserted into the major domain. This group also contains SQCY-like archael sequences and some bacterial SQCY's which lack this minor domain. 58464 cd02972: DsbA family; consists of DsbA and DsbA-like proteins, including DsbC, DsbG, glutathione (GSH) S-transferase kappa (GSTK), 2-hydroxychromene-2-carboxylate (HCCA) isomerase, an oxidoreductase (FrnE) presumed to be involved in frenolicin biosynthesis, a 27-kDa outer membrane protein, and similar proteins. Members of this family contain a redox active CXXC motif (except GSTK and HCCA isomerase) imbedded in a TRX fold, and an alpha helical insert of about 75 residues (shorter in DsbC and DsbG) relative to TRX. DsbA is involved in the oxidative protein folding pathway in prokaryotes, catalyzing disulfide bond formation of proteins secreted into the bacterial periplasm. DsbC and DsbG function as protein disulfide isomerases and chaperones to correct non-native disulfide bonds formed by DsbA and prevent aggregation of incorrectly folded proteins. 58465 cd02987: Phosducin (Phd)-like family, Phd subfamily; Phd is a cytosolic regulator of G protein functions. It specifically binds G protein betagamma (Gbg)-subunits with high affinity, resulting in the solubilization of Gbg from the plasma membrane. This impedes the formation of a functional G protein trimer (G protein alphabetagamma), thereby inhibiting G protein-mediated signal transduction. Phd also inhibits the GTPase activity of G protein alpha. Phd can be phosphorylated by protein kinase A and G protein-coupled receptor kinase 2, leading to its inactivation. Phd was originally isolated from the retina, where it is highly expressed and has been implicated to play an important role in light adaptation. It is also found in the pineal gland, liver, spleen, striated muscle and the brain. The C-terminal domain of Phd adopts a thioredoxin fold, but it does not contain a CXXC motif. Phd interacts with G protein beta mostly through the N-terminal helical domain. 58466 cd03108: Adenylosuccinate synthetase (AdSS) catalyzes the first step in the de novo biosynthesis of AMP. IMP and L-aspartate are conjugated in a two-step reaction accompanied by the hydrolysis of GTP to GDP in the presence of Mg2+. In the first step, the r-phosphate group of GTP is transferred to the 6-oxygen atom of IMP. An aspartate then displaces this 6-phosphate group to form the product adenylosuccinate. Because of its critical role in purine biosynthesis, AdSS is a target of antibiotics, herbicides and antitumor drugs. 58467 cd03109: Dethiobiotin synthetase (DTBS) is the penultimate enzyme in the biotin biosynthesis pathway in Escherichia coli and other microorganisms. The enzyme catalyzes formation of the ureido ring of dethiobiotin from (7R,8S)-7,8-diaminononanoic acid (DAPA) and carbon dioxide. The enzyme utilizes carbon dioxide instead of hydrogen carbonate as substrate and is dependent on ATP and divalent metal ions as cofactors. 58468 cd03110: This protein family's function is unkown. It contains nucleotide binding site. It uses NTP as energy source to transfer electron or ion. 58469 cd03111: This protein family consists of proteins similar to the cpaE protein of the Caulobacter pilus assembly and the orf4 protein of Actinobacillus pilus formation gene cluster. The function of these proteins are unkown. The Caulobacter pilus assembly contains 7 genes: pilA, cpaA, cpaB, cpaC, cpaD, cpaE and cpaF. These genes are clustered together on chromosome. 58470 cd03143: A4 beta-galactosidase middle domain: a type 1 glutamine amidotransferase (GATase1)-like domain. This group includes proteins similar to beta-galactosidase from Thermus thermophilus. Beta-Galactosidase hydrolyzes the beta-1,4-D-galactosidic linkage of lactose, as well as those of related chromogens, o-nitrophenyl-beta-D-galactopyranoside (ONP-Gal) and 5-bromo-4-chloro-3-indolyl-beta-D-galactoside (X-gal). This A4 beta-galactosidase middle domain lacks the catalytic triad of typical GATase1 domains. The reactive Cys residue found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow in typical GATase1 domains is not conserved in this group. 58471 cd03144: Type 1 glutamine amidotransferase (GATase1)-like domain found in proteins similar to Saccharomyces cerevisiae biotin-apoprotein ligase (ScBLP). Biotin-apoprotein ligase modifies proteins by covalently attaching biotin. ScBLP is known to biotinylate acety-CoA carboxylase and pyruvate carboxylase. The catalytic triad typical of GATase1 domains is not conserved in this GATase1-like domain. However, the Cys residue found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow in a typical GATase1 domain is conserved. 58472 cd03193: GST_C family, Metaxin subfamily; composed of metaxins and related proteins. Metaxin 1 is a component of a preprotein import complex of the mitochondrial outer membrane. It extends to the cytosol and is anchored to the mitochondrial membrane through its C-terminal domain. In mice, metaxin is required for embryonic development. In humans, alterations in the metaxin gene may be associated with Gaucher disease. Metaxin 2 binds to metaxin 1 and may also play a role in protein translocation into the mitochondria. Genome sequencing shows that a third metaxin gene also exists in zebrafish, Xenopus, chicken, and mammals. Sequence analysis suggests that all three metaxins share a common ancestry and that they possess similarity to GSTs. Also included in the subfamily are uncharacterized proteins with similarity to metaxins, including a novel GST from Rhodococcus with toluene o-monooxygenase and glutamylcysteine synthetase activities. Other members are the cadmium-inducible lysosomal protein CDR-1 and its homologs from C. elegans, and the failed axon connections (fax) protein from Drosophila. CDR-1 is an integral membrane protein that functions to protect against cadmium toxicity and may also have a role in osmoregulation to maintain salt balance in C. elegans. The fax gene of Drosophila was identified as a genetic modifier of Abelson (Abl) tyrosine kinase. The fax protein is localized in cellular membranes and is expressed in embryonic mesoderm and axons of the central nervous system. 58473 cd03314: Methylaspartate ammonia lyase (3-methylaspartase, MAL) is a homodimeric enzyme, catalyzing the magnesium-dependent reversible alpha,beta-elimination of ammonia from L-threo-(2S,3S)-3-methylaspartic acid to mesaconic acid. This reaction is part of the main catabolic pathway for glutamate. MAL belongs to the enolase superfamily of enzymes, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion. 58474 cd03315: Muconate lactonizing enzyme (MLE) like subgroup of the enolase superfamily. Enzymes of this subgroup share three conserved carboxylate ligands for the essential divalent metal ion (usually Mg2+), two aspartates and a glutamate, and residues that can function as general acid/base catalysts, a Lys-X-Lys motif and another conserved lysine. Despite these conserved residues, the members of the MLE subgroup, like muconate lactonizing enzyme, o-succinylbenzoate synthase (OSBS) and N-acylamino acid racemase (NAAAR), catalyze different reactions. 58475 cd03320: o-Succinylbenzoate synthase (OSBS) catalyzes the conversion of 2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate (SHCHC) to 4-(2'-carboxyphenyl)-4-oxobutyrate (o-succinylbenzoate or OSB), a reaction in the menaquinone biosynthetic pathway. Menaquinone is an essential cofactor for anaerobic growth in eubacteria and some archaea. OSBS belongs to the enolase superfamily of enzymes, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion. 58476 cd03322: The starvation sensing protein RpsA from E.coli and its homologs are lactonizing enzymes whose putative targets are homoserine lactone (HSL)-derivative. They are part of the mandelate racemase (MR)-like subfamily of the enolase superfamily. Enzymes of this subfamily share three conserved carboxylate ligands for the essential divalent metal ion (usually Mg2+), two aspartates and a glutamate, and catalytic residues, a partially conserved Lys-X-Lys motif and a conserved histidine-aspartate dyad. 58477 cd03327: Mandelate racemase (MR)-like subfamily of the enolase superfamily, subgroup 2. Enzymes of this subgroup share three conserved carboxylate ligands for the essential divalent metal ion (usually Mg2+), two aspartates and a glutamate, and conserved catalytic residues, a Lys-X-Lys motif and a conserved histidine-aspartate dyad. This subgroup's function is unknown. 58478 cd03332: L-Lactate 2-monooxygenase (LMO) FMN-binding domain. LMO (AKA lactate oxidase, lactate oxidative decarboxylase, lactate oxygenase, lactic oxidase, lactic oxygenase) is a FMN-containing enzyme that catalyzes the conversion of L-lactate and oxygen to acetate, carbon dioxide, and water. LMO is a member of the family of alpha-hydroxy acid oxidases. It is thought to be an homooctamer with two- and four- fold axes in the center of the octamer. 58479 cd03333: chaperonin_like superfamily. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. There are 2 main chaperonin groups. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). The symmetry of type II is eight- or nine-fold and they are found in archea (thermosome), thermophilic bacteria (TF55) and in the eukaryotic cytosol (CTT). Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. This superfamily also contains related domains from Fab1-like phosphatidylinositol 3-phosphate (PtdIns3P) 5-kinases that only contain the intermediate and apical domains. 58480 cd03337: TCP-1 (CTT or eukaryotic type II) chaperonin family, gamma subunit. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. In contrast to bacterial group I chaperonins (GroEL), each ring of the eukaryotic cytosolic chaperonin (CTT) consists of eight different, but homologous subunits. Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. The best studied in vivo substrates of CTT are actin and tubulin. 58481 cd03341: TCP-1 (CTT or eukaryotic type II) chaperonin family, theta subunit. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. In contrast to bacterial group I chaperonins (GroEL), each ring of the eukaryotic cytosolic chaperonin (CTT) consists of eight different, but homologous subunits. Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. The best studied in vivo substrates of CTT are actin and tubulin. 58482 cd03342: TCP-1 (CTT or eukaryotic type II) chaperonin family, zeta subunit. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. In contrast to bacterial group I chaperonins (GroEL), each ring of the eukaryotic cytosolic chaperonin (CTT) consists of eight different, but homologous subunits. Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. The best studied in vivo substrates of CTT are actin and tubulin. 58483 cd03370: NADPH_oxidase. Nitroreductase family containing NADH oxidase and other, uncharacterized proteins that are similar to nitroreductase. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor. The enzyme is typically a homodimer. Members of this family are also called NADH dehydrogenase, oxygen-insensitive NAD(P)H nitrogenase or dihydropteridine reductase. 58484 cd03375: Thiamine pyrophosphate (TPP family), 2-oxoglutarate ferredoxin oxidoreductase (OGFOR) subfamily, TPP-binding module; OGFOR catalyzes the oxidative decarboxylation of 2-oxo-acids, with ferredoxin acting as an electron acceptor. In the TCA cycle, OGFOR catalyzes the oxidative decarboxylation of 2-oxoglutarate to succinyl-CoA. In the reductive tricarboxylic acid cycle found in the anaerobic autotroph Hydrogenobacter thermophilus, OGFOR catalyzes the reductive carboxylation of succinyl-CoA to produce 2-oxoglutarate. Thauera aromatica OGFOR has been shown to provide reduced ferredoxin to benzoyl-CoA reductase, a key enzyme in the anaerobic metabolism of aromatic compounds. OGFOR is dependent on TPP and a divalent metal cation for activity. 58485 cd03378: Carbonic anhydrases (CA) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism in which the nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide is followed by the regeneration of an active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. CAs are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionarily distinct families of CAs (the alpha-, beta-, and gamma-CAs) which show no significant sequence identity or structural similarity. Within the beta-CA family there are four evolutionarily distinct clades (A through D). The beta-CAs are multimeric enzymes (forming dimers,tetramers,hexamers and octamers) which are present in higher plants, algae, fungi, archaea and prokaryotes. 58486 cd03747: Penicillin G acylase (PGA) belongs to a family of beta-lactam acylases that includes cephalosporin acylase (CA) and aculeacin A acylase. PGA and CA are crucial for the production of backbone chemicals like 6-aminopenicillanic acid and 7-aminocephalosporanic acid (7-ACA), which can be used to synthesize semi-synthetic penicillins and cephalosporins, respectively. While both PGA and CA have a conserved Ntn (N-terminal nucleophile) hydrolase fold and the structural similarity at their active sites is very high, their sequence similarity is low. 58487 cd03748: Penicillin G acylase (PGA) is the key enzyme in the industrial production of beta-lactam antibiotics. PGA hydrolyzes the side chain of penicillin G and related beta-lactam antibiotics releasing 6-amino penicillanic acid (6-APA), a building block in the production of semisynthetic penicillins. PGA is widely distributed among microorganisms, including bacteria, yeast and filamentous fungi but it's in vivo role remains unclear. 58488 cd00300: L-lactate dehydrogenases (LDH); member of the family of NAD-dependent 2-hydroxycarboxylate dehydrogenases. LDHs are tetrameric enzymes catalyzing the last step of glycolysis in which pyruvate is converted to L-lactate. Vertebrate LDHs are non-allosteric, but some bacterial LDHs are activated by an allosteric effector, fructose-1,6-bisphosphate. L-2-hydroxyisocaproate dehydrogenases and tetrameric LDH-like MDHs are also included in this group. 58489 cd00650: NAD-dependent 2-hydroxycarboxylate dehydrogenases. Members of this family include such ubiquitous enzymes, like L-lactate dehydrogenases (LDH) and malate dehydrogenases (MDH). LDH catalyzes the last step of glycolysis in which pyruvate is converted to L-lactate. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. L-2-hydroxyisocaproate dehydrogenases are also members of the family. 58490 cd00704: malate dehydrogenases (MDH); member of the family of NAD-dependent 2-hydroxycarboxylate dehydrogenases. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. 58491 cd01336: malate dehydrogenases (MDH) cytoplasmic and cytosolic; member of the family of NAD-dependent 2-hydroxycarboxylate dehydrogenases. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. Members of this cd are localized to the cytoplasm and cytosol. 58492 cd01337: malate dehydrogenases (MDH) glycosomal and mitochondrial; member of the family of NAD-dependent 2-hydroxycarboxylate dehydrogenases. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. Members of this cd are localized to the glycosome and mitochondria. 58493 cd01338: malate dehydrogenases (MDH) chloroplast; member of the family of NAD-dependent 2-hydroxycarboxylate dehydrogenases. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. Members of this cd are localized to the choloroplasts. In C4 plants, an NADP-dependent MDH located in the chloroplasts of mesophyll cells catalyzes the reduction of oxaloacetate to malate. The malate is shuttled to chloroplasts of bundle-sheath cells and provides reducing equivalents utilized in the photosynthetic fixation of carbon dioxide. The NADP-dependent form of MDH is regulated by light and is inactive in the dark, while NAD-dependent forms are active all the time. 58494 cd01339: LDH-like structure and MDH enzymatic activity; member of the family of NAD-dependent 2-hydroxycarboxylate dehydrogenases. Tetrameric Malate dehydrogenases (MDHs), including those from phototrophic bacteria, have a higher similarity to (Lactate dehydrogenases) LDHs than to other MDHs. LDH catalyzes the last step of glycolysis in which pyruvate is converted to L-lactate. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. L-2-hydroxyisocaproate dehydrogenases are also members of the family. 58495 cd04510: malate dehydrogenases (MDH) cytoplasmic and cytosolic, isozyme B; member of the family of NAD-dependent 2-hydroxycarboxylate dehydrogenases. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. Members of this cd are localized to the cytoplasm and cytosol. 58496 cd00229: SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid. 58497 cd01820: PAF_acetylhydrolase (PAF-AH)_like subfamily of SGNH-hydrolases. Platelet-activating factor (PAF) and PAF-AH are key players in inflammation and in atherosclerosis. PAF-AH is a calcium independent phospholipase A2 which exhibits strong substrate specificity towards PAF, hydrolyzing an acetyl ester at the sn-2 position. PAF-AH also degrades a family of oxidized PAF-like phospholipids with short sn-2 residues. In addition, PAF and PAF-AH are associated with neural migration and mammalian reproduction. 58498 cd01821: Rhamnogalacturan_acetylesterase_like subgroup of SGNH-hydrolases. Rhamnogalacturan acetylesterase removes acetyl esters from rhamnogalacturonan substrates, and renders them susceptible to degradation by rhamnogalacturonases. Rhamnogalacturonans are highly branched regions in pectic polysaccharides, consisting of repeating -(1,2)-L-Rha-(1,4)-D-GalUA disaccharide units, with many rhamnose residues substituted by neutral oligosaccharides such as arabinans, galactans and arabinogalactans. Extracellular enzymes participating in the degradation of plant cell wall polymers, such as Rhamnogalacturonan acetylesterase, would typically be found in saprophytic and plant pathogenic fungi and bacteria. 58499 cd01822: Lysophospholipase L1-like subgroup of SGNH-hydrolases. The best characterized member in this family is TesA, an E. coli periplasmic protein with thioesterase, esterase, arylesterase, protease and lysophospholipase activity. 58500 cd01823: SEST_like. A family of secreted SGNH-hydrolases similar to Streptomyces scabies esterase (SEST), a causal agent of the potato scab disease, which hydrolyzes a specific ester bond in suberin, a plant lipid. The tertiary fold of this enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles two of the three components of typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxylic acid. 58501 cd01824: Phospholipase-B_like. This subgroup of the SGNH-family of lipolytic enzymes may have both esterase and phospholipase-A/lysophospholipase activity. It's members may be involved in the conversion of phosphatidylcholine to fatty acids and glycerophosphocholine, perhaps in the context of dietary lipid uptake. Members may be membrane proteins. The tertiary fold of the SGNH-hydrolases is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; Its active site closely resembles two of the three components of typical Ser-His-Asp(Glu) triad from other serine hydrolases. 58502 cd01825: SGNH_peri1; putative periplasmic member of the SGNH-family of hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 58503 cd01826: Acyloxyacyl-hydrolase like subfamily of the SGNH-hydrolase family. Acyloxyacyl-hydrolase is a leukocyte-secreted enzyme that deacetylates bacterial lipopolysaccharides. 58504 cd01827: sialate O-acetylesterase_like family of the SGNH hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 58505 cd01828: sialate_O-acetylesterase_like subfamily of the SGNH-hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 58506 cd01829: SGNH_peri2; putative periplasmic member of the SGNH-family of hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 58507 cd01830: SGNH_hydrolase subfamily, similar to the putative arylesterase/acylhydrolase from the rumen anaerobe Prevotella bryantii XynE. The P. bryantii XynE gene is located in a xylanase gene cluster. SGNH hydrolases are a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 58508 cd01831: Endoglucanase E-like members of the SGNH hydrolase family; Endoglucanase E catalyzes the endohydrolysis of 1,4-beta-glucosidic linkages in cellulose, lichenin and cereal beta-D-glucans. 58509 cd01832: Members of the SGNH-hydrolase superfamily, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid. Myxobacterial members of this subfamily have been reported to be involved in adventurous gliding motility. 58510 cd01833: SGNH_hydrolase subfamily, similar to Ruminococcus flavefaciens XynB. Most likely a secreted hydrolase with xylanase activity. SGNH hydrolases are a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 58511 cd01834: SGNH_hydrolase subfamily. SGNH hydrolases are a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 58512 cd01835: SGNH_hydrolase subfamily. SGNH hydrolases are a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 58513 cd01836: SGNH_hydrolase subfamily, FeeA, FeeB and similar esterases/lipases. FeeA and FeeB are part of a biosynthetic gene cluster and may participate in the biosynthesis of long-chain N-acyltyrosines by providing saturated and unsaturated fatty acids, which it turn are loaded onto the acyl carrier protein FeeL. SGNH hydrolases are a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 58514 cd01837: SGNH_plant_lipase_like, a plant specific subfamily of the SGNH-family of hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 58515 cd01838: Isoamyl-acetate hydrolyzing esterase-like proteins. SGNH_hydrolase subfamily similar to the Saccharomyces cerevisiae IAH1. IAH1 may be the major esterase that hydrolyses isoamyl acetate in sake mash. The SGNH-family of hydrolases is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 58516 cd01839: SGNH_hydrolase subfamily, similar to arylesterase (7-aminocephalosporanic acid-deacetylating enzyme) of A. tumefaciens. SGNH hydrolases are a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 58517 cd01840: yrhL-like subfamily of SGNH-hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. Most members of this sub-family appear to co-occur with N-terminal acyltransferase domains. Might be involved in lipid metabolism. 58518 cd01841: NnaC (CMP-NeuNAc synthetase) _like subfamily of SGNH_hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles two of the three components of typical Ser-His-Asp(Glu) triad from other serine hydrolases. E. coli NnaC appears to be involved in polysaccharide synthesis. 58519 cd01842: SGNH_hydrolase subfamily. SGNH hydrolases are a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 58520 cd01844: SGNH_hydrolase subfamily. SGNH hydrolases are a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. 58521 cd01846: Fatty acyltransferase-like subfamily of the SGNH hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. Might catalyze fatty acid transfer between phosphatidylcholine and sterols. 58522 cd01847: Triacylglycerol lipase-like subfamily of the SGNH hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. Members of this subfamily might hydrolyze triacylglycerol into diacylglycerol and fatty acid anions. 58523 cd04501: Members of the SGNH-hydrolase superfamily, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid. 58524 cd04502: Members of the SGNH-hydrolase superfamily, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid. 58525 cd04506: Members of the SGNH-hydrolase superfamily, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid. This subfamily contains sequences similar to Bacillus YpmR. 58526 cd00595: Nucleoside diphosphate kinases (NDP kinases, NDPks): NDP kinases, responsible for the synthesis of nucleoside triphosphates (NTPs), are involved in numerous regulatory processes associated with proliferation, development, and differentiation. They are vital for DNA/RNA synthesis, cell division, macromolecular metabolism and growth. The enzymes generate NTPs or their deoxy derivatives by terminal (gamma) phosphotransfer from an NTP such as ATP or GTP to any nucleoside diphosphate (NDP) or its deoxy derivative. The sequence of NDPk has been highly conserved through evolution. There is a single histidine residue conserved in all known NDK isozymes, which is involved in the catalytic mechanism. The first confirmed metastasis suppressor gene was the NDP kinase protein encoded by the nm23 gene. Unicellular organisms generally possess only one gene encoding NDP kinase, while most multicellular organisms possess not only an ortholog that provides most of the NDP kinase enzymatic activity but also multiple divergent paralogous genes. The human genome codes for at least nine NDP kinases and can be classified into two groups, Groups I and II, according to their genomic architecture and distinct enzymatic activity. Group I isoforms (A-D) are well-conserved, catalytically active, and share 58-88% identity between each other, while Group II are more divergent, with only NDPk6 shown to be active. NDP kinases exist in two different quaternary structures; all known eukaryotic enzymes are hexamers, while some bacterial enzymes are tetramers, as in Myxococcus. The hexamer can be viewed as trimer of dimers, while tetramers are dimers of dimers, with the dimerization interface conserved. 58527 cd04412: Nucleoside diphosphate kinase 7 domain B (NDPk7B): The nm23-H7 class of nucleoside diphosphate kinase (NDPk7) consists of an N-terminal DM10 domain and two functional catalytic NDPk modules, NDPk7A and NDPk7B. The function of the DM10 domain, which also occurs in multiple copies in other proteins, is unknown. NDPk7 is predominantly expressed in testes, although appreciable amount are also found in liver, heart, brain, ovary, small intestine and spleen. The nm23-H7 gene is located in or near the hereditary prostrate cancer susceptibility locus. Nm23-H7 may be involved in the development of colon and gastric carcinoma, the latter possibly in a type-specific manner. 58528 cd04413: Nucleoside diphosphate kinase Group I (NDPk_I)-like: NDP kinase domains are present in a large family of structurally and functionally conserved proteins from bacteria to humans that generally catalyze the transfer of gamma-phosphates of a nucleoside triphosphate (NTP) donor onto a nucleoside diphosphate (NDP) acceptor through a phosphohistidine intermediate. The mammalian nm23/NDP kinase gene family can be divided into two distinct groups. The group I genes encode proteins that generally have highly homologous counterparts in other organisms and possess the classic enzymatic activity of a kinase. This group includes vertebrate NDP kinases A-D (Nm23- H1 to -H4), and its counterparts in bacteria, archea and other eukaryotes. NDP kinases exist in two different quaternary structures; all known eukaryotic enzymes are hexamers, while some bacterial enzymes are tetramers, as in Myxococcus. They possess the NDP kinase active site motif (NXXH[G/A]SD) and the nine residues that are most essential for catalysis. 58529 cd04414: Nucleoside diphosphate kinase 6 (NDP kinase 6, NDPk6, NM23-H6; NME6; Inhibitor of p53-induced apoptosis-alpha, IPIA-alpha): The nm23-H6 gene encoding NDPk6 is expressed mainly in mitochondria, but also found at a lower level in most tissues. NDPk6 has all nine residues considered crucial for enzyme structure and activity, and has been found to have NDP kinase activity. It may play a role in cell growth and cell cycle progression. The nm23-H6 gene locus has been implicated in a variety of malignant tumors. 58530 cd04415: Nucleoside diphosphate kinase 7 domain A (NDPk7A): The nm23-H7 class of nucleoside diphosphate kinase (NDPk7) consists of an N-terminal DM10 domain and two functional catalytic NDPk modules, NDPk7A and NDPk7B. The function of the DM10 domain, which also occurs in multiple copies in other proteins, is unknown. NDPk7 is predominantly expressed in testes, although appreciable amount are also found in liver, heart, brain, ovary, small intestine and spleen. The nm23-H7 gene is located in or near the hereditary prostrate cancer susceptibility locus. Nm23-H7 may be involved in the development of colon and gastric carcinoma, the latter possibly in a type-specific manner. 58531 cd04416: NDP kinase domain of thioredoxin domain-containing proteins (TXNDC3 and TXNDC6): Txl-2 (TXNDC6) and Sptrx-2 (TXNDC3) are fusion proteins of Group II N-terminal thioredoxin domains followed by one or three NDP kinase domains, respectively. Sptrx-2, which has a tissue specific distribution in human testis, has been considered as a member of the nm23 family (nm23-H8) and exhibits a high homology with sea urchin IC1 (intermediate chain-1) protein, a component of the sperm axonemal outer dynein arm complex. Txl-2 is mainly represented in close association with microtubules within tissues with cilia and flagella such as seminiferous epithelium (spermatids) and lung airway epithelium, suggesting possible role in control of microtubule stability and maintenance. 58532 cd04418: Nucleoside diphosphate kinase homolog 5 (NDP kinase homolog 5, NDPk5, NM23-H5; Inhibitor of p53-induced apoptosis-beta, IPIA-beta): In human, mRNA for NDPk5 is almost exclusively found in testis, especially in the flagella of spermatids and spermatozoa, in association with axoneme microtubules, and may play a role in spermatogenesis by increasing the ability of late-stage spermatids to eliminate reactive oxygen species. It belongs to the nm23 Group II genes and appears to differ from the other human NDPks in that it lacks two important catalytic site residues, and thus does not appear to possess NDP kinase activity. NDPk5 confers protection from cell death by Bax and alters the cellular levels of several antioxidant enzymes, including glutathione peroxidase 5 (Gpx5).. 58533 cd00101: Insulin/insulin-like growth factor/relaxin family; insulin family of proteins. Members include a number of active peptides which are evolutionary related including insulin, relaxin, prorelaxin, insulin-like growth factors I and II, mammalian Leydig cell-specific insulin-like peptide (gene INSL3), early placenta insulin-like peptide (ELIP; gene INSL4), insect prothoracicotropic hormone (bombyxin), locust insulin-related peptide (LIRP), molluscan insulin-related peptides 1 to 5 (MIP), and C. elegans insulin-like peptides. Typically, the active forms of these peptide hormones are composed of two chains (A and B) linked by two disulfide bonds; the arrangement of four cysteines is conserved in the ""A"" chain: Cys1 is linked by a disulfide bond to Cys3, Cys2 and Cys4 are linked by interchain disulfide bonds to cysteines in the ""B"" chain. This alignment contains both chains, plus the intervening linker region, arranged as found in the propeptide form. Propeptides are cleaved to yield two separate chains linked covalently by the two disulfide bonds. 58534 cd04365: IlGF_like family, relaxin_like subgroup, specific to vertebrates. Members include a number of active peptides including (pro)relaxin, mammalian Leydig cell-specific insulin-like peptide (gene INSL3), early placenta insulin-like peptide (ELIP; gene INSL4), and insulin-like peptides 5 (INSL5) and 6 (INSL6). Members of this subgroup are widely expressed in testes (INSL3, INSL6), decidua, placenta, prostate, corpus luteum, brain (various relaxins), GI tract, and kidney (INSL5) where they serve a variety of functions in parturition and development. Typically, the active forms of these peptide hormones are composed of two chains (A and B) linked by two disulfide bonds; the arrangement of four cysteines is conserved in the ""A"" chain: Cys1 is linked by a disulfide bond to Cys3, Cys2 and Cys4 are linked by interchain disulfide bonds to cysteines in the ""B"" chain. This alignment contains both chains, plus the intervening linker region, arranged as found in the propeptide form. Propeptides are cleaved to yield two separate chains linked covalently by the two disulfide bonds. 58535 cd04366: IlGF_like family, insulin_bombyxin_like subgroup. Members include a number of peptides including insulin, insulin-like growth factors I and II, insect prothoracicotropic hormone (bombyxin), locust insulin-related peptide (LIRP), molluscan insulin-related peptides 1 to 5 (MIP), and C. elegans insulin-like peptides. With the exception of insulin-like growth factors, the active forms of these peptide hormones are composed of two chains (A and B) linked by two disulfide bonds; the arrangement of four cysteines is conserved in the ""A"" chain: Cys1 is linked by a disulfide bond to Cys3, Cys2 and Cys4 are linked by interchain disulfide bonds to cysteines in the ""B"" chain. This alignment contains both chains, plus the intervening linker region, arranged as found in the propeptide form. Propeptides are cleaved to yield two separate chains linked covalently by the two disulfide bonds. 58536 cd04367: IlGF_like family, insulin_like subgroup, specific to vertebrates. Members include a number of peptides including insulin and insulin-like growth factors I and II, which play a variety of roles in controlling processes such as metabolism, growth and differentiation, and reproduction. On a cellular level they affect cell cycle, apoptosis, cell migration, and differentiation. With the exception of the insulin-like growth factors, the active forms of these peptide hormones are composed of two chains (A and B) linked by two disulfide bonds; the arrangement of four cysteines is conserved in the ""A"" chain: Cys1 is linked by a disulfide bond to Cys3, Cys2 and Cys4 are linked by interchain disulfide bonds to cysteines in the ""B"" chain. This alignment contains both chains, plus the intervening linker region, arranged as found in the propeptide form. Propeptides are cleaved to yield two separate chains linked covalently by the two disulfide bonds. 58537 cd04368: IlGF, insulin_like growth factors; specific to vertebrates. Members include a number of peptides including insulin-like growth factors I and II, which play a variety of roles in controlling processes such as growth, differentiation, and reproduction. On a cellular level they affect cell cycle, apoptosis, cell migration, proliferation, and differentiation. Typically, the active forms of these peptide hormones are single chains cross-linked by three disulfide bonds. 58538 cd03467: Rieske domain; a [2Fe-2S] cluster binding domain commonly found in Rieske non-heme iron oxygenase (RO) systems such as naphthalene and biphenyl dioxygenases, as well as in plant/cyanobacterial chloroplast b6f and mitochondrial cytochrome bc(1) complexes. The Rieske domain can be divided into two subdomains, with an incomplete six-stranded, antiparallel beta-barrel at one end, and an iron-sulfur cluster binding subdomain at the other. The Rieske iron-sulfur center contains a [2Fe-2S] cluster, which is involved in electron transfer, and is liganded to two histidine and two cysteine residues present in conserved sequences called Rieske motifs. In RO systems, the N-terminal Rieske domain of the alpha subunit acts as an electron shuttle that accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron in the alpha subunit C-terminal domain to be used for catalysis. 58539 cd03469: Rieske non-heme iron oxygenase (RO) family, N-terminal Rieske domain of the oxygenase alpha subunit; The RO family comprise a large class of aromatic ring-hydroxylating dioxygenases found predominantly in microorganisms. These enzymes enable microorganisms to tolerate and even exclusively utilize aromatic compounds for growth. ROs consist of two or three components: reductase, oxygenase, and ferredoxin (in some cases) components. The oxygenase component may contain alpha and beta subunits, with the beta subunit having a purely structural function. Some oxygenase components contain only an alpha subunit. The oxygenase alpha subunit has two domains, an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from the reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. Reduced pyridine nucleotide is used as the initial source of two electrons for dioxygen activation. 58540 cd03470: Iron-sulfur protein (ISP) component of the bc(1) complex family, Rieske domain; The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. The bc(1) complex is a multisubunit enzyme found in many different organisms including uni- and multi-cellular eukaryotes, plants (in their mitochondria) and bacteria. The cytochrome bc(1) and b6f complexes are central components of the respiratory and photosynthetic electron transport chains, respectively, which carry out similar core electron and proton transfer steps. The bc(1) and b6f complexes share a common core structure of three catalytic subunits: cyt b, the Rieske ISP, and either a cyt c1 in the bc(1) complex or cyt f in the b6f complex, which are arranged in an integral membrane-bound dimeric complex. While the core of the b6f complex is similar to that of the bc(1) complex, the domain arrangement outside the core and the complement of prosthetic groups are strikingly different. 58541 cd03471: Iron-sulfur protein (ISP) component of the b6f complex family, Rieske domain; The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. The cytochrome b6f complex from Mastigocladus laminosus, a thermophilic cyanobacterium, contains four large subunits, including cytochrome f, cytochrome b6, the Rieske ISP, and subunit IV; as well as four small hydrophobic subunits, PetG, PetL, PetM, and PetN. Rieske ISP, one of the large subunits of the cytochrome bc-type complexes, is involved in respiratory and photosynthetic electron transfer. The core of the chloroplast b6f complex is similar to the analogous respiratory cytochrome bc(1) complex, but the domain arrangement outside the core and the complement of prosthetic groups are strikingly different. 58542 cd03472: Rieske non-heme iron oxygenase (RO) family, Biphenyl dioxygenase (BPDO)-like subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; composed of the oxygenase alpha subunits of BPDO and similar proteins including cumene dioxygenase (CumDO), nitrobenzene dioxygenase (NBDO), alkylbenzene dioxygenase (AkbDO) and dibenzofuran 4,4a-dioxygenase (DFDO). ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. BPDO degrades biphenyls and polychlorinated biphenyls (PCB's) while CumDO degrades cumene (isopropylbenzene), an aromatic hydrocarbon that is intermediate in size between ethylbenzene and biphenyl. NBDO catalyzes the initial reaction in nitrobenzene degradation, oxidizing the aromatic rings of mono- and dinitrotoluenes to form catechol and nitrite. NBDO belongs to the naphthalene subfamily of ROs. AkbDO is involved in alkylbenzene catabolism, converting o-xylene to 2,3- and 3,4-dimethylphenol and ethylbenzene to cis-dihydrodiol. DFDO is involved in dibenzofuran degradation. 58543 cd03473: Cytidine monophosphate-N-acetylneuraminic acid (CMP Neu5Ac) hydroxylase family, N-terminal Rieske domain; The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. CMP Neu5Ac hydroxylase is the key enzyme for the synthesis of N-glycolylneuraminic acid (NeuGc) from N-acetylneuraminic acid (Neu5Ac), NeuGc and Neu5Ac are members of a family of cell surface sugars called sialic acids. All mammals except humans have both NeuGc variants on their cell surfaces. In humans, the gene encoding CMP Neu5Ac hydroxylase has a mutation within its coding region that abolishes NeuGc production. 58544 cd03474: Toluene-4-monooxygenase effector protein complex (T4mo), Rieske ferredoxin subunit; The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. T4mo is a four-protein complex that catalyzes the NADH- and O2-dependent hydroxylation of toluene to form p-cresol. T4mo consists of an NADH oxidoreductase (T4moF), a diiron hydroxylase (T4moH), a catalytic effector protein (T4moD), and a Rieske ferredoxin (T4moC). T4moC contains a Rieske domain and functions as an obligate electron carrier between T4moF and T4moH. Rieske ferredoxins are found as subunits of membrane oxidase complexes, cis-dihydrodiol-forming aromatic dioxygenases, bacterial assimilatory nitrite reductases, and arsenite oxidase. Rieske ferredoxins are also found as soluble electron carriers in bacterial dioxygenase and monooxygenase complexes. 58545 cd03475: SoxF and SoxL family, Rieske domain; The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. SoxF is a subunit of the terminal oxidase supercomplex SoxM in the plasma membrane of Sulfolobus acidocaldarius that combines features of a cytochrome bc(1) complex and a cytochrome. The Rieske domain of SoxF has a 12 residue insertion which is not found in eukaryotic and bacterial Rieske proteins and is thought to influence the redox properties of the iron-sulfur cluster. SoxL is a Rieske protein which may be part of an archaeal bc-complex homologue whose physiological function is still unknown. SoxL has two features not seen in other Rieske proteins; (i) a significantly greater distance between the two cluster-binding sites and (ii) an unexpected Pro -> Asp substitution at one of the cluster binding sites. SoxF and SoxL are found in archaea and in bacteria. 58546 cd03476: Small subunit of Arsenite oxidase (ArOX) family, Rieske domain; ArOX is a molybdenum/iron protein involved in the detoxification of arsenic, oxidizing it to arsenate. It consists of two subunits, a large subunit similar to members of the DMSO reductase family of molybdenum enzymes and a small subunit with a Rieske-type [2Fe-2S] cluster. The large subunit of ArOX contains the molybdenum site at which the oxidation of arsenite occurs. The small subunit contains a domain homologous to the Rieske domains of the cytochrome bc(1) and cytochrome b6f complexes as well as naphthalene 1,2-dioxygenase. The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. 58547 cd03477: YhfW family, C-terminal Rieske domain; YhfW is a protein of unknown function with an N-terminal DadA-like (glycine/D-amino acid dehydrogenase) domain and a C-terminal Rieske domain. The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. It is commonly found in Rieske non-heme iron oxygenase (RO) systems such as naphthalene and biphenyl dioxygenases, as well as in plant/cyanobacterial chloroplast b6f and mitochondrial cytochrome bc(1) complexes. YhfW is found in bacteria, some eukaryotes and archaea. 58548 cd03478: AIFL (apoptosis-inducing factor like) family, N-terminal Rieske domain; members of this family show similarity to human AIFL, containing an N-terminal Rieske domain and a C-terminal pyridine nucleotide-disulfide oxidoreductase domain (Pyr_redox). The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. AIFL shares 35% homology with human AIF (apoptosis-inducing factor), mainly in the Pyr_redox domain. AIFL is predominantly localized to the mitochondria. AIFL induces apoptosis in a caspase-dependent manner. 58549 cd03479: Rieske non-heme iron oxygenase (RO) family, Phthalate 4,5-dioxygenase (PhDO)-like subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; composed of the oxygenase alpha subunits of PhDO and similar proteins including 3-chlorobenzoate 3,4-dioxygenase (CBDO), phenoxybenzoate dioxygenase (POB-dioxygenase) and 3-nitrobenzoate oxygenase (MnbA). ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. PhDO and CBDO are two-component RO systems, containing oxygenase and reductase components. PhDO catalyzes the dihydroxylation of phthalate to form the 4,5-dihydro-cis-dihydrodiol of phthalate (DHD). CBDO, together with CbaC dehydrogenase, converts the environmental pollutant 3CBA to protocatechuate (PCA) and 5-Cl-PCA, which are then metabolized by the chromosomal PCA meta (extradiol) ring fission pathway. POB-dioxygenase catalyzes the initial catabolic step in the angular dioxygenation of phenoxybenzoate, converting mono- and dichlorinated phenoxybenzoates to protocatechuate and chlorophenols. These phenoxybenzoates are metabolic products formed during the degradation of pyrethroid insecticides. 58550 cd03480: Rieske non-heme iron oxygenase (RO) family, Pheophorbide a oxygenase (PaO) subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; composed of the oxygenase alpha subunits of a small subfamily of enzymes found in plants as well as oxygenic cyanobacterial photosynthesizers including LLS1 (lethal leaf spot 1, also known as PaO) and ACD1 (accelerated cell death 1). ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. PaO expression increases upon physical wounding of plant leaves and is thought to catalyze a key step in chlorophyll degradation. The Arabidopsis-accelerated cell death gene ACD1 is involved in oxygenation of PaO. 58551 cd03528: Rieske non-heme iron oxygenase (RO) family, Rieske ferredoxin component; composed of the Rieske ferredoxin component of some three-component RO systems including biphenyl dioxygenase (BPDO) and carbazole 1,9a-dioxygenase (CARDO). The RO family comprise a large class of aromatic ring-hydroxylating dioxygenases found predominantly in microorganisms. These enzymes enable microorganisms to tolerate and even exclusively utilize aromatic compounds for growth. ROs consist of two or three components: reductase, oxygenase, and ferredoxin (in some cases) components. The ferredoxin component contains either a plant-type or Rieske-type [2Fe-2S] cluster. The Rieske ferredoxin component in this family carries an electron from the RO reductase component to the terminal RO oxygenase component. BPDO degrades biphenyls and polychlorinated biphenyls. BPDO ferredoxin (BphF) has structural features consistent with a minimal and perhaps archetypical Rieske protein in that the insertions that give other Rieske proteins unique structural features are missing. CARDO catalyzes dihydroxylation at the C1 and C9a positions of carbazole. Rieske ferredoxins are found as subunits of membrane oxidase complexes, cis-dihydrodiol-forming aromatic dioxygenases, bacterial assimilatory nitrite reductases, and arsenite oxidase. Rieske ferredoxins are also found as soluble electron carriers in bacterial dioxygenase and monooxygenase complexes. 58552 cd03529: Assimilatory nitrite reductase (NirD) family, Rieske domain; Assimilatory nitrate and nitrite reductases convert nitrate through nitrite to ammonium. Members include bacterial and fungal proteins. The bacterial NirD contains a single Rieske domain while fungal proteins have a C-terminal Rieske domain in addition to several other domains. The fungal NirD is involved in nutrient acquisition, functioning at the soil/fungus interface to control nutrient exchange between the fungus and the host plant. The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. The Rieske [2Fe-2S] cluster is liganded to two histidine and two cysteine residues present in conserved sequences called Rieske motifs. In this family, only a few members contain these residues. Other members may have lost the ability to bind the Rieske [2Fe-2S] cluster. 58553 cd03530: Small subunit of nitrite reductase (NirD) family, Rieske domain; composed of proteins similar to the Bacillus subtilis small subunit of assimilatory nitrite reductase containing a Rieske domain. The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. Assimilatory nitrate and nitrite reductases convert nitrate through nitrite to ammonium. 58554 cd03531: The alignment model represents the N-terminal rieske iron-sulfur domain of KshA, the oxygenase component of 3-ketosteroid 9-alpha-hydroxylase (KSH). The terminal oxygenase component of KSH is a key enzyme in the microbial steroid degradation pathway, catalyzing the 9 alpha-hydroxylation of 4-androstene-3,17-dione (AD) and 1,4-androstadiene-3,17-dione (ADD). KSH is a two-component class IA monooxygenase, with terminal oxygenase (KshA) and oxygenase reductase (KshB) components. KSH activity has been found in many actino- and proteo- bacterial genera including Rhodococcus, Nocardia, Arthrobacter, Mycobacterium, and Burkholderia. 58555 cd03532: Rieske non-heme iron oxygenase (RO) family, Vanillate-O-demethylase oxygenase (VanA) and dicamba O-demethylase oxygenase (DdmC) subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. Vanillate-O-demethylase is a heterodimeric enzyme consisting of a terminal oxygenase (VanA) and reductase (VanB) components. This enzyme reductively catalyzes the conversion of vanillate into protocatechuate and formaldehyde. Protocatechuate and vanillate are important intermediate metabolites in the degradation pathway of lignin-derived compounds such as ferulic acid and vanillin by soil microbes. DDmC is the oxygenase component of a three-component dicamba O-demethylase found in Pseudomonas maltophila, that catalyzes the conversion of a widely used herbicide called herbicide dicamba (2-methoxy-3,6-dichlorobenzoic acid) to DCSA (3,6-dichlorosalicylic acid).. 58556 cd03535: Rieske non-heme iron oxygenase (RO) family, Nathphalene 1,2-dioxygenase (NDO) subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. NDO is a three-component RO system consisting of a reductase, a ferredoxin, and a hetero-hexameric alpha-beta subunit oxygenase component. NDO catalyzes the oxidation of naphthalene to cis-(1R,2S)-dihydroxy-1,2-dihydronaphthalene (naphthalene cis-dihydrodiol) with the consumption of O2 and NAD(P)H. NDO has a relaxed substrate specificity and can oxidize almost 100 substrates. Included in its varied activities are the enantiospecific cis-dihydroxylation of polycyclic aromatic hydrocarbons and benzocycloalkenes, benzylic hydroxylation, N- and O-dealkylation, sulfoxidation and desaturation reactions. 58557 cd03536: This alignment model represents the N-terminal rieske domain of the oxygenase alpha subunit (DitA) of diterpenoid dioxygenase (DTDO). DTDO is a novel aromatic-ring-hydroxylating dioxygenase found in Pseudomonas and other proteobacteria that degrades dehydroabietic acid (DhA). Specifically, DitA hydroxylates 7-oxodehydroabietic acid to 7-oxo-11,12-dihydroxy-8, 13-abietadien acid. The ditA1 and ditA2 genes encode the alpha and beta subunits of the oxygenase component of DTDO while the ditA3 gene encodes the ferredoxin component of DTDO. The organization of the genes encoding the various diterpenoid dioxygenase components, the phylogenetic distinctiveness of both the alpha subunit and the ferredoxin component, and the unusual iron-sulfur cluster of the ferredoxin all suggest that this enzyme belongs to a new class of aromatic ring-hydroxylating dioxygenases. 58558 cd03537: This alignment model represents the N-terminal rieske domain of the oxygenase alpha subunit of aminopyrrolnitrin oxygenase (PrnD). PrnD is a novel Rieske N-oxygenase that catalyzes the final step in the pyrrolnitrin biosynthetic pathway, the oxidation of the amino group in aminopyrrolnitrin to a nitro group, forming the antibiotic pyrrolnitrin. The biosynthesis of pyrrolnitrin is one of the best examples of enzyme-catalyzed arylamine oxidation. Although arylamine oxygenases are widely distributed within the microbial world and used in a variety of metabolic reactions, PrnD represents one of only two known examples of arylamine oxygenases or N-oxygenases involved in arylnitro group formation, the other being AurF involved in aureothin biosynthesis. 58559 cd03538: Rieske non-heme iron oxygenase (RO) family, Anthranilate 1,2-dioxygenase (AntDO) subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. AntDO converts anthranilate to catechol, a naturally occurring compound formed through tryptophan degradation and an important intermediate in the metabolism of many N-heterocyclic compounds such as indole, o-nitrobenzoate, carbazole, and quinaldine. 58560 cd03539: This alignment model represents the N-terminal rieske iron-sulfur domain of the oxygenase alpha subunit (NagG) of salicylate 5-hydroxylase (S5H). S5H converts salicylate (2-hydroxybenzoate), a metabolic intermediate of phenanthrene, to gentisate (2,5-dihydroxybenzoate) as part of an alternate pathway for naphthalene catabolism. S5H is a multicomponent enzyme made up of NagGH (the oxygenase components), NagAa (the ferredoxin reductase component), and NagAb (the ferredoxin component). The oxygenase component is made up of alpha (NagG) and beta (NagH) subunits. 58561 cd03541: Rieske non-heme iron oxygenase (RO) family, Choline monooxygenase (CMO) subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. CMO is a novel RO found in certain plants which catalyzes the first step in betaine synthesis. CMO is not found in animals or bacteria. In these organisms, the first step in betaine synthesis is catalyzed by either the membrane-bound choline dehydrogenase (CDH) or the soluble choline oxidase (COX).. 58562 cd03542: Rieske non-heme iron oxygenase (RO) family, 2-Halobenzoate 1,2-dioxygenase (HBDO) subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. HBDO catalyzes the double hydroxylation of 2-halobenzoates with concomitant release of halogenide and carbon dioxide, yielding catechol. 58563 cd03545: Rieske non-heme iron oxygenase (RO) family, Ortho-halobenzoate-1,2-dioxygenase (OHBDO)-like subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; composed of the oxygenase alpha subunits of OHBDO, salicylate 5-hydroxylase (S5H), terephthalate 1,2-dioxygenase system (TERDOS) and similar proteins. ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. OHBDO converts 2-chlorobenzoate (2-CBA) to catechol as well as 2,4-dCBA and 2,5-dCBA to 4-chlorocatechol, as part of the chlorobenzoate degradation pathway. Although ortho-substituted chlorobenzoates appear to be particularly recalcitrant to biodegradation, several strains utilize 2-CBA and the dCBA derivatives as a sole carbon and energy source. S5H converts salicylate (2-hydroxybenzoate), a metabolic intermediate of phenanthrene, to gentisate (2,5-dihydroxybenzoate) as part of an alternate pathway for naphthalene catabolism. S5H is a multicomponent enzyme made up of NagGH (the oxygenase components), NagAa (the ferredoxin reductase component), and NagAb (the ferredoxin component). The oxygenase component is made up of alpha (NagG) and beta (NagH) subunits. TERDOS is present in gram-positive bacteria and proteobacteria where it converts terephthalate (1,4-dicarboxybenzene) to protocatechuate as part of the terephthalate degradation pathway. The oxygenase component of TERDOS, called TerZ, is a hetero-hexamer with 3 alpha (TerZalpha) and 3 beta (TerZbeta) subunits. 58564 cd03548: Rieske non-heme iron oxygenase (RO) family, 2-Oxoquinoline 8-monooxygenase (OMO) and Carbazole 1,9a-dioxygenase (CARDO) subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. OMO catalyzes the NADH-dependent oxidation of the N-heterocyclic aromatic compound 2-oxoquinoline to 8-hydroxy-2-oxoquinoline, the second step in the bacterial degradation of quinoline. OMO consists of a reductase component (OMR) and an oxygenase component (OMO) that together function to shuttle electrons from the reduced pyridine nucleotide to the active site of OMO, where O2 activation and 2-oxoquinoline hydroxylation occurs. CARDO, which contains oxygenase (CARDO-O), ferredoxin (CARDO-F) and ferredoxin reductase (CARDO-R) components, catalyzes the dihydroxylation at the C1 and C9a positions of carbazole. The oxygenase component of OMO and CARDO contain only alpha subunits arranged in a trimeric structure. 58565 cd04337: Cao (chlorophyll a oxygenase) is a rieske non-heme iron-sulfur protein located within the plastid-envelope inner and thylakoid membranes, that catalyzes the conversion of chlorophyllide a to chlorophyllide b. CAO is found not only in plants but also in chlorophytes and prochlorophytes. This domain represents the N-terminal rieske domain of the oxygenase alpha subunit. ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. Cao is closely related to several other plant RO's including Tic 55, a 55 kDa protein associated with protein transport through the inner chloroplast membrane; Ptc 52, a novel 52 kDa protein isolated from chloroplasts; and LLS1/Pao (Lethal-leaf spot 1/pheophorbide a oxygenase).. 58566 cd04338: Tic55 is a 55kDa LLS1-related non-heme iron oxygenase associated with protein transport through the plant inner chloroplast membrane. This domain represents the N-terminal Rieske domain of the Tic55 oxygenase alpha subunit. Tic55 is closely related to the oxygenase alpha subunits of a small subfamily of enzymes found in plants as well as oxygenic cyanobacterial photosynthesizers including LLS1 (lethal leaf spot 1, also known as PaO), Ptc52, and ACD1 (accelerated cell death 1). ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. 58567 cd00203: Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease. 58568 cd04267: Zinc-dependent metalloprotease, ADAM_like or reprolysin_like subgroup. The adamalysin_like or ADAM family of metalloproteases contains proteolytic domains from snake venoms, proteases from the mammalian reproductive tract, and the tumor necrosis factor alpha convertase, TACE. ADAMs (A Disintegrin And Metalloprotease) are glycoproteins, which play roles in cell signaling, cell fusion, and cell-cell interactions. 58569 cd04268: Zinc-dependent metalloprotease, MMP_like subfamily. This group contains matrix metalloproteinases (MMPs), serralysins, and the astacin_like family of proteases. 58570 cd04269: Zinc-dependent metalloprotease; adamalysin_II_like subfamily. Adamalysin II is a snake venom zinc endopeptidase. This subfamily contains other snake venom metalloproteinases, as well as membrane-anchored metalloproteases belonging to the ADAM family. ADAMs (A Disintegrin And Metalloprotease) are glycoproteins, which play roles in cell signaling, cell fusion, and cell-cell interactions. 58571 cd04270: Zinc-dependent metalloprotease; TACE_like subfamily. TACE, the tumor-necrosis factor-alpha converting enzyme, releases soluble TNF-alpha from transmembrane pro-TNF-alpha. 58572 cd04271: Zinc-dependent metalloprotease, ADAM_fungal subgroup. The adamalysin_like or ADAM (A Disintegrin And Metalloprotease) family of metalloproteases are integral membrane proteases acting on a variety of extracellular targets. They are involved in shedding soluble peptides or proteins from the cell surface. This subfamily contains fungal ADAMs, whose precise function has yet to be determined. 58573 cd04272: Zinc-dependent metalloprotease, salivary_gland_MPs. Metalloproteases secreted by the salivary glands of arthropods. 58574 cd04273: Zinc-dependent metalloprotease, ADAMTS_like subgroup. ADAMs (A Disintegrin And Metalloprotease) are glycoproteins, which play roles in cell signaling, cell fusion, and cell-cell interactions. This particular subfamily represents domain architectures that combine ADAM-like metalloproteinases with thrombospondin type-1 repeats. ADAMTS (a disintegrin and metalloproteinase with thrombospondin motifs) proteinases are inhibited by TIMPs (tissue inhibitors of metalloproteinases), and they play roles in coagulation, angiogenesis, development and progression of arthritis. They hydrolyze the von Willebrand factor precursor and various components of the extracellular matrix. 58575 cd04275: Zinc-dependent metalloprotease, pappalysin_like subfamily. The pregnancy-associated plasma protein A (PAPP-A or pappalysin-1) cleaves insulin-like growth factor-binding proteins 4 and 5, thereby promoting cell growth by releasing bound growth factor. This model includes pappalysins and related metalloprotease domains from all three kingdoms of life. The three-dimensional structure of an archaeal representative, ulilysin, has been solved. 58576 cd04276: Zinc-dependent metalloprotease; MMP_like sub-family 2. A group of bacterial metalloproteinase domains similar to matrix metalloproteinases and astacin. 58577 cd04277: Zinc-dependent metalloprotease, serralysin_like subfamily. Serralysins and related proteases are important virulence factors in pathogenic bacteria. They may be secreted into the medium via a mechanism found in gram-negative bacteria, that does not require n-terminal signal sequences which are cleaved after the transmembrane translocation. A calcium-binding domain c-terminal to the metalloprotease domain, which contains multiple tandem repeats of a nine-residue motif including the pattern GGxGxD, and which forms a parallel beta roll may be involved in the translocation mechanism and/or substrate binding. Serralysin family members may have a broad spectrum of substrates each, including host immunoglobulins, complement proteins, cell matrix and cytoskeletal proteins, as well as antimicrobial peptides. 58578 cd04278: Zinc-dependent metalloprotease, matrix metalloproteinase (MMP) sub-family. MMPs are responsible for a great deal of pericellular proteolysis of extracellular matrix and cell surface molecules, playing crucial roles in morphogenesis, cell fate specification, cell migration, tissue repair, tumorigenesis, gain or loss of tissue-specific functions, and apoptosis. In many instances, they are anchored to cell membranes via trans-membrane domains, and their activity is controlled via TIMPs (tissue inhibitors of metalloproteinases).. 58579 cd04279: Zinc-dependent metalloprotease; MMP_like sub-family 1. A group of bacterial, archaeal, and fungal metalloproteinase domains similar to matrix metalloproteinases and astacin. 58580 cd04280: Zinc-dependent metalloprotease, astacin_like subfamily or peptidase family M12A, a group of zinc-dependent proteolytic enzymes with a HExxH zinc-binding site/active site. Members of this family may have an amino terminal propeptide, which is cleaved to yield the active protease domain, which is consequently always found at the N-terminus in multi-domain architectures. This family includes: astacin, a digestive enzyme from Crayfish; meprin, a multiple domain membrane component that is constructed from a homologous alpha and beta chain, proteins involved in (bone) morphogenesis, tolloid from drosophila, and the sea urchin SPAN protein, which may also play a role in development. 58581 cd04281: Zinc-dependent metalloprotease; BMP1/TLD-like subfamily. BMP1 (Bone morphogenetic protein 1) and TLD (tolloid)-like metalloproteases play vital roles in extracellular matrix formation, by cleaving precursor proteins such as enzymes, structural proteins, and proteins involved in the mineralization of the extracellular matrix. The drosophila protein tolloid and its Xenopus homologue xolloid cleave and inactivate Sog and chordin, respectively, which are inhibitors of Dpp (the Drosophila decapentaplegic gene product) and its homologue BMP4, involved in dorso-ventral patterning. 58582 cd04282: Zinc-dependent metalloprotease, meprin_like subfamily. Meprins are membrane-bound or secreted extracellular proteases, which cleave a variety of targets, including peptides such as parathyroid hormone, gastrin, and cholecystokinin, cytokines such as osteopontin, and proteins such as collagen IV, fibronectin, casein and gelatin. Meprins may also be able to release proteins from the cell surface. Closely related meprin alpha- and beta-subunits form homo- and hetero-oligomers; these complexes are found on epithelial cells of the intestine, for example, and are also expressed in certain cancer cells. 58583 cd04283: Zinc-dependent metalloprotease, hatching enzyme-like subfamily. Hatching enzymes are secreted by teleost embryos to digest the egg envelope or chorion. In some teleosts, the hatching enzyme may be a system consisting of two evolutionary related metalloproteases, high choriolytic enzyme and low choriolytic enzyme (HCE and LCE), which may have different substrate specificities and cooperatively digest the chorion. 58584 cd04327: Zinc-dependent metalloprotease; MMP_like sub-family 3. A group of bacterial and fungal metalloproteinase domains similar to matrix metalloproteinases and astacin. 58585 cd04100: Asp_Lys_Asn_RS_N: N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases (AspRS, AsnRS, and LysRS). aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation of the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of the activated AA to the terminal ribose of tRNA. In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose. Eukaryotes contain 2 sets of aaRSs, both of which are encoded by the nuclear genome. One set concerns with cytoplasmic protein synthesis, whereas the other exclusively with mitochondrial protein synthesis. Included in this group are archeal and archeal-like AspRSs which are non-discriminating and can charge both tRNAAsp and tRNAAsn. E. coli cells have two isoforms of LysRSs (LysS and LysU) encoded by two distinct genes, which are differentially regulated. The cytoplasmic and the mitochondrial isoforms of human LysRS are encoded by a single gene. Yeast cytoplasmic and mitochondrial LysRSs participate in mitochondrial import of cytoplasmic tRNAlysCUU. In addition to their housekeeping role, human LysRS may function as a signaling molecule that activates immune cells. Tomato LysRS may participate in a process possibly connected to conditions of oxidative-stress conditions or heavy metal uptake. It is known that human tRNAlys and LysRS are specifically packaged into HIV-1 suggesting a role for LysRS in tRNA packaging. AsnRS is immunodominant antigen of the filarial nematode Brugia malayai and is of interest as a target for anti-parasitic drug design. Human AsnRS has been shown to be a pro-inflammatory chemokine which interacts with CCR3 chemokine receptors on T cells, immature dendritic cells and macrophages. 58586 cd04316: ND_PkAspRS_like_N: N-terminal, anticodon recognition domain of the type found in the homodimeric non-discriminating (ND) Pyrococcus kodakaraensis aspartyl-tRNA synthetase (AspRS). This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop. P. kodakaraensis AspRS is a class 2b aaRS. aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of the activated AA to the terminal ribose of tRNA. In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose. P. kodakaraensis ND-AspRS can charge both tRNAAsp and tRNAAsn. Some of the enzymes in this group may be discriminating, based on the presence of homologs of asparaginyl-tRNA synthetase (AsnRS) in their completed genomes. 58587 cd04317: EcAspRS_like_N: N-terminal, anticodon recognition domain of the type found in Escherichia coli aspartyl-tRNA synthetase (AspRS), the human mitochondrial (mt) AspRS-2, the discriminating (D) Thermus thermophilus AspRS-1, and the nondiscriminating (ND) Helicobacter pylori AspRS. These homodimeric enzymes are class2b aminoacyl-tRNA synthetases (aaRSs). This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop. aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation of the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of the activated AA to the terminal ribose of tRNA. In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose. Eukaryotes contain 2 sets of aaRSs, both of which are encoded by the nuclear genome. One set concerns with cytoplasmic synthesis, whereas the other exclusively with mitochondrial protein synthesis. Human mtAspRS participates in mitochondrial biosynthesis; this enzyme been shown to charge E.coli native tRNAsp in addition to in vitro transcribed human mitochondrial tRNAsp. T. thermophilus is rare among bacteria in having both a D_AspRS and a ND_AspRS. H.pylori ND-AspRS can charge both tRNAASp and tRNAAsn, it is fractionally more efficient at aminoacylating tRNAAsp over tRNAAsn. The H.pylori genome does not contain AsnRS. 58588 cd04318: EcAsnRS_like_N: N-terminal, anticodon recognition domain of the type found in Escherichia coli asparaginyl-tRNA synthetase (AsnRS) and, in Arabidopsis thaliana and Saccharomyces cerevisiae mitochondrial (mt) AsnRS. This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop. The enzymes in this group are homodimeric class2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation of the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of the activated AA to the terminal ribose of tRNA. In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose. Eukaryotes contain 2 sets of aaRSs, both of which are encoded by the nuclear genome. One set concerns with cytoplasmic protein synthesis, whereas the other exclusively with mitochondrial protein synthesis. S. cerevisiae mtAsnRS can charge E.coli tRNA with asparagines. Mutations in the gene for S. cerevisiae mtAsnRS has been found to induce a ""petite"" phenotype typical for a mutation in a nuclear gene that results in a non-functioning mitochondrial protein synthesis system. 58589 cd04319: PhAsnRS_like_N: N-terminal, anticodon recognition domain of the type found in Pyrococcus horikoshii AsnRS asparaginyl-tRNA synthetase (AsnRS). This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop. The archeal enzymes in this group are homodimeric class2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation of the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of the activated AA to the terminal ribose of tRNA. In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose. 58590 cd04320: AspRS_cyto_N: N-terminal, anticodon recognition domain of the type found in Saccharomyces cerevisiae and human cytoplasmic aspartyl-tRNA synthetase (AspRS). This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop. The enzymes in this group are homodimeric class2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation of the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of the activated AA to the terminal ribose of tRNA. In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose. Eukaryotes contain 2 sets of aaRSs, both of which are encoded by the nuclear genome. One set concerns with cytoplasmic protein synthesis, whereas the other exclusively with mitochondrial protein synthesis. 58591 cd04321: ScAspRS_mt_like_N: N-terminal, anticodon recognition domain of the type found in Saccharomyces cerevisiae mitochondrial (mt) aspartyl-tRNA synthetase (AspRS). This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop. The enzymes in this fungal group are homodimeric class2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation of the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of the activated AA to the terminal ribose of tRNA. In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose. Eukaryotes contain 2 sets of aaRSs, both of which are encoded by the nuclear genome. One set concerns with cytoplasmic protein synthesis, whereas the other exclusively with mitochondrial protein synthesis. Mutations in the gene for S. cerevisiae mtAspRS result in a ""petite"" phenotype typical for a mutation in a nuclear gene that results in a non-functioning mitochondrial protein synthesis system. 58592 cd04322: LysRS_N: N-terminal, anticodon recognition domain of lysyl-tRNA synthetases (LysRS). These enzymes are homodimeric class 2b aminoacyl-tRNA synthetases (aaRSs). This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop. aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation of the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of the activated AA to the terminal ribose of tRNA. In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose. Included in this group are E. coli LysS and LysU. These two isoforms of LysRS are encoded by distinct genes which are differently regulated. Eukaryotes contain 2 sets of aaRSs, both of which encoded by the nuclear genome. One set concerns with cytoplasmic protein synthesis, whereas the other exclusively with mitochondrial protein synthesis. Saccharomyces cerevisiae cytoplasmic and mitochondrial LysRSs have been shown to participate in the mitochondrial import of the only nuclear-encoded tRNA of S. cerevisiae (tRNAlysCUU). The gene for human LysRS encodes both the cytoplasmic and the mitochondrial isoforms of LysRS. In addition to their housekeeping role, human lysRS may function as a signaling molecule that activates immune cells and tomato LysRS may participate in a root-specific process possibly connected to conditions of oxidative-stress conditions or heavy metal uptake. It is known that human tRNAlys and LysRS are specifically packaged into HIV-1 suggesting a role for LysRS in tRNA packaging. 58593 cd04323: AsnRS_cyto_like_N: N-terminal, anticodon recognition domain of the type found in human and Saccharomyces cerevisiae cytoplasmic asparaginyl-tRNA synthetase (AsnRS), in Brugia malayai AsnRs and, in various putative bacterial AsnRSs. This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop. The enzymes in this group are homodimeric class2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation of the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of the activated AA to the terminal ribose of tRNA. In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose. Eukaryotes contain 2 sets of aaRSs, both of which are encoded by the nuclear genome. One set concerns with cytoplasmic synthesis, whereas the other exclusively with mitochondrial protein synthesis. AsnRS is immunodominant antigen of the filarial nematode B. malayai and of interest as a target for anti-parasitic drug design. Human AsnRS has been shown to be a pro-inflammatory chemokine which interacts with CCR3 chemokine receptors on T cells, immature dendritic cells and macrophages. 58594 cd03173: DUF619-like: This CD includes the DUF619 domain of various N-acetylglutamate synthases (NAGS) of the urea cycle found in humans and fish, the DUF619 domain of the NAGS of the fungal arginine-biosynthetic pathway (FABP), as well as the DUF619 domain present C-terminal of a NAG kinase-like domain seen in a limited number of predicted NAGSs found in bacteria and Dictyostelium. Ureogenic NAGS is a mitochondrial enzyme catalyzing the formation of NAG from acetylcoenzyme A and L-glutamate. NAG is an essential allosteric activator of carbamylphosphate synthase I, the first and rate limiting enzyme of the urea cycle. Domain architecture of ureogenic and fungal NAGS consists of an N-terminal NAG kinase-like domain and a C-terminal DUF619 domain. This CD also includes the DUF619 domain of the FABP N-acetylglutamate kinase (NAGK). The nuclear-encoded mitochondrial polyprotein precursor (ARG5,6) consists of an N-terminal NAGK (ArgB) domain, a central DUF619 domain, and a C-terminal reductase domain (ArgC, N-acetylglutamate phosphate reductase). The DUF619 domain is yet to be characterized. 58595 cd04263: DUF619-NAGK-FABP: DUF619 domain of N-acetylglutamate kinase (NAGK) of the fungal arginine-biosynthetic pathway (FABP). The nuclear-encoded, mitochondrial polyprotein precursor (ARG5,6) consists of an N-terminal NAGK (ArgB) domain, a central DUF619 domain, and a C-terminal reductase domain (ArgC, N-Acetylglutamate Phosphate Reductase, NAGPR). The precursor is cleaved into two distinct enzymes (NAGK-DUF619 and NAGPR) in the mitochondria. Native molecular weights of these proteins indicate that the kinase is an octamer whereas the reductase is a dimer. The DUF619 domain is yet to be characterized. 58596 cd04264: DUF619-NAGS: This CD includes the DUF619 domain of various N-acetylglutamate Synthases (NAGS) of the urea cycle found in humans and fish, the DUF619 domain of the NAGS of the fungal arginine-biosynthetic pathway (FABP), as well as the DUF619 domain present C-terminal of a NAG kinase-like domain seen in a limited number of predicted NAGSs found in bacteria and Dictyostelium. Ureogenic NAGS is a mitochondrial enzyme catalyzing the formation of NAG from acetylcoenzyme A and L-glutamate. NAG is an essential allosteric activator of carbamylphosphate synthase I, the first and rate limiting enzyme of the urea cycle. Domain architecture of ureogenic and fungal NAGS consists of an N-terminal NAG kinase-like domain and a C-terminal DUF619 domain. The DUF619 domain is yet to be characterized. 58597 cd04265: DUF619-NAGS-U: This CD includes the DUF619 domain of various N-acetylglutamate Synthases (NAGS) of the urea (U) cycle found in humans and fish, as well as the DUF619 domain present C-terminal of a NAG kinase-like domain seen in a limited number of bacterial and Dictyostelium predicted NAGSs. Ureogenic NAGS is a mitochondrial enzyme catalyzing the formation of NAG from acetylcoenzyme A and L-glutamate. NAG is an essential allosteric activator of carbamylphosphate synthase I, the first and rate limiting enzyme of the urea cycle. Domain architecture of ureogenic NAGS consists of an N-terminal NAG kinase-like domain and a C-terminal DUF619 domain. The DUF619 domain is yet to be characterized. 58598 cd04266: DUF619-NAGS-FABP: DUF619 domain of N-acetylglutamate synthase (NAGS) of the fungal arginine-biosynthetic pathway (FABP). This NAGS (ARG2) consists of an N-terminal NAG kinase-like domain and a C-terminal DUF619 domain. The DUF619 domain, yet to be characterized, is predicted to function in NAG synthase association in fungi. 58599 cd02115: Amino Acid Kinases (AAK) superfamily, catalytic domain; present in such enzymes like N-acetylglutamate kinase (NAGK), carbamate kinase (CK), aspartokinase (AK), glutamate-5-kinase (G5K) and UMP kinase (UMPK). The AAK superfamily includes kinases that phosphorylate a variety of amino acid substrates. These kinases catalyze the formation of phosphoric anhydrides, generally with a carboxylate, and use ATP as the source of the phosphoryl group; are involved in amino acid biosynthesis. Some of these kinases control the process via allosteric feed-back inhibition. 58600 cd04234: AAK_AK: Amino Acid Kinase Superfamily (AAK), Aspartokinase (AK); this CD includes the N-terminal catalytic domain of aspartokinase (4-L-aspartate-4-phosphotransferase;). AK is the first enzyme in the biosynthetic pathway of the aspartate family of amino acids (lysine, threonine, methionine, and isoleucine) and the bacterial cell wall component, meso-diaminopimelate. It also catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. One mechanism for the regulation of this pathway is by the production of several isoenzymes of aspartokinase with different repressors and allosteric inhibitors. Pairs of ACT domains are proposed to specifically bind amino acids leading to allosteric regulation of the enzyme. In Escherichia coli, three different aspartokinase isoenzymes are regulated specifically by lysine, methionine, and threonine. AK-HSDHI (ThrA) and AK-HSDHII (MetL) are bifunctional enzymes that consist of an N-terminal AK and a C-terminal homoserine dehydrogenase (HSDH). ThrA and MetL are involved in threonine and methionine biosynthesis, respectively. The third isoenzyme, AKIII (LysC), is monofunctional and is involved in lysine synthesis. The three Bacillus subtilis isoenzymes, AKI (DapG), AKII (LysC), and AKIII (YclM), are feedback-inhibited by meso-diaminopimelate, lysine, and lysine plus threonine, respectively. The E. coli lysine-sensitive AK is described as a homodimer, whereas, the B. subtilis lysine-sensitive AK is described as a heterodimeric complex of alpha- and beta- subunits that are formed from two in-frame overlapping genes. A single AK enzyme type has been described in Pseudomonas, Amycolatopsis, and Corynebacterium. The fungal aspartate pathway is regulated at the AK step, with L-Thr being an allosteric inhibitor of the Saccharomyces cerevisiae AK (Hom3). At least two distinct AK isoenzymes can occur in higher plants, one is a monofunctional lysine-sensitive isoenzyme, which is involved in the overall regulation of the pathway and can be synergistically inhibited by S-adenosylmethionine. The other isoenzyme is a bifunctional, threonine-sensitive AK-HSDH protein. Also included in this CD is the catalytic domain of the Methylomicrobium alcaliphilum ectoine AK, the first enzyme of the ectoine biosynthetic pathway, found in this bacterium, and several other halophilic/halotolerant bacteria. 58601 cd04235: AAK_CK: Carbamate kinase (CK) catalyzes both the ATP-phosphorylation of carbamate and carbamoyl phosphate (CP) utilization with the production of ATP from ADP and CP. Both CK (this CD) and nonhomologous CP synthetase synthesize carbamoyl phosphate, an essential precursor of arginine and pyrimidine bases, in the presence of ATP, bicarbonate, and ammonia. CK is a homodimer of 33 kDa subunits and is a member of the Amino Acid Kinase Superfamily (AAK).. 58602 cd04236: AAK_NAGS-Urea: N-acetylglutamate (NAG) kinase-like domain of the NAG Synthase (NAGS) of the urea cycle found in animals. Ureogenic NAGS is a mitochondrial enzyme catalyzing the formation of NAG from acetylcoenzyme A and L-glutamate; NAG is an essential allosteric activator of carbamylphosphate synthase I, the first and rate limiting enzyme of the urea cycle. Ureogenic NAGS activity is dependent on the concentration of glutamate (substrate) and arginine (activator). Domain architecture of ureogenic NAGS consists of an N-terminal NAG kinase-like (ArgB) domain (this CD) and a C-terminal DUF619 domain. Members of this CD belong to the protein superfamily, the Amino Acid Kinase Family (AAKF).. 58603 cd04237: AAK_NAGS-ABP: N-acetylglutamate (NAG) kinase-like domain of the NAG Synthase (NAGS) of the arginine-biosynthesis pathway (ABP) found in gamma- and beta-proteobacteria and higher plant chloroplasts. Domain architecture of these NAGS consisted of an N-terminal NAG kinase-like (ArgB) domain (this CD) and a C-terminal NAG synthase, acetyltransferase (ArgA) domain. Both bacterial and plant sequences in this CD have a conserved N-terminal extension; a similar sequence in the NAG kinases of the cyclic arginine-biosynthesis pathway has been implicated in feedback inhibition sensing. Plant sequences also have an N-terminal chloroplast transit peptide and an insert (approx. 70 residues) in the C-terminal region of ArgB. Members of this CD belong to the Amino Acid Kinase Superfamily (AAK).. 58604 cd04238: AAK_NAGK-like: N-Acetyl-L-glutamate kinase (NAGK)-like . Included in this CD are the Escherichia coli and Pseudomonas aeruginosa type NAGKs which catalyze the phosphorylation of N-acetyl-L-glutamate (NAG) by ATP in the second step of arginine biosynthesis found in bacteria and photosynthetic organisms using either the acetylated, noncyclic (NC), or non-acetylated, cyclic (C) route of ornithine biosynthesis. Also included in this CD is a distinct group of uncharacterized (UC) bacterial and archeal NAGKs. Members of this CD belong to the Amino Acid Kinase Superfamily (AAK).. 58605 cd04239: AAK_UMPK-like: UMP kinase (UMPK)-like, the microbial/chloroplast uridine monophosphate kinase (uridylate kinase) enzyme that catalyzes UMP phosphorylation and plays a key role in pyrimidine nucleotide biosynthesis. Regulation of this process is via feed-back control and via gene repression of carbamoyl phosphate synthetase (the first enzyme of the pyrimidine biosynthesis pathway). The UMP kinases of E. coli (Ec) and Pyrococcus furiosus (Pf) are known to function as homohexamers, with GTP and UTP being allosteric effectors. Like other related enzymes (carbamate kinase, aspartokinase, and N-acetylglutamate kinase) the E. coli and most bacterial UMPKs have a conserved, N-terminal, lysine residue proposed to function in the catalysis of the phosphoryl group transfer, whereas most archaeal UMPKs appear to lack this residue and the Pyrococcus furiosus structure has an additional Mg ion bound to the ATP molecule which is proposed to function as the catalysis instead. Also included in this CD are the alpha and beta subunits of the Mo storage protein (MosA and MosB) characterized as an alpha4-beta4 octamer containing an ATP-dependent, polynuclear molybdenum-oxide cluster. These and related sequences in this CD are members of the Amino Acid Kinase Superfamily (AAK).. 58606 cd04240: AAK_UC: Uncharacterized (UC) amino acid kinase-like proteins found mainly in archaea and a few bacteria. Sequences in this CD are members of the Amino Acid Kinase (AAK) superfamily. 58607 cd04241: AAK_FomA-like: This CD includes a fosfomycin biosynthetic gene product, FomA, and similar proteins found in a wide range of organisms. Together, the fomA and fomB genes in the fosfomycin biosynthetic gene cluster of Streptomyces wedmorensis confer high-level fosfomycin resistance. FomA and FomB proteins converted fosfomycin to fosfomycin monophosphate and fosfomycin diphosphate in the presence of ATP and a magnesium ion, indicating that FomA and FomB catalyzed phosphorylations of fosfomycin and fosfomycin monophosphate, respectively. FomA and related sequences in this CD are members of the Amino Acid Kinase Superfamily (AAK).. 58608 cd04242: AAK_G5K_ProB: Glutamate-5-kinase (G5K) catalyzes glutamate-dependent ATP cleavage; G5K transfers the terminal phosphoryl group of ATP to the gamma-carboxyl group of glutamate, in the first and controlling step of proline (and, in mammals, ornithine) biosynthesis. G5K is subject to feedback allosteric inhibition by proline or ornithine. In microorganisms and plants, proline plays an important role as an osmoprotectant and, in mammals, ornithine biosynthesis is crucial for proper ammonia detoxification, since a G5K mutation has been shown to cause human hyperammonaemia. Microbial G5K generally consists of two domains: a catalytic G5K domain and one PUA (pseudo uridine synthases and archaeosine-specific transglycosylases) domain, and some lack the PUA domain. G5K requires free Mg for activity, it is tetrameric, and it aggregates to higher forms in a proline-dependent way. G5K lacking the PUA domain remains tetrameric, active, and proline-inhibitable, but the Mg requirement and the proline-triggered aggregation are greatly diminished and abolished, respectively, and more proline is needed for inhibition. Although plant and animal G5Ks are part of a bifunctional polypeptide, delta 1-pyrroline-5-carboxylate synthetase (P5CS), composed of an N-terminal G5K (ProB) and a C-terminal glutamyl 5- phosphate reductase (G5PR; ProA); bacterial and yeast G5Ks are monofunctional single-polypeptide enzymes. In this CD, all three domain architectures are present: G5K, G5K+PUA, and G5K+G5PR. 58609 cd04243: AAK_AK-HSDH-like: Amino Acid Kinase Superfamily (AAK), AK-HSDH-like; this family includes the N-terminal catalytic domain of aspartokinase (AK) of the bifunctional enzyme AK- homoserine dehydrogenase (HSDH). These aspartokinases are found in such bacteria as E. coli (AKI-HSDHI, ThrA and AKII-HSDHII, MetL) and in higher plants (Z. mays AK-HSDH). AK and HSDH are the first and third enzymes in the biosynthetic pathway of the aspartate family of amino acids. AK catalyzes the phosphorylation of Asp to P-aspartyl phosphate. HSDH catalyzes the NADPH-dependent conversion of Asp 3-semialdehyde to homoserine. ThrA and MetL are involved in threonine and methionine biosynthesis, respectively. In E. coli, ThrA is subject to allosteric regulation by the end product L-threonine and the native enzyme is reported to be tetrameric. As with bacteria, plant AK and HSDH are feedback inhibited by pathway end products. Maize AK-HSDH is a Thr-sensitive 180-kD enzyme. Arabidopsis AK-HSDH is an alanine-activated, threonine-sensitive enzyme whose ACT domains, located C-terminal to the AK catalytic domain, were shown to be involved in allosteric activation. Also included in this CD is the catalytic domain of the aspartokinase (AK) of the lysine-sensitive aspartokinase isoenzyme AKIII, a monofunctional class enzyme (LysC) found in some bacteria such as E. coli. In E. coli, LysC is reported to be a homodimer of 50 kD subunits. Also included in this CD is the catalytic domain of aspartokinase (AK) of the bifunctional enzyme AK - DAP decarboxylase (DapDC) found in some bacteria. DapDC, which is the lysA gene product, catalyzes the decarboxylation of DAP to lysine. 58610 cd04244: AAK_AK-LysC-like: Amino Acid Kinase Superfamily (AAK), AK-LysC-like; this CD includes the N-terminal catalytic aspartokinase (AK) domain of the lysine-sensitive AK isoenzyme found in higher plants. The lysine-sensitive AK isoenzyme is a monofunctional protein. It is involved in the overall regulation of the aspartate pathway and can be synergistically inhibited by S-adenosylmethionine. Also included in this CD is an uncharacterized LysC-like AK found in Euryarchaeota and some bacteria. AK catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. 58611 cd04245: AAK_AKiii-YclM-BS: Amino Acid Kinase Superfamily (AAK), AKiii-YclM-BS; this CD includes the N-terminal catalytic aspartokinase (AK) domain of the lysine plus threonine-sensitive aspartokinase isoenzyme AKIII, a monofunctional class enzyme found in Bacilli (Bacillus subtilis YclM) and Clostridia species. Aspartokinase is the first enzyme in the aspartate metabolic pathway and catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. In Bacillus subtilis (BS), YclM is reported to be a single polypeptide of 50 kD. The Bacillus subtilis 168 AKIII is induced by lysine and repressed by threonine, and it is synergistically inhibited by lysine and threonine. 58612 cd04246: AAK_AK-DapG-like: Amino Acid Kinase Superfamily (AAK), AK-DapG-like; this CD includes the N-terminal catalytic aspartokinase (AK) domain of the diaminopimelate-sensitive aspartokinase isoenzyme AKI (DapG), a monofunctional enzymes found in Bacilli (Bacillus subtilis 168), Clostridia, and Actinobacteria bacterial species, as well as, the catalytic AK domain of the lysine-sensitive aspartokinase isoenzyme AKII of Bacillus subtilis 168, the lysine plus threonine-sensitive aspartokinase of Corynebacterium glutamicum, and related isoenzymes. In Bacillus subtilis, the regulation of the diaminopimelate-lysine biosynthetic pathway involves dual control by diaminopimelate and lysine, effected through separate diaminopimelate- and lysine-sensitive aspartokinase isoenzymes. The role of the AKI isoenzyme is most likely to provide a constant level of aspartyl-beta-phosphate for the biosynthesis of diaminopimelate for peptidoglycan synthesis and dipicolinate during sporulation. The B. subtilis 168 AKII is induced by methionine, and repressed and inhibited by lysine. In Corynebacterium glutamicum and other various Gram-positive bacteria, the DAP-lysine pathway is feedback regulated by the concerted action of lysine and threonine. Also included in this CD are the aspartokinases of the extreme thermophile, Thermus thermophilus HB27, the Gram-negative obligate methylotroph, Methylophilus methylotrophus AS1, and those single aspartokinase isoenzyme types found in Pseudomonas, C. glutamicum, and Amycolatopsis lactamdurans. The B. subtilis AKI is tetrameric consisting of two alpha and two beta subunits; the alpha (43 kD) and beta (17 kD) subunit formed by two in-phase overlapping genes. The alpha subunit contains the AK catalytic domain and two ACT domains. The beta subunit contains two ACT domains. The B. subtilis 168 AKII aspartokinase is also described as tetrameric consisting of two alpha and two beta subunits. Some archeal aspartokinases in this group lack recognizable ACT domains. 58613 cd04247: AAK_AK-Hom3: Amino Acid Kinase Superfamily (AAK), AK-Hom3; this CD includes the N-terminal catalytic domain of the aspartokinase HOM3, a monofunctional class enzyme found in Saccharomyces cerevisiae and other related AK domains. Aspartokinase, the first enzyme in the aspartate metabolic pathway, catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP, and in fungi, is responsible for the production of threonine, isoleucine and methionine. S. cerevisiae has a single aspartokinase isoenzyme type, which is regulated by feedback, allosteric inhibition by L-threonine. Recent studies show that the allosteric transition triggered by binding of threonine to AK involves a large change in the conformation of the native hexameric enzyme that is converted to an inactive one of different shape and substantially smaller hydrodynamic size. 58614 cd04248: AAK_AK-Ectoine: Amino Acid Kinase Superfamily (AAK), AK-Ectoine; this CD includes the N-terminal catalytic domain of the aspartokinase of the ectoine (1,4,5,6-tetrahydro-2-methyl pyrimidine-4-carboxylate) biosynthetic pathway found in Methylomicrobium alcaliphilum, Vibrio cholerae, and other various halotolerant or halophilic bacteria. Bacteria exposed to hyperosmotic stress accumulate organic solutes called 'compatible solutes' of which ectoine, a heterocyclic amino acid, is one. Apart from its osmotic function, ectoine also exhibits a protective effect on proteins, nucleic acids and membranes against a variety of stress factors. de novo synthesis of ectoine starts with the phosphorylation of L-aspartate and shares its first two enzymatic steps with the biosynthesis of amino acids of the aspartate family: aspartokinase and L-aspartate-semialdehyde dehydrogenase. The M. alcaliphilum and the V. cholerae aspartokinases are encoded on the ectABCask operon. 58615 cd04249: AAK_NAGK-NC: N-Acetyl-L-glutamate kinase - noncyclic (NAGK-NC) catalyzes the phosphorylation of the gamma-COOH group of N-acetyl-L-glutamate (NAG) by ATP in the second step of microbial arginine biosynthesis using the acetylated, noncyclic route of ornithine biosynthesis. There are two variants of this pathway. In one, typified by the pathway in Escherichia coli, glutamate is acetylated by acetyl-CoA and acetylornithine is deacylated hydrolytically. In this pathway, feedback inhibition by arginine occurs at the initial acetylation of glutamate and not at the phosphorylation of NAG by NAGK. Homodimeric NAGK-NC are members of the Amino Acid Kinase Superfamily (AAK).. 58616 cd04250: AAK_NAGK-C: N-Acetyl-L-glutamate kinase - cyclic (NAGK-C) catalyzes the phosphorylation of the gamma-COOH group of N-acetyl-L-glutamate (NAG) by ATP in the second step of arginine biosynthesis found in some bacteria and photosynthetic organisms using the non-acetylated, cyclic route of ornithine biosynthesis. In this pathway, glutamate is first N-acetylated and then phosphorylated by NAGK to give phosphoryl NAG, which is converted to NAG-ornithine. There are two variants of this pathway. In one, typified by the pathway in Thermotoga maritima and Pseudomonas aeruginosa, the acetyl group is recycled by reversible transacetylation from acetylornithine to glutamate. The phosphorylation of NAG by NAGK is feedback inhibited by arginine. In photosynthetic organisms, NAGK is the target of the nitrogen-signaling protein PII. Hexameric formation of NAGK domains appears to be essential to both arginine inhibition and NAGK-PII complex formation. NAGK-C are members of the Amino Acid Kinase Superfamily (AAK).. 58617 cd04251: AAK_NAGK-UC: N-Acetyl-L-glutamate kinase - uncharacterized (NAGK-UC). This domain is similar to Escherichia coli and Pseudomonas aeruginosa NAGKs which catalyze the phosphorylation of the gamma-COOH group of N-acetyl-L-glutamate (NAG) by ATP in the second step of microbial arginine biosynthesis. These uncharacterized domain sequences are found in some bacteria (Deinococci and Chloroflexi) and archea and belong to the Amino Acid Kinase Superfamily (AAK).. 58618 cd04252: AAK_NAGK-fArgBP: N-Acetyl-L-glutamate kinase (NAGK) of the fungal arginine-biosynthetic pathway (fArgBP). The nuclear-encoded, mitochondrial polyprotein precursor with an N-terminal NAGK (ArgB) domain (this CD), a central DUF619 domain, and a C-terminal reductase domain (ArgC, N-Acetylglutamate Phosphate Reductase, NAGPR). The precursor is cleaved in the mitochondria into two distinct enzymes (NAGK-DUF619 and NAGPR). Native molecular weights of these proteins indicate that the kinase is an octamer whereas the reductase is a dimer. This CD also includes some gamma-proteobacteria (Xanthomonas and Xylella) NAG kinases with an N-terminal NAGK (ArgB) domain (this CD) and a C-terminal DUF619 domain. The DUF619 domain is described as a putative distant homolog of the acetyltransferase, ArgA, predicted to function in NAG synthase association in fungi. Eukaryotic sequences have an N-terminal mitochondrial transit peptide. Members of this NAG kinase domain CD belong to the Amino Acid Kinase Superfamily (AAK).. 58619 cd04253: AAK_UMPK-PyrH-Pf: UMP kinase (UMPK)-Pf, the mostly archaeal uridine monophosphate kinase (uridylate kinase) enzymes that catalyze UMP phosphorylation and play a key role in pyrimidine nucleotide biosynthesis; regulation of this process is via feed-back control and via gene repression of carbamoyl phosphate synthetase (the first enzyme of the pyrimidine biosynthesis pathway). The UMP kinase of Pyrococcus furiosus (Pf) is known to function as a homohexamer, with GTP and UTP being allosteric effectors. Like other related enzymes (carbamate kinase, aspartokinase, and N-acetylglutamate kinase) the E. coli and most bacterial UMPKs have a conserved, N-terminal, lysine residue proposed to function in the catalysis of the phosphoryl group transfer, whereas most archaeal UMPKs (this CD) appear to lack this residue and the Pyrococcus furiosus structure has an additional Mg ion bound to the ATP molecule which is proposed to function as the catalysis instead. Members of this CD belong to the Amino Acid Kinase Superfamily (AAK).. 58620 cd04254: UMP kinase (UMPK)-Ec, the microbial/chloroplast uridine monophosphate kinase (uridylate kinase) enzyme that catalyzes UMP phosphorylation and plays a key role in pyrimidine nucleotide biosynthesis; regulation of this process is via feed-back control and via gene repression of carbamoyl phosphate synthetase (the first enzyme of the pyrimidine biosynthesis pathway). The UMP kinase of E. coli (Ec) is known to function as a homohexamer, with GTP and UTP being allosteric effectors. Like other related enzymes (carbamate kinase, aspartokinase, and N-acetylglutamate kinase) the E. coli and most bacterial and chloroplast UMPKs (this CD) have a conserved, N-terminal, lysine residue proposed to function in the catalysis of the phosphoryl group transfer, whereas most archaeal UMPKs appear to lack this residue and the Pyrococcus furiosus structure has an additional Mg ion bound to the ATP molecule which is proposed to function as the catalysis instead. Members of this CD belong to the Amino Acid Kinase Superfamily (AAK).. 58621 cd04255: AAK_UMPK-MosAB: This CD includes the alpha and beta subunits of the Mo storage protein (MosA and MosB) which are related to uridine monophosphate kinase (UMPK) enzymes that catalyze the phosphorylation of UMP by ATP, yielding UDP, and playing a key role in pyrimidine nucleotide biosynthesis. The Mo storage protein from the nitrogen-fixing bacterium, Azotobacter vinelandii, is characterized as an alpha4-beta4 octamer containing a polynuclear molybdenum-oxide cluster which is ATP-dependent to bind Mo and pH-dependent to release Mo. These and related bacterial sequences in this CD are members of the Amino Acid Kinase Superfamily (AAK).. 58622 cd04256: AAK_P5CS_ProBA: Glutamate-5-kinase (G5K) domain of the bifunctional delta 1-pyrroline-5-carboxylate synthetase (P5CS), composed of an N-terminal G5K (ProB) and a C-terminal glutamyl 5- phosphate reductase (G5PR, ProA), the first and second enzyme catalyzing proline (and, in mammals, ornithine) biosynthesis. G5K transfers the terminal phosphoryl group of ATP to the gamma-carboxyl group of glutamate, and is subject to feedback allosteric inhibition by proline or ornithine. In plants, proline plays an important role as an osmoprotectant and, in mammals, ornithine biosynthesis is crucial for proper ammonia detoxification, since a G5K mutation has been shown to cause human hyperammonaemia. 58623 cd04257: AAK_AK-HSDH: Amino Acid Kinase Superfamily (AAK), AK-HSDH; this CD includes the N-terminal catalytic domain of aspartokinase (AK) of the bifunctional enzyme AK - homoserine dehydrogenase (HSDH). These aspartokinases are found in bacteria (E. coli AKI-HSDHI, ThrA and E. coli AKII-HSDHII, MetL) and higher plants (Z. mays AK-HSDH). AK and HSDH are the first and third enzymes in the biosynthetic pathway of the aspartate family of amino acids. AK catalyzes the phosphorylation of Asp to P-aspartyl phosphate. HSDH catalyzes the NADPH-dependent conversion of Asp 3-semialdehyde to homoserine. ThrA and MetL are involved in threonine and methionine biosynthesis, respectively. In E. coli, ThrA is subject to allosteric regulation by the end product L-threonine and the native enzyme is reported to be tetrameric. As with bacteria, plant AK and HSDH are feedback inhibited by pathway end products. Maize AK-HSDH is a Thr-sensitive 180-kD enzyme. Arabidopsis AK-HSDH is an alanine-activated, threonine-sensitive enzyme whose ACT domains, located C-terminal to the AK catalytic domain, were shown to be involved in allosteric activation. 58624 cd04258: AAK_AKiii-LysC-EC: Amino Acid Kinase Superfamily (AAK), AKiii-LysC-EC: this CD includes the N-terminal catalytic aspartokinase (AK) domain of the lysine-sensitive aspartokinase isoenzyme AKIII. AKIII is a monofunctional class enzyme (LysC) found in some bacteria such as E. coli. Aspartokinase is the first enzyme in the aspartate metabolic pathway and catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. In E. coli, LysC is reported to be a homodimer of 50 kD subunits. 58625 cd04259: AAK_AK-DapDC: Amino Acid Kinase Superfamily (AAK), AK-DapDC; this CD includes the N-terminal catalytic aspartokinase (AK) domain of the bifunctional enzyme AK - DAP decarboxylase (DapDC) found in some bacteria. Aspartokinase is the first enzyme in the aspartate metabolic pathway, catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. DapDC, which is the lysA gene product, catalyzes the decarboxylation of DAP to lysine. 58626 cd04260: AAK_AKi-DapG-BS: Amino Acid Kinase Superfamily (AAK), AKi-DapG; this CD includes the N-terminal catalytic aspartokinase (AK) domain of the diaminopimelate-sensitive aspartokinase isoenzyme AKI (DapG), a monofunctional class enzyme found in Bacilli (Bacillus subtilis 168), Clostridia, and Actinobacteria bacterial species. In Bacillus subtilis, the regulation of the diaminopimelate-lysine biosynthetic pathway involves dual control by diaminopimelate and lysine, effected through separate diaminopimelate- and lysine-sensitive aspartokinase isoenzymes. AKI activity is invariant during the exponential and stationary phases of growth and is not altered by addition of amino acids to the growth medium. The role of this isoenzyme is most likely to provide a constant level of aspartyl-beta-phosphate for the biosynthesis of diaminopimelate for peptidoglycan synthesis and dipicolinate during sporulation. The B. subtilis AKI is tetrameric consisting of two alpha and two beta subunits; the alpha (43 kD) and beta (17 kD) subunit formed by two in-phase overlapping genes. The alpha subunit contains the AK catalytic domain and two ACT domains. The beta subunit contains two ACT domains. 58627 cd04261: AAK_AKii-LysC-BS: Amino Acid Kinase Superfamily (AAK), AKii; this CD includes the N-terminal catalytic aspartokinase (AK) domain of the lysine-sensitive aspartokinase isoenzyme AKII of Bacillus subtilis 168, and the lysine plus threonine-sensitive aspartokinase of Corynebacterium glutamicum, and related sequences. In B. subtilis 168, the regulation of the diaminopimelate (Dap)-lysine biosynthetic pathway involves dual control by Dap and lysine, effected through separate Dap- and lysine-sensitive aspartokinase isoenzymes. The B. subtilis 168 AKII is induced by methionine, and repressed and inhibited by lysine. Although Corynebacterium glutamicum is known to contain a single aspartokinase isoenzyme type, both the succinylase and dehydrogenase variant pathways of DAP-lysine synthesis operate simultaneously in this organism. In this organism and other various Gram-positive bacteria, the DAP-lysine pathway is feedback regulated by the concerted action of lysine and theronine. Also included in this CD are the aspartokinases of the extreme thermophile, Thermus thermophilus HB27, the Gram-negative obligate methylotroph, Methylophilus methylotrophus AS1, and those single aspartokinases found in Pseudomons, C. glutamicum, and Amycolatopsis lactamdurans. B. subtilis 168 AKII, and the C. glutamicum, Streptomyces clavuligerus and A. lactamdurans aspartokinases are described as tetramers consisting of two alpha and two beta subunits; the alpha (44 kD) and beta (18 kD) subunits formed by two in-phase overlapping polypeptides. 58628 cd03687: Dehydratase large subunit. This family contains the large (alpha) subunit of B12-dependent glycerol dehydratases (GDHs) and B12-dependent diol dehydratases (DDHs). GDH is isofunctional with DDH. These enzymes can each catalyze the conversion of 1,2-propanediol, glycerol, and 1,2-ethanediol to the corresponding aldehydes via a coenzyme B12 (adenosylcobalamin)-dependent radical mechanism. Both enzymes exhibit a subunit composition of alpha2beta2gamma2. The enzymes differ in substrate specificity; glycerol is the preferred substrate for GDH and 1,2-propanediol for DDH. GDH shows almost equal affinity for both (R) and (S)-isomers while DDH prefers the (S) isomer. GDH plays a key role in the dihydroxyacetone (DHA) pathway and DDH in the anaerobic degradation of 1,2-diols. The radical mechanism has been well studied for Klebsiella oxytoca DDH and involves binding of 1,2-propanediol to the enzyme to induce hemolytic cleavage of the Co-C5' bond of the coenzyme to form cob(II)alamin and the adenosyl radical. Hydrogen abstraction from the substrate follows producing a substrate generated radical and 5'-deoxyadenosine. Rearrangement to the product radical is then followed by abstraction of a hydrogen atom from 5'-deoxyadenosine to produce the hydrated propionaldehyde and regenerate the adenosyl radical. After the Co-C5' bond is reformed and the hydrated aldehyde dehydrated, the process is complete. GDH has a higher affinity for coenzyme B12 than DDH. Both GDH and DDH are activated by various monovalent cations with K+, NH4+, and Rb+ being the most effective. However, DDH differs from GDH in that it is partially active with Cs+ and Na+. In general, the alpha and beta subunits for both enzymes are on different chains. However, for a subset of the GDHs, alpha and beta subunits appear to be on a single chain. 58629 cd03523: NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex. 58630 cd03574: NTR/C345C domain; The NTR domains that are found in the C-termini of complement C3, C4 and C5, are also called C345C domains. In C5, the domain interacts with various partners during the formation of the membrane attack complex, a fundamental process in the mammalian defense against infection. It's role in component C3 and C4 is not well understood. 58631 cd03575: NTR domain, WFIKKN subfamily; WFIKKN proteins contain a C-terminal NTR domain and are putative secreted proteins which may be multivalent protease inhibitors that act on serine proteases as well as metalloproteases. Human WFIKKN and a related protein sharing the same domain architecture were observed to have distinct tissue expression patterns. WFIKKN is also referred to as growth and differentiation factor-associated serum protein-1 (GASP-1). It inhibits the activity of mature myostatin, a specific regulator of skeletal muscle mass and a member of the TGFbeta superfamily. 58632 cd03576: NTR domain, PCOLCE subfamily; Procollagen C-endopeptidase enhancers (PCOLCEs) are extracellular matrix proteins that enhance the activity of procollagen C-proteases, by binding to the procollagen I C-peptide. They contain a C-terminal NTR domain, which have been suggested to possess inhibitory functions towards specific serine proteases but not towards metzincins, which are inhibited by the related TIMPs. 58633 cd03577: NTR domain, TIMP-like subfamily; TIMPs, or tissue inibitors of metalloproteases, are essential regulators of extracellular matrix turnover and remodeling. They form complexes with matrix metalloproteases (MMPs) and inactivate them irreversibly by non-covalently binding their active zinc-binding sites. This group contains domains similar to the TIMP NTR domain, which binds MMPs. Members of this group may or may not function as MMP inhibitors. 58634 cd03578: NTR domain, Netrin-4-like subfamily; composed of the C-terminal NTR domains of netrin-4 (beta netrin) and similar proteins. Netrins are secreted proteins that function as tropic cues in the direction of axon growth and cell migration during neural development. Netrin-4 is a basement membrane component that is important in neural, kidney and vascular development. It may also be involved in regulating the outgrowth and shape of epithelial cells during lung branching morphogenesis. 58635 cd03579: NTR domain, Netrin-1-like subfamily; The C-terminal NTR domain of netrins is also called domain C in the context of C. elegans netrin UNC-6. Netrins are secreted proteins that function as tropic cues in the direction of axon growth and cell migration during neural development. These proteins may be chemoattractive to some neurons and chemorepellant for others. In the case of netrin-1, attraction and repulsion responses are mediated by the DCC and UNC-5 receptor families. The biological activities of C. elegans UNC-6, which may either attract or repel migrating cells or axons, are mediated by its different domains. The C-terminal NTR domain of UNC-6 has been shown to inhibit axon branching activity. 58636 cd03580: NTR domain, Secreted frizzled-related protein (Sfrp) 1-like subfamily; composed of proteins similar to human Sfrp1, Sfrp2 and Sfrp5. Sfrps are soluble proteins containing an NTR domain C-terminal to a cysteine-rich Frizzled domain. They show diverse functions and are thought to work in Wnt signaling indirectly, as modulators or antagonists by binding Wnt ligands, and directly, via the Wnt receptor, Frizzled. They participate in regulating the patterning along the anteroposterior axis in vertebrates. Human Sfrp1 has been found frequently to be downregulated in breast cancer and is associated with disease progression and poor prognosis. 58637 cd03581: NTR domain, Secreted frizzled-related protein (Sfrp) 3-like subfamily; composed of proteins similar to human Sfrp3 and Sfrp4. Sfrps are soluble proteins containing an NTR domain C-terminal to a cysteine-rich Frizzled domain. They show diverse functions and are thought to work in Wnt signaling indirectly, as modulators or antagonists by binding Wnt ligands, and directly, via the Wnt receptor, Frizzled. They participate in regulating the patterning along the anteroposterior axis in vertebrates. Human Sfrp3 may suppress the growth and invasiveness of androgen-independent prostate cancer cells. 58638 cd03582: NTR/C345C domain, complement C5 subfamily; The NTR domain found in complement C5 is also known as C345C because it occurs at the C-terminus of complement C3, C4 and C5. Complement C5 is activated by C5 convertase, which itself is a complex between C3b and C3 convertase. The small cleavage fragment, C5a, is the most important small peptide mediator of inflammation, and the larger active fragment, C5b, initiates late events of complement activation. The NTR/C345C domain is important in the function of C5 as it interacts with enzymes that convert C5 to the active form, C5b. The domain has also been found to bind to complement components C6 and C7, and may specifically interact with their factor I modules. 58639 cd03583: NTR/C345C domain, complement C3 subfamily; The NTR domain found in complement C3 is also known as the C345C domain because it occurs at the C-terminus of complement C3, C4 and C5. Complement C3 plays a pivotal role in the activation of the complement systems, as all pathways (classical, alternative, and lectin) result in the processing of C3 by C3 convertase. The larger fragment, activated C3b, contains the NTR/C345C domain and binds covalently, via a reactive thioester, to cell surface carbohydrates including components of bacterial cell walls and immune aggregates. The smaller cleavage product, C3a, acts independently as a diffusible signal to mediate local inflammatory processes. The structure of C3 shows that the NTR/C345C domain is located in an exposed position relative to the rest of the molecule. The function of the domain in complement C3 is poorly understood. 58640 cd03584: NTR/C345C domain, complement C4 subfamily; The NTR domain found in complement C4 is also known as the C345C domain because it occurs at the C-terminus of complement C3, C4 and C5. Complement C4 is a key player in the activation of the component classical pathway. C4 is cleaved by activated C1 to yield C4a anaphylatoxin, and the larger fragment C4b, an essential component of the C3- and C5-convertase enzymes. C4b binds covalently to the surface of pathogens through a reactive thioester. The role of the NTR/C345C domain in C4 (C4b) is unclear. 58641 cd03585: NTR domain, TIMP subfamily; TIMPs, or tissue inibitors of metalloproteases, are essential regulators of extracellular matrix turnover and remodeling. They form complexes with matrix metalloproteases (MMPs) and inactivate them irreversibly by non-covalently binding their active zinc-binding sites. The levels of activated membrane-type MMPs, MMPs, and free TIMPs determine the balance between matrix degradation and matrix formation or stabilization. Consequently, TIMPs play roles in processes that require the remodeling and degradation of connective tissue, such as development, morphogenesis, wound healing, as well as in various diseases and pathological states such as tumor cell metastasis, arthritis, and artherosclerosis. Most TIMPs bind to a variety of MMPs. TIMP-1 and TIMP-2 appear to be multifunctional proteins with diverse biological action. They may exhibit growth factor-like activity and can inhibit angiogenesis. TIMP-3 has been implicated in apoptosis. 58642 cd00307: Ribulose bisphosphate carboxylase/oxygenase (Rubisco), small subunit and related proteins. Rubisco is a bifunctional enzyme catalyzes the initial steps of two opposing metabolic pathways: photosynthetic carbon fixation and the competing process of photorespiration. Rubisco Form I, present in plants and green algae, is composed of eight large and eight small subunits. The nearly identical small subunits are encoded by a family of nuclear genes. After translation, the small subunits are translocated across the chloroplast membrane, where an N-terminal signal peptide is cleaved off. While the large subunits contain the catalytic activities, it has been shown that the small subunits are important for catalysis by enhancing the catalytic rate through inducing conformational changes in the large subunits. This superfamily also contains specific proteins from cyanobacteria. CcmM plays a role in a CO2 concentrating mechanism, which cyanobacteria need to to overcome the low specificity of their Rubisco and fusions to Rubisco activase, a type of chaperone, which promotes and maintains the catalytic activity of Rubisco. CcmM contains an N-terminal carbonic anhydrase fused to four copies of the Rubisco-small subunit domain. 58643 cd03527: Ribulose bisphosphate carboxylase/oxygenase (Rubisco), small subunit. Rubisco is a bifunctional enzyme catalyzes the initial steps of two opposing metabolic pathways: photosynthetic carbon fixation and the competing process of photorespiration. Rubisco Form I, present in plants and green algae, is composed of eight large and eight small subunits. The nearly identical small subunits are encoded by a family of nuclear genes. After translation, the small subunits are translocated across the chloroplast membrane, where an N-terminal signal peptide is cleaved off. While the large subunits contain the catalytic activities, it has been shown that the small subunits are important for catalysis by enhancing the catalytic rate through inducing conformational changes in the large subunits. 58644 cd00003: Pyridoxine 5'-phosphate (PNP) synthase domain; pyridoxal 5'-phosphate is the active form of vitamin B6 that acts as an essential, ubiquitous coenzyme in amino acid metabolism. In bacteria, formation of pyridoxine 5'-phosphate is a step in the biosynthesis of vitamin B6. PNP synthase, a homooctameric enzyme, catalyzes the final step in PNP biosynthesis, the condensation of 1-amino-acetone 3-phosphate and 1-deoxy-D-xylulose 5-phosphate. PNP synthase adopts a TIM barrel topology, intersubunit contacts are mediated by three ''extra'' helices, generating a tetramer of symmetric dimers with shared active sites; the open state has been proposed to accept substrates and to release products, while most of the catalytic events are likely to occur in the closed state; a hydrophilic channel running through the center of the barrel was identified as the essential structural feature that enables PNP synthase to release water molecules produced during the reaction from the closed, solvent-shielded active site. 58645 cd00351: Thymidylate synthase and pyrimidine hydroxymethylase: Thymidylate synthase (TS) and deoxycytidylate hydroxymethylase (dCMP-HMase) are homologs that catalyze analogous alkylation of C5 of pyrimidine nucleotides. Both enzymes are involved in the biosynthesis of DNA precursors and are active as homodimers. However, they exhibit distinct pyrimidine base specificities and differ in the details of their catalyzed reactions. TS is biologically ubiquitous and catalyzes the conversion of dUMP and methylene-tetrahydrofolate (CH2THF) to dTMP and dihydrofolate (DHF). It also acts as a regulator of its own expression by binding and inactivating its own RNA. Due to its key role in the de novo pathway for thymidylate synthesis and, hence, DNA synthesis, it is one of the most conserved enzymes across species and phyla. TS is a well-recognized target for anticancer chemotherapy, as well as a valuable new target against infectious diseases. Interestingly, in several protozoa, a single polypeptide chain codes for both, dihydrofolate reductase (DHFR) and thymidylate synthase (TS), forming a bifunctional enzyme (DHFR-TS), possibly through gene fusion at a single evolutionary point. DHFR-TS is also active as a dimer. Virus encoded dCMP-HMase catalyzes the reversible conversion of dCMP and CH2THF to hydroxymethyl-dCMP and THF. This family also includes dUMP hydroxymethylase, which is encoded by several bacteriophages that infect Bacillus subtilis, for their own protection against the host restriction system, and contain hydroxymethyl-dUMP instead of dTMP in their DNA. 58646 cd00633: Secretoglobins are relatively small, secreted, disulphide-bridged dimeric proteins with encoding genes sharing substantial sequence similarity. Their family subunits may be grouped into five subfamilies, A-E. Uteroglobin (subfamily A), which is identical to Clara cell protein (CC10), forms a globular shaped homodimer with a large hydrophobic pocket located between the two dimers. The uteroglobin monomer structure is composed of four alpha helices that do not form a canonical four helix-bundle motif but rather a boomerang-shaped structure in which helices H1, H3, and H4 are able to bind a homodimeric partner. The hydrophobic pocket binds steroids, particularly progesterone, with high specificity. However, the true biological function of uteroglobin is poorly understood. In mammals, uteroglobin has immunosuppressive and anti-inflammatory properties through the inhibition of phospholipase A2. The other four main subfamilies of secretoglobins are found in heterodimeric combinations, with B and C subfamilies disulphide-bridged to the E and D subfamilies, respectively. [See review by Laukaitis C.M. & Karn R.C. (2005). Biological Journal of the Linnean Society 84, 493]. These include rat prostatic steroid-binding protein (PBP or prostatein), human mammaglobin (or heteroglobin), lipophilins, major cat allergen Fel dI, the hamster Harderian gland proteins and mouse salivary androgen-binding protein (ABP). Example of such a heterodimer: ABPalpha-like sequences are closely related to cat Fel dI chain 1, whereas ABPbeta-gamma-like sequences are closely related to Fel dI chain 2. Thus, the heterodimeric structure of ABPalpha-beta and ABPalpha-gamma is recapitulated by the sequence-similar Fel dI chains 1 and 2. This conservation of primary and quaternary structure indicates that the genome of the eutherian common ancestor of cats, rodents, and primates contained a similar gene pair. 58647 cd00756: MoaE family. Members of this family are involved in biosynthesis of the molybdenum cofactor (Moco), an essential cofactor for a diverse group of redox enzymes. Moco biosynthesis is an evolutionarily conserved pathway present in eubacteria, archaea and eukaryotes. Moco contains a tricyclic pyranopterin, termed molybdopterin (MPT), which carries the cis-dithiolene group responsible for molybdenum ligation. This dithiolene group is generated by MPT synthase in the second major step in Moco biosynthesis. MPT synthase is a heterotetramer consisting of two large (MoaE) and two small (MoaD) subunits. 58648 cd00922: Cytochrome c oxidase subunit IV. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit IV is the largest of the nuclear-encoded subunits. It binds ATP at the matrix side, leading to an allosteric inhibition of enzyme activity at high intramitochondrial ATP/ADP ratios. In mammals, subunit IV has a lung-specific isoform and a ubiquitously expressed isoform. 58649 cd00923: Cytochrome c oxidase subunit Va. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit Va is one of three mammalian subunits that lacks a transmembrane region. Subunit Va is located on the matrix side of the membrane and binds thyroid hormone T2, releasing allosteric inhibition caused by the binding of ATP to subunit IV and allowing high turnover at elevated intramitochondrial ATP/ADP ratios. 58650 cd00926: Cytochrome c oxidase subunit VIb. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit VIb is one of three mammalian subunits that lacks a transmembrane region. It is located on the cytosolic side of the membrane and helps form the dimer interface with the corresponding subunit on the other monomer complex. 58651 cd00927: Cytochrome c oxidase subunit VIc. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. The VIc subunit is found only in eukaryotes and its specific function remains unclear. It has been reported that the relative concentrations of some nuclear encoded CcO subunits, including subunit VIc, compared to those of the mitochondrial encoded subunits, are altered significantly during the progression of prostate cancer. 58652 cd00928: Cytochrome c oxidase subunit VIIa. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit VIIa has two tissue-specific isoforms that are expressed in a developmental manner. VIIa-H is expressed in heart and skeletal muscle but not smooth muscle. VIIa-L is expressed in liver and non-muscle tissues. 58653 cd00929: Cytochrome c oxidase subunit VIIc. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. The VIIc subunit is found only in eukaryotes and its specific function remains unclear. Peroxide inactivation of bovine CcO coincides with the direct oxidation of tryptophan (W19) within subunit VIIc, along with other structural changes in other subunits. 58654 cd00930: Cytochrome oxidase c subunit VIII. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit VIII is the smallest of the nuclear-encoded subunits. It exists in muscle-specific and non-muscle-specific isoforms that are differently expressed in different species, suggesting species-specific regulation of energy metabolism. 58655 cd01083: Glycosaminoglycan (GAG) polysaccharide lyase family. This family consists of a group of secreted bacterial lyase enzymes capable of acting on glycosaminoglycans, such as hyaluronan and chondroitin, in the extracellular matrix of host tissues, contributing to the invasive capacity of the pathogen. These are broad-specificity glycosaminoglycan lyases which recognize uronyl residues in polysaccharides and cleave their glycosidic bonds via a beta-elimination reaction to form a double bond between C-4 and C-5 of the non-reducing terminal uronyl residues of released products. Substrates include chondroitin, chondroitin 4-sulfate, chondroitin 6-sulfate, and hyaluronic acid. Family members include chondroitin AC lyase, chondroitin abc lyase, xanthan lyase, and hyalurate lyase. 58656 cd01403: Cytochrome C oxidase chain VIIb. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. The VIIb subunit is found only in eukaryotes and its specific function remains unclear. A rare polymorphism of the CcO VIIb gene may be associated with the high risk of nasopharyngeal carcinoma in a Cantonese family. 63836 cd00652: TATA box binding protein (TBP): Present in archaea and eukaryotes, TBPs are transcription factors that recognize promoters and initiate transcription. TBP has been shown to be an essential component of three different transcription initiation complexes: SL1, TFIID and TFIIIB, directing transcription by RNA polymerases I, II and III, respectively. TBP binds directly to the TATA box promoter element, where it nucleates polymerase assembly, thus defining the transcription start site. TBP's binding in the minor groove induces a dramatic DNA bending while its own structure barely changes. The conserved core domain of TBP, which binds to the TATA box, has a bipartite structure, with intramolecular symmetry generating a saddle-shaped structure that sits astride the DNA. New members of the TBP family, called TBP-like proteins (TBLP, TLF, TLP) or TBP-related factors (TRF1, TRF2,TRP), are similar to the core domain of TBPs, with identical or chemically similar amino acids at many equivalent positions, suggesting similar structure. However, TLFs contain distinct, conserved amino acids at several positions that distinguish them from TBP. 63837 cd04516: eukaryotic TATA box binding protein (TBP): Present in archaea and eukaryotes, TBPs are transcription factors that recognize promoters and initiate transcription. TBP has been shown to be an essential component of three different transcription initiation complexes: SL1, TFIID and TFIIIB, directing transcription by RNA polymerases I, II and III, respectively. TBP binds directly to the TATA box promoter element, where it nucleates polymerase assembly, thus defining the transcription start site. TBP's binding in the minor groove induces a dramatic DNA bending while its own structure barely changes. The conserved core domain of TBP, which binds to the TATA box, has a bipartite structure, with intramolecular symmetry generating a saddle-shaped structure that sits astride the DNA. 63838 cd04517: TBP-like factors (TLF; also called TLP, TRF, TRP), which are found in most metazoans. TLFs and TBPs have well-conserved core domains; however, they only share about 60% similarity. TLFs, like TBPs, interact with TFIIA and TFIIB, which are part of the basal transcription machinery. Yet, in contrast to TBPs, TLFs seem not to interact with the TATA-box and even have a negative effect on the transcription of TATA-containing promoters. Recent results indicate that TLFs are involved in the transcription via TATA-less promoters. 63839 cd04518: archaeal TATA box binding protein (TBP): TBPs are transcription factors present in archaea and eukaryotes, that recognize promoters and initiate transcription. TBP has been shown to be an essential component of three different transcription initiation complexes: SL1, TFIID and TFIIIB, directing transcription by RNA polymerases I, II and III, respectively. TBP binds directly to the TATA box promoter element, where it nucleates polymerase assembly, thus defining the transcription start site. TBP's binding in the minor groove induces a dramatic DNA bending while its own structure barely changes. The conserved core domain of TBP, which binds to the TATA box, has a bipartite structure, with intramolecular symmetry generating a saddle-shaped structure that sits astride the DNA. 63840 cd00159: RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins. 63841 cd04372: RhoGAP_chimaerin: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of chimaerins. Chimaerins are a family of phorbolester- and diacylglycerol-responsive GAPs specific for the Rho-like GTPase Rac. Chimaerins exist in two alternative splice forms that each contain a C-terminal GAP domain, and a central C1 domain which binds phorbol esters, inducing a conformational change that activates the protein; one splice form is lacking the N-terminal Src homology-2 (SH2) domain. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63842 cd04373: RhoGAP_p190: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of p190-like proteins. p190, also named RhoGAP5, plays a role in neuritogenesis and axon branch stability. p190 shows a preference for Rho, over Rac and Cdc42, and consists of an N-terminal GTPase domain and a C-terminal GAP domain. The central portion of p190 contains important regulatory phosphorylation sites. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63843 cd04374: RhoGAP_Graf: GTPase-activator protein (GAP) domain for Rho-like GTPases found in GRAF (GTPase regulator associated with focal adhesion kinase); Graf is a multi-domain protein, containing SH3 and PH domains, that binds focal adhesion kinase and influences cytoskeletal changes mediated by Rho proteins. Graf exhibits GAP activity toward RhoA and Cdc42, but only weakly activates Rac1. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63844 cd04375: RhoGAP_DLC1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of DLC1-like proteins. DLC1 shows in vitro GAP activity towards RhoA and CDC42. Beside its C-terminal GAP domain, DLC1 also contains a SAM (sterile alpha motif) and a START (StAR-related lipid transfer action) domain. DLC1 has tumor suppressor activity in cell culture. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63845 cd04376: RhoGAP_ARHGAP6: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of ArhGAP6-like proteins. ArhGAP6 shows GAP activity towards RhoA, but not towards Cdc42 and Rac1. ArhGAP6 is often deleted in microphthalmia with linear skin defects syndrome (MLS); MLS is a severe X-linked developmental disorder. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63846 cd04377: RhoGAP_myosin_IX: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in class IX myosins. Class IX myosins contain a characteristic head domain, a neck domain, a tail domain which contains a C6H2-zinc binding motif and a RhoGAP domain. Class IX myosins are single-headed, processive myosins that are partly cytoplasmic, and partly associated with membranes and the actin cytoskeleton. Class IX myosins are implicated in the regulation of neuronal morphogenesis and function of sensory systems, like the inner ear. There are two major isoforms, myosin IXA and IXB with several splice variants, which are both expressed in developing neurons. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63847 cd04378: RhoGAP_GMIP_PARG1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of GMIP (Gem interacting protein) and PARG1 (PTPL1-associated RhoGAP1). GMIP plays important roles in neurite growth and axonal guidance, and interacts with Gem, a member of the RGK subfamily of the Ras small GTPase superfamily, through the N-terminal half of the protein. GMIP contains a C-terminal RhoGAP domain. GMIP inhibits RhoA function, but is inactive towards Rac1 and Cdc41. PARG1 interacts with Rap2, also a member of the Ras small GTPase superfamily whose exact function is unknown, and shows strong preference for Rho. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63848 cd04379: RhoGAP_SYD1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in SYD-1_like proteins. Syd-1, first identified and best studied in C.elegans, has been shown to play an important role in neuronal development by specifying axonal properties. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63849 cd04380: RhoGAP_OCRL1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in OCRL1-like proteins. OCRL1 (oculocerebrorenal syndrome of Lowe 1)-like proteins contain two conserved domains: a central inositol polyphosphate 5-phosphatase domain and a C-terminal Rho GAP domain, this GAP domain lacks the catalytic residue and therefore maybe inactive. OCRL-like proteins are type II inositol polyphosphate 5-phosphatases that can hydrolyze lipid PI(4,5)P2 and PI(3,4,5)P3 and soluble Ins(1,4,5)P3 and Ins(1,3,4,5)P4, but their individual specificities vary. The functionality of the RhoGAP domain is still unclear. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63850 cd04381: RhoGap_RalBP1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in RalBP1 proteins, also known as RLIP, RLIP76 or cytocentrin. RalBP1 plays an important role in endocytosis during interphase. During mitosis, RalBP1 transiently associates with the centromere and has been shown to play an essential role in the proper assembly of the mitotic apparatus. RalBP1 is an effector of the Ral GTPase which itself is an effector of Ras. RalBP1 contains a RhoGAP domain, which shows weak activity towards Rac1 and Cdc42, but not towards Ral, and a Ral effector domain binding motif. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63851 cd04382: RhoGAP_MgcRacGAP: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in MgcRacGAP proteins. MgcRacGAP plays an important dual role in cytokinesis: i) it is part of centralspindlin-complex, together with the mitotic kinesin MKLP1, which is critical for the structure of the central spindle by promoting microtuble bundling. ii) after phosphorylation by aurora B MgcRacGAP becomes an effective regulator of RhoA and plays an important role in the assembly of the contractile ring and the initiation of cytokinesis. MgcRacGAP-like proteins contain a N-terminal C1-like domain, and a C-terminal RhoGAP domain. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63852 cd04383: RhoGAP_srGAP: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in srGAPs. srGAPs are components of the intracellular part of Slit-Robo signalling pathway that is important for axon guidance and cell migration. srGAPs contain an N-terminal FCH domain, a central RhoGAP domain and a C-terminal SH3 domain; this SH3 domain interacts with the intracellular proline-rich-tail of the Roundabout receptor (Robo). This interaction with Robo then activates the rhoGAP domain which in turn inhibits Cdc42 activity. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63853 cd04384: RhoGAP_CdGAP: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of CdGAP-like proteins; CdGAP contains an N-terminal RhoGAP domain and a C-terminal proline-rich region, and it is active on both Cdc42 and Rac1 but not RhoA. CdGAP is recruited to focal adhesions via the interaction with the scaffold protein actopaxin (alpha-parvin). Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63854 cd04385: RhoGAP_ARAP: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in ARAPs. ARAPs (also known as centaurin deltas) contain, besides the RhoGAP domain, an Arf GAP, ankyrin repeat ras-associating, and PH domains. Since their ArfGAP activity is PIP3-dependent, ARAPs are considered integration points for phosphoinositide, Arf and Rho signaling. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63855 cd04386: RhoGAP_nadrin: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of Nadrin-like proteins. Nadrin, also named Rich-1, has been shown to be involved in the regulation of Ca2+-dependent exocytosis in neurons and recently has been implicated in tight junction maintenance in mammalian epithelium. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63856 cd04387: RhoGAP_Bcr: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of Bcr (breakpoint cluster region protein)-like proteins. Bcr is a multidomain protein with a variety of enzymatic functions. It contains a RhoGAP and a Rho GEF domain, a Ser/Thr kinase domain, an N-terminal oligomerization domain, and a C-terminal PDZ binding domain, in addition to PH and C2 domains. Bcr is a negative regulator of: i) RacGTPase, via the Rho GAP domain, ii) the Ras-Raf-MEK-ERK pathway, via phosphorylation of the Ras binding protein AF-6, and iii) the Wnt signaling pathway through binding beta-catenin. Bcr can form a complex with beta-catenin and Tcf1. The Wnt signaling pathway is involved in cell proliferation, differentiation, and cell renewal. Bcr was discovered as a fusion partner of Abl. The Bcr-Abl fusion is characteristic for a large majority of chronic myelogenous leukemias (CML). Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63857 cd04388: RhoGAP_p85: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in the p85 isoforms of the regulatory subunit of the class IA PI3K (phosphatidylinositol 3 '-kinase). This domain is also called Bcr (breakpoint cluster region protein) homology (BH) domain. Class IA PI3Ks are heterodimers, containing a regulatory subunit (p85) and a catalytic subunit (p110) and are activated by growth factor receptor tyrosine kinases (RTKs); this activation is mediated by the p85 subunit. p85 isoforms, alpha and beta, contain a C-terminal p110-binding domain flanked by two SH2 domains, an N-terminal SH3 domain, and a RhoGAP domain flanked by two proline-rich regions. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63858 cd04389: RhoGAP_KIAA1688: GTPase-activator protein (GAP) domain for Rho-like GTPases found in KIAA1688-like proteins; KIAA1688 is a protein of unknown function that contains a RhoGAP domain and a myosin tail homology 4 (MyTH4) domain. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63859 cd04390: RhoGAP_ARHGAP22_24_25: GTPase-activator protein (GAP) domain for Rho-like GTPases found in ARHGAP22, 24 and 25-like proteins; longer isoforms of these proteins contain an additional N-terminal pleckstrin homology (PH) domain. ARHGAP25 (KIA0053) has been identified as a GAP for Rac1 and Cdc42. Short isoforms (without the PH domain) of ARHGAP24, called RC-GAP72 and p73RhoGAP, and of ARHGAP22, called p68RacGAP, has been shown to be involved in angiogenesis and endothelial cell capillary formation. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63860 cd04391: RhoGAP_ARHGAP18: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of ArhGAP18-like proteins. The function of ArhGAP18 is unknown. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63861 cd04392: RhoGAP_ARHGAP19: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of ArhGAP19-like proteins. The function of ArhGAP19 is unknown. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63862 cd04393: RhoGAP_FAM13A1a: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of FAM13A1, isoform a-like proteins. The function of FAM13A1a is unknown. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by up several orders of magnitude. 63863 cd04394: RhoGAP-ARHGAP11A: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of ArhGAP11A-like proteins. The mouse homolog of human ArhGAP11A has been detected as a gene exclusively expressed in immature ganglion cells, potentially playing a role in retinal development. The exact function of ArhGAP11A is unknown. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63864 cd04395: RhoGAP_ARHGAP21: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of ArhGAP21-like proteins. ArhGAP21 is a multi-domain protein, containing RhoGAP, PH and PDZ domains, and is believed to play a role in the organization of the cell-cell junction complex. It has been shown to function as a GAP of Cdc42 and RhoA, and to interact with alpha-catenin and Arf6. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63865 cd04396: RhoGAP_fSAC7_BAG7: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of fungal SAC7 and BAG7-like proteins. Both proteins are GTPase activating proteins of Rho1, but differ functionally in vivo: SAC7, but not BAG7, is involved in the control of Rho1-mediated activation of the PKC-MPK1 pathway. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63866 cd04397: RhoGAP_fLRG1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of fungal LRG1-like proteins. Yeast Lrg1p is required for efficient cell fusion, and mother-daughter cell separation, possibly through acting as a RhoGAP specifically regulating 1,3-beta-glucan synthesis. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63867 cd04398: RhoGAP_fRGD1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of fungal RGD1-like proteins. Yeast Rgd1 is a GAP protein for Rho3 and Rho4 and plays a role in low-pH response. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63868 cd04399: RhoGAP_fRGD2: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of fungal RGD2-like proteins. Yeast Rgd2 is a GAP protein for Cdc42 and Rho5. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63869 cd04400: RhoGAP_fBEM3: RhoGAP (GTPase-activator [GAP] protein for Rho-like small GTPases) domain of fungal BEM3-like proteins. Bem3 is a GAP protein of Cdc42, and is specifically involved in the control of the initial assembly of the septin ring in yeast bud formation. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63870 cd04401: RhoGAP_fMSB1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of fungal MSB1-like proteins. Msb1 was originally identified as a multicopy suppressor of temperature sensitive cdc42 mutation. Msb1 is a positive regulator of the Pkc1p-MAPK pathway and 1,3-beta-glucan synthesis, both pathways involve Rho1 regulation. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63871 cd04402: RhoGAP_ARHGAP20: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of ArhGAP20-like proteins. ArhGAP20, also known as KIAA1391 and RA-RhoGAP, contains a RhoGAP, a RA, and a PH domain, and ANXL repeats. ArhGAP20 is activated by Rap1 and induces inactivation of Rho, which in turn leads to neurite outgrowth. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63872 cd04403: RhoGAP_ARHGAP27_15_12_9: GTPase-activator protein (GAP) domain for Rho-like GTPases found in ARHGAP27 (also called CAMGAP1), ARHGAP15, 12 and 9-like proteins; This subgroup of ARHGAPs are multidomain proteins that contain RhoGAP, PH, SH3 and WW domains. Most members that are studied show GAP activity towards Rac1, some additionally show activity towards Cdc42. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63873 cd04404: RhoGAP-p50rhoGAP: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of p50RhoGAP-like proteins; p50RhoGAP, also known as RhoGAP-1, contains a C-terminal RhoGAP domain and an N-terminal Sec14 domain which binds phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P3). It is ubiquitously expressed and preferentially active on Cdc42. This subgroup also contains closely related ARHGAP8. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63874 cd04405: RhoGAP_BRCC3-like: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of BRCC3-like proteins. This subgroup also contains two groups of closely related proteins, BRCC3 and DEPDC7, which both contain a C-terminal RhoGAP-like domain and an N-terminal DEP (Disheveled, Egl-10, and Pleckstrin) domain. The function(s) of BRCC3 and DEPDC7 are unknown. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63875 cd04406: RhoGAP_myosin_IXA: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in myosins IXA. Class IX myosins contain a characteristic head domain, a neck domain and a tail domain which contains a C6H2-zinc binding motif and a Rho-GAP domain. Class IX myosins are single-headed, processive myosins that are partly cytoplasmic, and partly associated with membranes and the actin cytoskeleton. Class IX myosins are implicated in the regulation of neuronal morphogenesis and function of sensory systems, like the inner ear. There are two major isoforms, myosin IXA and IXB with several splice variants, which are both expressed in developing neurons. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63876 cd04407: RhoGAP_myosin_IXB: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in myosins IXB. Class IX myosins contain a characteristic head domain, a neck domain and a tail domain which contains a C6H2-zinc binding motif and a Rho-GAP domain. Class IX myosins are single-headed, processive myosins that are partly cytoplasmic, and partly associated with membranes and the actin cytoskeleton. Class IX myosins are implicated in the regulation of neuronal morphogenesis and function of sensory systems, like the inner ear. There are two major isoforms, myosin IXA and IXB with several splice variants, which are both expressed in developing neurons Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63877 cd04408: RhoGAP_GMIP: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of GMIP (Gem interacting protein). GMIP plays important roles in neurite growth and axonal guidance, and interacts with Gem, a member of the RGK subfamily of the Ras small GTPase superfamily, through the N-terminal half of the protein. GMIP contains a C-terminal RhoGAP domain. GMIP inhibits RhoA function, but is inactive towards Rac1 and Cdc41. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63878 cd04409: RhoGAP_PARG1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of PARG1 (PTPL1-associated RhoGAP1). PARG1 was originally cloned as an interaction partner of PTPL1, an intracellular protein-tyrosine phosphatase. PARG1 interacts with Rap2, also a member of the Ras small GTPase superfamily whose exact function is unknown, and shows strong preference for Rho. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. 63879 cd00594: Ku-core domain; includes the central DNA-binding beta-barrels, polypeptide rings, and the C-terminal arm of Ku proteins. The Ku protein consists of two tightly associated homologous subunits, Ku70 and Ku80, and was originally identified as an autoantigen recognized by the sera of patients with an autoimmunity disease. In eukaryotes, the Ku heterodimer contributes to genomic integrity through its ability to bind DNA double-strand breaks and facilitate repair by non-homologous end-joining. The bacterial Ku homologs does not contain the conserved N-terminal extension that is present in the eukaryotic Ku protein. 63880 cd00788: Ku-core domain, Ku70 subfamily; Ku70 is a subunit of the Ku protein, which plays a key role in multiple nuclear processes such as DNA repair, chromosome maintenance, transcription regulation, and V(D)J recombination. The mechanism underlying the regulation of all the diverse functions of Ku is still unclear, although it seems that Ku is a multifunctional protein that works in the nuclei. In mammalian cells, the Ku heterodimer recruits the catalytic subunit of DNA-dependent protein kinase (DNA-PK), which is dependent on its association with the Ku70/80 heterodimer bound to DNA for its protein kinase activity. 63881 cd00789: Ku-core domain, Ku-like subfamily; composed of prokaryotic homologs of the eukaryotic DNA binding protein Ku. The alignment includes the core domain shared by the prokaryotic YkoV-like proteins and the eukaryotic Ku70 and Ku80. The prokaryotic Ku homologs are predicted to form homodimers. It is proposed that the Ku homologs are functionally associated with ATP-dependent DNA ligase and the eukaryotic-type primase, probably as components of a double-strand break repair system. 63882 cd00873: Ku-core domain, Ku80 subfamily; Ku80 is a subunit of the Ku protein, which plays a key role in multiple nuclear processes such as DNA repair, chromosome maintenance, transcription regulation, and V(D)J recombination. The mechanism underlying the regulation of all the diverse functions of Ku is still unclear, although it seems that Ku is a multifunctional protein that works in nuclei. In mammalian cells, the Ku heterodimer recruits the catalytic subunit of DNA-dependent protein kinase (DNA-PK), which is dependent on its association with the Ku70/80 heterodimer bound to DNA for its protein kinase activity. 63883 cd00390: Urease gamma-subunit; Urease is a nickel-dependent metalloenzyme that catalyzes the hydrolysis of urea to form ammonia and carbon dioxide. Nickel-dependent ureases are found in bacteria, archaea, fungi and plants. Their primary role is to allow the use of external and internally-generated urea as a nitrogen source. The enzyme consists of three subunits, alpha, beta and gamma, which can exist as separate proteins or can be fused on a single protein chain. The alpha-beta-gamma heterotrimer forms multimers, mainly trimers. The large alpha subunit is the catalytic domain containing an active site with a bi-nickel center complexed by a carbamylated lysine. The beta and gamma subunits play a role in subunit association to form the higher order trimers. 63884 cd00407: Urease beta-subunit; Urease is a nickel-dependent metalloenzyme that catalyzes the hydrolysis of urea to form ammonia and carbon dioxide. Nickel-dependent ureases are found in bacteria, archaea, fungi and plants. Their primary role is to allow the use of external and internally-generated urea as a nitrogen source. The enzyme consists of three subunits, alpha, beta and gamma, which can exist as separate proteins or can be fused on a single protein chain. The alpha-beta-gamma heterotrimer forms multimers, mainly trimers. The large alpha subunit is the catalytic domain containing an active site with a bi-nickel center complexed by a carbamylated lysine. The beta and gamma subunits play a role in subunit association to form the higher order trimers. 63885 cd00466: Dehydroquinase (DHQase), type II. Dehydroquinase (or 3-dehydroquinate dehydratase) catalyzes the reversible dehydration of 3-dehydroquinate to form 3-dehydroshikimate. This reaction is part of two metabolic pathways: the biosynthetic shikimate pathway and the catabolic quinate pathway. There are two types of DHQases, which are distinct from each other in amino acid sequence and three-dimensional structure. Type I enzymes usually catalyze the biosynthetic reaction using a syn elimination mechanism. In contrast, type II enzymes, found in the quinate pathway of fungi and in the shikimate pathway of many bacteria, are dodecameric enzymes that employ an anti elimination reaction mechanism. 63886 cd00578: L-fucose isomerase (FucIase) and L-arabinose isomerase (AI) family; composed of FucIase, AI and similar proteins. FucIase converts L-fucose, an aldohexose, to its ketose form, which prepares it for aldol cleavage (similar to the isomerization of glucose in glycolysis). L-fucose (or 6-deoxy-L-galactose) is found in various oligo- and polysaccharides in mammals, bacteria and plants. AI catalyzes the isomerization of L-arabinose to L-ribulose, the first reaction in its conversion to D-xylulose-5-phosphate, an intermediate in the pentose phosphate pathway, which allows L-arabinose to be used as a carbon source. AI can also convert D-galactose to D-tagatose at elevated temperatures in the presence of divalent metal ions. D-tagatose, rarely found in nature, is of commercial interest as a low-calorie sugar substitute. 63887 cd03556: L-fucose isomerase (FucIase); FucIase converts L-fucose, an aldohexose, to its ketose form, which prepares it for aldol cleavage (similar to the isomerization of glucose during glycolysis). L-fucose (or 6-deoxy-L-galactose) is found in blood group determinants as well as in various oligo- and polysaccharides, and glycosides in mammals, bacteria and plants. 63888 cd03557: L-Arabinose isomerase (AI) catalyzes the isomerization of L-arabinose to L-ribulose, the first reaction in its conversion into D-xylulose-5-phosphate, an intermediate in the pentose phosphate pathway, which allows L-arabinose to be used as a carbon source. AI can also convert D-galactose to D-tagatose at elevated temperatures in the presence of divalent metal ions. D-tagatose, rarely found in nature, is of commercial interest as a low-calorie sugar substitute. 63889 cd03829: Seven in absentia (Sina) protein family, C-terminal substrate binding domain; composed of the Drosophila Sina protein, the mammalian Sina homolog (Siah), the plant protein SINAT5, and similar proteins. Sina, Siah and SINAT5 are RING-containing proteins that function as E3 ubiquitin ligases, acting either as single proteins or as a part of multiprotein complexes. Sina is expressed in many cells in the developing eye but is essential specifically for R7 photoreceptor cell development. Sina cooperates with Phyllopod (Phyl), Ebi and the E2 ubiquitin-conjugating enzyme Ubcd1 to catalyze the ubiquitination and subsequent degradation of Tramtrack (Ttk88); Ttk88 is a transcriptional repressor that blocks photoreceptor differentiation. Similarly, the mammalian homologue Siah1 cooperates with SIP (Siah-interacting protein), Ebi and the adaptor protein Skp1, to target beta-catenin for ubiquitination and degradation via a p53-dependent mechanism. SINAT5 targets NAC1 for ubiquitin-mediated degradation resulting in the downregulation of auxin, a hormone that controls many aspects of plant development. Other targets of Sina family proteins include c-Myb, synaptophysin, group 1 glutamate receptors, promyelocytic leukemia protein, alpha-synuclein, synphilin-1 and alpha-ketoglutarate dehydrogenase, among others. Sina proteins also bind proteins that are not targets for ubiquitination such as Phyl, adenomatous polyposis coli, VAV, BAG-1 and Dab-1. Siah binds to a consensus motif, PXAXVXP, which is present in Siah-binding proteins. Siah is a dimeric protein consisting of an N-terminal RING domain, two zinc finger motifs and a C-terminal substrate-binding domain (SBD); this SBD contains an eight-stranded antiparallel beta-sandwich fold similar to the MATH (meprin and TRAF-C homology) domain.