ExPASy logo ExPASy Home page Site Map Search ExPASy Contact us Swiss-Prot
Notice: This page will be replaced with www.uniprot.org. Please send us your feedback!
Search for

         UniProtKB/Swiss-Prot protein knowledgebase release 57.4 statistics


1.  INTRODUCTION

Release 57.4 of 16-Jun-09 of UniProtKB/Swiss-Prot contains 470369 sequence entries,
comprising 166709888 amino acids abstracted from 180531 references. 

1563 sequences have been added since release 57.3, the sequence data of
362 existing entries has been updated and the annotations of
430466 entries have been revised.

Number of fragments: 8407
Number of additional sequences produced by alternative splicing, initiation or promoter usage, or ribosomal frameshifting: 27711


Protein existence (PE):           entries     %

1: Evidence at protein level        65026   13.8%
2: Evidence at transcript level     65985     14%
3: Inferred from homology          323911   68.9%
4: Predicted                        13990      3%
5: Uncertain                         1457    0.3%

The growth of the database is summarized below.

   


2.  TAXONOMIC ORIGIN

   Total number of species represented in this release of UniProtKB/Swiss-Prot: 11798

   The first twenty species represent 105168 sequences:  22.4 % of the total
   number of entries.


   2.1 Table of the frequency of occurrence of species

        Species represented 1x: 5202
                            2x: 1714
                            3x:  889
                            4x:  567
                            5x:  414
                            6x:  319
                            7x:  238
                            8x:  195
                            9x:  174
                           10x:  102
                       11- 20x:  556
                       21- 50x:  357
                       51-100x:  180
                         >100x:  891


   2.2  Table of the most represented species

  ------  ---------  --------------------------------------------
  Number  Frequency  Species
  ------  ---------  --------------------------------------------
       1      20330  Homo sapiens (Human)
       2      16140  Mus musculus (Mouse)
       3       8338  Arabidopsis thaliana (Mouse-ear cress)
       4       7384  Rattus norvegicus (Rat)
       5       6552  Saccharomyces cerevisiae (Baker's yeast)
       6       5672  Bos taurus (Bovine)
       7       4957  Schizosaccharomyces pombe (Fission yeast)
       8       4341  Escherichia coli (strain K12)
       9       3805  Bacillus subtilis
      10       3793  Dictyostelium discoideum (Slime mold)
      11       3239  Caenorhabditis elegans
      12       3060  Xenopus laevis (African clawed frog)
      13       2989  Drosophila melanogaster (Fruit fly)
      14       2498  Danio rerio (Zebrafish) (Brachydanio rerio)
      15       2210  Pongo abelii (Sumatran orangutan)
      16       2196  Oryza sativa subsp. japonica (Rice)
      17       2125  Gallus gallus (Chicken)
      18       1984  Escherichia coli O157:H7
      19       1782  Methanocaldococcus jannaschii (Methanococcus jannaschii)
      20       1773  Haemophilus influenzae
      21       1744  Salmonella typhimurium
      22       1661  Escherichia coli O6
      23       1656  Shigella flexneri
      24       1466  Mycobacterium tuberculosis
      25       1403  Xenopus tropicalis (Western clawed frog) (Silurana tropicalis)
      26       1353  Sus scrofa (Pig)
      27       1331  Salmonella typhi
      28       1266  Pseudomonas aeruginosa
      29       1202  Mycobacterium bovis
      30       1151  Macaca fascicularis (Crab eating macaque) (Cynomolgus monkey)
      31       1012  Synechocystis sp. (strain PCC 6803)
      32        990  Archaeoglobus fulgidus
      33        987  Yersinia pestis
      34        933  Vibrio cholerae
      35        915  Salmonella paratyphi A
      36        912  Staphylococcus aureus (strain N315)
      37        912  Staphylococcus aureus (strain Mu50 / ATCC 700699)
      38        909  Acanthamoeba polyphaga mimivirus (APMV)
      39        906  Rhizobium meliloti (Sinorhizobium meliloti)
      40        886  Staphylococcus aureus (strain COL)
      41        883  Oryctolagus cuniculus (Rabbit)
      42        883  Staphylococcus aureus (strain MW2)
      43        878  Staphylococcus aureus (strain MSSA476)
      44        875  Staphylococcus aureus (strain MRSA252)
      45        863  Salmonella choleraesuis
      46        861  Escherichia coli O6:K15:H31 (strain 536 / UPEC)
      47        852  Yersinia pseudotuberculosis
      48        851  Shigella sonnei (strain Ss046)
      49        812  Escherichia coli O9:H4 (strain HS)
      50        806  Shigella boydii serotype 4 (strain Sb227)
      51        803  Escherichia coli O139:H28 (strain E24377A / ETEC)
      52        800  Ashbya gossypii (Yeast) (Eremothecium gossypii)
      53        799  Escherichia coli (strain UTI89 / UPEC)
      54        787  Vibrio parahaemolyticus
      55        784  Shigella dysenteriae serotype 1 (strain Sd197)
      56        782  Escherichia coli (strain ATCC 8739 / DSM 1576 / Crooks)
      57        781  Candida albicans (Yeast)
      58        773  Kluyveromyces lactis (Yeast) (Candida sphaerica)
      59        769  Pasteurella multocida
      60        765  Aquifex aeolicus
      61        762  Erwinia carotovora subsp. atroseptica (Pectobacterium atrosepticum)
      62        761  Canis familiaris (Dog)
      63        756  Neurospora crassa
      64        746  Staphylococcus epidermidis (strain ATCC 35984 / RP62A)
      65        745  Staphylococcus epidermidis (strain ATCC 12228)
      66        732  Shigella flexneri serotype 5b (strain 8401)
      67        730  Candida glabrata (Yeast) (Torulopsis glabrata)
      68        730  Streptomyces coelicolor
      69        726  Photorhabdus luminescens subsp. laumondii
      70        725  Vibrio vulnificus
      71        718  Bacillus halodurans
      72        710  Vibrio vulnificus (strain YJ016)
      73        706  Bacillus anthracis
      74        706  Yersinia enterocolitica serotype O:8 / biotype 1B (strain 8081)
      75        703  Escherichia coli (strain SMS-3-5 / SECEC)
      76        699  Yersinia pestis bv. Antiqua (strain Nepal516)
      77        697  Staphylococcus aureus (strain NCTC 8325)
      78        692  Yersinia pestis bv. Antiqua (strain Antiqua)
      79        691  Yersinia pseudotuberculosis serotype O:1b (strain IP 31758)
      80        688  Salmonella paratyphi B (strain ATCC BAA-1250 / SPB7)
      81        687  Mycoplasma pneumoniae
      82        686  Escherichia coli (strain DH10B)
      83        684  Escherichia coli O1:K1 / APEC
      84        682  Pan troglodytes (Chimpanzee)
      85        677  Enterobacter sp. (strain 638)
      86        675  Pseudomonas syringae pv. tomato
      87        673  Klebsiella pneumoniae subsp. pneumoniae (strain ATCC 700721 / MGH 78578)
      88        672  Anabaena sp. (strain PCC 7120)
      89        665  Pseudomonas putida (strain KT2440)
      90        654  Mycobacterium leprae
      91        652  Staphylococcus aureus (strain USA300)
      92        651  Escherichia coli O45:K1 (strain S88 / ExPEC)
      93        650  Escherichia coli O8 (strain IAI1)
      94        649  Yersinia pestis (strain Pestoides F)
      95        648  Escherichia coli (strain SE11)
      96        648  Escherichia coli O157:H7 (strain EC4115 / EHEC)
      97        647  Escherichia coli O17:K52:H18 (strain UMN026 / ExPEC)
      98        647  Citrobacter koseri (strain ATCC BAA-895 / CDC 4225-83 / SGSC4696)
      99        645  Escherichia coli O7:K1 (strain IAI39 / ExPEC)
     100        642  Zea mays (Maize)
     101        641  Escherichia coli
     102        641  Bradyrhizobium japonicum
     103        640  Salmonella enteritidis PT4 (strain P125109)
     104        634  Salmonella heidelberg (strain SL476)
     105        633  Salmonella paratyphi A (strain AKU_12601)
     106        630  Salmonella newport (strain SL254)
     107        629  Staphylococcus aureus (strain bovine RF122 / ET3-1)
     108        629  Salmonella schwarzengrund (strain CVM19633)
     109        628  Serratia proteamaculans (strain 568)
     110        628  Salmonella agona (strain SL483)
     111        625  Bacillus cereus (strain ATCC 14579 / DSM 31)
     112        623  Salmonella dublin (strain CT_02021853)
     113        617  Shigella boydii serotype 18 (strain CDC 3083-94 / BS512)
     114        614  Treponema pallidum
     115        613  Agrobacterium tumefaciens (strain C58 / ATCC 33970)
     116        612  Shewanella oneidensis
     117        607  Salmonella arizonae (strain ATCC BAA-731 / CDC346-86 / RSK2980)
     118        606  Salmonella gallinarum (strain 287/91 / NCTC 13346)
     119        605  Ralstonia solanacearum (Pseudomonas solanacearum)
     120        600  Methanobacterium thermoautotrophicum
     121        598  Rhizobium loti (Mesorhizobium loti)
     122        596  Staphylococcus haemolyticus (strain JCSC1435)
     123        594  Escherichia fergusonii (strain ATCC 35469 / DSM 13698 / CDC 0568-73)
     124        591  Staphylococcus saprophyticus subsp. saprophyticus 
     125        590  Photobacterium profundum (Photobacterium sp. (strain SS9))
     126        588  Listeria monocytogenes
     127        588  Klebsiella pneumoniae (strain 342)
     128        586  Enterobacter sakazakii (strain ATCC BAA-894)
     129        584  Emericella nidulans (Aspergillus nidulans)
     130        584  Xanthomonas campestris pv. campestris
     131        584  Rickettsia prowazekii
     132        583  Yersinia pseudotuberculosis serotype O:3 (strain YPIII)
     133        583  Helicobacter pylori (Campylobacter pylori)
     134        580  Listeria innocua
     135        576  Lactococcus lactis subsp. lactis (Streptococcus lactis)
     136        576  Yersinia pseudotuberculosis serotype IB (strain PB1/+)
     137        576  Yarrowia lipolytica (Candida lipolytica)
     138        575  Debaryomyces hansenii (Yeast) (Torulaspora hansenii)
     139        575  Neisseria meningitidis serogroup B
     140        573  Bacillus cereus (strain ATCC 10987)
     141        572  Buchnera aphidicola subsp. Acyrthosiphon pisum 
     142        567  Brucella melitensis
     143        566  Brucella suis
     144        564  Helicobacter pylori J99 (Campylobacter pylori J99)
     145        562  Buchnera aphidicola subsp. Schizaphis graminum
     146        552  Bacillus thuringiensis subsp. konkukian
     147        551  Neisseria meningitidis serogroup A
     148        548  Xanthomonas axonopodis pv. citri (Citrus canker)
     149        544  Bacillus cereus (strain ZK / E33L)
     150        544  Pseudomonas syringae pv. syringae (strain B728a)
     151        541  Pseudomonas aeruginosa (strain UCBPP-PA14)
     152        540  Bacillus licheniformis (strain DSM 13 / ATCC 14580)
     153        540  Oceanobacillus iheyensis
     154        540  Vibrio fischeri (strain ATCC 700601 / ES114)
     155        539  Yersinia pestis bv. Antiqua (strain Angola)
     156        539  Caulobacter crescentus (Caulobacter vibrioides)
     157        539  Clostridium acetobutylicum
     158        533  Pseudomonas fluorescens (strain Pf0-1)
     159        531  Pseudomonas fluorescens (strain Pf-5 / ATCC BAA-477)
     160        524  Pseudomonas syringae pv. phaseolicola (strain 1448A / Race 6)
     161        522  Listeria monocytogenes serotype 4b (strain F2365)
     162        517  Bordetella bronchiseptica (Alcaligenes bronchisepticus)
     163        515  Xylella fastidiosa
     164        513  Streptococcus pneumoniae
     165        507  Buchnera aphidicola subsp. Baizongia pistaciae
     166        507  Vibrio cholerae serotype O1 (strain ATCC 39541 / Ogawa 395 / O395)
     167        506  Xylella fastidiosa (strain Temecula1 / ATCC 700964)
     168        505  Thermotoga maritima
     169        503  Bordetella parapertussis
     170        503  Sodalis glossinidius (strain morsitans)
     171        502  Chromobacterium violaceum
     172        501  Bordetella pertussis
     173        498  Haemophilus ducreyi
     174        495  Rickettsia conorii
     175        494  Brucella abortus
     176        491  Pseudomonas aeruginosa (strain PA7)
     177        488  Staphylococcus aureus (strain Newman)
     178        488  Deinococcus radiodurans
     179        488  Pseudomonas entomophila (strain L48)
     180        485  Clostridium perfringens
     181        484  Geobacillus kaustophilus
     182        483  Mycoplasma genitalium
     183        483  Haemophilus influenzae (strain 86-028NP)
     184        482  Xanthomonas campestris pv. campestris (strain 8004)
     185        482  Bacillus clausii (strain KSM-K16)
     186        480  Vibrio harveyi (strain ATCC BAA-1116 / BB120)
     187        479  Corynebacterium glutamicum (Brevibacterium flavum)
     188        478  Burkholderia pseudomallei (Pseudomonas pseudomallei)
     189        478  Shewanella sp. (strain MR-7)
     190        477  Mannheimia succiniciproducens (strain MBEL55E)
     191        476  Streptomyces avermitilis
     192        475  Shewanella sp. (strain MR-4)
     193        473  Methanosarcina acetivorans
     194        472  Oryza sativa subsp. indica (Rice)
     195        469  Synechococcus elongatus (strain PCC 7942) (Anacystis nidulans R2)
     196        468  Staphylococcus aureus (strain Mu3 / ATCC 700698)
     197        467  Brucella abortus (strain 2308)
     198        467  Thermosynechococcus elongatus (strain BP-1)
     199        462  Bacillus amyloliquefaciens (strain FZB42)
     200        462  Aspergillus fumigatus (Sartorya fumigata)
     201        461  Pyrococcus horikoshii
     202        461  Enterococcus faecalis (Streptococcus faecalis)
     203        458  Burkholderia sp. (strain 383) (Burkholderia cepacia 
     204        458  Pseudomonas putida (strain F1 / ATCC 700007)
     205        457  Pyrococcus abyssi
     206        456  Burkholderia mallei (Pseudomonas mallei)
     207        456  Erwinia tasmaniensis (strain DSM 17950 / Et1/99)
     208        455  Acinetobacter sp. (strain ADP1)
     209        454  Rhodopseudomonas palustris
     210        454  Anabaena variabilis (strain ATCC 29413 / PCC 7937)
     211        454  Methanosarcina mazei (Methanosarcina frisia)
     212        453  Xanthomonas campestris pv. vesicatoria (strain 85-10)
     213        453  Shewanella sp. (strain ANA-3)
     214        452  Shewanella frigidimarina (strain NCIMB 400)
     215        452  Halobacterium salinarium (Halobacterium halobium)
     216        450  Streptococcus pneumoniae (strain ATCC BAA-255 / R6)
     217        450  Lactobacillus plantarum
     218        450  Rickettsia felis (Rickettsia azadi)
     219        450  Pseudomonas putida (strain GB-1)
     220        448  Ralstonia eutropha (strain JMP134) (Alcaligenes eutrophus)
     221        445  Ralstonia eutropha  (Cupriavidus necator 
     222        444  Thermoanaerobacter tengcongensis
     223        444  Streptococcus mutans
     224        442  Ovis aries (Sheep)
     225        442  Shewanella baltica (strain OS185)
     226        442  Xanthomonas oryzae pv. oryzae (strain MAFF 311018)
     227        439  Staphylococcus aureus (strain JH1)
     228        439  Rhodobacter sphaeroides (strain ATCC 17023 / 2.4.1 / NCIB 8253 / DSM 158)
     229        439  Chlamydia trachomatis
     230        438  Pyrococcus furiosus
     231        437  Streptococcus pyogenes serotype M6
     232        437  Rickettsia bellii (strain RML369-C)
     233        436  Methylococcus capsulatus
     234        434  Nicotiana tabacum (Common tobacco)
     235        434  Hahella chejuensis (strain KCTC 2396)
     236        434  Aeromonas hydrophila subsp. hydrophila (strain ATCC 7966 / NCIB 9240)
     237        433  Staphylococcus aureus (strain JH9)
     238        433  Caenorhabditis briggsae
     239        431  Pseudomonas aeruginosa (strain LESB58)
     240        431  Campylobacter jejuni
     241        430  Pseudomonas mendocina (strain ymp)
     242        430  Pseudoalteromonas haloplanktis (strain TAC 125)
     243        428  Shewanella baltica (strain OS195)
     244        427  Borrelia burgdorferi (Lyme disease spirochete)
     245        427  Colwellia psychrerythraea (strain 34H / ATCC BAA-681) (Vibrio psychroerythus)
     246        426  Shewanella sp. (strain W3-18-1)
     247        426  Aeromonas salmonicida (strain A449)
     248        426  Shewanella putrefaciens (strain CN-32 / ATCC BAA-453)
     249        425  Proteus mirabilis (strain HI4320)
     250        424  Mycobacterium paratuberculosis


   
   2.3  Taxonomic distribution of the sequences

   

   Kingdom        sequences (% of the database)
    Archaea           16278 (  3%)
    Bacteria         286134 ( 61%)
    Eukaryota        153806 ( 33%)
    Viruses           14151 (  3%)


   Within Eukaryota:

   

    Category            sequences (% of Eukaryota) (% of the complete database)
     Human                  20331 ( 13%)           (  4%)
     Other Mammalia         44210 ( 29%)           (  9%)
     Other Vertebrata       15433 ( 10%)           (  3%)
     Viridiplantae          27824 ( 18%)           (  6%)
     Fungi                  23846 ( 16%)           (  5%)
     Insecta                 6574 (  4%)           (  1%)
     Nematoda                3932 (  3%)           (  1%)
     Other                  11656 (  8%)           (  2%)



3.  SEQUENCE SIZE

   Repartition of the sequences by size (excluding fragments)

               From   To  Number             From   To   Number
                  1-  50    7782             1001-1100     3222
                 51- 100   35975             1101-1200     2194
                101- 150   50830             1201-1300     1721
                151- 200   50701             1301-1400     1688
                201- 250   49822             1401-1500     1326
                251- 300   43243             1501-1600      606
                301- 350   42826             1601-1700      480
                351- 400   37550             1701-1800      395
                401- 450   30612             1801-1900      380
                451- 500   24950             1901-2000      313
                501- 550   17626             2001-2100      187
                551- 600   12695             2101-2200      257
                601- 650   10711             2201-2300      265
                651- 700    7541             2301-2400      166
                701- 750    6355             2401-2500      126
                751- 800    4511             >2500          983
                801- 850    3840
                851- 900    4430
                901- 950    3319
                951-1000    2334

   


   The average sequence length in UniProtKB/Swiss-Prot is 354 amino acids.

   The shortest sequence is   GWA_SEPOF (P83570):     2 amino acids.
   The longest sequence is  TITIN_MOUSE (A2ASS6): 35213 amino acids.


4.  JOURNAL CITATIONS

   Note: the following citation statistics reflect the number of distinct
         journal citations.

   Total number of journals cited in this release of UniProtKB/Swiss-Prot: 1997


   4.1 Table of the frequency of journal citations

        Journals cited 1x:  646
                       2x:  280
                       3x:  131
                       4x:  107
                       5x:   76
                       6x:   60
                       7x:   36
                       8x:   45
                       9x:   34
                      10x:   23
                  11- 20x:  156
                  21- 50x:  157
                  51-100x:   93
                    >100x:  153


   4.2  List of the most cited journals in UniProtKB/Swiss-Prot

   Nb    Citations   Journal name
   --    ---------   -------------------------------------------------------------
    1        17051   Journal of Biological Chemistry
    2         7948   Proceedings of the National Academy of Sciences of the U.S.A.
    3         4895   Journal of Bacteriology
    4         4458   Gene
    5         4340   Biochemical and Biophysical Research Communications
    6         4240   Nucleic Acids Research
    7         3849   FEBS Letters
    8         3662   Biochemistry
    9         3607   The EMBO Journal
   10         3255   Molecular and Cellular Biology
   11         3104   Nature
   12         3058   European Journal of Biochemistry
   13         2916   Biochimica et Biophysica Acta
   14         2881   Journal of Molecular Biology
   15         2527   Cell
   16         2459   Genomics
   17         2097   Biochemical Journal
   18         1997   Science
   19         1878   Journal of Virology
   20         1690   Molecular Microbiology
   21         1498   Journal of Cell Biology
   22         1461   Plant Molecular Biology
   23         1302   Virology
   24         1298   Molecular and General Genetics
   25         1285   Genes and Development
   26         1270   Nature Genetics
   27         1252   Human Molecular Genetics
   28         1213   Plant Physiology
   29         1171   The American Journal of Human Genetics
   30         1137   Oncogene
   31         1137   Journal of Biochemistry
   32         1070   Development
   33          993   Human Mutation
   34          952   Journal of Immunology
   35          947   Molecular Biology of the Cell
   36          941   Genetics
   37          845   Infection and Immunity
   38          841   Structure
   39          832   Journal of General Virology
   40          797   The Plant Cell
   41          792   Archives of Biochemistry and Biophysics
   42          747   Yeast
   43          743   Blood
   44          724   Molecular Cell
   45          717   Microbiology
   46          674   Developmental Biology
   47          668   The Plant Journal
   48          661   Journal of Cell Science
   49          636   FEMS Microbiology Letters
   50          629   Cancer Research
   51          585   Human Genetics
   52          583   Current Biology
   53          576   Nature Structural Biology
   54          563   Mechanisms of Development
   55          520   Current Genetics
   56          507   Acta Crystallographica, Section D
   57          501   Applied and Environmental Microbiology
   58          501   Journal of Neuroscience
   59          498   Protein Science
   60          487   Toxicon
   61          485   Journal of Clinical Investigation
   62          476   Neuron
   63          466   Mammalian Genome
   64          434   American Journal of Physiology
   65          429   Immunogenetics
   66          426   The Journal of Experimental Medicine
   67          424   Molecular Endocrinology
   68          413   Molecular and Biochemical Parasitology
   69          391   Journal of Neurochemistry
   70          372   Endocrinology
   71          371   The Journal of Clinical Endocrinology and Metabolism
   72          369   Journal of Molecular Evolution
   73          361   DNA and Cell Biology
   74          352   DNA Sequence
   75          346   Molecular Biology and Evolution
   76          339   Bioscience, Biotechnology, and Biochemistry
   77          329   Journal of Medical Genetics
   78          321   Proteins
   79          310   Brain Research. Molecular Brain Research
   80          289   Biological Chemistry Hoppe-Seyler
   81          273   Cytogenetics and Cell Genetics
   82          273   Peptides
   83          272   Comparative Biochemistry and Physiology
   84          267   Journal of Investigative Dermatology
   85          267   Plant and Cell Physiology
   86          267   Antimicrobial Agents and Chemotherapy
   87          255   Nature Cell Biology
   88          253   Molecular Pharmacology
   89          253   Experimental Cell Research
   90          248   Biology of Reproduction
   91          245   Journal of General Microbiology
   92          236   Genome Research
   93          228   Virus Research
   94          227   Neurology
   95          223   RNA
   96          218   Developmental Dynamics
   97          215   Hoppe-Seyler's Zeitschrift fur Physiologische Chemie
   98          201   DNA Research
   99          199   Developmental Cell
  100          198   Molecular Plant-Microbe Interactions
  101          197   European Journal of Immunology
  102          194   Biochimie
  103          187   Annals of Neurology
  104          187   Planta
  105          184   European Journal of Human Genetics
  106          183   Tissue Antigens
  107          180   Eukaryotic cell
  108          178   Genes to Cells
  109          175   Journal of Human Genetics
  110          170   Immunity
  111          166   Molecular and Cellular Endocrinology
  112          165   The New England Journal of Medicine
  113          164   American Journal of Medical Genetics
  114          163   Molecular Phylogenetics and Evolution
  115          161   Archives of Microbiology
  116          159   DNA
  117          153   Hemoglobin
  118          152   Insect Biochemistry and Molecular Biology
  119          148   Bioorganicheskaia Khimiia
  120          147   Investigative Ophthalmology and Visual Science
  121          146   Diabetes
  122          145   Molecular Reproduction and Development
  123          140   Glycobiology
  124          139   Molecular Immunology
  125          136   Archives of Virology
  126          135   Animal Genetics
  127          135   EMBO Reports
  128          133   General and Comparative Endocrinology
  129          130   International Journal of Cancer
  130          129   Clinical Genetics
  131          128   Nature Structural and Molecular Biology
  132          128   The FASEB Journal
  133          128   Molecular and Cellular Neuroscience
  134          123   Molecular Genetics and Metabolism
  135          122   British Journal of Haematology
  136          121   The FEBS Journal
  137          119   Agricultural and Biological Chemistry
  138          117   Molecular Genetics and Genomics
  139          116   Journal of Cellular Biochemistry
  140          114   Biological Chemistry
  141          113   Journal of Protein Chemistry
  142          112   Thrombosis and Haemostasis
  143          111   Journal of Lipid Research
  144          110   American Journal of Medical Genetics. Part A
  145          108   Journal of the American Chemical Society
  146          107   Journal of Neuroscience Research
  147          106   Nature Immunology
  148          105   Neuroscience Letters
  149          104   Circulation Research
  150          104   Journal of Molecular Endocrinology


5.  STATISTICS FOR SOME LINE TYPES

The following table summarizes the total number of some UniProtKB/Swiss-Prot lines,
as well as the number of entries with at least one such line, and the
frequency of the lines.

                                      Total    Number of  Average
   Line type / subtype                number   entries    per entry
------------------------------------  -------- ---------  ---------

References (RL)                       835238                 1.78                                         
   Journal                            663060     355628      1.41       1                                 
   Submitted to EMBL/GenBank/DDBJ     159845     148651      0.34       2                                 
   Submitted to other databases        10320       9048      0.02       3                                 
   Book citation                         624        613     <0.01       4                                 
   Plant Gene Register                   557        545     <0.01       5                                 
   Thesis                                390        388     <0.01       6                                 
   Unpublished observations              290        286     <0.01       7                                 
   Patent                                146        144     <0.01       8                                 
   Worm Breeder's Gazette                  6          6     <0.01       9                                 

Total number of distinct authors cited in UniProtKB/Swiss-Prot: 275280

                                      Total    Number of  Average
   Line type / subtype                number   entries    per entry  Rank
------------------------------------  -------- ---------  ---------  ----
Comments (CC)                        1959656                 4.17                                         
   ALLERGEN                              455        455     <0.01      26                                 
   ALTERNATIVE PRODUCTS                18180      18180      0.04      12                                 
   BIOPHYSICOCHEMICAL PROPERTIES        2605       2605      0.01      22                                 
   BIOTECHNOLOGY                         245        243     <0.01      28                                 
   CATALYTIC ACTIVITY                 192237     175775      0.41       5                                 
   CAUTION                              6334       6205      0.01      19                                 
   COFACTOR                            85129      78245      0.18       7                                 
   DEVELOPMENTAL STAGE                  8197       8197      0.02      16                                 
   DISEASE                              4583       3146      0.01      20                                 
   DISRUPTION PHENOTYPE                 1842       1842     <0.01      23                                 
   DOMAIN                              27968      24791      0.06      11                                 
   ENZYME REGULATION                    7175       7175      0.02      18                                 
   FUNCTION                           343696     329686      0.73       2                                 
   INDUCTION                           10499      10499      0.02      15                                 
   INTERACTION                         11587      11587      0.02      14                                 
   MASS SPECTROMETRY                    4096       3100      0.01      21                                 
   MISCELLANEOUS                       28503      26255      0.06      10                                 
   PATHWAY                            110216     100774      0.23       6                                 
   PHARMACEUTICAL                         81         81     <0.01      29                                 
   POLYMORPHISM                          740        712     <0.01      24                                 
   PTM                                 32782      26635      0.07       8                                 
   RNA EDITING                           576        576     <0.01      25                                 
   SEQUENCE CAUTION                    12048      12048      0.03      13                                 
   SIMILARITY                         545626     446020      1.16       1                                 
   SUBCELLULAR LOCATION               269535     264836      0.57       3                                 
   SUBUNIT                            194877     194877      0.41       4                                 
   TISSUE SPECIFICITY                  31398      31398      0.07       9                                 
   TOXIC DOSE                            400        392     <0.01      27                                 
   WEB RESOURCE                         8046       6355      0.02      17                                 

Total number of comment topics: 29


                                      Total    Number of  Average
   Line type / subtype                number   entries    per entry  Rank
------------------------------------  -------- ---------  ---------  ----
Features (FT)                        2897364                 6.16                                         
   ACT_SITE                           115573      68595      0.25       8                                 
   BINDING                            164635      47568      0.35       4                                 
   CA_BIND                              3604       1459      0.01      35                                 
   CARBOHYD                            92024      23768      0.20      12                                 
   CHAIN                              476839     466069      1.01       1                                 
   COILED                              17004      11453      0.04      26                                 
   COMPBIAS                            45373      24295      0.10      18                                 
   CONFLICT                           113517      39694      0.24       9                                 
   CROSSLNK                             4317       2838      0.01      34                                 
   DISULFID                            90166      23711      0.19      13                                 
   DNA_BIND                            10359       9526      0.02      29                                 
   DOMAIN                             133849      78185      0.28       6                                 
   HELIX                              112915      11601      0.24      10                                 
   INIT_MET                            13171      13171      0.03      27                                 
   LIPID                               10209       6532      0.02      30                                 
   METAL                              234870      58055      0.50       3                                 
   MOD_RES                            141455      50909      0.30       5                                 
   MOTIF                               29496      19071      0.06      22                                 
   MUTAGEN                             27311       6556      0.06      24                                 
   NON_CONS                             1568        629     <0.01      36                                 
   NON_STD                               345        270     <0.01      38                                 
   NON_TER                             11421       8665      0.02      28                                 
   NP_BIND                             88134      59561      0.19      14                                 
   PEPTIDE                              8053       5008      0.02      32                                 
   PROPEP                              10037       8411      0.02      31                                 
   REGION                              78174      43539      0.17      16                                 
   REPEAT                              83858      12445      0.18      15                                 
   SIGNAL                              32087      32077      0.07      21                                 
   SITE                                33081      19390      0.07      20                                 
   STRAND                             116480      10970      0.25       7                                 
   TOPO_DOM                           109856      22435      0.23      11                                 
   TRANSIT                              6070       5984      0.01      33                                 
   TRANSMEM                           313713      64226      0.67       2                                 
   TURN                                27868       9326      0.06      23                                 
   UNSURE                               1053        344     <0.01      37                                 
   VAR_SEQ                             37918      16179      0.08      19                                 
   VARIANT                             73817      15813      0.16      17                                 
   ZN_FING                             27144      11567      0.06      25                                 

Total number of feature keys: 38



                                      Total    Number of  Average
   Line type / subtype                number   entries    per entry  Rank      Category
------------------------------------  -------- ---------  ---------  ----      -------------------------------------------
Cross-references (DR)               10074921                21.42                                                           
   2DBase-Ecoli                           84         84     <0.01     104      2D gel databases                             
   Aarhus/Ghent-2DPAGE                   126         96     <0.01     101      2D gel databases                             
   AGD                                   806        800     <0.01      79      Organism-specific databases                  
   ANU-2DPAGE                             23         23     <0.01     111      2D gel databases                             
   ArrayExpress                        54662      54662      0.12      31      Gene expression databases                    
   Bgee                                35635      35582      0.08      36      Gene expression databases                    
   BindingDB                             297        297     <0.01      95      Other                                        
   BioCyc                             159611     151459      0.34      16      Enzyme and pathway databases                 
   BRENDA                              65131      62338      0.14      28      Enzyme and pathway databases                 
   BuruList                              312        312     <0.01      94      Organism-specific databases                  
   CAZy                                 5492       4884      0.01      54      Protein family/group databases               
   CGD                                   535        533     <0.01      84      Organism-specific databases                  
   CleanEx                             30246      29597      0.06      38      Gene expression databases                    
   COMPLUYEAST-2DPAGE                     59         59     <0.01     106      2D gel databases                             
   Cornea-2DPAGE                          67         67     <0.01     105      2D gel databases                             
   CYGD                                 6628       6522      0.01      52      Organism-specific databases                  
   dictyBase                            3908       3793      0.01      65      Organism-specific databases                  
   DIP                                  9026       8976      0.02      47      Protein-protein interaction databases        
   DisProt                               397        394     <0.01      88      3D structure databases                       
   DOSAC-COBS-2DPAGE                     150        150     <0.01     100      2D gel databases                             
   DrugBank                             5317       1626      0.01      55      Other                                        
   EchoBASE                             4159       4124      0.01      62      Organism-specific databases                  
   ECO2DBASE                             351        299     <0.01      92      2D gel databases                             
   EcoGene                              4330       4327      0.01      61      Organism-specific databases                  
   EMBL                               782022     461034      1.66       3      Sequence databases                           
   Ensembl                             68530      67194      0.15      27      Genome annotation databases                  
   euHCVdb                                55         44     <0.01     107      Organism-specific databases                  
   FlyBase                              4716       4343      0.01      58      Organism-specific databases                  
   Gene3D                             210339     174523      0.45      13      Family and domain databases                  
   GeneCards                           21175      19892      0.05      39      Organism-specific databases                  
   GeneDB_Spombe                        5003       4954      0.01      57      Organism-specific databases                  
   GeneFarm                             2571       2550      0.01      71      Organism-specific databases                  
   GeneID                             421861     402743      0.90       6      Genome annotation databases                  
   GenomeReviews                      322889     304098      0.69       9      Genome annotation databases                  
   GermOnline                          41953      41343      0.09      34      Gene expression databases                    
   GlycoSuiteDB                          280        280     <0.01      96      PTM databases                                
   GO                                1954928     438072      4.16       1      Ontologies                                   
   Gramene                              4131       4131      0.01      63      Organism-specific databases                  
   H-InvDB                             11258       9564      0.02      46      Organism-specific databases                  
   HAMAP                              269327     269190      0.57      11      Family and domain databases                  
   HGNC                                19432      19262      0.04      41      Organism-specific databases                  
   HOGENOM                            208250     208250      0.44      14      Phylogenomic databases                       
   HOVERGEN                            76755      76755      0.16      25      Phylogenomic databases                       
   HPA                                  6106       4958      0.01      53      Organism-specific databases                  
   HSC-2DPAGE                             85         85     <0.01     103      2D gel databases                             
   HSSP                                84961      84961      0.18      24      3D structure databases                       
   IntAct                              20453      20453      0.04      40      Protein-protein interaction databases        
   InterPro                          1259651     443283      2.68       2      Family and domain databases                  
   IPI                                 86672      62450      0.18      23      Sequence databases                           
   KEGG                               394044     372718      0.84       8      Genome annotation databases                  
   LegioList                             743        741     <0.01      80      Organism-specific databases                  
   Leproma                               657        654     <0.01      83      Organism-specific databases                  
   ListiList                            1169       1161     <0.01      77      Organism-specific databases                  
   MaizeGDB                              469        464     <0.01      86      Organism-specific databases                  
   MEROPS                               8382       8124      0.02      49      Protein family/group databases               
   MGI                                 16021      15970      0.03      43      Organism-specific databases                  
   MIM                                 15699      12391      0.03      45      Organism-specific databases                  
   MypuList                              202        202     <0.01      99      Organism-specific databases                  
   NextBio                             48401      48399      0.10      33      Other                                        
   NMPDR                              125956     125926      0.27      17      Genome annotation databases                  
   OGP                                   378        378     <0.01      90      2D gel databases                             
   OMA                                321967     321967      0.68      10      Phylogenomic databases                       
   Orphanet                             3443       2030      0.01      68      Organism-specific databases                  
   PANTHER                            171548     157826      0.36      15      Family and domain databases                  
   Pathway_Interaction_DB               4569       1666      0.01      60      Enzyme and pathway databases                 
   PDB                                 60284      14654      0.13      30      3D structure databases                       
   PDBsum                              60284      14654      0.13      29      3D structure databases                       
   PeptideAtlas                         5167       5167      0.01      56      Proteomic databases                          
   PeroxiBase                            668        656     <0.01      82      Protein family/group databases               
   Pfam                               619250     432244      1.32       4      Family and domain databases                  
   PharmGKB                            15839      15827      0.03      44      Organism-specific databases                  
   PHCI-2DPAGE                           245        245     <0.01      98      2D gel databases                             
   PhosphoSite                         19335      19335      0.04      42      PTM databases                                
   PhosSite                              267        267     <0.01      97      PTM databases                                
   PhotoList                             726        726     <0.01      81      Organism-specific databases                  
   PIR                                113996     104145      0.24      21      Sequence databases                           
   PIRSF                               70983      70983      0.15      26      Family and domain databases                  
   PMAP-CutDB                           1396       1396     <0.01      74      Other                                        
   PMMA-2DPAGE                            52         52     <0.01     108      2D gel databases                             
   PptaseDB                               34         34     <0.01     109      Protein family/group databases               
   PRIDE                               36154      36154      0.08      35      Proteomic databases                          
   PRINTS                             121792     104599      0.26      18      Family and domain databases                  
   ProDom                             118627     115367      0.25      19      Family and domain databases                  
   ProMEX                                434        434     <0.01      87      Proteomic databases                          
   PROSITE                            418146     265669      0.89       7      Family and domain databases                  
   PseudoCAP                            1205       1196     <0.01      75      Organism-specific databases                  
   Rat-heart-2DPAGE                       28         28     <0.01     110      2D gel databases                             
   Reactome                             4621       2750      0.01      59      Enzyme and pathway databases                 
   REBASE                                354        345     <0.01      91      Protein family/group databases               
   RefSeq                             437641     403015      0.93       5      Sequence databases                           
   REPRODUCTION-2DPAGE                  1030        942     <0.01      78      2D gel databases                             
   RGD                                  7270       7266      0.02      50      Organism-specific databases                  
   SagaList                              384        383     <0.01      89      Organism-specific databases                  
   SGD                                  6640       6537      0.01      51      Organism-specific databases                  
   Siena-2DPAGE                          102        102     <0.01     102      2D gel databases                             
   SMART                              115729      89183      0.25      20      Family and domain databases                  
   SMR                                 51251      51251      0.11      32      3D structure databases                       
   SubtiList                            3742       3740      0.01      66      Organism-specific databases                  
   SWISS-2DPAGE                         1182       1182     <0.01      76      2D gel databases                             
   TAIR                                 8421       8307      0.02      48      Organism-specific databases                  
   TCDB                                 3107       3072      0.01      70      Protein family/group databases               
   TIGR                                33270      32519      0.07      37      Genome annotation databases                  
   TIGRFAMs                           248186     231929      0.53      12      Family and domain databases                  
   TubercuList                          1494       1458     <0.01      73      Organism-specific databases                  
   UniGene                             86835      79839      0.18      22      Sequence databases                           
   VectorBase                            349        338     <0.01      93      Genome annotation databases                  
   World-2DPAGE                          503        503     <0.01      85      2D gel databases                             
   WormBase                             3728       3643      0.01      67      Organism-specific databases                  
   WormPep                              3965       3230      0.01      64      Organism-specific databases                  
   Xenbase                              3372       3302      0.01      69      Organism-specific databases                  
   ZFIN                                 2430       2414      0.01      72      Organism-specific databases                  

Total number of cross-referenced databases: 111

6.  AMINO ACID COMPOSITION

   6.1  Composition in percent for the complete database

   Ala (A) 8.22   Gln (Q) 3.95   Leu (L) 9.67   Ser (S) 6.56
   Arg (R) 5.52   Glu (E) 6.75   Lys (K) 5.86   Thr (T) 5.33
   Asn (N) 4.06   Gly (G) 7.06   Met (M) 2.42   Trp (W) 1.08
   Asp (D) 5.43   His (H) 2.27   Phe (F) 3.87   Tyr (Y) 2.92
   Cys (C) 1.38   Ile (I) 5.97   Pro (P) 4.72   Val (V) 6.85

   Asx (B) 0.000  Glx (Z) 0.000  Xaa (X) 0.00

   

   Legend: gray = aliphatic, red = acidic, green = small hydroxy,
           blue = basic, black = aromatic, white = amide, yellow = sulfur


   6.2  Classification of the amino acids by their frequency

   Leu, Ala, Gly, Val, Glu, Ser, Ile, Lys, Arg, Asp, Thr, Pro, Asn, Gln,
   Phe, Tyr, Met, His, Cys, Trp


7.  MISCELLANEOUS STATISTICS

4442 entries are encoded on a mitochondrion, and 3503 are encoded on a plasmid.

12079 entries are encoded on a plastid, 
of which 21 are encoded on apicoplasts, 
11522 on chloroplasts, 
43 on organellar chromatophores,
145 on cyanelles, 
149 on non-photosynthetic plastids and 
199 on unspecified types of plastid.

Number of entries with at least one sequence correction: 66353


ExPASy logo ExPASy Home page Site Map Search ExPASy Contact us Swiss-Prot
 Hosted by kr flag YPRC Korea Mirror sites: Australia  Brazil  Canada  China  Switzerland
Notice: This page will be replaced with www.uniprot.org. Please send us your feedback!