Pfam

1. simple introduction

The Pfam database is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs).

There are two components to Pfam: Pfam-A and Pfam-B. Pfam-A entries ara high quality, manually curated families. Pfam-B families are of lower quality, but can be useful for identifying functionally conserved regions when no Pfam-A entries are found.

Pfam also generates higher-level groupings of related families, known as clans. A clan is a collection of Pfam-A entries which are related by similarity of sequence, structure of profile-HMM.

2. citation

Bateman A, Coin L, Durbin R, et al. The Pfam protein families database[J]. Nucleic acids research, 2004, 32(suppl 1): D138-D141.

The latest version of Pfam is release 27.0, which contains 14,831 manually curated protein families.

3.