1. simple introduction
The Pfam database is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs).
There are two components to Pfam: Pfam-A and Pfam-B. Pfam-A entries ara high quality, manually curated families. Pfam-B families are of lower quality, but can be useful for identifying functionally conserved regions when no Pfam-A entries are found.
Pfam also generates higher-level groupings of related families, known as clans. A clan is a collection of Pfam-A entries which are related by similarity of sequence, structure of profile-HMM.
2. citation
The latest version of Pfam is release 27.0, which contains 14,831 manually curated protein families.