PIR format description

  • A sequence in PIR format consists of:
    • One line starting with
      • a ">" (greater-than) sign, followed by
      • a two-letter code describing the sequence type (P1, F1, DL, DC, RL, RC, or XX), followed by
      • a semicolon, followed by
      • the sequence identification code (the database ID-code).
    • One line containing a textual description of the sequence.
    • One or more lines containing the sequence itself. The end of the sequence is marked by a "*" (asterisk) character.
    • Optionally, this can be followed by one or more lines describing the sequence. Software that is supposed to read only the sequence should ignore these.
  • A file in PIR format may comprise more than one sequence.

  • The PIR format is also often referred to as the NBRF format.

PIR format example

Use the mouse to cut-and-paste the sequences below into the appropriate input window.


>P1;CRAB_ANAPL
ALPHA CRYSTALLIN B CHAIN (ALPHA(B)-CRYSTALLIN).
  MDITIHNPLI RRPLFSWLAP SRIFDQIFGE HLQESELLPA SPSLSPFLMR 
  SPIFRMPSWL ETGLSEMRLE KDKFSVNLDV KHFSPEELKV KVLGDMVEIH 
  GKHEERQDEH GFIAREFNRK YRIPADVDPL TITSSLSLDG VLTVSAPRKQ 
  SDVPERSIPI TREEKPAIAG AQRK*

>P1;CRAB_BOVIN
ALPHA CRYSTALLIN B CHAIN (ALPHA(B)-CRYSTALLIN).
  MDIAIHHPWI RRPFFPFHSP SRLFDQFFGE HLLESDLFPA STSLSPFYLR 
  PPSFLRAPSW IDTGLSEMRL EKDRFSVNLD VKHFSPEELK VKVLGDVIEV 
  HGKHEERQDE HGFISREFHR KYRIPADVDP LAITSSLSSD GVLTVNGPRK 
  QASGPERTIP ITREEKPAVT AAPKK*

>P1;CRAB_CHICK
ALPHA CRYSTALLIN B CHAIN (ALPHA(B)-CRYSTALLIN).
  MDITIHNPLV RRPLFSWLTP SRIFDQIFGE HLQESELLPT SPSLSPFLMR 
  SPFFRMPSWL ETGLSEMRLE KDKFSVNLDV KHFSPEELKV KVLGDMIEIH 
  GKHEERQDEH GFIAREFSRK YRIPADVDPL TITSSLSLDG VLTVSAPRKQ 
  SDVPERSIPI TREEKPAIAG SQRK*

>P1;CRAB_HUMAN
ALPHA CRYSTALLIN B CHAIN (ALPHA(B)-CRYSTALLIN) (ROSENTHAL FIBER).
  MDIAIHHPWI RRPFFPFHSP SRLFDQFFGE HLLESDLFPT STSLSPFYLR 
  PPSFLRAPSW FDTGLSEMRL EKDRFSVNLD VKHFSPEELK VKVLGDVIEV 
  HGKHEERQDE HGFISREFHR KYRIPADVDP LTITSSLSSD GVLTVNGPRK 
  QVSGPERTIP ITREEKPAVT AAPKK*

>P1;CRAB_MESAU
ALPHA CRYSTALLIN B CHAIN (ALPHA(B)-CRYSTALLIN).
  MDIAIHHPWI RRPFFPFHSP SRLFDQFFGE HLLESDLFST ATSLSPFYLR 
  PPSFLRAPSW IDTGLSEMRM EKDRFSVNLD VKHFSPEELK VKVLGDVVEV 
  HGKHEERQDE HGFISREFHR KYRIPADVDP LTITSSLSSD GVLTVNGPRK 
  QASGPERTIP ITREEKPAVT AAPKK*

>P1;CRAB_MOUSE
ALPHA CRYSTALLIN B CHAIN (ALPHA(B)-CRYSTALLIN) (P23).
  MDIAIHHPWI RRPFFPFHSP SRLFDQFFGE HLLESDLFST ATSLSPFYLR 
  PPSFLRAPSW IDTGLSEMRL EKDRFSVNLD VKHFSPEELK VKVLGDVIEV 
  HGKHEERQDE HGFISREFHR KYRIPADVDP LAITSSLSSD GVLTVNGPRK 
  QVSGPERTIP ITREEKPAVA AAPKK*

>P1;CRAB_RABIT
ALPHA CRYSTALLIN B CHAIN (ALPHA(B)-CRYSTALLIN).
  MDIAIHHPWI RRPFFPFHSP SRLFDQFFGE HLLESDLFPT STSLSPFYLR 
  PPSFLRAPSW IDTGLSEMRL EKDRFSVNLD VKHFSPEELK VKVLGDVIEV 
  HGKHEERQDE HGFISREFHR KYRIPADVDP LTITSSLSSD GVLTVNGPRK 
  QAPGPERTIP ITREEKPAVT AAPKK*

>P1;CRAB_RAT
ALPHA CRYSTALLIN B CHAIN (ALPHA(B)-CRYSTALLIN).
  MDIAIHHPWI RRPFFPFHSP SRLFDQFFGE HLLESDLFST ATSLSPFYLR 
  PPSFLRAPSW IDTGLSEMRM EKDRFSVNLD VKHFSPEELK VKVLGDVIEV 
  HGKHEERQDE HGFISREFHR KYRIPADVDP LTITSSLSSD GVLTVNGPRK 
  QASGPERTIP ITREEKPAVT AAPKK*

>P1;CRAB_SQUAC
ALPHA CRYSTALLIN B CHAIN (ALPHA(B)-CRYSTALLIN).
  MDIAIQHPWL RRPLFPSSIF PSRIFDQNFG EHFDPDLFPS FSSMLSPFYW 
  RMGAPMARMP SWAQTGLSEL RLDKDKFAIH LDVKHFTPEE LRVKILGDFI 
  EVQAQHEERQ DEHGYVSREF HRKYKVPAGV DPLVITCSLS ADGVLTITGP 
  RKVADVPERS VPISRDEKPA VAGPQQK*