Query lcl|NC_019400.1_cdsid_YP_006987037.1 [gene=GAP31_201] [protein=putative major head protein] [protein_id=YP_006987037.1] [location=98279..99292] Match_columns 337 No_of_seqs 112 out of 187 Neff 8.3 Searched_HMMs 1612 Date Thu Nov 7 17:55:12 2013 Command /home/guerois/workspace/virfam/python/lib/hhsearch//hhsearch2 -i .//seq/seq_204 -d /home/guerois/workspace/virfam/python/profile_database/capsid_neck_tail.hhm -glob -cpu 7 -o .//seq/HHR/seq_204_vs_rec_db.hhr No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 protein:vir:95258 Length: 368 100.0 8E-109 5E-112 613.1 34.0 336 1-337 2-366 (368) 2 protein:vir:10324 Length: 320 100.0 1E-99 6E-103 563.4 31.3 314 18-337 1-317 (320) 3 protein:vir:6378 Length: 346 # 100.0 1.4E-62 8.9E-66 359.6 32.2 319 6-335 1-346 (346) 4 protein:vir:3424 Length: 341 # 100.0 9.1E-61 5.6E-64 349.8 32.7 315 1-335 1-341 (341) 5 protein:vir:393 Length: 341 # 100.0 6.2E-60 3.8E-63 345.2 31.9 314 1-335 1-341 (341) 6 protein:vir:96490 Length: 348 100.0 2.4E-52 1.5E-55 303.6 29.4 323 1-337 1-347 (348) 7 protein:vir:4902 Length: 348 # 100.0 1.1E-50 6.6E-54 294.5 29.5 323 1-337 1-347 (348) 8 protein:vir:2736 Length: 348 # 100.0 6.9E-50 4.3E-53 290.1 29.6 323 1-337 1-347 (348) 9 protein:vir:106590 Length: 349 100.0 3.5E-46 2.2E-49 269.8 27.0 316 1-335 8-349 (349) 10 protein:vir:98480 Length: 348 100.0 6.9E-45 4.3E-48 262.7 29.3 319 1-336 1-348 (348) 11 protein:vir:79503 Length: 409 100.0 5.2E-31 3.2E-34 186.6 25.1 332 1-337 5-393 (409) 12 protein:vir:78006 Length: 409 100.0 5.2E-31 3.2E-34 186.6 25.1 332 1-337 5-393 (409) 13 protein:vir:79078 Length: 307 99.2 1.4E-12 8.6E-16 85.6 17.6 293 1-335 1-307 (307) 14 protein:vir:107882 Length: 307 99.2 5E-12 3.1E-15 82.5 18.2 294 1-335 1-307 (307) 15 protein:vir:99888 Length: 309 98.9 4E-10 2.5E-13 72.1 18.8 297 4-336 1-309 (309) 16 protein:vir:108211 Length: 318 97.4 3E-05 1.8E-08 45.4 15.0 293 1-337 19-317 (318) 17 protein:vir:98819 Length: 437 94.0 0.0053 3.3E-06 33.0 11.7 322 1-337 2-419 (437) 18 protein:vir:94711 Length: 347 86.5 0.027 1.6E-05 29.2 8.2 302 1-337 23-346 (347) 19 protein:vir:6324 Length: 335 # 81.2 0.088 5.4E-05 26.4 16.9 294 1-337 19-328 (335) 20 protein:vir:103323 Length: 364 81.2 0.088 5.5E-05 26.4 20.6 301 1-337 19-339 (364) 21 protein:vir:78935 Length: 335 79.8 0.1 6.3E-05 26.0 15.7 293 1-337 19-328 (335) 22 protein:vir:10450 Length: 344 73.3 0.17 0.00011 24.8 12.3 302 1-335 25-344 (344) 23 protein:vir:106647 Length: 303 71.7 0.19 0.00012 24.5 14.1 275 1-337 1-296 (303) 24 protein:vir:99675 Length: 324 69.0 0.23 0.00014 24.1 9.0 272 33-337 1-298 (324) 25 protein:vir:97031 Length: 402 67.1 0.26 0.00016 23.8 12.4 300 1-337 19-337 (402) 26 protein:vir:94576 Length: 347 63.8 0.31 0.00019 23.4 18.0 307 1-337 24-347 (347) 27 protein:vir:2201 Length: 345 # 63.3 0.32 0.0002 23.3 13.6 300 1-337 25-345 (345) 28 protein:vir:7019 Length: 401 # 59.7 0.39 0.00024 22.8 12.9 298 1-337 19-333 (401) 29 protein:vir:105645 Length: 400 56.7 0.45 0.00028 22.5 13.6 299 1-337 19-333 (400) 30 protein:vir:100057 Length: 375 49.4 0.64 0.0004 21.6 16.2 304 1-337 27-370 (375) 31 protein:vir:8885 Length: 347 # 45.8 0.76 0.00047 21.3 15.7 306 1-337 24-346 (347) 32 protein:vir:80213 Length: 334 42.6 0.88 0.00055 20.9 16.1 296 1-337 21-332 (334) 33 protein:vir:1886 Length: 385 # 37.8 1.1 0.00068 20.4 20.1 273 1-337 105-384 (385) 34 protein:vir:191 Length: 385 # 37.8 1.1 0.00068 20.4 20.1 273 1-337 105-384 (385) 35 protein:vir:78148 Length: 123 30.4 1.6 0.00098 19.5 6.6 117 202-337 1-123 (123) 36 protein:vir:9875 Length: 296 # 24.4 2.2 0.0014 18.7 15.7 274 1-337 5-295 (296) 37 protein:vir:9927 Length: 295 # 23.4 2.3 0.0014 18.6 16.3 272 3-337 1-288 (295) No 1 >protein:vir:95258 Length: 368 # NCBI annotation: Phage conserved protein # Family: family:all:570 # MgeID: mge:1561 # MgeName: Felix 01 # Cross-refs: genbank:acc:NP_944891;genbank:gi:38707831;genbank:GeneID:2744044 Probab=100.00 E-value=8.4e-109 Score=613.10 Aligned_cols=336 Identities=21% Similarity=0.250 Sum_probs=305.7 Q ss_pred CCCCCCCccCHHHHHHHHHhcCCCccchhhcccccCcccccceEEEEEEcCceeEeeeccCCC-CcccccCCceeEEEEe Q lcl|NC_019400. 1 MAVVRTNDFQIVDLGATLEIVPRQYRLITNMDLFTAYHGVTTVAQIERVDEVVTDFPARRRQG-ERNYVGTEKAQLKNFN 79 (337) Q Consensus 1 ~d~~~~d~Fs~~~Lt~~i~~~p~~~~~l~~l~~F~~~~~~t~~v~ie~~~~~~~l~p~v~~g~-~~~~~~~~~~~~~~f~ 79 (337) |||||+|+||+++||++||++|++|++|++||||++++++|++|.||++++.++|+|+++||+ ++++.++++++++.|+ T Consensus 2 ~d~f~~d~Fs~~~LT~ain~~p~~p~~l~~lglF~~~~v~t~~v~iE~~~~~l~Lvp~~~rg~~~~~~~~~~~r~~~~f~ 81 (368) T protein:vir:95 2 LTNSEKSRFFLADLTGEVQSIPNTYGYISNLGLFRSAPITQTTFLMDLTDWDVSLLDAVDRDSRKAETSAPERVRQISFP 81 (368) T ss_pred cccccCCcccHHHHHHHHHhcCCCcceecccccccCCCccceEEEEEEEcCeEEEccccCCCCCCcccccCCceeEEEEe Confidence 999999999999999999999999999999999999999999999999999999999999998 5567788899999999 Q ss_pred ccccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCEEecCCCceEeechhhcCCC Q lcl|NC_019400. 80 IPFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIMGKSWAPQDPTAQYNYFTEWGVT 159 (337) Q Consensus 80 ~p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~g~i~~~~~g~v~~d~~~~fG~~ 159 (337) +|||++++.|+|+||||+|+||+++++++++.++++||++||++|+.|+||||+|||+|+|+|+ ||++++|||++||++ T Consensus 82 ~ph~~~~d~I~a~eiQg~RafG~~~~l~~v~~~v~~kl~~~r~~~d~T~E~~r~gAL~G~ilDa-dGtvl~dly~eFGit 160 (368) T protein:vir:95 82 MMYFKEVESITPDEIQGVRQPGTANELTTEAVVRAKKLMKIRTKFDITREFLFMQALKGKVVDA-RGTLYADLYKQFDVE 160 (368) T ss_pred cceeccccccchHHHccccCCCChhHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhcCeeECC-CCcEEecchhhhCCc Confidence 9999999999999999999999999999999999999999999999999999999999999998 679999999999999 Q ss_pred cceEEEecCCCCcchHHHHHHHHHHHHHHhhc-cccccccEEEEEChHHHHHHhcCHHHHHHHHhhccccc--ccccccc Q lcl|NC_019400. 160 QHTANIDFTDVATDPTDIIEADARAYIIDNAG-DNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYSSTQE--PLRRRLG 236 (337) Q Consensus 160 ~~~~~~~l~~~~~d~~~~~~~~~~~~~~~~~~-~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~~~~~--~~~~~~~ 236 (337) |++++|+|+++++|+.++|.++.+++.+++++ .....+++++|||++||++|++||+|+++|+||++++. .++.+++ T Consensus 161 ~~~v~f~l~~~~tdv~~~~~~~~~~i~d~l~g~~~~~~~~v~alcg~~Ffd~L~~h~~Vkeay~~~~~a~~~~~lr~~~r 240 (368) T protein:vir:95 161 KKTIYFDLDNPNADIDASIEELRMHMEDEAKTGTVINGEEIHVVVDRVFFSKLTKHPKIRDAYLAQQTPLAWQQITGSLR 240 (368) T ss_pred cceEEEEeCCCCcCHHHHHHHHHHHHHHhhcccccccccceEEEEChHHHHHhhcChhHHHHHHHHHhhhhhhhhccccc Confidence 99999999999999999999998876555543 33456789999999999999999999999999987654 3444444 Q ss_pred cc-----------eeccceeEEEeccEE-Eec--------CceeeecCCeeEEEEec-----chhhheEEeccccchhhc Q lcl|NC_019400. 237 QG-----------QENANNRMFVHKNVT-YIE--------DISNYIPDGEAYILPQG-----IDDMFQIHYAPADDVREA 291 (337) Q Consensus 237 ~~-----------~~~~~~~~~~~~~~~-~~~--------~~~~~i~~~~~~~~p~g-----~~~~f~~~~ap~d~~~~~ 291 (337) .+ |.++|+.|++|++.. ..+ +..-+||+|+|++||.| ++++|++||||+|++|.+ T Consensus 241 ~g~~~~~~~~~~~F~fgGi~f~eYrg~~~~~~g~~~~~v~~d~v~I~~gea~~~P~G~~~~~~~~~F~~~~aPad~~e~v 320 (368) T protein:vir:95 241 TGGADGVQAHMNTFYYGGVKFVQYNGKFKDKRGKVHTLVSIDSVADTVGVGHAFPNVAMLGEANNIFEVAYGPCPKMGYA 320 (368) T ss_pred cccccccccccceeEecCEEEEEcceeecCCCcceeeeecCCceeeccCceEEEeecccccccCcceEEEecCCCcHhhc Confidence 33 778888888877632 222 22347889999999999 579999999999999999 Q ss_pred cccCcceeeeEEEccCCCEEEEEEeecccccccCcceEEEEEEeeC Q lcl|NC_019400. 292 NTPAQELYLWYKSSAYLREEKVESETSFLTVNTRPELVVRSTGTFA 337 (337) Q Consensus 292 n~~~~~~y~k~~~~~~~~~~~l~~eS~PLpi~~rP~~l~~~t~~aa 337 (337) |+.|+|+|+|+|++++++|++|++||||||||+||++|+++|++|. T Consensus 321 Nt~g~p~Ya~~~~~~~~~g~~le~qSnpLpic~RP~~lv~~~~~a~ 366 (368) T protein:vir:95 321 NTLGQELYVFEYEKDRDEGIDFEAHSYMLPYCTRPQLLVDVRADAK 366 (368) T ss_pred CCCcccccceeeeccCCCeeEEEEeecccchhcccceeEEEEecCC Confidence 9999999999999999999999999999999999999999999999 No 2 >protein:vir:10324 Length: 320 # NCBI annotation: ORF26 # Family: family:all:570 # MgeID: mge:182 # MgeName: VHML # Cross-refs: genbank:acc:NP_758919;genbank:gi:27311193;genbank:GeneID:956155 Probab=100.00 E-value=9.9e-100 Score=563.36 Aligned_cols=314 Identities=19% Similarity=0.263 Sum_probs=292.2 Q ss_pred HHhcCCCccchhhcccc-cCcccccceEEEEEEcCceeEeeeccCCCCcccccCCceeEEEEeccccccCccccHHHHhc Q lcl|NC_019400. 18 LEIVPRQYRLITNMDLF-TAYHGVTTVAQIERVDEVVTDFPARRRQGERNYVGTEKAQLKNFNIPFFPLDRQITAADVQN 96 (337) Q Consensus 18 i~~~p~~~~~l~~l~~F-~~~~~~t~~v~ie~~~~~~~l~p~v~~g~~~~~~~~~~~~~~~f~~p~i~~~~~v~a~dlq~ 96 (337) ||.+|+.++++ ||+| ++++++|++|.||++++.++|+|+|+||++|++.++++++++.|+||||+++++|+|+|||+ T Consensus 1 i~~~P~~~g~~--~glff~~~~v~T~~V~ie~~~~~l~lip~v~rg~~g~~~~~~~~~~~~f~~p~~~~~d~i~a~eiq~ 78 (320) T protein:vir:10 1 MNLLPVNYGDS--RALFAREKKVRTRTILVEEKNGVLTLIQSREPGSTENVAKRGKRKVRSFVIPHLPLEDVILPDEYEG 78 (320) T ss_pred CCcCCchhhhh--hhhccCCCCcccceEEEEEecCceeeeeccCCCCCceeecCCcceEEEEecceeccCCccCHHHHcC Confidence 99999988876 4555 67799999999999999999999999999999999999999999999999999999999999 Q ss_pred cccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCEEecCCCceEeechhhcCCCcceEEEecCCCCcchHH Q lcl|NC_019400. 97 FRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIMGKSWAPQDPTAQYNYFTEWGVTQHTANIDFTDVATDPTD 176 (337) Q Consensus 97 ~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~g~i~~~~~g~v~~d~~~~fG~~~~~~~~~l~~~~~d~~~ 176 (337) +|+||+ ++++++++++++++.+||++|++|+||||+|||+|+|+|+ ||++++|||++||++++++.|+|+++++|+.+ T Consensus 79 ~Ra~G~-~~~~~~~~~v~~~l~~lr~~~~~T~E~m~~~AL~G~ilda-dGtv~~d~y~~fGi~~~~i~~~l~~a~~dv~~ 156 (320) T protein:vir:10 79 LRGFGT-TALAAKSELVKERXETMKSSHDITHEHLRMGAKKGQILDA-DGTVLYDLYAEFGITKKTIYFGLDNKDANVAE 156 (320) T ss_pred cccCCC-chHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcCeEEcC-CCcEEEechhhhCCccceeEEecCCCCccHHH Confidence 999997 7899999999999999999999999999999999999987 67899999999999999999999999999999 Q ss_pred HHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhhcccccccccccccceeccceeEEEeccE-EEe Q lcl|NC_019400. 177 IIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYSSTQEPLRRRLGQGQENANNRMFVHKNV-TYI 255 (337) Q Consensus 177 ~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~ 255 (337) +|.++.+++.+.++ +..++++++|||++||++|++||+|+|+|++++.+.+.++.+...++.++|+.|++|++. +.. T Consensus 157 ~~~~~~~~i~~~l~--g~~~t~v~al~g~~f~~al~~h~~Vke~y~~~~~~~~~l~~~~~~~f~~gGi~~~~Y~g~~~d~ 234 (320) T protein:vir:10 157 SCRQVLRHVEDNLR--GDVMKDVSVDVSEEFFDKFIKHASVKEVFLNHEAAVNRLGGDTRKGFKFGGLIFNENRARHVDE 234 (320) T ss_pred HHHHHHHHHHHHhc--cCCCCceEEEEChHHHHHHhcCHHHHHHHHhhhhhhhhccccccceEEecCEEEEEcccEEEcC Confidence 99998887655444 346789999999999999999999999999999988889999999999999999998763 333 Q ss_pred cCc-eeeecCCeeEEEEecchhhheEEeccccchhhccccCcceeeeEEEccCCCEEEEEEeecccccccCcceEEEEEE Q lcl|NC_019400. 256 EDI-SNYIPDGEAYILPQGIDDMFQIHYAPADDVREANTPAQELYLWYKSSAYLREEKVESETSFLTVNTRPELVVRSTG 334 (337) Q Consensus 256 ~~~-~~~i~~~~~~~~p~g~~~~f~~~~ap~d~~~~~n~~~~~~y~k~~~~~~~~~~~l~~eS~PLpi~~rP~~l~~~t~ 334 (337) +++ +++||+|+++|||.|++++|++||||+|+++.+|+.|+|||+|+|++++|+|++|++||+|||||+||++|+++|+ T Consensus 235 ~g~~~~~I~~~~~~~~p~g~~~~f~~~~apad~~e~vnt~g~p~y~k~~~~~~~~g~~l~~qS~PLpi~~rP~~lv~~~~ 314 (320) T protein:vir:10 235 EGKETRFIKAGKGHAFPTGTTNTFFTALAPADFNETAGTLGKRYYAKMEPRRMGRGFDLHSQSNVLPMCCRPGVLVELDA 314 (320) T ss_pred CCCeeEeecCCeeEEEEecCchhheeeecccCcHhhcCCcccccccccccccCCCeEEEEeeecccccccCcceEEEEEe Confidence 333 5799999999999999999999999999999999999999999999999999999999999999999999999999 Q ss_pred eeC Q lcl|NC_019400. 335 TFA 337 (337) Q Consensus 335 ~aa 337 (337) +|+ T Consensus 315 ~a~ 317 (320) T protein:vir:10 315 AAQ 317 (320) T ss_pred cCC Confidence 999 No 3 >protein:vir:6378 Length: 346 # NCBI annotation: capsid protein E # Family: family:all:1021 # MgeID: mge:133 # MgeName: BcepNazgul # Cross-refs: genbank:acc:NP_918991;genbank:gi:34610166;genbank:GeneID:2559600 Probab=100.00 E-value=1.4e-62 Score=359.62 Aligned_cols=319 Identities=12% Similarity=0.046 Sum_probs=254.6 Q ss_pred CCccCHHHHHHHHHhcCCCccchhhcccccCcccccceEEEEEEcCceeEeeeccCCCCcccccCCceeEEEEecccccc Q lcl|NC_019400. 6 TNDFQIVDLGATLEIVPRQYRLITNMDLFTAYHGVTTVAQIERVDEVVTDFPARRRQGERNYVGTEKAQLKNFNIPFFPL 85 (337) Q Consensus 6 ~d~Fs~~~Lt~~i~~~p~~~~~l~~l~~F~~~~~~t~~v~ie~~~~~~~l~p~v~~g~~~~~~~~~~~~~~~f~~p~i~~ 85 (337) -|.|++.+|+++|+++|+. ++|.+++||+.+.+.|++|.||..++.+.++|+|+|+.++..+.++++++..|++|||++ T Consensus 1 ~d~f~~~~l~~~i~~~p~~-~~l~~~~fp~~~~~~t~~i~i~~~~g~~~la~~v~~~~~~~~~~~~g~~~~~~~~p~i~~ 79 (346) T protein:vir:63 1 MEIFDTLTLAGVIQSGPAL-SMYWQGFYPNEITFDTDEILFDLVFKDKKLAPFVAPNVQGRVIAARGYTTKTFRPAYVKP 79 (346) T ss_pred CCccCHHHHHHHHHhcCCc-cchhhhcCccccccccceEEEEEecCceeeeeeecCCCCcceecccceeeeEeecCccCc Confidence 4577789999999999975 457787666677788999999999999999999999999999999999999999999999 Q ss_pred CccccHHHHhcccc-----CCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCEEecCCCceEeechhhcCCCc Q lcl|NC_019400. 86 DRQITAADVQNFRK-----YFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIMGKSWAPQDPTAQYNYFTEWGVTQ 160 (337) Q Consensus 86 ~~~v~a~dlq~~R~-----~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~g~i~~~~~g~v~~d~~~~fG~~~ 160 (337) ++.|+|+|++++|. +|+.++++++...+.+++.+|+++|++|+||||+|||+|++++.+ |....++..+||+.. T Consensus 80 ~~~i~~~d~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~l~~~i~~~~E~m~~~al~~gki~~~-g~~~~~~~vdfg~~~ 158 (346) T protein:vir:63 80 KDVINPNRTLKRRAGEQPIIGGMSLQERFQAVVADSQLEQRQRIENRIEWMCAMATIYGYVDVV-GEAFPMQRVDFGRDP 158 (346) T ss_pred cceeCHHHHHHHhhhhhhccCCcCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCEEEee-CCceeEEEEeeCCCc Confidence 99999999998664 566788899999999999999999999999999999997766653 334456666799853 Q ss_pred c-eE----EEecCCCCcchHHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhhcccccc----- Q lcl|NC_019400. 161 H-TA----NIDFTDVATDPTDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYSSTQEP----- 230 (337) Q Consensus 161 ~-~~----~~~l~~~~~d~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~~~~~~----- 230 (337) . .+ +..|+++++|+.++|+++.+++ ++..+ .+..+++||+++|++|++|++|+++++++...... T Consensus 159 ~~~~~lt~~~~W~~~~adp~~di~~~~~~~-~~~~g----~~~~~~i~~~~~~~~l~~~~~v~~~~~~~~~~~~~~~~~~ 233 (346) T protein:vir:63 159 ALTVQLTGGAAWDQATSDPLGNIQTMRTTA-WKKSN----STITRLTMGLDAWSLFSQKPAVVELLNLFYKGSTSDFNRS 233 (346) T ss_pred cceeeecccccCCCCCCCHHHHHHHHHHHH-HHccC----CceEEEEECHHHHHHHhcCHHHHHHHhhhccccccccchh Confidence 2 22 2357788999999999877764 33222 23458999999999999999999999865432110 Q ss_pred -ccc--------ccccceeccceeEEEeccEEEec--C-ceeeecCCeeEEEEecchhhheEEeccccchhhccccCcce Q lcl|NC_019400. 231 -LRR--------RLGQGQENANNRMFVHKNVTYIE--D-ISNYIPDGEAYILPQGIDDMFQIHYAPADDVREANTPAQEL 298 (337) Q Consensus 231 -~~~--------~~~~~~~~~~~~~~~~~~~~~~~--~-~~~~i~~~~~~~~p~g~~~~f~~~~ap~d~~~~~n~~~~~~ 298 (337) +.. .......++++.++.|.. +|.+ + .+++||+|+++|+|.|..| .++|||..+++. |..+.++ T Consensus 234 ~l~~~~~~~~~~~~~~~~~~~gi~i~~y~~-~y~d~~G~~~~~ip~~~v~~~p~~~~g--~~~yg~~~d~~~-~~~~~~~ 309 (346) T protein:vir:63 234 RLDDGSPVQYQGTIGGYNGMGTLELYTYHD-TYTGDDNTEQEILGSYDVVGTGPGLQG--TQCFGAIMDFKN-GLVPTRM 309 (346) T ss_pred hcccchhhhhhhhHhhhhccCCeEEEEecc-EEEcCCCceeccccCCeEEEEecCCcc--eEEEeecccccc-Cccccee Confidence 100 001112344555555443 3433 2 2579999999999988755 567777666554 7889999 Q ss_pred eeeEEEccCCCEEEEEEeecccccccCcceEEEEEEe Q lcl|NC_019400. 299 YLWYKSSAYLREEKVESETSFLTVNTRPELVVRSTGT 335 (337) Q Consensus 299 y~k~~~~~~~~~~~l~~eS~PLpi~~rP~~l~~~t~~ 335 (337) |+++|..++|+++++++||+|||+|.+|++++.+|++ T Consensus 310 ~~~~~~~~dp~~~~~~~~s~plPv~~~p~~~~~~~V~ 346 (346) T protein:vir:63 310 FPKMWEEEDPSVAMLMTQSAPLMVPAQPNASFRMTVK 346 (346) T ss_pred eeEEEEecCCCEEEEEEeeeccceecCCCcEEEEEeC Confidence 9999999999999999999999999999999999999 No 4 >protein:vir:3424 Length: 341 # NCBI annotation: capsid component # Family: family:all:1021 # MgeID: mge:70 # MgeName: lambda # Cross-refs: genbank:acc:NP_040587;genbank:gi:9626251;genbank:GeneID:2703482 Probab=100.00 E-value=9.1e-61 Score=349.75 Aligned_cols=315 Identities=13% Similarity=0.126 Sum_probs=241.2 Q ss_pred CCCCCCCccCHHHHHHHHHhcCCCccchhhcccccCcccccceEEEEEEcCceeEeeeccCCCCcccccCCceeEEEEec Q lcl|NC_019400. 1 MAVVRTNDFQIVDLGATLEIVPRQYRLITNMDLFTAYHGVTTVAQIERVDEVVTDFPARRRQGERNYVGTEKAQLKNFNI 80 (337) Q Consensus 1 ~d~~~~d~Fs~~~Lt~~i~~~p~~~~~l~~l~~F~~~~~~t~~v~ie~~~~~~~l~p~v~~g~~~~~~~~~~~~~~~f~~ 80 (337) || .|++.+|+++++++|+.+++|.+++|+....++|++|.||++++.+.++|+|+|++++.++.++++++++|+| T Consensus 1 ~d-----~f~~~~L~~~i~~~~~~~~~l~d~~fp~~~~~~t~~v~~~~~~~~~~lap~v~~~~~~~~~~~~~~~~~~~~~ 75 (341) T protein:vir:34 1 MS-----MYTTAQLLAANEQKFKFDPLFLRLFFRESYPFTTEKVYLSQIPGLVNMALYVSPIVSGEVIRSRGGSTSEFTP 75 (341) T ss_pred CC-----CcCHHHHHHHHHhccCccchhHHhcCCcccccccceEEEEEeeCCeeEEEeecCCCCcceeccCceeeeEEec Confidence 44 5778999999999999999999997666777899999999999999999999999999999999999999999 Q ss_pred cccccCccccHHHHhccccCC-----CCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhc-CCEEecCCC--ceEeec Q lcl|NC_019400. 81 PFFPLDRQITAADVQNFRKYF-----TADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIM-GKSWAPQDP--TAQYNY 152 (337) Q Consensus 81 p~i~~~~~v~a~dlq~~R~~G-----~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~-g~i~~~~~g--~v~~d~ 152 (337) |||++++.|+++|++ .|.+| ..++++++.+.+.+++.+|+++|++|+||||+|||+ |+|....+| .+.+| T Consensus 76 p~i~~~~~i~~~d~~-~r~~g~~~~~~~~~~~~~~~~i~~~l~~l~~~i~~~~E~m~~qaL~~Gki~~~~~g~~~~~vD- 153 (341) T protein:vir:34 76 GYVKPKHEVNPQMTL-RRLPDEDPQNLADPAYRRRRIIMQNMRDEELAIAQVEEMQAVSAVLKGKYTMTGEAFDPVEVD- 153 (341) T ss_pred CccCccceeCHHHHH-HHhhccccccCcCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCcEEEecCCccEEEEE- Confidence 999999999999998 47776 346788899999999999999999999999999997 887543333 23455 Q ss_pred hhhcCCCcceEEEe------cCCCCcchHHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhhcc Q lcl|NC_019400. 153 FTEWGVTQHTANID------FTDVATDPTDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYSS 226 (337) Q Consensus 153 ~~~fG~~~~~~~~~------l~~~~~d~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~~ 226 (337) ||+. .+++++ |+++++++...+.++.+ +.++. ..++.+++||+++|++|++|++|+++|+++.. T Consensus 154 ---fg~~-~~~~~~~t~~~~W~~~~~~~~d~l~di~~-~~~~~-----g~~~~~~i~~~~~~~~l~~~~~v~~~~~~~~~ 223 (341) T protein:vir:34 154 ---MGRS-EENNITQSGGTEWSKRDKSTYDPTDDIEA-YALNA-----SGVVNIIVFDPKGWALFRSFKAVKEKLDTRRG 223 (341) T ss_pred ---eCCC-CccceEecCCccCCcCCCchHHHHHHHHH-HHHhc-----CCceEEEEeCHHHHHHHhcCHHHHHHHhhccc Confidence 5663 233332 44444445555554332 33332 23466899999999999999999999987654 Q ss_pred ccccccc---cccccee----ccceeEEEeccEEEecCc--eeeecCCeeEEEEecchhhheEEeccccchhhcc--ccC Q lcl|NC_019400. 227 TQEPLRR---RLGQGQE----NANNRMFVHKNVTYIEDI--SNYIPDGEAYILPQGIDDMFQIHYAPADDVREAN--TPA 295 (337) Q Consensus 227 ~~~~~~~---~~~~~~~----~~~~~~~~~~~~~~~~~~--~~~i~~~~~~~~p~g~~~~f~~~~ap~d~~~~~n--~~~ 295 (337) ....+.. ....+.. ++++.++.|.. +|.+++ +++||+|+++|+|.|+.| .++||+..+++..+ ... T Consensus 224 ~~~~~~~~~~~~~~~~~~~~~~~g~~i~~y~~-~y~ddG~~~~~ip~~~v~l~p~g~~g--~~~yg~~~d~~~~~~~~~~ 300 (341) T protein:vir:34 224 SNSELETAVKDLGKAVSYKGMYGDVAIVVYSG-QYVENGVKKNFLPDNTMVLGNTQARG--LRTYGCIQDADAQREGINA 300 (341) T ss_pred ccccccccccccccceeeeeecCCceEEEEcC-EEEECCcEEeeecCCeEEEeeCCCcc--eEEEeecccccccccceee Confidence 4332221 2222222 34555554443 444443 579999999999988754 45555544334332 244 Q ss_pred cceeeeEEEc-cCCCEEEEEEeecccccccCcceEEEEEEe Q lcl|NC_019400. 296 QELYLWYKSS-AYLREEKVESETSFLTVNTRPELVVRSTGT 335 (337) Q Consensus 296 ~~~y~k~~~~-~~~~~~~l~~eS~PLpi~~rP~~l~~~t~~ 335 (337) .++|++.|.. ++|+++++++||+|||+|.||++++.+|+. T Consensus 301 ~~~~~~~~~~~~dp~~~~~~~~s~pLPv~~~pd~~~~a~V~ 341 (341) T protein:vir:34 301 SARYPKNWVTTGDPAREFTMIQSAPLMLLADPDEFVSVQLA 341 (341) T ss_pred eeEeeeeeeecCCCcEEEEEEcccceeeeeCCCcEEEEEeC Confidence 6899999865 589999999999999999999999999999 No 5 >protein:vir:393 Length: 341 # NCBI annotation: gp8 # Family: family:all:1021 # MgeID: mge:325 # MgeName: N15 # Cross-refs: genbank:acc:NP_046903;genbank:gi:9630472;genbank:GeneID:1261647 Probab=100.00 E-value=6.2e-60 Score=345.18 Aligned_cols=314 Identities=15% Similarity=0.136 Sum_probs=239.5 Q ss_pred CCCCCCCccCHHHHHHHHHhcCCCccchhhccccc-CcccccceEEEEEEcCceeEeeeccCCCCcccccCCceeEEEEe Q lcl|NC_019400. 1 MAVVRTNDFQIVDLGATLEIVPRQYRLITNMDLFT-AYHGVTTVAQIERVDEVVTDFPARRRQGERNYVGTEKAQLKNFN 79 (337) Q Consensus 1 ~d~~~~d~Fs~~~Lt~~i~~~p~~~~~l~~l~~F~-~~~~~t~~v~ie~~~~~~~l~p~v~~g~~~~~~~~~~~~~~~f~ 79 (337) || .|++.+|+++|+++|+.+++|.++ ||+ ...++|.+|.||.+++.++++|+|+|++++.+++++++++++|+ T Consensus 1 ~d-----~f~~~~L~~~i~~~~~~~~~l~~~-~Fp~~~~~~t~~v~~~~~~~~~~lap~v~~~~~~~~~~~~~~~~~~~~ 74 (341) T protein:vir:39 1 MS-----VYTTAQLLAVNEKKFKFDPLFLRI-FFRETYPFSTEKVYLSQIPGLVNMALYVSPIVSGKVIRSRGGSTSEFT 74 (341) T ss_pred CC-----ccCHHHHHHHHHhhcCccchhHhh-cCCcccccCcceEEEEEecCCceeeEEecCCCCcceecccceeeeeEe Confidence 44 577899999999999999999998 565 45668899999999999999999999999999999999999999 Q ss_pred ccccccCccccHHHHhccccCC-----CCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhc-CCEE-ecCCC-ceEee Q lcl|NC_019400. 80 IPFFPLDRQITAADVQNFRKYF-----TADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIM-GKSW-APQDP-TAQYN 151 (337) Q Consensus 80 ~p~i~~~~~v~a~dlq~~R~~G-----~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~-g~i~-~~~~g-~v~~d 151 (337) ||||++++.|+++|++. |.+| +.+++++..+.+.+++.+|+++|++|+||||+|||+ |+|. +..|+ .+.+| T Consensus 75 ~p~i~~~~~i~~~d~~~-r~~g~~~~~~~~~~~~~~~~i~~~~~~l~~~i~~r~E~m~~qaL~~Gki~i~~~g~~~~~vD 153 (341) T protein:vir:39 75 PGYVKPKHEVNPLMTLR-RLPDEDPQNLADPVYRRRRIILQNMKDEELAIAQVEEKQAVAAVLSGKYTMTGEAFEPVEVD 153 (341) T ss_pred ccccCcccccCHHHHHH-HhhcccccccCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCceEEEcCCCcEEEEe Confidence 99999999999999984 6665 457888888999999999999999999999999996 8884 44333 24455 Q ss_pred chhhcCCCcceEEE------ecCCCCcchHHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhhc Q lcl|NC_019400. 152 YFTEWGVTQHTANI------DFTDVATDPTDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYS 225 (337) Q Consensus 152 ~~~~fG~~~~~~~~------~l~~~~~d~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~ 225 (337) ||+.. .+++ .|+++++++...+.++ +.+.++.+ .++.+++||+++|++|++|++|+++|+++. T Consensus 154 ----fg~~~-~~~~~lt~~~~W~~~~~~~~d~l~di-~~~~~~~g-----~~~~~ii~~~~~~~~l~~~~~v~~~~~~~~ 222 (341) T protein:vir:39 154 ----MGRSA-GNNIVQAGAAAWSSRDKETYDPTDDI-EAYALNAS-----GVVNIIVFDPKGWALFRSFKAVKEKLDTRR 222 (341) T ss_pred ----ccCCc-cceeEecCCccCCCCCCchHHHHHHH-HHHHHhcC-----CceEEEEeChHHHHHHhcCHHHHHHHhhcc Confidence 56532 2322 3555555555555543 33443322 245689999999999999999999999765 Q ss_pred ccccccc---ccccccee----ccceeEEEeccEEEecCc--eeeecCCeeEEEEecchhhheEEeccccchhhc--ccc Q lcl|NC_019400. 226 STQEPLR---RRLGQGQE----NANNRMFVHKNVTYIEDI--SNYIPDGEAYILPQGIDDMFQIHYAPADDVREA--NTP 294 (337) Q Consensus 226 ~~~~~~~---~~~~~~~~----~~~~~~~~~~~~~~~~~~--~~~i~~~~~~~~p~g~~~~f~~~~ap~d~~~~~--n~~ 294 (337) .....+. .+...+.. .++..++.|.+ +|.+++ +++||+|+++|+|.|..| .++||+..+++.. +.. T Consensus 223 ~~~~~~~~~~~~~~~~~~~~~~~~g~~i~~y~~-~y~d~g~~~~~ip~~~~~l~p~~~~g--~~~yg~~~d~~~~~~~~~ 299 (341) T protein:vir:39 223 GSNSELETALKDLGKAVSYKGMYGDVAIVVYSG-QYIENDVKKNYLPDLTMVLGNTQARG--LRTYGCILDADAQREGIN 299 (341) T ss_pred cccccccchhhhhhhHhhhhhhhcCceEEEEcc-EEEecCcEEeeecCCeEEEeeCCCcc--eEEEecccchhhccccee Confidence 4433222 22222222 24455555443 444433 689999999999988764 4555554434433 235 Q ss_pred CcceeeeEEEcc-CCCEEEEEEeecccccccCcceEEEEEEe Q lcl|NC_019400. 295 AQELYLWYKSSA-YLREEKVESETSFLTVNTRPELVVRSTGT 335 (337) Q Consensus 295 ~~~~y~k~~~~~-~~~~~~l~~eS~PLpi~~rP~~l~~~t~~ 335 (337) +.++|+++|..+ ||+++++++||+|||+|.||++++.+|+. T Consensus 300 ~~~~~~~~~~~~~dp~~~~~~~~s~plPv~~~p~~~~~a~V~ 341 (341) T protein:vir:39 300 ASTRYPKNWVQTGDPAREFTMIQSAPLMLLADPDEFVSVKLA 341 (341) T ss_pred eeeeeeeeeeecCCCcEEEEEEeccccceeeCCCcEEEEEeC Confidence 678999999765 89999999999999999999999999998 No 6 >protein:vir:96490 Length: 348 # NCBI annotation: head protein # Family: family:all:1083 # MgeID: mge:1620 # MgeName: 2972 # Cross-refs: genbank:acc:YP_238492;genbank:gi:66391768;genbank:GeneID:5176912 Probab=100.00 E-value=2.4e-52 Score=303.60 Aligned_cols=323 Identities=12% Similarity=0.104 Sum_probs=241.8 Q ss_pred CCCCCCCccCHHHHHHHHHhcCCC-ccchhhcccccCcccccce-EEEEEEcCceeEeeeccCCCCcccccCCceeEEEE Q lcl|NC_019400. 1 MAVVRTNDFQIVDLGATLEIVPRQ-YRLITNMDLFTAYHGVTTV-AQIERVDEVVTDFPARRRQGERNYVGTEKAQLKNF 78 (337) Q Consensus 1 ~d~~~~d~Fs~~~Lt~~i~~~p~~-~~~l~~l~~F~~~~~~t~~-v~ie~~~~~~~l~p~v~~g~~~~~~~~~~~~~~~f 78 (337) |-.. .|.|+..+|++.|+.+|+. .++|.+. +|+.+++.+.. +.++..++...++|++++++++.+.+++++++.+| T Consensus 1 M~~i-~d~f~~~~l~~~i~~~~~~~~~~l~~~-~Fp~~~~~~~~~~~~~~~~~~~~~a~~v~~~~~~~~~~r~~~~~~~~ 78 (348) T protein:vir:96 1 MGLI-YDKVTASNIAGYFNTLQENVDSTLGES-IFPARKQLGTKLSYIKGASGQSVALKAAAFDTNVTIRDRVSAEIHDE 78 (348) T ss_pred Ccch-hhccCHHHHHHHHHhcccchhhhhhhh-cCCCccccceeEEEEeecCCceeEeeeecCCCCcceecccceeeeee Confidence 5544 3589999999999999854 5677774 78877664443 44566677777899999999999999999999999 Q ss_pred eccccccCccccHHHHhcccc---CCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhc-CCEEecCCCceEeechh Q lcl|NC_019400. 79 NIPFFPLDRQITAADVQNFRK---YFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIM-GKSWAPQDPTAQYNYFT 154 (337) Q Consensus 79 ~~p~i~~~~~v~a~dlq~~R~---~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~-g~i~~~~~g~v~~d~~~ 154 (337) +||+|++++.|++.|++.++. .|+....+++.+.+++++++|+++|++|+||||+|||+ |+|....+| . ++.. T Consensus 79 ~~p~i~~~~~i~~~d~~~l~~~~~~~~~~~~~~~~~~i~~d~~~l~~~i~~r~E~m~~qal~~Gki~~~~~~-~--~~~v 155 (348) T protein:vir:96 79 QMPFFKEALLVKENDRQQLNLVKDTGNEALINTIVAGIFNDDVTLINGARARLEAMRMQVLATGKIAFTSDG-V--NKDI 155 (348) T ss_pred ecCccccccccCHHHHHHHHhhhccCCchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCeeEeecCC-e--eEEE Confidence 999999999999999876654 44555678889999999999999999999999999997 777665333 2 3334 Q ss_pred hcCCCcc---eEEEecCCCCcchHHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhhccccccc Q lcl|NC_019400. 155 EWGVTQH---TANIDFTDVATDPTDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYSSTQEPL 231 (337) Q Consensus 155 ~fG~~~~---~~~~~l~~~~~d~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~~~~~~~ 231 (337) +||+... +.+.+|+++++|+.++|+++++. +++.+ .++-+++||+++|++|++|++|+++++++......+ T Consensus 156 dfg~~~~~~~t~~~~W~~~~adp~~di~~~~~~-~~~~G-----~~~~~~i~~~~~~~~l~~~~~v~~~~~~~~~~~~~~ 229 (348) T protein:vir:96 156 DYGVKADHKKQVSKSWAEPGATPLADLEDAIET-ARELG-----LNPERAIMNAKTFGLIRKAASTVKAIKPLAGDGSSV 229 (348) T ss_pred eccCCcccceeeccccCCCCCCHHHHHHHHHHH-HHhcC-----CcccEEEeCHHHHHHHhcCHHHHHHHhccCCccccc Confidence 6887532 34467888899999999877654 44432 233478999999999999999999998765443322 Q ss_pred ccccc-cce-eccceeEEEeccEEEec--C-ceeeecCCeeEEEEecchhhheEEeccc-cchh--h-------ccccCc Q lcl|NC_019400. 232 RRRLG-QGQ-ENANNRMFVHKNVTYIE--D-ISNYIPDGEAYILPQGIDDMFQIHYAPA-DDVR--E-------ANTPAQ 296 (337) Q Consensus 232 ~~~~~-~~~-~~~~~~~~~~~~~~~~~--~-~~~~i~~~~~~~~p~g~~~~f~~~~ap~-d~~~--~-------~n~~~~ 296 (337) ..... ... ..+++..+.| +..|.+ + .++++|+|.++|+|.|.. +.++|||. +..+ . ++..+. T Consensus 230 ~~~~~~~~~~~~~g~~i~~y-~~~y~d~~G~~~~~~p~~~v~l~~~~~~--G~~~yg~~~e~~~~~~~~~~~~~~~~~~~ 306 (348) T protein:vir:96 230 TKAELQNYVADNYGVEIVLE-NGTYRNEKGEVSKFFPDGHLTLIPNGPL--GNTVFGTTPEESDLFADNTVNADVEIVDS 306 (348) T ss_pred cHHHHHHHHhhhcCceEEEE-ccEEEecCCcEeccccCCeEEEEcCCCc--eeEEeccChhhhhhhhcccccccceecCC Confidence 22211 111 2234433333 334433 3 257999999999998764 46777763 2111 1 122233 Q ss_pred ceeeeEEEccCCCEEEEEEeecccccccCcceEEEEEEeeC Q lcl|NC_019400. 297 ELYLWYKSSAYLREEKVESETSFLTVNTRPELVVRSTGTFA 337 (337) Q Consensus 297 ~~y~k~~~~~~~~~~~l~~eS~PLpi~~rP~~l~~~t~~aa 337 (337) .+|.+.|.+.||+++++++||+|||++.+|++++.+|+-+| T Consensus 307 ~~~~~~~~~~dP~~~~~~~~s~plPv~~~~~~~~~a~Vl~~ 347 (348) T protein:vir:96 307 GIAVTTTKTTDPVNVQTKVSMVALPSFERLGDVYMLTVIPG 347 (348) T ss_pred eeEEEeeecCCCceEEEEEeeeeeccccCCCcEEEEEEecC Confidence 48999999999999999999999999999999999999999 No 7 >protein:vir:4902 Length: 348 # NCBI annotation: gp348 # Family: family:all:1083 # MgeID: mge:107 # MgeName: Sfi11 # Cross-refs: genbank:acc:NP_056680;genbank:gi:9635015;genbank:GeneID:1262657 Probab=100.00 E-value=1.1e-50 Score=294.54 Aligned_cols=323 Identities=14% Similarity=0.119 Sum_probs=242.6 Q ss_pred CCCCCCCccCHHHHHHHHHhcCC-CccchhhcccccCccc-ccceEEEEEEcCceeEeeeccCCCCcccccCCceeEEEE Q lcl|NC_019400. 1 MAVVRTNDFQIVDLGATLEIVPR-QYRLITNMDLFTAYHG-VTTVAQIERVDEVVTDFPARRRQGERNYVGTEKAQLKNF 78 (337) Q Consensus 1 ~d~~~~d~Fs~~~Lt~~i~~~p~-~~~~l~~l~~F~~~~~-~t~~v~ie~~~~~~~l~p~v~~g~~~~~~~~~~~~~~~f 78 (337) |-.. .|.|+..+|++.|+.+|. ..++|.++ +|+.+.+ .++.+.++..++...++|++++++++.+..++++++.+| T Consensus 1 M~~l-~d~f~~~~l~~~v~~~~~~~~~~l~~~-~Fp~~~~~~~~~~~~~~~~~~~~~a~~v~~~~~~~~~~r~~~~~~~~ 78 (348) T protein:vir:49 1 MGLI-YDKVTASNIAGYFNALQENVDSTLGES-IFPARKQLGTKLSYITGASGQSVALKAAAFDTNVTVRDRVSAEMHDE 78 (348) T ss_pred Ccch-hhhcCHHHHHHHHHhccccchhhhHhh-cCCCccccCceeEEEEeecCceeeeeeecCCCCcceecccceeeeee Confidence 4433 367999999999999975 45678775 7887655 566788899999999999999999999999999999999 Q ss_pred eccccccCccccHHHHhccccC---CCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhc-CCEEecCCCceEeechh Q lcl|NC_019400. 79 NIPFFPLDRQITAADVQNFRKY---FTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIM-GKSWAPQDPTAQYNYFT 154 (337) Q Consensus 79 ~~p~i~~~~~v~a~dlq~~R~~---G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~-g~i~~~~~g~v~~d~~~ 154 (337) +||||+++..|++.|+++++.. ++....+++...+.+++++|+++|++|.||||+|||+ |++....+| . ++.. T Consensus 79 ~~p~i~~~~~i~~~d~~~l~~~~~~~~~~~~~~~~~~i~~d~~~l~~~i~~r~E~m~~qal~~Gki~i~~~g-~--~~~v 155 (348) T protein:vir:49 79 QMPFFKEAMLVKENDRQQLNLVKDSGNAALVNTIVAGIFNDNLTLVNGARARLEAMRMQVLATGKIAFTSDG-V--NKDI 155 (348) T ss_pred ecCccccccccCHHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhCCeEEEecCC-c--eEEE Confidence 9999999999999998766554 4444456777889999999999999999999999997 887554333 2 2334 Q ss_pred hcCCCcc---eEEEecCCCCcchHHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhhccccccc Q lcl|NC_019400. 155 EWGVTQH---TANIDFTDVATDPTDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYSSTQEPL 231 (337) Q Consensus 155 ~fG~~~~---~~~~~l~~~~~d~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~~~~~~~ 231 (337) +||+... +.+.+|+++++|+.++|+++++. +++.+ . ++-+++||+++|++|++|++|++++.++......+ T Consensus 156 dyg~~~~~~~t~~~~W~~~~adp~~di~~~~~~-~~~~G-~----~~~~ii~~~~~~~~l~~~~~v~~~~~~~~~~~~~i 229 (348) T protein:vir:49 156 DYGVKPDHKKQVSKSWAEPGATPLADLEDAIET-ARELG-L----NPERAVMNAKTFGLIRKAASTVKVIKPLAGDGSSV 229 (348) T ss_pred eecCCcccceeeeeccCCCCCCHHHHHHHHHHH-HHhcC-C----cccEEEeCHHHHHHHhcCHHHHHHhhccCcccccc Confidence 6787432 34567999999999999887654 44432 2 23368999999999999999999998765443333 Q ss_pred cc-cccccee-ccceeEEEeccEEEec--C-ceeeecCCeeEEEEecchhhheEEecc-ccchhhc--cc-------cCc Q lcl|NC_019400. 232 RR-RLGQGQE-NANNRMFVHKNVTYIE--D-ISNYIPDGEAYILPQGIDDMFQIHYAP-ADDVREA--NT-------PAQ 296 (337) Q Consensus 232 ~~-~~~~~~~-~~~~~~~~~~~~~~~~--~-~~~~i~~~~~~~~p~g~~~~f~~~~ap-~d~~~~~--n~-------~~~ 296 (337) .+ .+..+.. .++. .++.++.+|.+ + .++++|+|+++|+|.|..| .++||| ++..+.. +. .+- T Consensus 230 ~~~~~~~~~~~~~g~-~i~~y~~~y~d~dG~~~~~~p~~~v~l~~~~~~G--~~~yg~~~e~~~~~~~~~~~~~~~~~~~ 306 (348) T protein:vir:49 230 TKAELDNYIADNFGV-TVVLENGTYRNEKGEVSKFFPDGHLTLIPNGPLG--NTVFGTTPEESDLFADNTVNADVEIVDN 306 (348) T ss_pred cHHHHHHHHHhhcCc-eEEEEeeEEEecCCcEeeeecCCeEEEecCCCcc--eeEEecChhhhhhccccccccceeecCC Confidence 22 2222222 2333 33334444443 2 2589999999999987654 566665 3321111 11 122 Q ss_pred ceeeeEEEccCCCEEEEEEeecccccccCcceEEEEEEeeC Q lcl|NC_019400. 297 ELYLWYKSSAYLREEKVESETSFLTVNTRPELVVRSTGTFA 337 (337) Q Consensus 297 ~~y~k~~~~~~~~~~~l~~eS~PLpi~~rP~~l~~~t~~aa 337 (337) .+|.+.|..+||.++++++||+|||++.+|++++.+|+.+| T Consensus 307 ~~~~~~~~~~dP~~~~~~~~s~~lPv~~~~~~~~~a~Vl~~ 347 (348) T protein:vir:49 307 GIAVTTTKTTDPVNVQTKVSMVALPSFERLDDVYMLTVIPA 347 (348) T ss_pred eEEEeeeecCCCceEEEEEeeeccccccCCCcEEEEEEecC Confidence 38899999999999999999999999999999999999999 No 8 >protein:vir:2736 Length: 348 # NCBI annotation: putative structural protein # Family: family:all:1083 # MgeID: mge:58 # MgeName: O1205 # Cross-refs: genbank:acc:NP_695109;genbank:gi:23455878;genbank:GeneID:955608 Probab=100.00 E-value=6.9e-50 Score=290.12 Aligned_cols=323 Identities=13% Similarity=0.111 Sum_probs=238.8 Q ss_pred CCCCCCCccCHHHHHHHHHhcCCC-ccchhhcccccCcccccce-EEEEEEcCceeEeeeccCCCCcccccCCceeEEEE Q lcl|NC_019400. 1 MAVVRTNDFQIVDLGATLEIVPRQ-YRLITNMDLFTAYHGVTTV-AQIERVDEVVTDFPARRRQGERNYVGTEKAQLKNF 78 (337) Q Consensus 1 ~d~~~~d~Fs~~~Lt~~i~~~p~~-~~~l~~l~~F~~~~~~t~~-v~ie~~~~~~~l~p~v~~g~~~~~~~~~~~~~~~f 78 (337) |-.. .|.|+..+|++.|+++|+. .++|.+. +|+.+.+.+.. +.++..++...++|++++++++.+..++++++.+| T Consensus 1 M~~i-~d~f~~~~l~~~v~~~~~~~~~~l~~~-~Fp~~~~~~~~~~~~~~~~~~~~~a~~v~~~~~~~~~~r~~~~~~~~ 78 (348) T protein:vir:27 1 MGLI-YDKVTASNIAGYFNALQENVSSTLGES-IFPARKQLGTKLSYIKGASGQSVALKAAAFDTNVTIRDRVSAEMHDE 78 (348) T ss_pred Ccch-hhhcCHHHHHHHHHhccchhhhhhHhh-cCCCccccceeEEEEeeccCceeEeeeecCCCCcceecccceeeeee Confidence 5433 4789999999999999764 5678775 78877654444 44566677777899999999999999999999999 Q ss_pred eccccccCccccHHHHhcc---ccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhc-CCEEecCCCceEeechh Q lcl|NC_019400. 79 NIPFFPLDRQITAADVQNF---RKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIM-GKSWAPQDPTAQYNYFT 154 (337) Q Consensus 79 ~~p~i~~~~~v~a~dlq~~---R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~-g~i~~~~~g~v~~d~~~ 154 (337) +||+|+++..|+++|++++ +..++....+++...+.+++++|+++|++|+||||+|||+ |++....+| ..+ .. T Consensus 79 ~~p~i~~~~~i~~~d~~~~~~~~~~~~~~~~~~~~~~i~~d~~~l~~~i~~r~E~m~~~al~~Gki~i~~~~-~~~--~v 155 (348) T protein:vir:27 79 QMPFFKEAMLVKENDRQQLNLVKDSGNAVLVNTIVAGIFNDNLTLVNGARARLEAMRMQVLATGKIAFTSDG-VNK--DI 155 (348) T ss_pred ecCccccccccCHHHHHHHHHhhccCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCeeEEecCC-eeE--EE Confidence 9999999999999998765 4344444556788899999999999999999999999997 777554333 222 23 Q ss_pred hcCCCcc---eEEEecCCCCcchHHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhhccccccc Q lcl|NC_019400. 155 EWGVTQH---TANIDFTDVATDPTDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYSSTQEPL 231 (337) Q Consensus 155 ~fG~~~~---~~~~~l~~~~~d~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~~~~~~~ 231 (337) +||+... +.+..|+++++|+.++|+++++. +++.+ .+.-+++||+++|++|++|++|++++++.......+ T Consensus 156 dfg~~~~~~~t~~~~W~~~~adp~~di~~~~~~-~~~~G-----~~~~~ii~~~~~~~~l~~~~~v~~~~~~~~~~~~~i 229 (348) T protein:vir:27 156 DYGVKPDHKKQVSKSWAEPGATPLADLEDAIET-ARELG-----LNPERAVMNAKTFGLIRKAASTVKVIKPLAGDGSAV 229 (348) T ss_pred eecCCcccceeeeeccCCCCCCHHHHHHHHHHH-HHhcC-----CcccEEEECHHHHHHHhcCHHHHHHhcccCcccccc Confidence 5777432 33467999999999999887764 34432 233478999999999999999999998765433222 Q ss_pred ccccc-cce-eccceeEEEeccEEEecC---ceeeecCCeeEEEEecchhhheEEecc-ccchhhc---------cccCc Q lcl|NC_019400. 232 RRRLG-QGQ-ENANNRMFVHKNVTYIED---ISNYIPDGEAYILPQGIDDMFQIHYAP-ADDVREA---------NTPAQ 296 (337) Q Consensus 232 ~~~~~-~~~-~~~~~~~~~~~~~~~~~~---~~~~i~~~~~~~~p~g~~~~f~~~~ap-~d~~~~~---------n~~~~ 296 (337) .+... ... ..++. .++.++.+|.+. .++++|+|.++|+|.|..| .+.||+ ++..+.+ ...+. T Consensus 230 ~~~~~~~~~~~~~g~-~i~~yd~~y~d~~G~~~~~~p~~~vvl~~~~~~G--~~~yG~~~e~~~~~~~~~~~~~~~~~~~ 306 (348) T protein:vir:27 230 TKAELENYIADNFGV-SIVLENGTYRNDKGEVSKFYPDGHLTLIPNGPLG--NTVFGTTPEESDLFADNTVNAEVEIVDN 306 (348) T ss_pred CHHHHHHHHHhhcCc-eEEEEeeEEEcCCCcCcccccCCeEEEEcCCcce--eEEeccCcchhhhhhccccccceeeeCC Confidence 22211 111 12333 333444455432 2589999999999987654 455554 4322211 11223 Q ss_pred ceeeeEEEccCCCEEEEEEeecccccccCcceEEEEEEeeC Q lcl|NC_019400. 297 ELYLWYKSSAYLREEKVESETSFLTVNTRPELVVRSTGTFA 337 (337) Q Consensus 297 ~~y~k~~~~~~~~~~~l~~eS~PLpi~~rP~~l~~~t~~aa 337 (337) .+|.+.|.++||.++++++||+|||++.+|++++.+|+.+| T Consensus 307 ~~~~~~~~~~dP~~~~~~~~s~~lPv~~~~~~~~~a~Vl~~ 347 (348) T protein:vir:27 307 GIAVTTTKTTDPVNVQTKVSMVALPSFERLDDVYMLTVIPA 347 (348) T ss_pred eeEEEeeecCCCceEEEEEeeeeeccccCCCcEEEEEEecC Confidence 38999999999999999999999999999999999999999 No 9 >protein:vir:106590 Length: 349 # NCBI annotation: putative major head protein # Family: family:all:1083 # MgeID: mge:1598 # MgeName: Lj965 # Cross-refs: genbank:acc:NP_958585;genbank:gi:41179245;genbank:GeneID:2717126 Probab=100.00 E-value=3.5e-46 Score=269.79 Aligned_cols=316 Identities=12% Similarity=0.080 Sum_probs=225.4 Q ss_pred CCCCC-----CCccCHHHHHHHHHhcCCCccchhhcccccCcccccce-EEEEEEcCceeEeeeccCCCCcccccCCcee Q lcl|NC_019400. 1 MAVVR-----TNDFQIVDLGATLEIVPRQYRLITNMDLFTAYHGVTTV-AQIERVDEVVTDFPARRRQGERNYVGTEKAQ 74 (337) Q Consensus 1 ~d~~~-----~d~Fs~~~Lt~~i~~~p~~~~~l~~l~~F~~~~~~t~~-v~ie~~~~~~~l~p~v~~g~~~~~~~~~~~~ 74 (337) ||+=+ .|.|+...|++.++.+|+ +++|.++ +|+.+.+.... ..++..++...++|++++++++.+.++++ . T Consensus 8 ~~~~~~~~~~~d~~~~~~l~~~~~~~~~-~~~l~~~-~Fp~~~~~~~~~~~~~~~~~~~~~a~~v~~~~~~~~~~r~~-~ 84 (349) T protein:vir:10 8 LDLQRFATPILDMFSQNTVLDYTRNRQY-PEMLGDT-LFPAVKVPTLEVDILKAGSRVPTIASVSAFDAEAEIGTREA-S 84 (349) T ss_pred HHHHHHHHHhhcccCHHHHHHHHHhcCc-chhhHhh-cCCccccccceeEEEeeccCcceeeeeecCCCCcceecccc-e Confidence 22111 357888999999999987 5788886 67766654444 34455677788899999999998777765 5 Q ss_pred EEEEeccccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhc-CCEEecCCCceEeech Q lcl|NC_019400. 75 LKNFNIPFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIM-GKSWAPQDPTAQYNYF 153 (337) Q Consensus 75 ~~~f~~p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~-g~i~~~~~g~v~~d~~ 153 (337) ...+++|+++++..+++.|++.+|.+++.++.+++.+.+.+++.+|+++|++|+||||+|+|+ |++... ++.+.+| T Consensus 85 ~~~~~~p~ik~~~~i~e~dl~~~~~~~~~~~~~~~~~~i~~d~~~l~~~i~~r~E~m~~q~l~~Gki~~~-~~g~~vD-- 161 (349) T protein:vir:10 85 KMTAELAYVKRKMQITEEMLIKLQSPRNTAEENYLKQYVFDDIDAMVQAVKARGEKMTMEMFATGKITDK-KNGIAID-- 161 (349) T ss_pred eEEeeccccccccccCHHHHHHHhhccCcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhCCeeEEc-CCcEEEe-- Confidence 668999999999999999999999999888888999999999999999999999999999998 777765 4445555 Q ss_pred hhcCCCcc-eE----EEecCCCCcchHHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhhcccc Q lcl|NC_019400. 154 TEWGVTQH-TA----NIDFTDVATDPTDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYSSTQ 228 (337) Q Consensus 154 ~~fG~~~~-~~----~~~l~~~~~d~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~~~~ 228 (337) ||+.+. .+ +-.|+++++|+.++++++++ ..+. ++-+++||+++|++|++|++|++++.+...+. T Consensus 162 --~g~~~~~~~~lt~~~~Ws~~~adpi~Di~~~~~----~~g~-----~p~~~vm~~~~~~~l~~~~~i~~~~~~~~~~~ 230 (349) T protein:vir:10 162 --YGVPKKHQETLSGTKTWDKSDASIIDNLQDWSD----SLDV-----TPTRALTSKKVLRILMRSTEIKEAIFGKDTGR 230 (349) T ss_pred --cccCccceeEecCcccCCCCCCCHHHHHHHHHH----HhCC-----CccEEEeCHHHHHHHhcCHHHHHHhccccccc Confidence 676432 22 23577788999988876543 2221 23368899999999999999999997654332 Q ss_pred cccccccccceeccceeEEEeccEEEec--C-----ceeeecCCeeEEEEecchhhheEEeccc-cchhhc-ccc----C Q lcl|NC_019400. 229 EPLRRRLGQGQENANNRMFVHKNVTYIE--D-----ISNYIPDGEAYILPQGIDDMFQIHYAPA-DDVREA-NTP----A 295 (337) Q Consensus 229 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~--~-----~~~~i~~~~~~~~p~g~~~~f~~~~ap~-d~~~~~-n~~----~ 295 (337) ..............+...++.++-+|.+ + .++++|+|.+.|+|.+..| .++||+. +..+.. +.. . T Consensus 231 ~~~~~~~~~~l~~~~~~~i~~yd~~y~d~~~~~~~t~~~~~p~~~v~l~~~~~~G--~~~yG~~~e~~~~~~g~~~~~~~ 308 (349) T protein:vir:10 231 VVGQADLDQWMTAQGLPIIRAYDGKYRDEDSRGNLTTNSYFPEDRIVLFNDEVPG--QKIYGPTPEENRLISSNAQVSNV 308 (349) T ss_pred ccCHHHHHHHHHhcCCceEEEEeeEEEeecCCCceeecccccCCeEEEecCCCce--eEEeeccchhhhhcccccceeec Confidence 2111111111111222223333333432 2 2469999999999976544 6666653 322211 111 1 Q ss_pred cceeeeE-EEccCCCEEEEEEeecccccccCcceEEEEEEe Q lcl|NC_019400. 296 QELYLWY-KSSAYLREEKVESETSFLTVNTRPELVVRSTGT 335 (337) Q Consensus 296 ~~~y~k~-~~~~~~~~~~l~~eS~PLpi~~rP~~l~~~t~~ 335 (337) .+++++. +.++||.++++++||+|||++.+|++++.+|+- T Consensus 309 ~~~~~~~~~~~~dP~~~~~~~~s~~lPv~~~~~~~~~a~Vl 349 (349) T protein:vir:10 309 GNIMAKIYETSEDPIGTWILASATMLPSFASADDVFQAKVL 349 (349) T ss_pred cceEEEeeeecCCCceEEEEEeeeeeeeecCCCcEEEEEeC Confidence 2345555 568899999999999999999999999999999 No 10 >protein:vir:98480 Length: 348 # NCBI annotation: ORFp38 # Family: family:all:1083 # MgeID: mge:1589 # MgeName: VWB # Cross-refs: genbank:acc:NP_958280;genbank:gi:41057254;uniprot:Q38595;genbank:GeneID:2732864 Probab=100.00 E-value=6.9e-45 Score=262.67 Aligned_cols=319 Identities=11% Similarity=0.025 Sum_probs=231.5 Q ss_pred CCC-CCCCccCHHHHHHHHHhcC---CCccchhhcccccCcccccceEEEEEEc---CceeEeeeccCCCCcccccCCce Q lcl|NC_019400. 1 MAV-VRTNDFQIVDLGATLEIVP---RQYRLITNMDLFTAYHGVTTVAQIERVD---EVVTDFPARRRQGERNYVGTEKA 73 (337) Q Consensus 1 ~d~-~~~d~Fs~~~Lt~~i~~~p---~~~~~l~~l~~F~~~~~~t~~v~ie~~~---~~~~l~p~v~~g~~~~~~~~~~~ 73 (337) |.. .+.|.|+..+|++.|+..| +.+++|.+. +|+.+. +..+.++..+ +....++++++++++.+..+++. T Consensus 1 M~~~~~~d~~~~~~l~~~i~~~~~~~~~~~~l~~~-~fp~~~--~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~r~g~ 77 (348) T protein:vir:98 1 MSWTLDTEFIEPTQLTGLIREALRDLQVNRFRLAR-WLPNVD--VDDITFEFLRGGGGLAETASYRSWDTESKIGRREGL 77 (348) T ss_pred CcchhhhhccCHHHHHHHHHHHhhccCcchhhHHh-cCCCcc--ccceEEEEEeccCCceeeeeeecCCCccceeecccc Confidence 433 4668999999999999886 456788885 677654 4455666543 44456799999999999999999 Q ss_pred eEEEEeccccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhc-CCEEecCCCceEeec Q lcl|NC_019400. 74 QLKNFNIPFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIM-GKSWAPQDPTAQYNY 152 (337) Q Consensus 74 ~~~~f~~p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~-g~i~~~~~g~v~~d~ 152 (337) +...++||+++++..++++|++.+| ...++++.+.+.+.+.+|++++++|.||||+|||+ |++... ++...+ T Consensus 78 ~~~~~~~~~i~~~~~i~~~d~~~~~----~~~~~~~~~~i~~d~~~l~~~i~~r~E~m~~qal~~Gki~~~-g~~~~v-- 150 (348) T protein:vir:98 78 AKVMGELPPISEKIPLNEYDRLRLR----KLSRDEALPFIARDAQRLARNIGARFEVARGSALVNATVPVT-ELQQTV-- 150 (348) T ss_pred eeeeeeccccccccccCHHHHHHhc----CChHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhCCeEEEe-cCceEE-- Confidence 9999999999999999999998765 34667888999999999999999999999999998 777554 333334 Q ss_pred hhhcCCCcc---eEEEecC-CCCcchHHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhhcccc Q lcl|NC_019400. 153 FTEWGVTQH---TANIDFT-DVATDPTDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYSSTQ 228 (337) Q Consensus 153 ~~~fG~~~~---~~~~~l~-~~~~d~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~~~~ 228 (337) +||+... +.+..|+ .+++|+.++++++++.+..+. + .+.-++++|+++|++|++|++|++.++++.... T Consensus 151 --Dyg~~~~~~~t~~~~Ws~~~~adp~~di~~~~~~~~~~~-G----~~p~~~vm~~~~~~~l~~~~~i~~~~~~~~~~~ 223 (348) T protein:vir:98 151 --DFGRIGSHSVVAAVLWSVHATATPISDLESWVATYEDTN-G----QSPGVILMPKAAVSHMRQCEEVIRQVFPLAPSG 223 (348) T ss_pred --ccccCcccccccccccCCCCCCCHHHHHHHHHHHHHHcc-C----CcceEEEeCHHHHHHHhcCHHHHHHHhccCccc Confidence 4677432 2345675 467899999988777643332 2 233478999999999999999999998654321 Q ss_pred --ccccccc-ccceeccceeEEEeccEEEecCc--eeeecCCeeEEEEecc-------hhhheEEeccccchhhcc--c- Q lcl|NC_019400. 229 --EPLRRRL-GQGQENANNRMFVHKNVTYIEDI--SNYIPDGEAYILPQGI-------DDMFQIHYAPADDVREAN--T- 293 (337) Q Consensus 229 --~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~--~~~i~~~~~~~~p~g~-------~~~f~~~~ap~d~~~~~n--~- 293 (337) ..+.... .......|...++.++..+.+++ ++++|+|.+.|+|.+. ...+.++||+.......+ . T Consensus 224 ~~~~~~~~~~~~~~~~~g~~~i~~~d~~~~~~g~~~~~~p~~~i~l~p~~~~~~~~~~~~~G~t~~G~~~e~~~~~~~~~ 303 (348) T protein:vir:98 224 TAPMVSVEQLNTVLSSMGLPPIEVYDAKVAVDGVSTRITPANAIALLPEPGATDAAQPTELGATLLGTTAESLEDDYALA 303 (348) T ss_pred cccccCHHHHHHHHHhhCCeEEEEeeeEEEcCCceeceecCCeEEEEecCCcccccccccccceecccchhhhccccccc Confidence 1111111 11112223333444444444433 5799999999999653 234567777632111111 1 Q ss_pred --cCcceeeeEEEccCCCEEEEEEeecccccccCcceEEEEEEee Q lcl|NC_019400. 294 --PAQELYLWYKSSAYLREEKVESETSFLTVNTRPELVVRSTGTF 336 (337) Q Consensus 294 --~~~~~y~k~~~~~~~~~~~l~~eS~PLpi~~rP~~l~~~t~~a 336 (337) ....+|++.|.++||+++++++||+|||++.+|++++.+|+-| T Consensus 304 ~~~~~~i~~~~~~~~dP~~~~~~~~s~~lPv~~~~~~~~~a~Vl~ 348 (348) T protein:vir:98 304 PGEQPGIVAATWKTKDPVRLWTHAAAVGIPVLREPNLTFKAQVLA 348 (348) T ss_pred eeccCceeeeeeeecCCcEEEEEEeeeeeccccCCCcEEEEEEeC Confidence 1123799999999999999999999999999999999999999 No 11 >protein:vir:79503 Length: 409 # NCBI annotation: major head protein # Family: family:all:11999 # MgeID: mge:1870 # MgeName: P74-26 # Cross-refs: genbank:acc:YP_001468058;genbank:gi:157265500;genbank:GeneID:5600620 Probab=99.96 E-value=5.2e-31 Score=186.62 Aligned_cols=332 Identities=11% Similarity=0.016 Sum_probs=200.8 Q ss_pred CCC----------------CCCCccCHHHHHHHHHhcCCCccchhhcccccCccc-ccc-eEEEEEEcCceeEeeec--c Q lcl|NC_019400. 1 MAV----------------VRTNDFQIVDLGATLEIVPRQYRLITNMDLFTAYHG-VTT-VAQIERVDEVVTDFPAR--R 60 (337) Q Consensus 1 ~d~----------------~~~d~Fs~~~Lt~~i~~~p~~~~~l~~l~~F~~~~~-~t~-~v~ie~~~~~~~l~p~v--~ 60 (337) ++| -.-+.+++..++..+.++.+++++|.+. ||+...+ .|+ .+.++..+|..++..+. + T Consensus 5 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ia~~~~~~p~~~~L~d~-~FP~~~~f~t~l~~~~~~~kg~kk~~~~~~~~ 83 (409) T protein:vir:79 5 ININNALARVRDPLSIGGLKFPTTKEIQEAVAAIADKFNQENDLVDR-FFPEDSTFASELELYLLRTQDAEQTGMTFVHQ 83 (409) T ss_pred cccchhhhhhcCcchhcceecCchHHHHHHHHHHHHhcCCccchhhc-cCCCCccccceEEEEeeeccCcccccceEeee Confidence 111 1224677777777766666666678775 6775543 443 33344456655554443 3 Q ss_pred CCCCcccccCCc----eeEEEEeccccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHh Q lcl|NC_019400. 61 RQGERNYVGTEK----AQLKNFNIPFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAI 136 (337) Q Consensus 61 ~g~~~~~~~~~~----~~~~~f~~p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL 136 (337) .+..+..+.... .+..+++||+|+++..|++.|+..++..+.....+++.+.+.+++++|.++|++|+||||+|+| T Consensus 84 ~~d~~~pv~~r~~~~~~~~~t~epp~iK~k~~i~e~dl~~~~~~~n~~~~~~i~~~i~~D~~~L~~~I~~R~E~Ma~q~L 163 (409) T protein:vir:79 84 VGSTSLPVEARVAKVDLAKATWSPLAFKESRVWDEKEILYLGRLADEVQAGVINEQIAESLTWLMARMRNRRRWLTWQVM 163 (409) T ss_pred cCCccccccccceeeeeeeecccccccccccccCHHHHHHHhCCCChhHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Confidence 344444443222 3677899999999999999999876655544556677788999999999999999999999999 Q ss_pred c-CCEEec-CCCc--eEeechhhcCCCcc-eE----EEecCCCCcchHHHHHHHHHHHHHHhhccccccccEEEEEChHH Q lcl|NC_019400. 137 M-GKSWAP-QDPT--AQYNYFTEWGVTQH-TA----NIDFTDVATDPTDIIEADARAYIIDNAGDNGNNYGIVVLASRKW 207 (337) Q Consensus 137 ~-g~i~~~-~~g~--v~~d~~~~fG~~~~-~~----~~~l~~~~~d~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~ 207 (337) + |+|... .++. ...++-.+||+... .+ +-.|+++++|+.++++++++. +++.++... +.-.++++++. T Consensus 164 ~tGki~i~g~~~~~~~g~~~~vDyg~pa~hkvtlTgt~~W~~~~AdPi~DIe~w~~~-i~~~~g~~~--t~~~~imt~~~ 240 (409) T protein:vir:79 164 RTGRITIQPNDPYNPNGLKYVIDYGVTDIELPLPQKFDAKDGNGNSAVDPIQYFRDL-IKAATYFPD--RRPVAIIVGPG 240 (409) T ss_pred hCCeEEEEecCCCccccceEEEecCCCcccceeecccccCCCCCCChHHHHHHHHHH-HHHhcCCCC--CccEEEEcHHH Confidence 8 777542 2211 11223336788542 22 335888899999999877654 444333211 22246666777 Q ss_pred HHHHh-cCHHHHHHHHhhcccccccccc-----c-----ccce-eccceeEEEeccEEEec--Cc-eeeecCCeeEEEEe Q lcl|NC_019400. 208 FSALI-AHPLVMNAYQYYSSTQEPLRRR-----L-----GQGQ-ENANNRMFVHKNVTYIE--DI-SNYIPDGEAYILPQ 272 (337) Q Consensus 208 ~~al~-~h~~v~~~~~~~~~~~~~~~~~-----~-----~~~~-~~~~~~~~~~~~~~~~~--~~-~~~i~~~~~~~~p~ 272 (337) |++|+ .++.|++++++........... + ..++ ...|.. +..++.+|.+ +. ++++|+|+++|++. T Consensus 241 ~~~l~~~n~~ik~~l~~~~~~~~~~~~~~~~~~l~~~~~ln~~~~~~GL~-I~vYd~~Y~dedGt~k~~~Pd~~vvLl~a 319 (409) T protein:vir:79 241 FDEVLADNTFVQKYVEYEKGWVVGQNTVQPPREVYRQAALDIFKRYTGLE-VMVYDKTYRDQDGSVKYWIPVGELIVLNQ 319 (409) T ss_pred HHHHHhCcHHHHHhhhcccccccccccccchhhhcchhHhHhhhhhcCce-EEEEeeEEEecCCcccceecCCeEEEEcC Confidence 76654 6777888887543322111111 1 0111 122333 3333334443 33 57999999988853 Q ss_pred cchhhheEEeccc-c-ch--hhccccCcceeeeEEEccCCCEEEEEEeecccccccCcce--EEEEEEee---C Q lcl|NC_019400. 273 GIDDMFQIHYAPA-D-DV--REANTPAQELYLWYKSSAYLREEKVESETSFLTVNTRPEL--VVRSTGTF---A 337 (337) Q Consensus 273 g~~~~f~~~~ap~-d-~~--~~~n~~~~~~y~k~~~~~~~~~~~l~~eS~PLpi~~rP~~--l~~~t~~a---a 337 (337) ....++.++||+. + .. ..+...+--+-.+.|..+||..+.......-||+-..+|. ..-..++- | T Consensus 320 p~g~LG~T~yGa~~~~~~~~~~v~~~g~~i~~~~~~~~dP~~~~~~~~~~~~p~l~~~~~~~~~~~~~~~~~~~ 393 (409) T protein:vir:79 320 STGPVGRFVYTAHVAGQRNGKVVYATGPYLTVKDHLQDDPPYYAIIAGFHGLPQLSGYNTEDFSFHRFKWLKYA 393 (409) T ss_pred CcccccceecccccccccchhhhccccceeEecccccCCcceeeeecceEEeeeeecCCccceeehhhhhhhhh Confidence 2222457778762 1 11 1122123224457789999999999999999999886663 33333322 2 No 12 >protein:vir:78006 Length: 409 # NCBI annotation: major head protein # Family: family:all:11999 # MgeID: mge:1843 # MgeName: P23-45 # Cross-refs: genbank:acc:YP_001467942;genbank:gi:157265383;genbank:GeneID:5600496 Probab=99.96 E-value=5.2e-31 Score=186.62 Aligned_cols=332 Identities=11% Similarity=0.016 Sum_probs=200.8 Q ss_pred CCC----------------CCCCccCHHHHHHHHHhcCCCccchhhcccccCccc-ccc-eEEEEEEcCceeEeeec--c Q lcl|NC_019400. 1 MAV----------------VRTNDFQIVDLGATLEIVPRQYRLITNMDLFTAYHG-VTT-VAQIERVDEVVTDFPAR--R 60 (337) Q Consensus 1 ~d~----------------~~~d~Fs~~~Lt~~i~~~p~~~~~l~~l~~F~~~~~-~t~-~v~ie~~~~~~~l~p~v--~ 60 (337) ++| -.-+.+++..++..+.++.+++++|.+. ||+...+ .|+ .+.++..+|..++..+. + T Consensus 5 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ia~~~~~~p~~~~L~d~-~FP~~~~f~t~l~~~~~~~kg~kk~~~~~~~~ 83 (409) T protein:vir:78 5 ININNALARVRDPLSIGGLKFPTTKEIQEAVAAIADKFNQENDLVDR-FFPEDSTFASELELYLLRTQDAEQTGMTFVHQ 83 (409) T ss_pred cccchhhhhhcCcchhcceecCchHHHHHHHHHHHHhcCCccchhhc-cCCCCccccceEEEEeeeccCcccccceEeee Confidence 111 1224677777777766666666678775 6775543 443 33344456655554443 3 Q ss_pred CCCCcccccCCc----eeEEEEeccccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHh Q lcl|NC_019400. 61 RQGERNYVGTEK----AQLKNFNIPFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAI 136 (337) Q Consensus 61 ~g~~~~~~~~~~----~~~~~f~~p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL 136 (337) .+..+..+.... .+..+++||+|+++..|++.|+..++..+.....+++.+.+.+++++|.++|++|+||||+|+| T Consensus 84 ~~d~~~pv~~r~~~~~~~~~t~epp~iK~k~~i~e~dl~~~~~~~n~~~~~~i~~~i~~D~~~L~~~I~~R~E~Ma~q~L 163 (409) T protein:vir:78 84 VGSTSLPVEARVAKVDLAKATWSPLAFKESRVWDEKEILYLGRLADEVQAGVINEQIAESLTWLMARMRNRRRWLTWQVM 163 (409) T ss_pred cCCccccccccceeeeeeeecccccccccccccCHHHHHHHhCCCChhHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Confidence 344444443222 3677899999999999999999876655544556677788999999999999999999999999 Q ss_pred c-CCEEec-CCCc--eEeechhhcCCCcc-eE----EEecCCCCcchHHHHHHHHHHHHHHhhccccccccEEEEEChHH Q lcl|NC_019400. 137 M-GKSWAP-QDPT--AQYNYFTEWGVTQH-TA----NIDFTDVATDPTDIIEADARAYIIDNAGDNGNNYGIVVLASRKW 207 (337) Q Consensus 137 ~-g~i~~~-~~g~--v~~d~~~~fG~~~~-~~----~~~l~~~~~d~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~ 207 (337) + |+|... .++. ...++-.+||+... .+ +-.|+++++|+.++++++++. +++.++... +.-.++++++. T Consensus 164 ~tGki~i~g~~~~~~~g~~~~vDyg~pa~hkvtlTgt~~W~~~~AdPi~DIe~w~~~-i~~~~g~~~--t~~~~imt~~~ 240 (409) T protein:vir:78 164 RTGRITIQPNDPYNPNGLKYVIDYGVTDIELPLPQKFDAKDGNGNSAVDPIQYFRDL-IKAATYFPD--RRPVAIIVGPG 240 (409) T ss_pred hCCeEEEEecCCCccccceEEEecCCCcccceeecccccCCCCCCChHHHHHHHHHH-HHHhcCCCC--CccEEEEcHHH Confidence 8 777542 2211 11223336788542 22 335888899999999877654 444333211 22246666777 Q ss_pred HHHHh-cCHHHHHHHHhhcccccccccc-----c-----ccce-eccceeEEEeccEEEec--Cc-eeeecCCeeEEEEe Q lcl|NC_019400. 208 FSALI-AHPLVMNAYQYYSSTQEPLRRR-----L-----GQGQ-ENANNRMFVHKNVTYIE--DI-SNYIPDGEAYILPQ 272 (337) Q Consensus 208 ~~al~-~h~~v~~~~~~~~~~~~~~~~~-----~-----~~~~-~~~~~~~~~~~~~~~~~--~~-~~~i~~~~~~~~p~ 272 (337) |++|+ .++.|++++++........... + ..++ ...|.. +..++.+|.+ +. ++++|+|+++|++. T Consensus 241 ~~~l~~~n~~ik~~l~~~~~~~~~~~~~~~~~~l~~~~~ln~~~~~~GL~-I~vYd~~Y~dedGt~k~~~Pd~~vvLl~a 319 (409) T protein:vir:78 241 FDEVLADNTFVQKYVEYEKGWVVGQNTVQPPREVYRQAALDIFKRYTGLE-VMVYDKTYRDQDGSVKYWIPVGELIVLNQ 319 (409) T ss_pred HHHHHhCcHHHHHhhhcccccccccccccchhhhcchhHhHhhhhhcCce-EEEEeeEEEecCCcccceecCCeEEEEcC Confidence 76654 6777888887543322111111 1 0111 122333 3333334443 33 57999999988853 Q ss_pred cchhhheEEeccc-c-ch--hhccccCcceeeeEEEccCCCEEEEEEeecccccccCcce--EEEEEEee---C Q lcl|NC_019400. 273 GIDDMFQIHYAPA-D-DV--REANTPAQELYLWYKSSAYLREEKVESETSFLTVNTRPEL--VVRSTGTF---A 337 (337) Q Consensus 273 g~~~~f~~~~ap~-d-~~--~~~n~~~~~~y~k~~~~~~~~~~~l~~eS~PLpi~~rP~~--l~~~t~~a---a 337 (337) ....++.++||+. + .. ..+...+--+-.+.|..+||..+.......-||+-..+|. ..-..++- | T Consensus 320 p~g~LG~T~yGa~~~~~~~~~~v~~~g~~i~~~~~~~~dP~~~~~~~~~~~~p~l~~~~~~~~~~~~~~~~~~~ 393 (409) T protein:vir:78 320 STGPVGRFVYTAHVAGQRNGKVVYATGPYLTVKDHLQDDPPYYAIIAGFHGLPQLSGYNTEDFSFHRFKWLKYA 393 (409) T ss_pred CcccccceecccccccccchhhhccccceeEecccccCCcceeeeecceEEeeeeecCCccceeehhhhhhhhh Confidence 2222457778762 1 11 1122123224457789999999999999999999886663 33333322 2 No 13 >protein:vir:79078 Length: 307 # NCBI annotation: gp8 # Family: family:all:908 # MgeID: mge:1862 # MgeName: phiE255 # Cross-refs: genbank:acc:YP_001111208;genbank:gi:134288798;genbank:GeneID:4960752 Probab=99.25 E-value=1.4e-12 Score=85.59 Aligned_cols=293 Identities=13% Similarity=0.063 Sum_probs=141.5 Q ss_pred CCCCCCCccCHH-HHHHHHHhcCCCccchhhcccccCcccccceEEE-EEEcCceeEeeecc--CCCCcccccCCceeEE Q lcl|NC_019400. 1 MAVVRTNDFQIV-DLGATLEIVPRQYRLITNMDLFTAYHGVTTVAQI-ERVDEVVTDFPARR--RQGERNYVGTEKAQLK 76 (337) Q Consensus 1 ~d~~~~d~Fs~~-~Lt~~i~~~p~~~~~l~~l~~F~~~~~~t~~v~i-e~~~~~~~l~p~v~--~g~~~~~~~~~~~~~~ 76 (337) |-=++ ..|-+. .||..--... .+.++++. +|+..++......+ .+.+... .+|-.. |++..+.+...+.+.. T Consensus 1 m~~~~-~~~~~dp~LT~~A~gy~-n~~~Iad~-lfP~vpV~~~~~k~~~f~~e~f-~~~~t~ra~~~~~~~v~~~~~~~~ 76 (307) T protein:vir:79 1 MGRLS-KLRIVDPVLTNLAIGYT-NAEFIGQT-LMPVVEVEKEGGKIPKFGKESF-RLYQTERALRAKSNRMNPEDIDSV 76 (307) T ss_pred CCCCC-CCcccCHHHHHHHhhcc-chhhhhhh-cCCcccccccccceeeeccccc-cccccccccCCCcceeeeeccccc Confidence 33233 356555 4555544444 46689984 89988776554443 2223333 234443 3444445554444444 Q ss_pred EEeccccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCEEecCCCceEeechhhc Q lcl|NC_019400. 77 NFNIPFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIMGKSWAPQDPTAQYNYFTEW 156 (337) Q Consensus 77 ~f~~p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~g~i~~~~~g~v~~d~~~~f 156 (337) .+.+.-......|. .|.-|...- ...++.+..+.+.|.+++||||++.++.....+.+..+.+ - T Consensus 77 ~~~~~~~~l~~~id------~r~~~~~~~-----~~~~~Av~~l~d~I~l~~E~~~A~l~~~~~~y~~~~k~tL-----s 140 (307) T protein:vir:79 77 DVNLDEHDLEYPID------YREDQESAF-----PLEQAAVQTATDAIQLRREKMIADLSQNPSSYAAGNKKQL-----S 140 (307) T ss_pred cccccccchhhccc------chhcCCCCC-----CHHHHHHHHHHHHHHhHHHHHHHHHhccccccCCCceEEE-----c Confidence 44433222222221 243332111 1223446667889999999999999996544332222211 1 Q ss_pred CCCcceEEEecCCCCcchHHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhhcccccccccccc Q lcl|NC_019400. 157 GVTQHTANIDFTDVATDPTDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYSSTQEPLRRRLG 236 (337) Q Consensus 157 G~~~~~~~~~l~~~~~d~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~~~~~~~~~~~~ 236 (337) |. =.|+++++||..+++++...+... .+ ..+=++++|+++|++|+.||+|.+.+++++.+ .+.+... T Consensus 141 gt------~~Wsd~~sDPi~di~~~~~ai~~~-~g----~~Pn~~vlg~~a~~~l~~h~~i~~~lk~~~~g--~it~~~l 207 (307) T protein:vir:79 141 AT------EKFTAANSDPVGVIEDGKEAIRTK-IG----RRPNTMVIGASAYKTLKAHPQLIEKIKYSMKG--IVTVDLL 207 (307) T ss_pred cC------cccCCCCCCcHHHHHHHHHHHHHh-hC----CccceEEeCHHHHHHHhcCHHHHHHhcCcccc--ccCHHHH Confidence 21 148889999999998876654433 32 22336888999999999999999999876532 3333332 Q ss_pred cceeccceeEEEeccEEEecCc---eeeecCCeeEEE-Ee----cchhhheEEeccccchhhccccCcceeeeEEEccCC Q lcl|NC_019400. 237 QGQENANNRMFVHKNVTYIEDI---SNYIPDGEAYIL-PQ----GIDDMFQIHYAPADDVREANTPAQELYLWYKSSAYL 308 (337) Q Consensus 237 ~~~~~~~~~~~~~~~~~~~~~~---~~~i~~~~~~~~-p~----g~~~~f~~~~ap~d~~~~~n~~~~~~y~k~~~~~~~ 308 (337) .. ..++..+..+...|.+.. +++.+.+-+.++ |. +.+.++.--||---. ..+.++--+. .+.+ T Consensus 208 a~--l~~v~~V~vg~a~y~~~~~~~~~iw~~~~~l~y~~~~~~~~~~~~~~ps~Gyt~~-----~~g~~~~d~~--~~~~ 278 (307) T protein:vir:79 208 KE--IFEVENIAVGEAIYADDKDRFTDIWGANIVLAYVPLQRGGQQRTPYEPSYGYTLR-----KKGNPVVDTR--IEDG 278 (307) T ss_pred HH--HhCceeEEEeeeeeecccccchhcCCCceEEEecccccCCCCCcccccccceeEE-----ecCceEEecc--cCCC Confidence 22 223344444444443322 344544433332 11 111111111111000 1122211111 1233 Q ss_pred CEEEEE--EeecccccccCcceEEEEEEe Q lcl|NC_019400. 309 REEKVE--SETSFLTVNTRPELVVRSTGT 335 (337) Q Consensus 309 ~~~~l~--~eS~PLpi~~rP~~l~~~t~~ 335 (337) +++.+. -...|+-++..-..|++.-+- T Consensus 279 ~~~~vrv~~~~~~~i~~~~~G~li~~~v~ 307 (307) T protein:vir:79 279 KLELVRATDIFRPYLLGADAGYLISGING 307 (307) T ss_pred ceeEEeecccccceeeccccchhhccCCC Confidence 333332 223333222222222222222 No 14 >protein:vir:107882 Length: 307 # NCBI annotation: gp34 # Family: family:all:908 # MgeID: mge:1565 # MgeName: BcepMu # Cross-refs: genbank:acc:YP_024707;genbank:gi:48696944;genbank:GeneID:2845970 Probab=99.19 E-value=5e-12 Score=82.55 Aligned_cols=294 Identities=12% Similarity=0.054 Sum_probs=138.3 Q ss_pred CCCCCCCccCHHHHHHHHHhcCCCccchhhcccccCcccccceEEE-EEEcCceeEeeecc--CCCCcccccCCceeEEE Q lcl|NC_019400. 1 MAVVRTNDFQIVDLGATLEIVPRQYRLITNMDLFTAYHGVTTVAQI-ERVDEVVTDFPARR--RQGERNYVGTEKAQLKN 77 (337) Q Consensus 1 ~d~~~~d~Fs~~~Lt~~i~~~p~~~~~l~~l~~F~~~~~~t~~v~i-e~~~~~~~l~p~v~--~g~~~~~~~~~~~~~~~ 77 (337) |-=++ ..|-+.=.+..|..-=..+.++.+. +|+..++......+ ++.+.... +|-.. |++..+.+..+...... T Consensus 1 m~~~~-~~~~~dp~LT~~A~gy~n~~~ia~~-l~P~vpv~~~~~k~~~f~~eaF~-~~~t~r~~~~~~~~v~~~~~~~~~ 77 (307) T protein:vir:10 1 MGRLS-KLRIVDPVLTNLAIGYTNAEFIGQS-LMPVVEVEKEGGKIPKFGKESFR-LYKTERALRARSNRMNPEDLGSID 77 (307) T ss_pred CCCCC-CCcccChhHHHHHHhhcchhhhhhh-cCCcccccccccceeeECccccc-chhhhcccCCCcceeecccccccc Confidence 33233 3566553333333333345788884 89988876555443 22233332 33332 33333333333222222 Q ss_pred EeccccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCEEecCCCceEeechhhcC Q lcl|NC_019400. 78 FNIPFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIMGKSWAPQDPTAQYNYFTEWG 157 (337) Q Consensus 78 f~~p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~g~i~~~~~g~v~~d~~~~fG 157 (337) +.++-.....+|. .|.-|... -...++.+..+.++|.+++||||++.++.....+.+..+.+ -| T Consensus 78 ~~~~~~~L~~~id------~r~~~~~~-----~~~~~~av~~l~d~I~l~~E~~~A~l~~~~~~y~~~~k~tL-----sG 141 (307) T protein:vir:10 78 IVLDEHDLEYPID------YREDQESA-----FPLEQAAVQTATEAIQLRREKMVADLAQNPNSYAGGNKKQL-----SA 141 (307) T ss_pred cccccccccccCC------hhhcCCCC-----CCHHHHHHHHHHHHHHHHHHHHHHHHhcCccccCCCceEEe-----cc Confidence 2222222222211 24433211 12334556667789999999999999876443332222111 12 Q ss_pred CCcceEEEecCCCCcchHHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhhccccccccccccc Q lcl|NC_019400. 158 VTQHTANIDFTDVATDPTDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYSSTQEPLRRRLGQ 237 (337) Q Consensus 158 ~~~~~~~~~l~~~~~d~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~~~~~~~~~~~~~ 237 (337) +=.|+++++||..+++++..++..+.+. .+=.+++|.++|++|+.||+|.+.+++++.+ .+.+.... T Consensus 142 ------t~~Wsd~~sDPi~di~~~~~ai~~~~g~-----~Pn~~vlg~~a~~al~~hp~i~e~lk~~~~g--~it~~~la 208 (307) T protein:vir:10 142 ------TEKFTAAGSDPVGVIEDGKEAIRTKIGR-----RPNTMVIGASAYKTLKAHPQLIEKIKYSMKG--IVTVDLLK 208 (307) T ss_pred ------ccccCCCCCCcHHHHHHHHHHHHhhhCC-----ccceEEeCHHHHHHHhcCHHHHHHhCCcccc--ccCHHHHH Confidence 1258888999999998877665443222 2236888999999999999999999876532 33333322 Q ss_pred ceeccceeEEEeccEEEecCc---eeeecCCeeEEE-Ee----cchhhheEEeccccchhhccccCcceeeeEEEccCCC Q lcl|NC_019400. 238 GQENANNRMFVHKNVTYIEDI---SNYIPDGEAYIL-PQ----GIDDMFQIHYAPADDVREANTPAQELYLWYKSSAYLR 309 (337) Q Consensus 238 ~~~~~~~~~~~~~~~~~~~~~---~~~i~~~~~~~~-p~----g~~~~f~~~~ap~d~~~~~n~~~~~~y~k~~~~~~~~ 309 (337) . ..++..+.++...+.+.. +.+.+.+-+.++ |. +.++++.--||- ++-..+.++.-+. .+..+ T Consensus 209 ~--ll~v~~i~vg~a~~~~~~~~~~~iw~~~~vl~yv~~~~~~~~~~~~epsfGy-----T~~~~g~~~~d~~--~~~~~ 279 (307) T protein:vir:10 209 E--IFEVENIAVGEAIYADDKDRFTDIWGANIVLAYVPLQRGGQQRTPYEPSYGY-----TLRKKGNPVVDTR--IEDGK 279 (307) T ss_pred H--HhCceeEEEeeeeeeccCCccceeCCCceEEEecccccCCCCCcccccccce-----eEEEcCCeEeece--ecCCc Confidence 2 223444455544443322 233444332222 11 011111100110 0000111222111 12333 Q ss_pred EEEE--EEeecccccccCcceEEEEEEe Q lcl|NC_019400. 310 EEKV--ESETSFLTVNTRPELVVRSTGT 335 (337) Q Consensus 310 ~~~l--~~eS~PLpi~~rP~~l~~~t~~ 335 (337) ++.+ .-.-+|+-++..-..|++..+- T Consensus 280 ~~~~r~~~~~~~~i~~~~~G~li~~~~~ 307 (307) T protein:vir:10 280 LELVRSTDIFRPYLLGADAGYLISGING 307 (307) T ss_pred eeEEeccccccceeecccccceeccCCC Confidence 3333 2233444444444444444333 No 15 >protein:vir:99888 Length: 309 # NCBI annotation: capsid protein # Family: family:all:908 # MgeID: mge:1480 # MgeName: B3 # Cross-refs: genbank:acc:YP_164075;genbank:gi:56692607;genbank:GeneID:3192616 Probab=98.94 E-value=4e-10 Score=72.08 Aligned_cols=297 Identities=11% Similarity=-0.012 Sum_probs=157.2 Q ss_pred CCCCccCHHHHHHHHHhcCCCccchhhcccccCcccccceEEE-EEEcCceeEeee--ccCCCCcccccCCceeEEEEec Q lcl|NC_019400. 4 VRTNDFQIVDLGATLEIVPRQYRLITNMDLFTAYHGVTTVAQI-ERVDEVVTDFPA--RRRQGERNYVGTEKAQLKNFNI 80 (337) Q Consensus 4 ~~~d~Fs~~~Lt~~i~~~p~~~~~l~~l~~F~~~~~~t~~v~i-e~~~~~~~l~p~--v~~g~~~~~~~~~~~~~~~f~~ 80 (337) +++-.|-+.-.+..+..-=..+.++++. +|+..++......+ ++.+...-.+|- +.|++..+.+... ...+.+.+ T Consensus 1 ~~~~~~~~dp~LT~~A~gy~n~~~Ia~~-l~P~vpV~~~~~~~~~f~~~e~F~~~~t~r~~~~~~~~v~~~-~~~~~~~~ 78 (309) T protein:vir:99 1 MSNAPFPIDPELTAIAIAYRNGRMISDE-VLPRVPVGKQEFKFWKYDLAQGFTVPETLVGRKSKPNEVEFS-ATDETGST 78 (309) T ss_pred CCCCCcCcCHhHHHHHhhccChhhhhhh-cCCccccCccccceeeechhhcccccchhhccCCCcceEeec-ccCceeee Confidence 5556787664333333333456689885 89988887665544 233322333433 3556555655554 34456666 Q ss_pred cccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCEEecCCCceEeechhhcCCCc Q lcl|NC_019400. 81 PFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIMGKSWAPQDPTAQYNYFTEWGVTQ 160 (337) Q Consensus 81 p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~g~i~~~~~g~v~~d~~~~fG~~~ 160 (337) --.....+|.-+|+++- .+.-++ .++.+..+.+.|.+++|+++++.++.--..+.+..+.+. |.+ T Consensus 79 ~~~~L~~~i~~~~~~~a--~~~~d~-------~~~Av~~l~~~i~l~rE~~~A~lv~~~a~y~~~~k~~Ls-----gt~- 143 (309) T protein:vir:99 79 EDHGLDAPVPQADIDNA--PTNYNP-------LGHATEQTTNLILLDREARTSKLVFSPNSYAAGNKTTLS-----GAD- 143 (309) T ss_pred cccceeecCCchhhhhc--cCCCCH-------HHHHHHHHHHHHHHHHHHHHHHHhcChhhcCCCceEEec-----Ccc- Confidence 66677777777777542 222232 333445678899999999999988754333322222111 221 Q ss_pred ceEEEecCCCCcchHHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhhccccccccccccccee Q lcl|NC_019400. 161 HTANIDFTDVATDPTDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYSSTQEPLRRRLGQGQE 240 (337) Q Consensus 161 ~~~~~~l~~~~~d~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~~~~~~~~~~~~~~~~ 240 (337) .|+++++||...+++.+... + +.+=.+++|.++|++|+.||+|.+.+++.......+........ T Consensus 144 -----~wsd~~SDPi~~i~~~~~~~-----g----~~PN~~vlg~~~~~~l~~hp~i~~~ik~~~~~~g~it~~~la~l- 208 (309) T protein:vir:99 144 -----QWSDPTSNPLPVITDALDSV-----I----LRPNIGVLGRRTATILRRHPKIVKAYNGSLGDEGMVPMAFLQEL- 208 (309) T ss_pred -----ccCCCCCCcHHHHHHHHHhh-----C----CCcceEEechHHHHHHhhCHHHHHHhcCCCccccccCHHHHHHH- Confidence 37788999999997765432 2 22336788999999999999999999876544333433332221 Q ss_pred ccceeEEEeccEE------EecCceeeecCCeeEEEEecch--hhheEEecc-ccchhhccccCcceeeeEEEccCCCEE Q lcl|NC_019400. 241 NANNRMFVHKNVT------YIEDISNYIPDGEAYILPQGID--DMFQIHYAP-ADDVREANTPAQELYLWYKSSAYLREE 311 (337) Q Consensus 241 ~~~~~~~~~~~~~------~~~~~~~~i~~~~~~~~p~g~~--~~f~~~~ap-~d~~~~~n~~~~~~y~k~~~~~~~~~~ 311 (337) .++..+..+... +.+..-..|=.+.+.|...+.. ..+..-||. ..+ .. -..|.. +...+.++-.+++ T Consensus 209 -~~ve~V~vg~a~~n~a~~g~~~~~~~iwg~~~~L~y~~~~~~~~~~ps~G~t~~~-~~-r~~g~~-~d~~~~~~g~~~v 284 (309) T protein:vir:99 209 -LELDAIYIGEARLNIARPGQNPNLIRAWGPHASFIYRDRLADTRNGTTFGLTAQW-GD-RVSGSI-ADPNIGLRGGQRV 284 (309) T ss_pred -hCcceEEeecceeeccccccccccccccCCcEEEEEcCCCCCCcccccccceeec-cc-ccCCce-eeeeeccCCceEE Confidence 222222222221 1122222343444445443321 111111221 110 00 011222 2222333444555 Q ss_pred EEEEeecccccccCcceEEEEEEee Q lcl|NC_019400. 312 KVESETSFLTVNTRPELVVRSTGTF 336 (337) Q Consensus 312 ~l~~eS~PLpi~~rP~~l~~~t~~a 336 (337) ..-..-+|+.++..-..+++..+++ T Consensus 285 r~~~~~k~~i~~~d~G~li~~~va~ 309 (309) T protein:vir:99 285 RVGESVKELVTAPDLGFFFENAVAA 309 (309) T ss_pred EEeccccchhcchhcchhhhhcccC Confidence 5544556666555555566655555 No 16 >protein:vir:108211 Length: 318 # NCBI annotation: gp9 # Family: family:all:6420 # MgeID: mge:2004 # MgeName: Giles # Cross-refs: genbank:acc:YP_001552338;genbank:gi:160700658;genbank:GeneID:5758931 Probab=97.45 E-value=3e-05 Score=45.41 Aligned_cols=293 Identities=11% Similarity=0.049 Sum_probs=131.8 Q ss_pred CCCCCCCccCHHHHHHHHHhcCCCccchhhcccccCcccc-cceEEEEEEcCce--eEeeeccCCCCcccccCCceeEEE Q lcl|NC_019400. 1 MAVVRTNDFQIVDLGATLEIVPRQYRLITNMDLFTAYHGV-TTVAQIERVDEVV--TDFPARRRQGERNYVGTEKAQLKN 77 (337) Q Consensus 1 ~d~~~~d~Fs~~~Lt~~i~~~p~~~~~l~~l~~F~~~~~~-t~~v~ie~~~~~~--~l~p~v~~g~~~~~~~~~~~~~~~ 77 (337) =|+++++.|=.+.+.+.+ .+.+|.++ ||.....+ ...+.+....... .=+.-|.+|++-.+...+....+. T Consensus 19 ~~ll~~P~~I~~~i~e~~-----~~~~iad~-lf~~~~a~~~~~v~f~~~~p~~~~~d~e~VaEggEiP~~~~~~G~~~i 92 (318) T protein:vir:10 19 RELVGNPLWIPTALKKMM-----VNQFISES-LFRNGGANPNGVVAYNEGNPSFLEDDVADVAEFGEIPVSAGARGLPRT 92 (318) T ss_pred HHhhCCchhHHHHHHHHH-----hccchhhh-hhhcccccccceeEEEecccccccCcHhhccCcccccccCCCCCchhh Confidence 223333433223332222 36777775 66654332 3344443322111 111234445554433333222222 Q ss_pred EeccccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCEEecCCCceEeechhhcC Q lcl|NC_019400. 78 FNIPFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIMGKSWAPQDPTAQYNYFTEWG 157 (337) Q Consensus 78 f~~p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~g~i~~~~~g~v~~d~~~~fG 157 (337) -..-....+-.|+-+-+ .| +. -..+.+.+.++.+.+.+-.+.|++.||+-..+..--.+.- |.. ++ T Consensus 93 a~~~K~G~~~~vS~Em~--~~--n~-------~~~v~r~~~~l~Nti~r~~d~~a~dal~sa~t~~~~~s~~--w~~-~~ 158 (318) T protein:vir:10 93 AFAVKKALGVRVSKEMI--DE--NR-------VGAVNDQMLQLRNTFIRANDRSAKALLQSPIVPTLAVPTA--WDN-GG 158 (318) T ss_pred hhhehhccceeccHHHH--hh--cC-------hhHHHHHHHHHHHHHHHHHHHHHHHHHhccccccccCCcC--CCC-cc Confidence 22223334444442221 12 11 1357788888889999999999999997322110000000 100 00 Q ss_pred CCcceEEEecCCCCcchHHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhhccccccccccccc Q lcl|NC_019400. 158 VTQHTANIDFTDVATDPTDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYSSTQEPLRRRLGQ 237 (337) Q Consensus 158 ~~~~~~~~~l~~~~~d~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~~~~~~~~~~~~~ 237 (337) -.+ . +..++-..+.+......-+........-+ +..-.+++++..|..|.+|++++++|..............+. T Consensus 159 ~~~-~---d~~~A~e~v~~a~~~~~~a~~~~~~~~~G-Y~pdtIVlhP~~~~~l~~n~~~~~~y~~~a~~~~~~~~~tg~ 233 (318) T protein:vir:10 159 KVR-T---DIAIAIEQISTAAPTAYPAGVGSSDEYFG-FIPDTIVMHYALLPILMDNENFMKVYERNANYVSTAPDWTGN 233 (318) T ss_pred ccc-c---cchhhhhhhhhhhhhhhhhhhhhhhhccC-ccceeeEECHHHHHHHhcchhhhhhhhccchhhhhccccccc Confidence 000 0 00000000111000000000011111112 233457789999999999999999997543211100000000 Q ss_pred -ceeccceeEEEeccEEEecCceeeecCCeeEEEEecchhhheEEeccccchhhccccCcceeee--EEEccCCCEEEEE Q lcl|NC_019400. 238 -GQENANNRMFVHKNVTYIEDISNYIPDGEAYILPQGIDDMFQIHYAPADDVREANTPAQELYLW--YKSSAYLREEKVE 314 (337) Q Consensus 238 -~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~p~g~~~~f~~~~ap~d~~~~~n~~~~~~y~k--~~~~~~~~~~~l~ 314 (337) .....|..+ -..+++|.|+++++-.|..|. ++ | .......++|.- .--....+.+.+. T Consensus 234 ~~g~~lGl~v----------i~s~~~p~~~alvlq~g~vG~----~~--d---~~pl~~t~~~~egg~~~g~~~~s~~~~ 294 (318) T protein:vir:10 234 FPGSVMGLNV----------IRSRTFPIDRVLIMERGTVGF----YS--D---TRPLQFTALYPEGNGPNGGPTESYRAD 294 (318) T ss_pred ccceeeceEE----------eecCccCCCeeEEEecCCcce----ee--c---cccceeeecccCCCCCCCCcchhhhee Confidence 011122222 235678999999988776652 22 2 222333444431 0000112234455 Q ss_pred EeecccccccCcceEEEEEEeeC Q lcl|NC_019400. 315 SETSFLTVNTRPELVVRSTGTFA 337 (337) Q Consensus 315 ~eS~PLpi~~rP~~l~~~t~~aa 337 (337) .--..-+...+|.+++++|---. T Consensus 295 ~~~~~~~~V~~PkA~~~itgi~~ 317 (318) T protein:vir:10 295 ASHKRALAVDQPKAALWLTGIVT 317 (318) T ss_pred hheeeeeeeeCcceeEEEeeccC Confidence 55556778899999999998777 No 17 >protein:vir:98819 Length: 437 # NCBI annotation: hypothetical protein # Family: family:all:32561 # MgeID: mge:1530 # MgeName: Ma-LMM01 # Cross-refs: genbank:acc:YP_851100;genbank:gi:117530257;genbank:GeneID:4484483 Probab=93.98 E-value=0.0053 Score=33.04 Aligned_cols=322 Identities=14% Similarity=0.127 Sum_probs=163.7 Q ss_pred CC--------------CCCCCccC----HHHHHHH-HHhcCCCccchhhcccccCcccccceEEE-EEEcCceeEeeecc Q lcl|NC_019400. 1 MA--------------VVRTNDFQ----IVDLGAT-LEIVPRQYRLITNMDLFTAYHGVTTVAQI-ERVDEVVTDFPARR 60 (337) Q Consensus 1 ~d--------------~~~~d~Fs----~~~Lt~~-i~~~p~~~~~l~~l~~F~~~~~~t~~v~i-e~~~~~~~l~p~v~ 60 (337) -| +.++-.|- ..+|... +.++|..| |+. +|+.+.+....|.- ..++|.-.+.|.|+ T Consensus 2 sdipspnlqalisspylvdnttfprepvytelarsilaklpatp--lsa--vfpdetiaeriviaehviegvntifpvve 77 (437) T protein:vir:98 2 SDIPSPNLQALISSPYLVDNTTFPREPVYTELARSILAKLPATP--LSA--VFPDETIAERIVIAEHVIEGVNTIFPVVE 77 (437) T ss_pred CCCCCcchHhhhcCceeeccccCCccchHHHHHHHHHHhcCCcc--ccc--cccchhhhhhhhhHHHHHhhhhhhhhhhc Confidence 00 11222232 3455544 45555543 333 58777664443322 34588888999999 Q ss_pred CCCCcccccCCceeE--EEEeccccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcC Q lcl|NC_019400. 61 RQGERNYVGTEKAQL--KNFNIPFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIMG 138 (337) Q Consensus 61 ~g~~~~~~~~~~~~~--~~f~~p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~g 138 (337) .|+|---++.+++.+ .+++|-.++++...+-.+++|--.-|+.++-...++.+.+||.++.++|..||....+..+.| T Consensus 78 wgapdlfvdddgytvyrqsyqplpirqsmymsyaqlnntvregttnerataaeqiekkltrqmqkhqltwnvfqaammlg 157 (437) T protein:vir:98 78 WGAPDLFVDDDGYTVYRQSYQPLPIRQSMYMSYAQLNNTVREGTTNERATAAEQIEKKLTRQMQKHQLTWNVFQAAMMLG 157 (437) T ss_pred cCCcceeecCCCceeeecccCCccchhhhhhhhhhhhhhhhccccchhhhhHHHHHHHHHHHHHhhhhhHHHHHHHHHhc Confidence 999999999998865 569999999999999999987655677777777788899999999999999998887777778 Q ss_pred CEEec--CCCceE--------eechhhcCCCcc-----e-----EEEecCCC--------CcchHHHHHHHHH---HHHH Q lcl|NC_019400. 139 KSWAP--QDPTAQ--------YNYFTEWGVTQH-----T-----ANIDFTDV--------ATDPTDIIEADAR---AYII 187 (337) Q Consensus 139 ~i~~~--~~g~v~--------~d~~~~fG~~~~-----~-----~~~~l~~~--------~~d~~~~~~~~~~---~~~~ 187 (337) +|... ..|.-. -+|| .|..+|. + .-++|... -+|+.-.+...-| .|.. T Consensus 158 ginytdprsgvrvkapayiparnff-nfnttqgyrgrnearlfrnlidlnaggtpssgipitdpqfalsnftrrlnrwfk 236 (437) T protein:vir:98 158 GINYTDPRSGVRVKAPAYIPARNFF-NFNTTQGYRGRNEARLFRNLIDLNAGGTPSSGIPITDPQFALSNFTRRLNRWFK 236 (437) T ss_pred cccccCcccceeeeccccccccccc-ccccccccccchHHHHHHHHhhccCCCCCcCCcccccchhhHHHHHHHHHHHhh Confidence 88543 222100 1222 1222221 0 01222111 1232222222111 1111 Q ss_pred H-hhccccccccEEEEEChHHHHHHhcCHHHHHH-----------HHh-h--cccccccccccccceec----------c Q lcl|NC_019400. 188 D-NAGDNGNNYGIVVLASRKWFSALIAHPLVMNA-----------YQY-Y--SSTQEPLRRRLGQGQEN----------A 242 (337) Q Consensus 188 ~-~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~-----------~~~-~--~~~~~~~~~~~~~~~~~----------~ 242 (337) + ++.. .+ ...||++.-|.++-..+.+-+ |-. . +.+.....+-.-+|... + T Consensus 237 dtnksd---it--dmymgpemrdvilmseearlaqggiiprlgavfgdstidsngsggsfgplppgglgtgmglvlgtrg 311 (437) T protein:vir:98 237 DTNKSD---IT--DMYMGPEMRDVILMSEEARLAQGGIIPRLGAVFGDSTIDSNGSGGSFGPLPPGGLGTGMGLVLGTRG 311 (437) T ss_pred cccccc---ch--hhhcCccceeeeeeccchhhhhcccchhhhhhhccccccCCCCCcccCCCCccccccccceeeeccc Confidence 1 1111 11 122344433333322221110 000 0 00000000000000000 0 Q ss_pred ceeEEEeccEE-----Eec----CceeeecCCeeEEEEe-cc----hhhheEEecc-ccchhhccccCcceeeeEEEc-- Q lcl|NC_019400. 243 NNRMFVHKNVT-----YIE----DISNYIPDGEAYILPQ-GI----DDMFQIHYAP-ADDVREANTPAQELYLWYKSS-- 305 (337) Q Consensus 243 ~~~~~~~~~~~-----~~~----~~~~~i~~~~~~~~p~-g~----~~~f~~~~ap-~d~~~~~n~~~~~~y~k~~~~-- 305 (337) .+..+.--++. |+| ..++..|.+|++.+.- .+ ....++.|+- .+.++. .+ .|.....+ T Consensus 312 eilsiaginvhvvdtiykdpvdgvekrvwpknkivavsfrdsdgnveapgrtqycssensids---pg--lwtrtvtdvp 386 (437) T protein:vir:98 312 EILSIAGINVHVVDTIYKDPVDGVEKRVWPKNKIVAVSFRDSDGNVEAPGRTQYCSSENSIDS---PG--LWTRTVTDVP 386 (437) T ss_pred ceeEeecceeeeehhhhhcchhhhhhhcCCccceEEEEEecCCCcccCCccccccccccccCC---Cc--ceeeeeccCC Confidence 00010000111 111 1246788888877652 11 1223555553 333332 22 35555443 Q ss_pred -cCCCEEEEEEeecccccccCcceEEEEEEeeC Q lcl|NC_019400. 306 -AYLREEKVESETSFLTVNTRPELVVRSTGTFA 337 (337) Q Consensus 306 -~~~~~~~l~~eS~PLpi~~rP~~l~~~t~~aa 337 (337) +--.|+-+.+-.+-||+-+-|--+|..|--.- T Consensus 387 ppaapgiavqmgnaglpyfkypyrvchvtpctv 419 (437) T protein:vir:98 387 PPAAPGIAVQMGNAGLPYFKYPYRVCHVTPCTV 419 (437) T ss_pred CCCCCcceEeecCCCCcccccceeeeeecccch Confidence 33358888888999999999988888776555 No 18 >protein:vir:94711 Length: 347 # NCBI annotation: capsid # Family: family:all:975 # MgeID: mge:1528 # MgeName: K1F # Cross-refs: genbank:acc:YP_338120;genbank:gi:77118198;genbank:GeneID:3707734 Probab=86.50 E-value=0.027 Score=29.23 Aligned_cols=302 Identities=12% Similarity=0.015 Sum_probs=106.5 Q ss_pred CCCCCCCccCHHHHHHHHHhcCCCccchhhcccccCccc-ccceEEEEEEcCceeEeeeccCCCCc--cc--ccCCceeE Q lcl|NC_019400. 1 MAVVRTNDFQIVDLGATLEIVPRQYRLITNMDLFTAYHG-VTTVAQIERVDEVVTDFPARRRQGER--NY--VGTEKAQL 75 (337) Q Consensus 1 ~d~~~~d~Fs~~~Lt~~i~~~p~~~~~l~~l~~F~~~~~-~t~~v~ie~~~~~~~l~p~v~~g~~~--~~--~~~~~~~~ 75 (337) .++|- -.|.-.-+++ +. ..+.+. ++...+.+ ..+++.|... |...+ ....||.+. .. ++..+..+ T Consensus 23 ~al~i-k~f~~eV~~~-f~----~~s~~~--~~~~~r~i~~G~sv~i~~i-G~~tv-~~~t~G~~l~~~~~~~~~~e~~i 92 (347) T protein:vir:94 23 LALFL-KVFAGEVLTA-FT----RRSVTA--DKHIVRTIQNGKSAQFPVM-GRTSG-VYLAPGERLSDKRKGIKHTEKVI 92 (347) T ss_pred HHHHH-HHHhHHHHHH-HH----HHHhhh--cccccccccccceEEEecc-cceee-eeecCCCCcCCCCCCCCcceEEE Confidence 33333 1344333332 21 112232 33444443 3666777665 22222 223333332 11 11111111 Q ss_pred EEEeccccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhc---CCEEecCCCceEeec Q lcl|NC_019400. 76 KNFNIPFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIM---GKSWAPQDPTAQYNY 152 (337) Q Consensus 76 ~~f~~p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~---g~i~~~~~g~v~~d~ 152 (337) ..=+.-++. .-.=.-+++|.. -+++.+...++-..+.+...-+++..|. +..... .+. .. T Consensus 93 tID~~~~~~-~~VddiD~~q~~------------~D~~~~~~~~~g~aLa~~~D~~i~~~~~~~aa~~~~~-~~~--~~- 155 (347) T protein:vir:94 93 TIDGLLTAD-VMIFDIEDAMNH------------YDVAGEYSNQLGEALAIAADGAVLAEMAILCNLPAAS-NEN--IA- 155 (347) T ss_pred Eecchhhhh-HHhhhHHHHhcC------------cchHHHHHHHHHHHHHHHHHHHHHHHHHHHhcccccc-ccc--cC- Confidence 100111111 000011223321 1122222222333333333333332221 111100 110 00 Q ss_pred hhhcCCCcceEEEecCCCCcchH---HHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhhccccc Q lcl|NC_019400. 153 FTEWGVTQHTANIDFTDVATDPT---DIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYSSTQE 229 (337) Q Consensus 153 ~~~fG~~~~~~~~~l~~~~~d~~---~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~~~~~ 229 (337) -|| ....+.+.....+.++. ..+-+..+...+.+.-...+..+.+++++|++|..|++|+.+.... +..... T Consensus 156 --g~~-~~s~~~~~~~~~~~~~~~~~~~~~~~i~~a~~~Lde~~VP~~~R~~vv~P~~~~~Ll~~~~~~~~~--~~~~~~ 230 (347) T protein:vir:94 156 --GLG-TASVLEVGKKADLDTPAKLGEAIIGQLTIARAKLTSNYVPAGDRYFYTTPDNYSAILAALMPNAAN--YAALID 230 (347) T ss_pred --CCc-ccceeeccccccccchhhhHHHHHHHHHHHHHHHhhcCCCCCCcEEEeCHHHHHHHhccchhhhhh--cccccc Confidence 011 11122221111111111 1111222222333333444666778889999999999988875532 222221 Q ss_pred ccccccccceeccceeEEEeccEEEec------CceeeecCCeeEEEEecchhhhe----EEeccccchhhcccc-Ccce Q lcl|NC_019400. 230 PLRRRLGQGQENANNRMFVHKNVTYIE------DISNYIPDGEAYILPQGIDDMFQ----IHYAPADDVREANTP-AQEL 298 (337) Q Consensus 230 ~~~~~~~~~~~~~~~~~~~~~~~~~~~------~~~~~i~~~~~~~~p~g~~~~f~----~~~ap~d~~~~~n~~-~~~~ 298 (337) ... +......|+.+++..+..... +...-+.+++.+.+|.....-++ .-.+-.=+-+.+++. ..++ T Consensus 231 ~~~---G~Vg~i~G~~V~~Sn~lp~~~~t~~~~~~~~~~~aG~~~~~~~~~~~~~~~~~~~~~~l~~h~~A~~~v~~~~~ 307 (347) T protein:vir:94 231 PET---GNIRNVMGFVVVEVPHLVQGGAGETRGDDGITIASGQKHAFPATASSDVKVTMDNVVGLFSHRSAVGTVKLRDL 307 (347) T ss_pred ccc---cceEEEeceEEEecCcccccccccccccCcceecCcccccccccchhhhcccccceeEEEeehhhhhhhhcccc Confidence 111 222334555555544432211 11122344444444321111000 000000000111110 0011 Q ss_pred eeeEEEccCCCEEEEEEeecccccccCcceEEEEEEeeC Q lcl|NC_019400. 299 YLWYKSSAYLREEKVESETSFLTVNTRPELVVRSTGTFA 337 (337) Q Consensus 299 y~k~~~~~~~~~~~l~~eS~PLpi~~rP~~l~~~t~~aa 337 (337) =...|.+++-.+..|.+--..=.-..||++++.+++++| T Consensus 308 ~~e~~r~~~~~~d~i~~~~~~G~~~~rP~~a~~~~~~~A 346 (347) T protein:vir:94 308 ALERDRDVDAQGDLIVGKYAMGHGGLRPEAAGALVFSPA 346 (347) T ss_pred cccchhchhhHHHHhhhhhhhcCcccccceeEEEEecCC Confidence 111122222233444444444456789999999999999 No 19 >protein:vir:6324 Length: 335 # NCBI annotation: capsid protein # Family: family:all:2806 # MgeID: mge:132 # MgeName: phiKMV # Cross-refs: genbank:acc:NP_877471;genbank:gi:33300843;uniprot:Q7Y2D3;genbank:GeneID:1482613 Probab=81.23 E-value=0.088 Score=26.38 Aligned_cols=294 Identities=12% Similarity=0.034 Sum_probs=114.7 Q ss_pred CCCCCCCccCHHHHHHHHHhcCCCccchhhcccccCccc-ccceEEEEEEcCceeEeeeccCCCCcc--cccCCceeEEE Q lcl|NC_019400. 1 MAVVRTNDFQIVDLGATLEIVPRQYRLITNMDLFTAYHG-VTTVAQIERVDEVVTDFPARRRQGERN--YVGTEKAQLKN 77 (337) Q Consensus 1 ~d~~~~d~Fs~~~Lt~~i~~~p~~~~~l~~l~~F~~~~~-~t~~v~ie~~~~~~~l~p~v~~g~~~~--~~~~~~~~~~~ 77 (337) .++|- -.|+-.-+++--+ ...+ ++++..+.+ .++++.|.+... .-+....||.+-. ....+ -.+.. T Consensus 19 ~al~l-e~f~geV~~af~~-----~s~~--~~~~~~rti~~g~s~~~~~iG~--~~~~~~~pG~~l~~~~~~~~-k~~it 87 (335) T protein:vir:63 19 VDIHL-EEHLGIVDKHFAY-----TSKF--APLMNIRDLRGSNVVRLDRLGN--VEAKGRRAGEELERSRVVND-KWNLT 87 (335) T ss_pred hheeh-hhhhhhHHHHHHh-----hhhh--ccccceeeeccceeEEEeeeee--eeeecccCCcCcCCCCcccc-ceEEE Confidence 44444 3455444444333 2222 234555544 366788877622 2244444444422 11111 11222 Q ss_pred EeccccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCEEecCCCceEeechhhcC Q lcl|NC_019400. 78 FNIPFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIMGKSWAPQDPTAQYNYFTEWG 157 (337) Q Consensus 78 f~~p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~g~i~~~~~g~v~~d~~~~fG 157 (337) +....+...-.=.-+|+|+ .+ +-...+...+...|+++.+.... .|.+.+--..+... +-+-+ .-| T Consensus 88 VD~ll~a~~~I~dlDe~~~--~y---DvRse~s~e~G~aLA~~~D~~~~------~~i~~aa~~~a~~~--~~~~~-~~G 153 (335) T protein:vir:63 88 VDTLLYLRHQFDHQDEWTQ--SF---DMRKEVAELDGQELARKFDQACL------IQVIKAAAMDAPVD--LEDAF-SPG 153 (335) T ss_pred ecceeechhhhhhHHHHhc--Cc---hhHHHHHHHHHHHHHHHHHHHHH------HHHHhhccccCccc--cCCCc-CCC Confidence 2222222222222233332 11 11112333444444433333222 22223221111000 00000 113 Q ss_pred CCcceEEEecCCCCcchHHHHHHHHHHHHHHhhccccc---cccEEEEEChHHHHHHhcCHHHHHH-HHhhccccccccc Q lcl|NC_019400. 158 VTQHTANIDFTDVATDPTDIIEADARAYIIDNAGDNGN---NYGIVVLASRKWFSALIAHPLVMNA-YQYYSSTQEPLRR 233 (337) Q Consensus 158 ~~~~~~~~~l~~~~~d~~~~~~~~~~~~~~~~~~~~~~---~~~v~~l~g~~~~~al~~h~~v~~~-~~~~~~~~~~~~~ 233 (337) +++. ..++.+++.++ ...+....+...+.+.-...+ ..+.+++++|++|.+|+.|+.+... |.+......+.+ T Consensus 154 ~~~~-~~~tg~~~~~~-~~~l~~a~~~a~~~L~e~dVP~~~~~dr~~vv~P~~y~~Ll~~~~l~n~~~~~s~~~~~~~~- 230 (335) T protein:vir:63 154 VLEK-LDLTGLTAKQA-ADKIVRMHRRVVETFIDRDLGDAVYSEGLTPMSPRVFSLLLEHDKLMNVEYQATGATNDYVK- 230 (335) T ss_pred ccee-eeeccCccccc-HHHHHHHHHHHHHHHHhccCCCcccCceEEEeChHHHHHHhccccccccccccccccccccC- Confidence 3322 12222222223 333333333222322222222 3457889999999999999765332 111111111111 Q ss_pred ccccceeccceeEEEeccEEEecCceeeecCCeeEEEEecc-----hhhh---eEEeccccchhhccc-cCcceeeeEEE Q lcl|NC_019400. 234 RLGQGQENANNRMFVHKNVTYIEDISNYIPDGEAYILPQGI-----DDMF---QIHYAPADDVREANT-PAQELYLWYKS 304 (337) Q Consensus 234 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~p~g~-----~~~f---~~~~ap~d~~~~~n~-~~~~~y~k~~~ 304 (337) +......|+..+...+ +|.+.+.-=|.|. .+.| ...+.+.+ .+++ ...+.=+..|. T Consensus 231 --g~v~~v~Gv~V~~sn~----------lP~~~~t~~~lg~a~n~~~~d~~~~~~~~~~~~---Al~t~~~~~vt~e~~~ 295 (335) T protein:vir:63 231 --SRVAILNGVKVLETPR----------FATKAIAAHPLGRHFNVSAEESERQIALFLPSK---TLITAQVAPVQAKLWE 295 (335) T ss_pred --ceeEEeeceEEEeecc----------CCCCCcccccccccCCccccccceeEEEEEecc---eEEEEEEeecccceee Confidence 1112222222222211 1211100001111 1111 11112222 1111 12234455666 Q ss_pred ccCCCEEEEEEeecccccccCcceEEEEEEeeC Q lcl|NC_019400. 305 SAYLREEKVESETSFLTVNTRPELVVRSTGTFA 337 (337) Q Consensus 305 ~~~~~~~~l~~eS~PLpi~~rP~~l~~~t~~aa 337 (337) +.+..++.|.+--..=..+.||++.+.++.+-. T Consensus 296 ~~~~~~~~i~~~~a~G~g~lRPe~a~~i~~tg~ 328 (335) T protein:vir:63 296 DNEKFSWVLDTFQMYNIGARRPDTAGAIELKGI 328 (335) T ss_pred ccchhhHHhHHHHHcCCcccccceEEEEEEcCC Confidence 666666777666666678899999999988776 No 20 >protein:vir:103323 Length: 364 # NCBI annotation: major capsid-like protein # Family: family:all:2806 # MgeID: mge:1609 # MgeName: Era103 # Cross-refs: genbank:acc:YP_001039668;genbank:gi:125999997;genbank:GeneID:4818399 Probab=81.16 E-value=0.088 Score=26.36 Aligned_cols=301 Identities=10% Similarity=0.019 Sum_probs=117.8 Q ss_pred CCCCCCCccCHHHHHHHHHhcCCCccchhhcccccCccc-ccceEEEEEEcCceeEeeeccCCCCcc--cccCCceeEEE Q lcl|NC_019400. 1 MAVVRTNDFQIVDLGATLEIVPRQYRLITNMDLFTAYHG-VTTVAQIERVDEVVTDFPARRRQGERN--YVGTEKAQLKN 77 (337) Q Consensus 1 ~d~~~~d~Fs~~~Lt~~i~~~p~~~~~l~~l~~F~~~~~-~t~~v~ie~~~~~~~l~p~v~~g~~~~--~~~~~~~~~~~ 77 (337) .++|-. .|+-.-+++--+ ...+ ++++..+.+ .++++.|... |..++ ..-.+|.+.. ....++ .+.. T Consensus 19 ~al~le-~f~geV~taf~~-----~s~~--~~~~~~rti~~gkS~q~~~i-G~~~~-~~~~~G~~ld~~~~~~~k-~~it 87 (364) T protein:vir:10 19 DSLLIE-KFNNRVHEQYLK-----GENL--LQWFDVQEVVGTNSVSNKYI-GETEL-QVLSPGKSPDASPTEFDK-NRLV 87 (364) T ss_pred hhhhhh-hhhhhHHHHHHH-----HHhh--cCcceeeeecccceEEeeee-eeeEE-eeeccCcccCCCCcccCc-EEEE Confidence 334332 333333333222 2222 234444444 3677888777 33333 4444444321 122211 1222 Q ss_pred EeccccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCEEecCCCceEeechhhcC Q lcl|NC_019400. 78 FNIPFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIMGKSWAPQDPTAQYNYFTEWG 157 (337) Q Consensus 78 f~~p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~g~i~~~~~g~v~~d~~~~fG 157 (337) +....+...-.-.-+|+|+ .+-. --..+.......|+++.++.-. ....+ .++...+-....+.+ T Consensus 88 ID~ll~a~~~V~diDe~q~--~~D~--vR~e~s~e~G~ALA~~~Dq~i~-~~v~~-aa~a~~~~~~~~~~~--------- 152 (364) T protein:vir:10 88 VDTTVIARNTVAHFHDVQN--DIDG--LKSKLSVNQAKKLKKMEDSMVI-QQLVL-GGISNTEAIRKNPRV--------- 152 (364) T ss_pred ecceeeechhhhhHHHHhc--Cccc--hhHHHHHHHHHHHHHHHHHHHH-HHHHh-hhhhcccccccCCcc--------- Confidence 3323332222223334442 1210 0112233444444444444321 11111 122111100000000 Q ss_pred CCcceEEEecCC----CCcchHHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHH-HHhhcccccccc Q lcl|NC_019400. 158 VTQHTANIDFTD----VATDPTDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNA-YQYYSSTQEPLR 232 (337) Q Consensus 158 ~~~~~~~~~l~~----~~~d~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~-~~~~~~~~~~~~ 232 (337) .+....++.+. ...++.. +.+......+.+--...+..+.++++.|++|.+|++|+++... |... .+..+.. T Consensus 153 -~~~g~~i~~~~~a~~~~~~~~~-l~~ai~~a~~~LdEkdVP~~~R~~vv~P~~y~~Ll~~~~lvn~d~~~~-~~~~~~~ 229 (364) T protein:vir:10 153 -AGHGFSIHIVGLASSFLTSPQY-MMAAIEMAMEQQTEQEVDTSELCGLMPWTAFNCLRDADRIVDKSYTIA-ASDNTVD 229 (364) T ss_pred -cCCcceeeecccCcchhhhHHH-HHHHHHHHHHHHhhcCCCccccEEEeChHHHHHHhcCCcccccccccc-CCCcccc Confidence 01111111111 1112111 1111111222222234455678899999999999998875421 1111 1111211 Q ss_pred cccccceeccceeEEEeccEEEecCce---------eeecCCeeEEEEecch--hhheEEeccccchhhccc-cCcceee Q lcl|NC_019400. 233 RRLGQGQENANNRMFVHKNVTYIEDIS---------NYIPDGEAYILPQGID--DMFQIHYAPADDVREANT-PAQELYL 300 (337) Q Consensus 233 ~~~~~~~~~~~~~~~~~~~~~~~~~~~---------~~i~~~~~~~~p~g~~--~~f~~~~ap~d~~~~~n~-~~~~~y~ 300 (337) + ......|+..+...+.....+.. ++-+++.+.-+..... ..--..|-| + .+++ ...+.=. T Consensus 230 G---~v~~v~Gv~Vv~Sn~lP~~~~~~~~t~~~t~h~ls~~~~g~~y~v~~d~~~~~~~~f~~-~---Al~tv~~~~~t~ 302 (364) T protein:vir:10 230 G---FVLKSWNTPIVPSNRFPKLSDNTEGTGNTKHHKLSNAGNGNRYDVTAGQTSAQAVLFTQ-D---ALLVGRTISITG 302 (364) T ss_pred c---eeEEEeceEEEeccccccccccccccccccccccccccCCcccccccccceeEEEEEec-c---eEEEEEEeccee Confidence 1 11222333333322221111100 0101111111110000 000112222 1 2222 2235566 Q ss_pred eEEEccCCCEEEEEEeecccccccCcceEEEEEEeeC Q lcl|NC_019400. 301 WYKSSAYLREEKVESETSFLTVNTRPELVVRSTGTFA 337 (337) Q Consensus 301 k~~~~~~~~~~~l~~eS~PLpi~~rP~~l~~~t~~aa 337 (337) ..|.+++..++.+.+--+.=.-..||++.+.++..+| T Consensus 303 e~~~~~~~~~~~ida~~a~G~g~lRPeaa~~i~~~~~ 339 (364) T protein:vir:10 303 DIFYEKKEKTWYIDTFLAEGAIPDRWEAVAVVTAADT 339 (364) T ss_pred eeeeccceeeeeeeeehcccCcccCccceEEEEecCC Confidence 6777777788888887777778999999999999988 No 21 >protein:vir:78935 Length: 335 # NCBI annotation: capsid protein # Family: family:all:2806 # MgeID: mge:1860 # MgeName: LKD16 # Cross-refs: genbank:acc:YP_001522824;genbank:gi:158345059;genbank:GeneID:5687425 Probab=79.82 E-value=0.1 Score=26.05 Aligned_cols=293 Identities=12% Similarity=0.058 Sum_probs=115.7 Q ss_pred CCCCCCCccCHHHHHHHHHhcCCCccchhhcccccCccc-ccceEEEEEEcCceeEeeeccCCCCcc--cccCCceeEEE Q lcl|NC_019400. 1 MAVVRTNDFQIVDLGATLEIVPRQYRLITNMDLFTAYHG-VTTVAQIERVDEVVTDFPARRRQGERN--YVGTEKAQLKN 77 (337) Q Consensus 1 ~d~~~~d~Fs~~~Lt~~i~~~p~~~~~l~~l~~F~~~~~-~t~~v~ie~~~~~~~l~p~v~~g~~~~--~~~~~~~~~~~ 77 (337) .++|- -.|+-.-+++--+ ...+ ++++..+.+ .++++.|.+. |..+ +....||.+.. ....++ .+.. T Consensus 19 ~al~l-e~f~geV~~af~~-----~s~~--~~~~~~rti~~g~s~~~~~i-G~~~-~~~~~pG~~l~~~~~~~~k-~~it 87 (335) T protein:vir:78 19 VDIHL-EEHLGIVDKHFAY-----TSKF--APLMNIRDLRGSNVVRLDRL-GNVE-AKGRRAGEELERSRVVNDK-WNLT 87 (335) T ss_pred hhhhh-hhhhhHHHHHHHH-----hhhh--ccccceeeeccceeEEEeee-eeee-ecccccCcccCCCCcccCC-eEEE Confidence 55555 3555555554444 2222 244555554 3777888876 3332 35555555432 122211 1222 Q ss_pred EeccccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCEEecCCCceEeechhhcC Q lcl|NC_019400. 78 FNIPFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIMGKSWAPQDPTAQYNYFTEWG 157 (337) Q Consensus 78 f~~p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~g~i~~~~~g~v~~d~~~~fG 157 (337) +....+...-.=.-+|+|+ .+ +-...+.....+.|+++.++.- ..+.+.+-...+... .-+- ..-| T Consensus 88 ID~ll~a~~~VddlDe~~~--~y---DvR~e~s~~~G~aLA~~~Dq~~------~~~l~~aa~~~a~~~--~~~~-~~~G 153 (335) T protein:vir:78 88 VDTLLYLRHQFDHQDEWTQ--SF---DMRKEVAELDGQELARKFDQAC------LIQVIKAAAMDAPVD--LEDA-FSPG 153 (335) T ss_pred ecceeechhhHhhHHHhhc--Cc---hhHHHHHHHHHHHHHHHHHHHH------HHHHHhhcccccccc--cCCC-cCCC Confidence 2222222222212223332 11 1111233334444443333322 122233221111000 0000 0113 Q ss_pred CCcceEEEecCCCCcch---HHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHH-HHhhccccccccc Q lcl|NC_019400. 158 VTQHTANIDFTDVATDP---TDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNA-YQYYSSTQEPLRR 233 (337) Q Consensus 158 ~~~~~~~~~l~~~~~d~---~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~-~~~~~~~~~~~~~ 233 (337) .+.. ..+.-+++.++. ...+.+....+.++.... ....+.+++++|++|.+|+.|+.+... |.+......+.++ T Consensus 154 ~~~~-~~~tg~~~~~~~~~l~~a~~~a~~~l~ekdvP~-~~~~~rv~vv~P~~y~~Ll~~~~l~n~~~~~s~~~~~~~~g 231 (335) T protein:vir:78 154 VLEK-LDLTGLTAKEAAEKIVRMHRRVVETFIERDLGD-AVYSEGLTPMSPRVFSLLLEHDKLMSVEYQATGATNDYVKS 231 (335) T ss_pred ccee-eeeccccccccHHHHHHHHHHHHHHHHhccCCC-CCCCccEEEeChHHHHHHhcccccccccccccccccccccc Confidence 3221 111112222232 233333333333322222 123456888999999999999876432 1111111111111 Q ss_pred ccccceeccceeEEEeccEEEecCceeeecCC--eeEEEEecch-hhh-------eEEeccccchhhccccCcceeeeEE Q lcl|NC_019400. 234 RLGQGQENANNRMFVHKNVTYIEDISNYIPDG--EAYILPQGID-DMF-------QIHYAPADDVREANTPAQELYLWYK 303 (337) Q Consensus 234 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~--~~~~~p~g~~-~~f-------~~~~ap~d~~~~~n~~~~~~y~k~~ 303 (337) ......|+..+...+ +|.+ .++ +.|.. +.+ ...+.+.+-+-.+ ...+.=+..| T Consensus 232 ---~v~~v~Gv~V~~Sn~----------lP~~~~t~~--~lg~a~n~~~~d~~~~~~~~~~~~Al~t~--~~~~~~~e~~ 294 (335) T protein:vir:78 232 ---RVAILNGVKVLETPR----------FATKAISAH--PLGRHFNVSAEEAERQIALFLPSKTLITA--QVAPVQAKLW 294 (335) T ss_pred ---eeEEeeceEEEeecc----------CCCCCCccc--cccccCCcccccccceEEEEEecceEEEE--EEEeccccee Confidence 112222322222222 1211 111 11111 000 1111222211111 1223344556 Q ss_pred EccCCCEEEEEEeecccccccCcceEEEEEEeeC Q lcl|NC_019400. 304 SSAYLREEKVESETSFLTVNTRPELVVRSTGTFA 337 (337) Q Consensus 304 ~~~~~~~~~l~~eS~PLpi~~rP~~l~~~t~~aa 337 (337) .+++..++.|.+--..=..+.||++.+.++.+-. T Consensus 295 ~~~~~~~~~i~~~~a~G~g~lRPe~a~~i~~tg~ 328 (335) T protein:vir:78 295 EDHDQFSWVLDTFQMYNIGARRPDTAGAIELKGI 328 (335) T ss_pred eccchhhHhhhHHHHcCCcccCcceEEEEEecCC Confidence 6666666677666666678899999999998877 No 22 >protein:vir:10450 Length: 344 # NCBI annotation: major capsid protein # Family: family:all:975 # MgeID: mge:184 # MgeName: phiA1122 # Cross-refs: genbank:acc:NP_848297;genbank:gi:30387487;genbank:GeneID:1733971 Probab=73.26 E-value=0.17 Score=24.77 Aligned_cols=302 Identities=12% Similarity=-0.016 Sum_probs=105.6 Q ss_pred CCCCCCCccCHHHHHHHHHhcCCCccchhhcccccCcccc-cceEEEEEEcCceeEeeeccCCCCcccccC---CceeEE Q lcl|NC_019400. 1 MAVVRTNDFQIVDLGATLEIVPRQYRLITNMDLFTAYHGV-TTVAQIERVDEVVTDFPARRRQGERNYVGT---EKAQLK 76 (337) Q Consensus 1 ~d~~~~d~Fs~~~Lt~~i~~~p~~~~~l~~l~~F~~~~~~-t~~v~ie~~~~~~~l~p~v~~g~~~~~~~~---~~~~~~ 76 (337) .++|- -.|+-.-++.--+ .+.+. ++...+.++ ++++.|... |..++ ....||.+...... ....+. T Consensus 25 ~al~i-e~~~geV~~~f~~-----~s~~~--~~~~~r~i~~g~s~~~~~i-G~~~~-~~~~~G~~l~~t~~~~~~~e~~l 94 (344) T protein:vir:10 25 LALFL-KVFGGEVLTAFAR-----TSVTT--SRHMVRSISSGKSAQFPVL-GRTQA-AYLAPGENLDDIRKDIKHTEKVI 94 (344) T ss_pred hHHHH-HHHHHHHHHHHHH-----Hhhhc--ccceeeeecccceEEEEee-ceeEE-EeeecCCCCCCCCCCcccceEEE Confidence 55555 3555444443333 23332 345555554 778888877 44443 35555555322111 111111 Q ss_pred EEeccccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhc-CC-EEecCCCceEeechh Q lcl|NC_019400. 77 NFNIPFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIM-GK-SWAPQDPTAQYNYFT 154 (337) Q Consensus 77 ~f~~p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~-g~-i~~~~~g~v~~d~~~ 154 (337) .+.-..+..--.=.-+|+|. .+ +............|+ +.++. ++++.|. +. ...+.+. +. T Consensus 95 ~ID~~~y~~~~VdDiD~~q~--~~---D~r~~~~~~~G~aLA---~~~D~----~i~~~la~~a~~~~~~~~------~~ 156 (344) T protein:vir:10 95 TIDGLLTADVLIYDIEDAMN--HY---DVRSEYTSQLGESLA---MAADG----AVLAEIAGLCNVESQYNE------NI 156 (344) T ss_pred EEcchhhhhhhhhhHHHHhc--Cc---chHHHHHHHHHHHHH---HHHHH----HHHHHHHhhhcccccccc------cc Confidence 12111111111112233332 11 111112222222222 22222 2222221 10 0111011 00 Q ss_pred hcCCCcceEEEe----cCCCCcchHHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhhcccccc Q lcl|NC_019400. 155 EWGVTQHTANID----FTDVATDPTDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYSSTQEP 230 (337) Q Consensus 155 ~fG~~~~~~~~~----l~~~~~d~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~~~~~~ 230 (337) ..+-+...+... +.++.......+.+..+...+.+-....+..+.+++++|++|..|++|+.+-.. .+...... T Consensus 157 ~g~~~~~~~~~~~~~~~~t~~~~~~~~~~~~i~~a~~~Lde~~VP~~gR~~vv~P~~y~~Ll~~~~~~~~--~~~~~~~~ 234 (344) T protein:vir:10 157 TGLGTATVIETTQDKTTLTDQVALGKEIIAALTKARAALTKNYVPSSDRVFYCDPDSYSAILAALMPNAA--NYAALIDP 234 (344) T ss_pred ccccccceeecccccccccchhhhHHHHHHHHHHHHHHHhhcCCCccCCEEEeChHHHHHHhhccccccc--ccccccce Confidence 111011111111 011111111122222222223333334466777889999999999999876332 11111111 Q ss_pred cccccccceeccceeEEEeccEEEe-cCceeeecCCeeEEEEecchhhh----eEEeccccchhhccc-cCcceeeeEEE Q lcl|NC_019400. 231 LRRRLGQGQENANNRMFVHKNVTYI-EDISNYIPDGEAYILPQGIDDMF----QIHYAPADDVREANT-PAQELYLWYKS 304 (337) Q Consensus 231 ~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~i~~~~~~~~p~g~~~~f----~~~~ap~d~~~~~n~-~~~~~y~k~~~ 304 (337) ..+ ......|+..++..+.... ..+..-...+..+.+|.+..+.+ ....|-.=+-+.+++ ...++=...|. T Consensus 235 ~~G---~V~~v~G~~V~~Sn~lp~~~~~~~~~~~tg~~~~~~~~~~~~~~~~~s~~~~l~~h~~A~~~v~~~~~~~e~~r 311 (344) T protein:vir:10 235 EKG---SIRNVMGFEVVEVPHLTAGGAGTSREGTTGQKHAFPATKSGNDKVAKDNVIGLFMHRSAVGTVKLRDLALERAR 311 (344) T ss_pred eee---EEEEEeceEEEeccccccccCCcccccccCccccccCCcccceeeecceeEEEeechhhhhhhhhccceeeccc Confidence 111 1112234433333332211 01112223333344443222111 000000000011111 01111112222 Q ss_pred ccCCCEEEEEEeecccccccCcceE--EEEEEe Q lcl|NC_019400. 305 SAYLREEKVESETSFLTVNTRPELV--VRSTGT 335 (337) Q Consensus 305 ~~~~~~~~l~~eS~PLpi~~rP~~l--~~~t~~ 335 (337) +++-.+..|.+--..=.-..||+++ +++|.+ T Consensus 312 ~~~~~~d~i~g~~~~G~~vlRPe~a~~v~~~~~ 344 (344) T protein:vir:10 312 RANFQADQIIAKYAMGHGGLRPEAAGAVVFKTK 344 (344) T ss_pred chhHHHHHHHHHhhcccceecccceEEEEeecC Confidence 3333333443333333457899977 777777 No 23 >protein:vir:106647 Length: 303 # NCBI annotation: ORF011 # Family: family:all:1178 # MgeID: mge:1557 # MgeName: 187 # Cross-refs: genbank:acc:YP_239493;genbank:gi:66395226;genbank:GeneID:4555801 Probab=71.71 E-value=0.19 Score=24.51 Aligned_cols=275 Identities=8% Similarity=0.044 Sum_probs=112.6 Q ss_pred CCCCCCCccCHHHHHH-----HHHhc-CCCccchhhcccccCcccccc-eEE-----EEEEcCceeEeeeccCCCCcccc Q lcl|NC_019400. 1 MAVVRTNDFQIVDLGA-----TLEIV-PRQYRLITNMDLFTAYHGVTT-VAQ-----IERVDEVVTDFPARRRQGERNYV 68 (337) Q Consensus 1 ~d~~~~d~Fs~~~Lt~-----~i~~~-p~~~~~l~~l~~F~~~~~~t~-~v~-----ie~~~~~~~l~p~v~~g~~~~~~ 68 (337) |-.-++ .=...+|.. -+|+. .....++.-||+++..|...- .+. .....+. +--|..|..=... T Consensus 1 M~~e~n-l~~~~dL~~a~siDF~~~f~~~i~~L~~~LGv~r~~pla~Gt~iktyK~~~~~y~gd---a~dVaEGe~Ipls 76 (303) T protein:vir:10 1 MSAENN-LINVEALGKAKSIDFANKLGVGLNKLFEALAIQNKIPMNVGSALKQYRFKVEDSEKP---NGDVAEGDVIPLT 76 (303) T ss_pred CCCCcC-CcchhhcccceeehhhhhhhhhHHHHHHHhhhhccccccCCceeeeeeeeceeeccc---cccccCCcccchh Confidence 222111 111122211 11111 122344444566655544311 121 1111111 1123333332222 Q ss_pred cCCceeEEEEeccccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCEEecCCCce Q lcl|NC_019400. 69 GTEKAQLKNFNIPFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIMGKSWAPQDPTA 148 (337) Q Consensus 69 ~~~~~~~~~f~~p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~g~i~~~~~g~v 148 (337) +-.+......+.+.-|....+|++.|| +..+|. +..... .+|.+.|.+-.-.-+..+|.+.....+ + T Consensus 77 kvt~~~~~t~~~~~kK~rK~tTdEAIq-lsGyg~--aVgetd-------~qL~~~Iq~kIdnd~~~~lktaT~t~~-~-- 143 (303) T protein:vir:10 77 KVTREQVDITELQFAKYRKSTSAEAIQ-AHGYDL--AINQTD-------NEMIKYVQKKFRAKFFETLKSAIENGK-R-- 143 (303) T ss_pred hheeeecceEEEEeecccccccHHHHH-hhcCCc--hhHHHH-------HHHHHHHHhhhhHHHHHHHhhcccccc-c-- Confidence 222212233455555656688999997 456663 211111 112222222222234445543222110 0 Q ss_pred EeechhhcCCCcceEEEecCCCCcc-hHHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhhccc Q lcl|NC_019400. 149 QYNYFTEWGVTQHTANIDFTDVATD-PTDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYSST 227 (337) Q Consensus 149 ~~d~~~~fG~~~~~~~~~l~~~~~d-~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~~~ 227 (337) +...+ +..+ +...+...+ .+...........++||.|.-+.++++++.+.. .+..-+ T Consensus 144 -------------t~~t~---~s~~glq~Al~~~~----~kl~~~~ed~~~~V~FvNP~Daa~yl~~A~i~~--~~t~fG 201 (303) T protein:vir:10 144 -------------TNKTK---LSAENLQGALSKGR----ANLSVLLDDEITPIAFVNPNDTAEYLANGFINS--TGAQFG 201 (303) T ss_pred -------------cccee---ecHHHHHHHHHhhh----hhccccccccccEEEEEchHHHHHHhhcCCcch--hhhhhh Confidence 01011 1111 222222111 111111112234589999999888888877742 112223 Q ss_pred ccccccccccceeccceeEEEeccEEEecCceeeecCCeeEEEEecchhhheEEeccccchhhccccCcceeee------ Q lcl|NC_019400. 228 QEPLRRRLGQGQENANNRMFVHKNVTYIEDISNYIPDGEAYILPQGIDDMFQIHYAPADDVREANTPAQELYLW------ 301 (337) Q Consensus 228 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~p~g~~~~f~~~~ap~d~~~~~n~~~~~~y~k------ 301 (337) .+++.+ +.|...++ ...||+|++++.+. .+. ..+|+|..- + ++ .+..|+.. T Consensus 202 ~n~L~n-------fLG~~II~----------S~kv~~G~~~~T~~--~Ni-~~ay~~~~g-~-l~-~~f~~t~D~tglIG 258 (303) T protein:vir:10 202 VNLLTP-------YVGVKIVE----------FADVPQGEVWMTVA--ENL-NVAYANPRG-E-LS-RAFAFATDATGFVG 258 (303) T ss_pred hhhhhh-------hhcceEEE----------eccCCCceEEEeec--cce-EEEEecCch-h-hh-hhhhhccccccceE Confidence 333321 22222111 34489999998763 343 667887641 1 21 12222211 Q ss_pred EEEccCCC--EEEEEEeecccccccCcceEEEEEEeeC Q lcl|NC_019400. 302 YKSSAYLR--EEKVESETSFLTVNTRPELVVRSTGTFA 337 (337) Q Consensus 302 ~~~~~~~~--~~~l~~eS~PLpi~~rP~~l~~~t~~aa 337 (337) +-...+.+ .++-..-|.-.-.|-|++.++++|++++ T Consensus 259 v~h~~~~~~~t~eT~~~~~~~lfpE~~dgiv~~ti~~~ 296 (303) T protein:vir:10 259 VLHDIQPQRLTSDTIYASAISMFPENIDAVIKVTIKKD 296 (303) T ss_pred EEeccccceeeehhHhHhHHHhcccccceEEEEEEecc Confidence 11112222 2222233444556899999999999998 No 24 >protein:vir:99675 Length: 324 # NCBI annotation: Major capsid protein # Family: family:all:975 # MgeID: mge:1523 # MgeName: VP4 # Cross-refs: genbank:acc:YP_249589;genbank:gi:68299740;genbank:GeneID:3799990 Probab=68.96 E-value=0.23 Score=24.09 Aligned_cols=272 Identities=12% Similarity=0.019 Sum_probs=88.5 Q ss_pred cccCcccccceEEEEEEcCceeEeeeccCCCCc--cc--ccCCceeEEEEeccccccCccccHHHHhccccCCCCCHHHH Q lcl|NC_019400. 33 LFTAYHGVTTVAQIERVDEVVTDFPARRRQGER--NY--VGTEKAQLKNFNIPFFPLDRQITAADVQNFRKYFTADAPKS 108 (337) Q Consensus 33 ~F~~~~~~t~~v~ie~~~~~~~l~p~v~~g~~~--~~--~~~~~~~~~~f~~p~i~~~~~v~a~dlq~~R~~G~~~~~~~ 108 (337) +-+.. ..++++.|... |..+ +....+|.+- .. ....+..+ .+.-..+..--.=.-+++|. .. +-... T Consensus 1 ~vr~i-~~g~s~~~~~i-G~~~-~~~~~~G~~l~~~~~~~~~~e~~i-tID~~l~~~~~VdDiD~~qa---~~--Dlr~e 71 (324) T protein:vir:99 1 MTRTI-TSGKSAQFPVM-GRTK-ARYLKQGQSLDDGREDIKHTEKVI-TIDGLLTTDVLIYDIEDAMN---HY--DVRSE 71 (324) T ss_pred Ceeee-ecCceEEEeee-eeeE-eccccCCCCcCCCcCCcCcccEEE-EecchhhhhhhhhhHHHHhc---Cc--cchhH Confidence 11111 13667777776 2222 3333344432 11 11111111 11111111110101123332 11 11111 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHh-cCCEEecCCCceEeechhhcCCCcc-eEEEecCCCCc---chHHHHHHHHH Q lcl|NC_019400. 109 VEDVVARVVRRIRISHEQLKEKAMLQAI-MGKSWAPQDPTAQYNYFTEWGVTQH-TANIDFTDVAT---DPTDIIEADAR 183 (337) Q Consensus 109 ~~~~v~~~l~~~~~~i~~t~E~m~a~AL-~g~i~~~~~g~v~~d~~~~fG~~~~-~~~~~l~~~~~---d~~~~~~~~~~ 183 (337) ..+.....|++..++.-... ++.+. ...-.. .+... ..|.+.. .+.-...++.. .+...+.+... T Consensus 72 ~s~~~G~aLA~~~Dq~i~~~---~a~~~~~~a~~~--~~~~~-----~~g~~~~~~~~~~~~~~~~~~~~~~dai~~a~~ 141 (324) T protein:vir:99 72 YSTQMGEALAMAADVANYAE---MAKLVNSRKETT--NENIE-----GLGAASLVKITGKKEDPAKYGTQVIQALTYARA 141 (324) T ss_pred HHHHHHHHHHHHHHHHHHHH---HHHhhhcccccc--cCCcc-----cCCccceecccccccccccCHHHHHHHHHHHHH Confidence 22223333332222222111 11111 111000 01000 0111110 00000011112 22333333333 Q ss_pred HHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhhcccccccccccccceeccceeEEEeccEEEecCc-eeee Q lcl|NC_019400. 184 AYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYSSTQEPLRRRLGQGQENANNRMFVHKNVTYIEDI-SNYI 262 (337) Q Consensus 184 ~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~i 262 (337) .+ -....+..+.+++++|++|.+|++|+.+.... +........ +......|+..++..+.....+. .... T Consensus 142 ~L----de~~VP~~gR~~vv~P~~y~~Ll~~~~~~~~~--~~~~~~~~~---G~V~~i~Gf~V~~Sn~lp~~~~t~~~~a 212 (324) T protein:vir:99 142 AF----AKKYIPAGDRTFYTDPDTYSAILAALMPNAAN--YAALIDPET---GNIRNVMGFEVVETPHMTAQMVTNPTDA 212 (324) T ss_pred HH----hhcCCCCCCCEEEeChHHHHHHhhcccccccc--cccccceec---ceEEEEeceEEEecCCcccccccccccc Confidence 22 22233556678899999999999887764321 111111111 11222344444444433221111 1122 Q ss_pred cCCeeEEEEecch-hhheEEeccccchhhccccCccee-------------eeEEEccCCCEEEEEEeecccccccCcce Q lcl|NC_019400. 263 PDGEAYILPQGID-DMFQIHYAPADDVREANTPAQELY-------------LWYKSSAYLREEKVESETSFLTVNTRPEL 328 (337) Q Consensus 263 ~~~~~~~~p~g~~-~~f~~~~ap~d~~~~~n~~~~~~y-------------~k~~~~~~~~~~~l~~eS~PLpi~~rP~~ 328 (337) .++..+.++.-.. +....|-+-+ .+..++=|+ ...+.+++-.+..|..--..=....||++ T Consensus 213 ~~~~~~~~~~~~~~~~~~ky~~d~-----~~~~gl~~~~~a~~tv~~~~~~~e~~~~~~~~~d~i~~~~a~G~~~lRPe~ 287 (324) T protein:vir:99 213 FDGTGHIFPATGDSTTTGKMTVGA-----DNVVGLFVHRSAVATLKLKDMALERARRPEYQADQIIAKYAMGHGGLRPEA 287 (324) T ss_pred cccccccccccccccccccccccc-----CceeEEEEehhheEEEeeecceecceechhhHHHhhhhhhhhcCcccccce Confidence 2333333332111 1111111111 111222111 11112223334455554445566789997 Q ss_pred EEEEE--EeeC Q lcl|NC_019400. 329 VVRST--GTFA 337 (337) Q Consensus 329 l~~~t--~~aa 337 (337) +..++ ..|+ T Consensus 288 a~~v~l~~~~~ 298 (324) T protein:vir:99 288 VGAIIFEDGET 298 (324) T ss_pred EEEEEEccCcc Confidence 75444 4443 No 25 >protein:vir:97031 Length: 402 # NCBI annotation: 31 # Family: family:all:2806 # MgeID: mge:1644 # MgeName: K1-5 # Cross-refs: genbank:acc:YP_654132;genbank:gi:108862016;genbank:GeneID:5075980 Probab=67.09 E-value=0.26 Score=23.81 Aligned_cols=300 Identities=10% Similarity=0.039 Sum_probs=115.4 Q ss_pred CCCCCCCccCHHHHHHHHHhcCCCccchhhcccccCccc-ccceEEEEEEcCceeEeeeccCCCCcc--cccCCceeEEE Q lcl|NC_019400. 1 MAVVRTNDFQIVDLGATLEIVPRQYRLITNMDLFTAYHG-VTTVAQIERVDEVVTDFPARRRQGERN--YVGTEKAQLKN 77 (337) Q Consensus 1 ~d~~~~d~Fs~~~Lt~~i~~~p~~~~~l~~l~~F~~~~~-~t~~v~ie~~~~~~~l~p~v~~g~~~~--~~~~~~~~~~~ 77 (337) .++|-. .|+-.-+++--. ...+ ++++..+.+ .++++.|.+. |..++ ....+|.+.. ....+ -.+.. T Consensus 19 ~al~le-~f~geV~taF~~-----~si~--~~~~~vrti~~GkS~qf~~i-G~~~a-~y~~~G~~ldg~~~~~~-k~~It 87 (402) T protein:vir:97 19 DSLLIE-KFNGKVNEQYLK-----GENI--LSYFDVQTVTGTNTVSNKYL-GETEL-QVLAPGQSPNATPTQAD-KNQLV 87 (402) T ss_pred hhhhhh-hhhhhHHHHHHH-----HHhh--cCcceeeeecccceEEEEEE-eeeEE-eeeccccccCCCCcccc-cEEEE Confidence 333332 233333332222 2222 234554544 4777888887 33333 4444444321 11211 11122 Q ss_pred EeccccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhc-CCEEec--CCCceEeechh Q lcl|NC_019400. 78 FNIPFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIM-GKSWAP--QDPTAQYNYFT 154 (337) Q Consensus 78 f~~p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~-g~i~~~--~~g~v~~d~~~ 154 (337) +....+...-.-.-+|+|+ .+-. --..+.......|.++.++ +++|.+. ...... .++.... + T Consensus 88 ID~lL~a~~~V~diDeaq~--~yD~--vRse~s~e~G~ALA~~~Dq-------~ii~~i~~aa~a~t~~~~~~~~~-~-- 153 (402) T protein:vir:97 88 IDTTVIARNTVAHIHDVQG--DIDS--LKPKLAMNQAKQLKRLEDQ-------MAIQQMLLGGIANTKAERNKPRV-K-- 153 (402) T ss_pred eCceeechhhhhhHHHHHh--cccc--hhHHHHHHHHHHHHHHHHH-------HHHHHHHHhhccccccccccCcc-c-- Confidence 3333333322223333332 1210 0112233344444444443 2222221 111110 0000000 0 Q ss_pred hcCCCcceEEEecCCCCcchHHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhh-ccccccccc Q lcl|NC_019400. 155 EWGVTQHTANIDFTDVATDPTDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYY-SSTQEPLRR 233 (337) Q Consensus 155 ~fG~~~~~~~~~l~~~~~d~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~-~~~~~~~~~ 233 (337) .-|.+. ....+-..+..++..... ......+.+-....+..+.+++++|++|..|++|+.+... ++. ..+..+..+ T Consensus 154 ~~g~s~-~~~~t~~~a~~~~~~l~~-ai~~a~~~LdEkdVP~~dRv~vv~P~~y~~Ll~~~rl~n~-d~~~~~~g~~~~G 230 (402) T protein:vir:97 154 GHGFSI-NVNVTESEALANPQYVMA-AVEYALEQQLEQEVDISDVAIMMPWKFFNALRDADRIVDK-TYTISQSGATING 230 (402) T ss_pred cccccc-ccccccchhhcCHHHHHH-HHHHHHHHHHhcCCCccccEEEeChHHHHHHhhcccccch-hhccccCCccccc Confidence 001111 111111122233332222 1222222222334456778899999999999999876432 111 111112111 Q ss_pred ccccceeccceeEEEeccEEEec--Cc-eeeecCCeeEEEEecchhhhe----EEeccccchhhccc-cCcceeeeEEEc Q lcl|NC_019400. 234 RLGQGQENANNRMFVHKNVTYIE--DI-SNYIPDGEAYILPQGIDDMFQ----IHYAPADDVREANT-PAQELYLWYKSS 305 (337) Q Consensus 234 ~~~~~~~~~~~~~~~~~~~~~~~--~~-~~~i~~~~~~~~p~g~~~~f~----~~~ap~d~~~~~n~-~~~~~y~k~~~~ 305 (337) ......|+..+...+..... .. ...-+++.+.-+.. .+-|. ..|-| +.+++ ...++=...|.+ T Consensus 231 ---~v~~v~Gv~Vv~SnnlP~~a~~it~~~ls~a~~G~~y~~--t~d~t~~~~~~f~~----~Av~tvk~~~vT~~~~~d 301 (402) T protein:vir:97 231 ---FVLSSYNCPVIPSNRFPTFAQDQAHHLLSNEDNGYRYDP--IAEMNGAVAVLFTS----DALLVGRTIEVTGDIFYE 301 (402) T ss_pred ---eeEEEeceEEEecCccccccccccccccccCCCCccCCc--CcccceeEEEEEec----ceEEEEEeeccccchhhc Confidence 11223333333333332110 00 11112222222210 01111 12222 12332 234556667778 Q ss_pred cCCCEEEEEEeecccccccCcceEEEEEEe----eC Q lcl|NC_019400. 306 AYLREEKVESETSFLTVNTRPELVVRSTGT----FA 337 (337) Q Consensus 306 ~~~~~~~l~~eS~PLpi~~rP~~l~~~t~~----aa 337 (337) .+...+.|.+-.+.=..+.||++.--++.+ ++ T Consensus 302 ~r~~~~~id~~~a~G~g~~RPeaa~vv~~~~~~t~~ 337 (402) T protein:vir:97 302 KKEKTYYIDTFMAEGAIPDRWEAVSVVTTKRDATTG 337 (402) T ss_pred hhHHHHHHHHHHHhCCcccCccceEEEEEecccccc Confidence 777778887777777788999865444332 22 No 26 >protein:vir:94576 Length: 347 # NCBI annotation: Major capsid protein # Family: family:all:975 # MgeID: mge:1516 # MgeName: Berlin # Cross-refs: genbank:acc:YP_919012;genbank:gi:119637776;genbank:GeneID:5179336 Probab=63.83 E-value=0.31 Score=23.37 Aligned_cols=307 Identities=11% Similarity=-0.037 Sum_probs=117.6 Q ss_pred CCCCCCCccCHHHHHHHHHhcCCCccchhhcccccCccc-ccceEEEEEEcCceeEeeeccCCCCcc----cccCCceeE Q lcl|NC_019400. 1 MAVVRTNDFQIVDLGATLEIVPRQYRLITNMDLFTAYHG-VTTVAQIERVDEVVTDFPARRRQGERN----YVGTEKAQL 75 (337) Q Consensus 1 ~d~~~~d~Fs~~~Lt~~i~~~p~~~~~l~~l~~F~~~~~-~t~~v~ie~~~~~~~l~p~v~~g~~~~----~~~~~~~~~ 75 (337) ..+|- -.|+-.-++.--+ .+.+. ++...+.+ .++++.|....... +....+|.+.. .+...+..+ T Consensus 24 ~al~i-e~~~geV~~~f~~-----~s~~~--~~~~~rti~~G~sv~~~~iG~~~--~~~~~~G~~l~~~~~~~~~~e~~l 93 (347) T protein:vir:94 24 LALFL-KVFGGEVLTAFTR-----TSVTM--NKHLVRSIQSGKSAQFPVLGRTK--AAYLQPGENLDDKRKDMKHTEKTI 93 (347) T ss_pred HHHHH-HHHhHHHHHHHHH-----HHhhh--hhhhheeccccceEEeeecccee--EeeeecCcCCCCCcCCccccceEE Confidence 22444 2454444333332 13332 23444443 36677777653332 23334444321 122222222 Q ss_pred EEEeccccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHH-HHHHHHHHHhc-CCEEecCCCceEeech Q lcl|NC_019400. 76 KNFNIPFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQ-LKEKAMLQAIM-GKSWAPQDPTAQYNYF 153 (337) Q Consensus 76 ~~f~~p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~-t~E~m~a~AL~-g~i~~~~~g~v~~d~~ 153 (337) ..=+.-|+. --.=.-+++|..-. ....+.......|.+..++... ..-..+.-+.- .+...+.++...+... T Consensus 94 tID~~~y~~-~~VddiD~~q~~~D-----~rs~~~~~~g~ALA~~~D~~i~~~l~~~a~~~~~~~~~~~g~~~~~~v~i~ 167 (347) T protein:vir:94 94 NIDGLLTAD-VLIYDIEDAMNHYD-----VRSEYTAQLGESLAMAADGAVLAEMAKLCNLPTANNENIAGLGKAHVLEVG 167 (347) T ss_pred EEcchhhhh-hhhhhHHHHhcCcc-----hHHHHHHHHHHHHHHHHHHHHHHHHHHhhccccccccccccCCcceeEeee Confidence 111111211 10112244443111 1111222233333322222111 11111111100 1111111111111110 Q ss_pred hhcCCCcceEEEecCCCCcchHHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhhccccccccc Q lcl|NC_019400. 154 TEWGVTQHTANIDFTDVATDPTDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYSSTQEPLRR 233 (337) Q Consensus 154 ~~fG~~~~~~~~~l~~~~~d~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~~~~~~~~~ 233 (337) . ..+ ..+....+... +.+..+...+.+.....+..+.+++++|++|.+|+++.... +.++..... . T Consensus 168 ~-----~~~---~~~~~~~~~~~-~~d~i~~a~~~Lde~dVP~~~R~~vv~P~~y~~LLk~~~~~--~~~~~~~~~---~ 233 (347) T protein:vir:94 168 D-----QAT---LQGDQVKLGQA-IIAQLTLARAKLTGNYVPSSDRVFYTTPDNYSAILAALMPN--AANYQALID---P 233 (347) T ss_pred c-----ccc---ccccccccHHH-HHHHHHHHHHHhhhcCCCCCCCEEEeChHHHHHHHHhhccc--ccccccccc---c Confidence 0 000 01111112222 22223333334333345666788899999999999542221 111111111 1 Q ss_pred ccccceeccceeEEEeccEEEecCcee-----eecCCeeEEEEecchhhheEE----eccccchhhcccc-CcceeeeEE Q lcl|NC_019400. 234 RLGQGQENANNRMFVHKNVTYIEDISN-----YIPDGEAYILPQGIDDMFQIH----YAPADDVREANTP-AQELYLWYK 303 (337) Q Consensus 234 ~~~~~~~~~~~~~~~~~~~~~~~~~~~-----~i~~~~~~~~p~g~~~~f~~~----~ap~d~~~~~n~~-~~~~y~k~~ 303 (337) ..+......|+..+...+......+.. ....+..+.++.++.+-|+.- .|-.-+-+.+++. ..++=...| T Consensus 234 ~~G~V~~v~G~~V~~Sn~~p~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~d~~~~~~l~~~~~A~~tv~~~~~~~e~~ 313 (347) T protein:vir:94 234 STGSIRNVMGFEVIEVPHLTAGGAGDNRAEEGVAPTNQKHAFPDTASGDTRVALDNVVGLFNHRSAVGTVKLKDMALERA 313 (347) T ss_pred ccceeEEeeceEEEEcCccccccCcccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcccceeee Confidence 112222334444444444322111111 122333455555443333210 0101111222221 223344556 Q ss_pred EccCCCEEEEEEeecccccccCcceEEEEEEeeC Q lcl|NC_019400. 304 SSAYLREEKVESETSFLTVNTRPELVVRSTGTFA 337 (337) Q Consensus 304 ~~~~~~~~~l~~eS~PLpi~~rP~~l~~~t~~aa 337 (337) .+.+-.++.|.+--..=.-+.||++.+.++.++| T Consensus 314 ~~~~~~~~~i~~~~a~G~g~~rPe~a~~i~~~~a 347 (347) T protein:vir:94 314 RRANFQADQIIAKYAMGHGGLRPEACGALVFKKA 347 (347) T ss_pred echhhhhhhhhhhhhhcCcccccceeEEEEecCC Confidence 6666677777777777778899999999999999 No 27 >protein:vir:2201 Length: 345 # NCBI annotation: major capsid protein # Family: family:all:975 # MgeID: mge:49 # MgeName: T7 # Cross-refs: genbank:acc:NP_041998;swissprot:sw:p19726;genbank:gi:9627469;goa:P19726;uniprot:P19726;genbank:GeneID:1261026 Probab=63.34 E-value=0.32 Score=23.30 Aligned_cols=300 Identities=12% Similarity=0.022 Sum_probs=104.0 Q ss_pred CCCCCCCccCHHHHHHHHHhcCCCccchhhcccccCcccc-cceEEEEEEcCceeEeeeccCCCCcccc----cCCceeE Q lcl|NC_019400. 1 MAVVRTNDFQIVDLGATLEIVPRQYRLITNMDLFTAYHGV-TTVAQIERVDEVVTDFPARRRQGERNYV----GTEKAQL 75 (337) Q Consensus 1 ~d~~~~d~Fs~~~Lt~~i~~~p~~~~~l~~l~~F~~~~~~-t~~v~ie~~~~~~~l~p~v~~g~~~~~~----~~~~~~~ 75 (337) .++|- -.|+-.-++.--. .+.+. +++..+.++ ++++.|... |..++ ....+|.+-... +..+.. T Consensus 25 ~al~l-e~f~geV~~~f~~-----~s~~~--~~~~~r~i~~gks~~~~~i-G~~~~-~~~~~G~~l~~~~~~~~~~e~~- 93 (345) T protein:vir:22 25 LALFL-KVFGGEVLTAFAR-----TSVTT--SRHMVRSISSGKSAQFPVL-GRTQA-AYLAPGENLDDKRKDIKHTEKV- 93 (345) T ss_pred hHHHH-HHHhHHHHHHHHH-----Hhhhc--ccceeeeccccceEEEeee-cceEE-EeeecCCCCCCCCCCcccceEE- Confidence 55555 2455444333332 23332 345555554 778888876 44333 333444432111 111111 Q ss_pred EEEeccccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhc-C-CEEecCCCceEeech Q lcl|NC_019400. 76 KNFNIPFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIM-G-KSWAPQDPTAQYNYF 153 (337) Q Consensus 76 ~~f~~p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~-g-~i~~~~~g~v~~d~~ 153 (337) ..+.-..+..--.=.-+|+|. -. +-...........|+ +.++. ++++.|. + ....+..+.+- -++ T Consensus 94 ltID~~~y~~~~VddiD~~q~---~~--D~r~~~s~~~G~aLA---~~~D~----~i~~~l~k~a~~~~~~~~~~~-~~~ 160 (345) T protein:vir:22 94 ITIDGLLTADVLIYDIEDAMN---HY--DVRSEYTSQLGESLA---MAADG----AVLAEIAGLCNVESKYNENIE-GLG 160 (345) T ss_pred EEecchhhhhhhHhhHHHHhc---Cc--hhHHHHHHHHHHHHH---HHHHH----HHHHHHHHhhccccccccccc-ccc Confidence 111111111111101122331 10 111111222222222 22222 2222221 1 11111011000 000 Q ss_pred hhcCCCcc-e-EEEecCCCC---cchHHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhhcccc Q lcl|NC_019400. 154 TEWGVTQH-T-ANIDFTDVA---TDPTDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYSSTQ 228 (337) Q Consensus 154 ~~fG~~~~-~-~~~~l~~~~---~d~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~~~~ 228 (337) . |.... + ..-+++... .++...+.+ ..+.+-....+..+.+++++|++|.+|++|+.+-... +.... T Consensus 161 ~--~~~~~~~~~g~~~t~~~~~~~~~~~ai~~----a~~~Lde~~VP~~~R~~vv~P~~y~~Ll~~~~~~~~~--~~~~~ 232 (345) T protein:vir:22 161 T--ATVIETTQNKAALTDQVALGKEIIAALTK----ARAALTKNYVPAADRVFYCDPDSYSAILAALMPNAAN--YAALI 232 (345) T ss_pred c--ccccccccccccccccccCHHHHHHHHHH----HHHHhhhcCCCccCCEEEeChHHHHHHhccccccccc--ccccc Confidence 0 00000 0 000111111 122333332 2333333344566778999999999999998763311 11111 Q ss_pred cccccccccceeccceeEEEeccEEEecCce-eeecCCeeEEEEecchhhheEEeccccc-------hhhccc-cCccee Q lcl|NC_019400. 229 EPLRRRLGQGQENANNRMFVHKNVTYIEDIS-NYIPDGEAYILPQGIDDMFQIHYAPADD-------VREANT-PAQELY 299 (337) Q Consensus 229 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~i~~~~~~~~p~g~~~~f~~~~ap~d~-------~~~~n~-~~~~~y 299 (337) .... +......|+..++..+......+. ..-+.+..+.+|.++.. + .++...+. -+.+++ ...++= T Consensus 233 ~~~~---G~V~~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~g~-~-~~~~~~~~~~~l~~h~~A~~~v~~~~~~ 307 (345) T protein:vir:22 233 DPEK---GSIRNVMGFEVVEVPHLTAGGAGTAREGTTGQKHVFPANKGE-G-NVKVAKDNVIGLFMHRSAVGTVKLRDLA 307 (345) T ss_pred cccc---ceEEEEeceEEEecccccccccCccccCcccccccccccccc-e-eeeeccCceEEEEEehhheeeeeeecce Confidence 1111 111223344444333332111111 11122233444543211 1 11111110 001110 000111 Q ss_pred eeEEEccCCCEEEEEEeecccccccCcceEEEEEEeeC Q lcl|NC_019400. 300 LWYKSSAYLREEKVESETSFLTVNTRPELVVRSTGTFA 337 (337) Q Consensus 300 ~k~~~~~~~~~~~l~~eS~PLpi~~rP~~l~~~t~~aa 337 (337) ...|.+++..+..|.+--..=.-..||++++.++.+-- T Consensus 308 ~e~~r~~~~~~d~I~~~~a~G~~vlRPeaa~~i~~~~~ 345 (345) T protein:vir:22 308 LERARRANFQADQIIAKYAMGHGGLRPEAAGAVVFKVE 345 (345) T ss_pred eeeeechhHHHHHHHHHHhcCCcccccceeEEEEEeeC Confidence 11122222223333333333346789999888888777 No 28 >protein:vir:7019 Length: 401 # NCBI annotation: major capsid protein # Family: family:all:2806 # MgeID: mge:141 # MgeName: SP6 # Cross-refs: genbank:acc:NP_853592;genbank:gi:31711674;genbank:GeneID:1481800 Probab=59.71 E-value=0.39 Score=22.84 Aligned_cols=298 Identities=9% Similarity=0.006 Sum_probs=117.8 Q ss_pred CCCCCCCccCHHHHHHHHHhcCCCccchhhcccccCcccc-cceEEEEEEcCceeEeeeccCCCCcc--cccCCceeEEE Q lcl|NC_019400. 1 MAVVRTNDFQIVDLGATLEIVPRQYRLITNMDLFTAYHGV-TTVAQIERVDEVVTDFPARRRQGERN--YVGTEKAQLKN 77 (337) Q Consensus 1 ~d~~~~d~Fs~~~Lt~~i~~~p~~~~~l~~l~~F~~~~~~-t~~v~ie~~~~~~~l~p~v~~g~~~~--~~~~~~~~~~~ 77 (337) .++|-. .|+-.-+++--+ ...+ ++++..+.++ +++++|.+. +..-+....+|.+.. ....+ -.+.. T Consensus 19 ~al~Le-~f~GeV~taF~~-----~si~--~~~~~vRti~~gkS~qf~~~--G~s~~~~~~pG~~ld~~~~~~d-K~~It 87 (401) T protein:vir:70 19 DSLLIE-KFNGKVNEQYLK-----GENI--MSYFDVQTVTGTNTVSNKYL--GETELQVLAPGQSPAATSTQAD-KNQLV 87 (401) T ss_pred hHhHHh-HhcchHHHHHHH-----Hhhh--cccceeeeecccceEEEEEe--eeeEeeeecCCCCcCCCCcccc-cEEEE Confidence 333331 233233332222 1222 3456666553 777888887 223345555555432 11111 11122 Q ss_pred EeccccccCccccHHHHhccccCCC-CCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhc-CCEEe--cCCCceEeech Q lcl|NC_019400. 78 FNIPFFPLDRQITAADVQNFRKYFT-ADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIM-GKSWA--PQDPTAQYNYF 153 (337) Q Consensus 78 f~~p~i~~~~~v~a~dlq~~R~~G~-~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~-g~i~~--~~~g~v~~d~~ 153 (337) +....+.+.-.=.-+|+|+ .|.. .+ .+...+.+.|+++.++ +++|.+. .++.. +.+..+ T Consensus 88 ID~lL~a~~~V~dlDe~q~--~yD~vRs---e~s~e~G~ALA~~~Dq-------~iiq~i~~aa~ana~~~~~~p----- 150 (401) T protein:vir:70 88 IDATVIARNTVAHLHDVQG--DIDSLKP---KLATNQAKQLKRMEDE-------MLIQQMMLGGIANTQAKRTNP----- 150 (401) T ss_pred eCceeehhhhhhhHHHHHh--cccccch---HHHHHHHHHHHHHHHH-------HHHHHHHHhccccccccccCC----- Confidence 3333333333333344443 2221 11 1223344444443333 3344442 22211 111110 Q ss_pred hhcCCC-cceEEEecCC--CCcc---hHHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhhccc Q lcl|NC_019400. 154 TEWGVT-QHTANIDFTD--VATD---PTDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYSST 227 (337) Q Consensus 154 ~~fG~~-~~~~~~~l~~--~~~d---~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~~~ 227 (337) +|.. ...+++.-.. +..| +...+.+...++.++ ..+..++++|+.+.+|..|+.|+.+...--+...+ T Consensus 151 --~~~~~G~~i~v~~~~~~~~~~~~~l~~ai~dA~~~LdEk----dVP~~r~vvl~pp~~Ys~Ll~~d~L~nrd~~~s~~ 224 (401) T protein:vir:70 151 --RVKGHGFSINVEVAEGEALVNPQYVMAAVEFALEQQLEQ----EVDISDVAILMPWRYFNVLRDADRIVDKTYTISQS 224 (401) T ss_pred --CcCCCceEEeccccccccccCHHHHHHHHHHHHHHHHhc----CCCccceEEEcCHHHHHHHHhcCcccchhhccccC Confidence 0110 0112222111 1233 333333333332222 22355789999999999999998554311011111 Q ss_pred ccccccccccceeccceeEEEeccEEEecCc---eeeecCCeeEEEEecchhhheEEeccccchhhccc-cCcceeeeEE Q lcl|NC_019400. 228 QEPLRRRLGQGQENANNRMFVHKNVTYIEDI---SNYIPDGEAYILPQGIDDMFQIHYAPADDVREANT-PAQELYLWYK 303 (337) Q Consensus 228 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~---~~~i~~~~~~~~p~g~~~~f~~~~ap~d~~~~~n~-~~~~~y~k~~ 303 (337) ..+.++. .....|+..++..+.....+. ...-+++.+.-+.. .+-|..-.+..=+-+.+++ ...++=...| T Consensus 225 g~~~~G~---v~~vaGv~Vv~SnnlP~~a~~it~~~ls~a~~G~~y~~--~~d~s~~~~v~f~~~Av~tvk~~~lt~~~~ 299 (401) T protein:vir:70 225 GATIQGF---TLSSYNCPVIPSNRFPKYSQGQTHHLLSNEDNGYRYDP--LPAMNGAIAVLFTADALLVGRSIDVTGDIF 299 (401) T ss_pred Cccccce---EEEEeceEEEeeccccccccccccccccccCCCccCCC--CccccceeEEEEehhheEEEEeeccccchh Confidence 1122221 122334333333333211100 11111111111110 0011111110000012222 2345566668 Q ss_pred EccCCCEEEEEEeecccccccCcceEEEEEEeeC Q lcl|NC_019400. 304 SSAYLREEKVESETSFLTVNTRPELVVRSTGTFA 337 (337) Q Consensus 304 ~~~~~~~~~l~~eS~PLpi~~rP~~l~~~t~~aa 337 (337) .+.+...+.|.+--..=..+.||++..-+|.+-- T Consensus 300 ~d~r~~~~~id~~~a~g~g~~RPeaa~vv~~k~~ 333 (401) T protein:vir:70 300 YEKKEKTYYIDTFMAEGAIPDRWEAVSVVTTKRN 333 (401) T ss_pred hhhhhhHHHHHHHHHhCCcccchhheEEEeecCc Confidence 8888888888877777788999997744432221 No 29 >protein:vir:105645 Length: 400 # NCBI annotation: putative major capsid protein # Family: family:all:2806 # MgeID: mge:1674 # MgeName: K1E # Cross-refs: genbank:acc:YP_425009;genbank:gi:83571757;uniprot:Q2WC43;genbank:GeneID:3837286 Probab=56.72 E-value=0.45 Score=22.48 Aligned_cols=299 Identities=10% Similarity=0.042 Sum_probs=121.1 Q ss_pred CCCCCCCccCHHHHHHHHHhcCCCccchhhcccccCcccc-cceEEEEEEcCceeEeeeccCCCCcc--cccCCceeEEE Q lcl|NC_019400. 1 MAVVRTNDFQIVDLGATLEIVPRQYRLITNMDLFTAYHGV-TTVAQIERVDEVVTDFPARRRQGERN--YVGTEKAQLKN 77 (337) Q Consensus 1 ~d~~~~d~Fs~~~Lt~~i~~~p~~~~~l~~l~~F~~~~~~-t~~v~ie~~~~~~~l~p~v~~g~~~~--~~~~~~~~~~~ 77 (337) .++|-. .|+-.-+++--+ ...+ ++++..+.++ +++++|.+. | ..-+....+|.+.. ....+ -.+.. T Consensus 19 ~aL~Le-~f~GeV~taF~~-----~si~--~~~~~vRtI~~gkS~qf~~l-G-~s~a~y~~pG~~ldg~~~~~d-k~~It 87 (400) T protein:vir:10 19 DSLLIE-KFNGKVNEQYLK-----GENI--MSYFDVQTVTGTNTVSNKYL-G-ETELQVLAPGQSPAATSTQAD-KNQLV 87 (400) T ss_pred hhhHHh-HhcchHHHHHHH-----Hhhh--cccceeeeecccceEEEEEe-e-eeEEeeecCCCCcCCCCcccC-cEEEE Confidence 333321 233333333222 1222 3456666554 777888887 2 22244545555421 12122 12223 Q ss_pred EeccccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHh-cCCEEec--CCCceEeechh Q lcl|NC_019400. 78 FNIPFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAI-MGKSWAP--QDPTAQYNYFT 154 (337) Q Consensus 78 f~~p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL-~g~i~~~--~~g~v~~d~~~ 154 (337) +....+.+.-.=.-+|+|+ .|-+ .-..+...+...|+++.++ +++|.+ .+++... .++. -+. . T Consensus 88 IDtLL~a~~~V~dlDd~q~--~yD~--vRse~s~e~G~ALA~~~Dq-------~iiq~i~~a~~a~t~~~~~~--~~g-~ 153 (400) T protein:vir:10 88 IDATVIARNTVAHLHDVQG--DIDS--LKPKLATNQAKQLKKMEDE-------MLIQQMLLGGIANTQAKRTN--PRV-K 153 (400) T ss_pred eCceeeecchhhhHHHHhh--cccc--ccHHHHHHHHHHHHHHHHH-------HHHHHHHHhccccccccccc--CCc-c Confidence 3333333333333344443 2211 0011233444445444433 334433 2333211 1110 000 0 Q ss_pred hcCCCcceEEEecCC--CCcchHH---HHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHH-HHhhcccc Q lcl|NC_019400. 155 EWGVTQHTANIDFTD--VATDPTD---IIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNA-YQYYSSTQ 228 (337) Q Consensus 155 ~fG~~~~~~~~~l~~--~~~d~~~---~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~-~~~~~~~~ 228 (337) +-|. ++.+.-.+ +..|+.. .+.+...++.++ ..+..++++++.+.+|+.|..|+.+... |.+ ..+. T Consensus 154 ~~g~---s~~v~~~~~~~~~~~~~l~~A~~~A~~~LdEk----dVP~~d~vvl~pp~~Ys~Ll~~dkLvnrdf~~-s~~g 225 (400) T protein:vir:10 154 GHGF---SVNVEVNEGEALVNPQYVMAAVEFALEQQLEQ----EVDISDVAILMPWRYFNVLRDADRIVDKSYTI-SQSG 225 (400) T ss_pred cccc---ceeecccccccccCHHHHHHHHHHHHHHHHhc----CCCccceEEEcCHHHHHHHHhCCcccchhccc-cCCC Confidence 1111 12221111 1234322 233333332222 2235578999999999999998843211 111 1111 Q ss_pred cccccccccceeccceeEEEeccEEEe-cC--ceeeecCCeeEEEEecchhhheEEeccccchhhccc-cCcceeeeEEE Q lcl|NC_019400. 229 EPLRRRLGQGQENANNRMFVHKNVTYI-ED--ISNYIPDGEAYILPQGIDDMFQIHYAPADDVREANT-PAQELYLWYKS 304 (337) Q Consensus 229 ~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~--~~~~i~~~~~~~~p~g~~~~f~~~~ap~d~~~~~n~-~~~~~y~k~~~ 304 (337) .+.++ ......|+..++..+.... .. ....-+++.+.-+.. .+-|..-.+..=+-+.+++ ...++=...|. T Consensus 226 ~~~~g---~v~~v~Gv~Iv~Sn~lP~~a~~~~~~~lS~a~~G~~y~~--t~d~s~~~av~F~~sAv~tvk~~~lt~~~~~ 300 (400) T protein:vir:10 226 ATIQG---FVLSSYNCPVIPSNRFPKYSQGQKHHLLSNEDNGYRYDP--IAEMNGAIAVLFTADALLVGRSIDVIGDIFY 300 (400) T ss_pred ccccc---eEEEEeceEEEeeCcCCcccCcccccccccCCCCccCCc--cccccceeEEEEehhheEEEEeecccccccc Confidence 12111 1122334444333333111 00 011111221221110 0011111110000012222 33456677788 Q ss_pred ccCCCEEEEEEeecccccccCcceEEEEEEeeC Q lcl|NC_019400. 305 SAYLREEKVESETSFLTVNTRPELVVRSTGTFA 337 (337) Q Consensus 305 ~~~~~~~~l~~eS~PLpi~~rP~~l~~~t~~aa 337 (337) +++...+.|.+-.+.=..+.||++..-+|.+=- T Consensus 301 d~r~~~~~id~~~a~G~g~~RPeaa~vv~~~~~ 333 (400) T protein:vir:10 301 EKKEKTYYIDTFMSEGAIPDRWEAVSVVTTKRQ 333 (400) T ss_pred chhhHHHHHHHHHHhCCcccchhheEEEEecCC Confidence 888888888888888888999998776665422 No 30 >protein:vir:100057 Length: 375 # NCBI annotation: T7-like capsid protein # Family: family:all:975 # MgeID: mge:1604 # MgeName: P-SSP7 # Cross-refs: genbank:acc:YP_214206;genbank:gi:61806429;genbank:GeneID:3294737 Probab=49.38 E-value=0.64 Score=21.64 Aligned_cols=304 Identities=12% Similarity=-0.010 Sum_probs=107.8 Q ss_pred CCCCCCCccCHHHHHHHHHhcCCCccchhhcccccCcccc-cceEEEEEEcCceeEeeeccCCCC--cccccCCce--eE Q lcl|NC_019400. 1 MAVVRTNDFQIVDLGATLEIVPRQYRLITNMDLFTAYHGV-TTVAQIERVDEVVTDFPARRRQGE--RNYVGTEKA--QL 75 (337) Q Consensus 1 ~d~~~~d~Fs~~~Lt~~i~~~p~~~~~l~~l~~F~~~~~~-t~~v~ie~~~~~~~l~p~v~~g~~--~~~~~~~~~--~~ 75 (337) .++|- -.|+-.- ..++.+ .+.+. +++..+.++ ++++.|... |..++-- ..||.+ ++.....+- .+ T Consensus 27 ~al~l-e~f~geV-~~~f~~----~si~~--~~~~~rti~~Gksv~f~~i-G~~t~~~-~t~G~~i~~~~~~d~~~te~~ 96 (375) T protein:vir:10 27 YALYL-KLFSGEM-FKGFQH----ETIAR--DLVTKRTLKNGKSLQFIYT-GRMTSSF-HTPGTPILGNADKAPPVAEKT 96 (375) T ss_pred HHHHH-HHHhHHH-HHHHHH----HHhhh--ccccccccccCceEEEEee-eeeEEee-ecCCcCcCCccccCCCCCceE Confidence 33333 1344222 233332 23332 345555554 778888887 4444333 333433 222222211 11 Q ss_pred EEEeccccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhc-CCEEecC-CCceEeech Q lcl|NC_019400. 76 KNFNIPFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIM-GKSWAPQ-DPTAQYNYF 153 (337) Q Consensus 76 ~~f~~p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~-g~i~~~~-~g~v~~d~~ 153 (337) ..+.-..+...-.=.-+++|. --.++.+..+++-..+.+..--++++.|. +....+. ++. +.. T Consensus 97 l~ID~~~y~~~~VdDiD~aqa------------~~Dlr~e~s~~~G~aLA~~~D~~i~~~l~kaa~~~~p~~~~---~~~ 161 (375) T protein:vir:10 97 IVMDDLLISSAFVYDLDETLA------------HYELRGEISKKIGYALAEKYDRLIFRSITRGARSASPVSAT---NFV 161 (375) T ss_pred EEecchhhhhhhHhhHHHHhc------------CchhHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhccccccc---ccc Confidence 222211111100001122221 11122222222222233332223333222 2111110 000 000 Q ss_pred hhcCCCcce-EEEecCCCCcc---hHHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhhccccc Q lcl|NC_019400. 154 TEWGVTQHT-ANIDFTDVATD---PTDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYSSTQE 229 (337) Q Consensus 154 ~~fG~~~~~-~~~~l~~~~~d---~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~~~~~ 229 (337) ..|.++.. ....-+++..+ +...+.+.. +.+--...+..+.+++++|++|.+|+.|.+..+.. +...+.+ T Consensus 162 -~~Gg~~i~~~sg~~~~~~~ta~~~~~ai~~a~----~~Lde~~VP~~~R~~vv~P~~y~~Ll~~~d~~~~~-n~d~~~~ 235 (375) T protein:vir:10 162 -EPGGTQIRVGSGTNESDAFTASALVNAFYDAA----AAMDEKGVSSQGRCAVLNPRQYYALIQDIGSNGLV-NRDVQGS 235 (375) T ss_pred -ccCcceeeeccccccccccCHHHHHHHHHHHH----HHHhhcCCCCCCCEEEeChHHHHHHHhcCCcccee-eeccccc Confidence 11222211 01111111122 233333333 33333334556678889999999999875433322 1111111 Q ss_pred ccccccccceeccceeEEEeccEEEecCceeeec--------CCee-EEEEec-----chhhheEEecccc--------- Q lcl|NC_019400. 230 PLRRRLGQGQENANNRMFVHKNVTYIEDISNYIP--------DGEA-YILPQG-----IDDMFQIHYAPAD--------- 286 (337) Q Consensus 230 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~--------~~~~-~~~p~g-----~~~~f~~~~ap~d--------- 286 (337) ..... +......|+..+...+.-........+. .+.+ +..|.. +.+.+..|++-.+ T Consensus 236 ~~~~~-g~v~~i~Gv~V~~Sn~lP~~~~~~~~~g~~~~~~a~~~~~~~~~~~~~~~~~~~g~~~~y~~d~~~~~~~~~~~ 314 (375) T protein:vir:10 236 ALQSG-NGVIEIAGIHIYKSMNIPFLGKYGVKYGGTTGETSPGNLGSHIGPTPENANATGGVNNDYGTNAELGAKSCGLI 314 (375) T ss_pred ceecc-ceEEEEeceEEEEeccccccccccccccccccccchhhhhccccccCCcceeeccccccccccccccCceEEEE Confidence 10000 0111222333333222211111001110 0000 111111 1122222222111 Q ss_pred -chhhccc---cC--cceeeeEEEccCCCEEEEEEeecccccccCcceEEEEEEeeC Q lcl|NC_019400. 287 -DVREANT---PA--QELYLWYKSSAYLREEKVESETSFLTVNTRPELVVRSTGTFA 337 (337) Q Consensus 287 -~~~~~n~---~~--~~~y~k~~~~~~~~~~~l~~eS~PLpi~~rP~~l~~~t~~aa 337 (337) +-+.+++ .+ .+.+ ...-+..-.++.|.+-.+.=..+.||++.+.++..++ T Consensus 315 ~~~~A~g~v~~~~~~~~~~-~~~~~~~~q~~~i~~~~a~G~~~lrp~~av~l~~~~~ 370 (375) T protein:vir:10 315 FQKEAAGVVEAIGPQVQVT-NGDVSVIYQGDVILGRMAMGADYLNPAAAVELYIGAT 370 (375) T ss_pred Echhheeeeeeeccccccc-cchhhheeeeeeeeeeeeeccCccCceeEEEEecCcC Confidence 1122221 00 1110 0111334457777777777788999999999998866 No 31 >protein:vir:8885 Length: 347 # NCBI annotation: major capsid protein A # Family: family:all:975 # MgeID: mge:161 # MgeName: gh-1 # Cross-refs: genbank:acc:NP_813774;genbank:gi:29366729;genbank:GeneID:1258837 Probab=45.84 E-value=0.76 Score=21.25 Aligned_cols=306 Identities=11% Similarity=0.005 Sum_probs=108.6 Q ss_pred CCCCCCCccCHHHHHHHHHhcCCCccchhhcccccCccc-ccceEEEEEEcCceeEeeeccCCCCc----ccccCCceeE Q lcl|NC_019400. 1 MAVVRTNDFQIVDLGATLEIVPRQYRLITNMDLFTAYHG-VTTVAQIERVDEVVTDFPARRRQGER----NYVGTEKAQL 75 (337) Q Consensus 1 ~d~~~~d~Fs~~~Lt~~i~~~p~~~~~l~~l~~F~~~~~-~t~~v~ie~~~~~~~l~p~v~~g~~~----~~~~~~~~~~ 75 (337) .++|- -.|+-.-++ ++.+ .+.+. ++...+.+ ..+++.|....... + ....+|.+. +.++..+..+ T Consensus 24 ~al~i-e~~~geV~~-~f~~----~s~~~--~~~~~r~i~~G~sv~~~~iG~~~-~-~~~~~g~~l~~~~~~~~~~~~~i 93 (347) T protein:vir:88 24 LALFL-KVFGGEVLT-AFVR----RSVTM--DKHMVRTIQNGKSASFPVMGRTK-G-YYLAPGENLDDKRKDIKHSEKVI 93 (347) T ss_pred HHHHH-HHHHHHHHH-HHHH----Hhhhh--hccccccccCcceEEEeeeccee-e-eeeccccCCCCCCCCCccceEEE Confidence 55555 345544443 3331 23443 33554443 36677777654332 1 222333321 1112222222 Q ss_pred EEEeccccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHh-c-CCEEecCCCceEeech Q lcl|NC_019400. 76 KNFNIPFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAI-M-GKSWAPQDPTAQYNYF 153 (337) Q Consensus 76 ~~f~~p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL-~-g~i~~~~~g~v~~d~~ 153 (337) ..=+.-++... .=.-+++|. .+ +............|.+..+........-++..- . +.-..+.++....+ T Consensus 94 ~ID~~~y~~~~-Vdd~D~~q~--~~---D~r~~~~~~~g~aLA~~~D~~i~~~l~~~a~~~~~~~~~~~g~~~~~~~~-- 165 (347) T protein:vir:88 94 QIDGLLTSDVL-IYDIEDAMN--HY---DVRAEYSAQLGEALAIAADGAVLAEMAKLCNLPAASNENIAGLGQAVVLN-- 165 (347) T ss_pred EEechhhhhhh-hhhHHHHhh--cC---CchHHHHHHHHHHHHHHHHHHHHHHHHHhhccccccccccCCcccccccc-- Confidence 11111111100 001122331 11 111112222223332222222211111111100 0 00011000110110 Q ss_pred hhcCCCcceEEEecCCCCcchHHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhhccccccccc Q lcl|NC_019400. 154 TEWGVTQHTANIDFTDVATDPTDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYSSTQEPLRR 233 (337) Q Consensus 154 ~~fG~~~~~~~~~l~~~~~d~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~~~~~~~~~ 233 (337) .+. .. +..++..+..... +.++...+.+.-...+..+.+++++|++|..|++++.+-... +........ T Consensus 166 --~~~-~~----~~~~~~~~~~~~~-~~i~~a~~~Lde~~VP~~gR~~vv~P~~y~~Ll~~~~~~~~~--~~~~~~~~~- 234 (347) T protein:vir:88 166 --IGA-AA----DLVDVEARGKAIL-KGLTLARARLTKNYVPAGDRRFYCAPEDYSAILSALMPNAAN--YAALIDPET- 234 (347) T ss_pred --ccc-cc----cccchhhhHHHHH-HHHHHHHHHHhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhh--hccccchhc- Confidence 010 00 0011111111111 222222333333344666788999999999999888754322 122112111 Q ss_pred ccccceeccceeEEEeccEEEecCc-----eeeecCCeeEEEEecchhhheEEec----cccchhhcccc-CcceeeeEE Q lcl|NC_019400. 234 RLGQGQENANNRMFVHKNVTYIEDI-----SNYIPDGEAYILPQGIDDMFQIHYA----PADDVREANTP-AQELYLWYK 303 (337) Q Consensus 234 ~~~~~~~~~~~~~~~~~~~~~~~~~-----~~~i~~~~~~~~p~g~~~~f~~~~a----p~d~~~~~n~~-~~~~y~k~~ 303 (337) +......|+..++..+......+ ..+-.....+.++.++.+-|+.-+. -.-+...+++. ..+.-...+ T Consensus 235 --G~vg~i~G~~V~~s~nlp~~~~~~~~~~~~~~~t~~~~~~~~~~~~~~~~d~~~~~~l~~~~~a~g~v~~~d~~~e~~ 312 (347) T protein:vir:88 235 --GNIRNVMGFEVIEVPHLTVGGAGDNNPADGVAPTNQKHIFPATATGDDRVAQNNVVGLFNHRSAVGTVKLKDMALERA 312 (347) T ss_pred --ceeeeeccceEEEeecccccccccccccccccccccccccccccccccccccCcEEEEEechhhhhheecccceeeee Confidence 11122344444444333211111 0122223334444333222111000 00111222221 111122233 Q ss_pred EccCCCEEEEEEeecccccccCcceEEEEEEeeC Q lcl|NC_019400. 304 SSAYLREEKVESETSFLTVNTRPELVVRSTGTFA 337 (337) Q Consensus 304 ~~~~~~~~~l~~eS~PLpi~~rP~~l~~~t~~aa 337 (337) .+++..+..|.+--..=.-..||++++.++.++| T Consensus 313 r~~~~~~d~i~~~~~~G~~~~rPe~a~~~~~~~a 346 (347) T protein:vir:88 313 RRPEFQADQIIGKYAMGHGGLRPEAAGALVFTPA 346 (347) T ss_pred echhhHHHHhhhhhhhcCceeccceEEEEEeCCC Confidence 3344344555555555566799999999998888 No 32 >protein:vir:80213 Length: 334 # NCBI annotation: capsid protein # Family: family:all:2806 # MgeID: mge:1879 # MgeName: LKA1 # Cross-refs: genbank:acc:YP_001522884;genbank:gi:158345177;genbank:GeneID:5687476 Probab=42.55 E-value=0.88 Score=20.89 Aligned_cols=296 Identities=14% Similarity=0.071 Sum_probs=104.4 Q ss_pred CCCCCCCccCHHHHHHHHHhcCCCccchhhcccccCcccc-cceEEEEEEcCceeEeeeccCCCCcc--cccCCceeEEE Q lcl|NC_019400. 1 MAVVRTNDFQIVDLGATLEIVPRQYRLITNMDLFTAYHGV-TTVAQIERVDEVVTDFPARRRQGERN--YVGTEKAQLKN 77 (337) Q Consensus 1 ~d~~~~d~Fs~~~Lt~~i~~~p~~~~~l~~l~~F~~~~~~-t~~v~ie~~~~~~~l~p~v~~g~~~~--~~~~~~~~~~~ 77 (337) .++|- -.|+-.-+++--+. ..+. ++...+.++ ++++.|... |..++ ....+|.+-. .++.++..+ . T Consensus 21 ~~l~l-e~~~geV~~af~~~-----s~~~--~~~~~r~i~~G~s~~~~~i-G~~~~-~~~~~g~~l~~~~~~~~~~~l-~ 89 (334) T protein:vir:80 21 VSLHI-EEHLGLVDASFMYS-----SKFA--SWMNVRSLRGTNQLRVDRV-GASTI-AGRKAGEELVVQKNVSDKLNL-T 89 (334) T ss_pred heehh-hhhhhHHHHHHHHh-----hhhh--ccceeeeccccceEEEeee-cceee-eeecCCCCCCCCCcccCceEE-E Confidence 56665 35665555544442 3332 345555554 778888876 33333 4545555432 222222211 1 Q ss_pred EeccccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCEEecCCCceEeechhhcC Q lcl|NC_019400. 78 FNIPFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIMGKSWAPQDPTAQYNYFTEWG 157 (337) Q Consensus 78 f~~p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~g~i~~~~~g~v~~d~~~~fG 157 (337) +.-..+...-.=.-+|+|. -. +-...+.......|++.-+ . ....+.+.+....+.... .+-+. -| T Consensus 90 ID~~l~~~~~VddiD~~q~---~~--D~rse~~~~~G~aLA~~~D---~---~~~~~l~kaa~~~~~~~~--~~~~~-~G 155 (334) T protein:vir:80 90 VDTVLYARHFFDKFDEWTS---NL--DVRKETAREDGIALARQYD---Q---ACIIQLQKCGDFLAPAHL--KPAFH-DG 155 (334) T ss_pred EeeeeehhhhHhhHHHHhc---Cc--chHHHHHHHHHHHHHHHHH---H---HHHHHHHHhhhhcccccc--ccccc-CC Confidence 2222222211112233332 11 1111222233333333222 2 122222222221110000 00000 01 Q ss_pred CC-cceEEEecCCCCcch---HHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHH-HHhhcccccccc Q lcl|NC_019400. 158 VT-QHTANIDFTDVATDP---TDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNA-YQYYSSTQEPLR 232 (337) Q Consensus 158 ~~-~~~~~~~l~~~~~d~---~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~-~~~~~~~~~~~~ 232 (337) .. ....+-...+..+++ ...+.+....+.+.... -....+.+++++|++|.+|+.|+.+... |.+........+ T Consensus 156 ~~~~~~~~g~~~~~~~~~~~l~~a~~~a~~~L~e~dvp-~~~~~~R~~vv~P~~y~~Ll~~~r~~n~d~~~s~~~~~~~~ 234 (334) T protein:vir:80 156 ILLPSTISGLAADAAADADVLVAAHRQGVEAMVFRDLG-DQLMSEGVTLLDPVIFSFLLEHDRLMNVEFGAKEGGNSFVG 234 (334) T ss_pred cceeecccccccchhhhHHHHHHHHHHHHHHHHhcCCC-CCcCCceEEEeChHHHHHHhcccccccceeccccccccccc Confidence 00 000000001112222 12222222222222111 1124677899999999999999876443 111111111111 Q ss_pred cccccceeccceeEEEeccEEEecCceeeecCCeeEEEEecc-----hhhheE---EeccccchhhccccCcceeeeEEE Q lcl|NC_019400. 233 RRLGQGQENANNRMFVHKNVTYIEDISNYIPDGEAYILPQGI-----DDMFQI---HYAPADDVREANTPAQELYLWYKS 304 (337) Q Consensus 233 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~p~g~-----~~~f~~---~~ap~d~~~~~n~~~~~~y~k~~~ 304 (337) +......|+..+...+. |...+.--+.|. .+-|.. .+.+.+-+-.+ ...+.=...|. T Consensus 235 ---g~i~~v~G~~V~~Sn~~----------P~~~~t~~~~g~~~~~~agd~t~~~~~~~~~~Al~t~--~~~~~~~e~~~ 299 (334) T protein:vir:80 235 ---GRIAMLNGVRVVETPRF----------PQSAITANALGADFNVTDAEVRRKMITFIPSMALISA--QVHPVSAQFWE 299 (334) T ss_pred ---eeEEEEeceEEEeecCC----------CCccccccccccccccccccccceEEEEEeCceEEEE--EEeecceeeee Confidence 11112233333332221 111000000000 000000 11111111111 11112223344 Q ss_pred ccCCCEEEEEEeecccccccCcceEEEEEEeeC Q lcl|NC_019400. 305 SAYLREEKVESETSFLTVNTRPELVVRSTGTFA 337 (337) Q Consensus 305 ~~~~~~~~l~~eS~PLpi~~rP~~l~~~t~~aa 337 (337) +++..++.|.+--..=.-..||++++-++++-- T Consensus 300 ~~~~~~d~i~~~~a~G~g~lRPeaa~vv~~~~~ 332 (334) T protein:vir:80 300 EKKDFGHYLDTFQSYNIGQRRPDAVAVHDITVT 332 (334) T ss_pred chhhHHHHHHHHHHcCCceeccceEEEEEEeee Confidence 444444444333333456789988877777777 No 33 >protein:vir:1886 Length: 385 # NCBI annotation: major capsid subunit precursor # Family: family:all:585 # MgeID: mge:41 # MgeName: HK022 # Cross-refs: genbank:acc:NP_037666;genbank:gi:9634124;genbank:GeneID:1262513 Probab=37.77 E-value=1.1 Score=20.35 Aligned_cols=273 Identities=9% Similarity=0.002 Sum_probs=110.7 Q ss_pred CCCCCC--CccCHHHHHHHHHhcCCCccchhhcccccCcccccceEEEEEEcCceeEeeeccCCCCcccccCCceeEEEE Q lcl|NC_019400. 1 MAVVRT--NDFQIVDLGATLEIVPRQYRLITNMDLFTAYHGVTTVAQIERVDEVVTDFPARRRQGERNYVGTEKAQLKNF 78 (337) Q Consensus 1 ~d~~~~--d~Fs~~~Lt~~i~~~p~~~~~l~~l~~F~~~~~~t~~v~ie~~~~~~~l~p~v~~g~~~~~~~~~~~~~~~f 78 (337) |..-.+ ..+-...+...|-......+.|.+ +.+..++.+..+.+-...+.-.-+-++..|+.-...+ .+.....+ T Consensus 105 ~~~~~~~~g~~i~~~~~~~ii~~~~~~~~l~~--~~~~~~~~~~~~~~~~~~~~~~~a~~v~E~~~~~~~~-~~~~~~~~ 181 (385) T protein:vir:18 105 LGSDADSAGSLIQPMQIPGIIMPGLRRLTIRD--LLAQGRTSSNALEYVREEVFTNNADVVAEKALKPESD-ITFSKQTA 181 (385) T ss_pred hccccccCCceecchhhhHHHHHhhhccchhh--hcceecccCcceEEEEEecCCcceeeeccCccccccc-cceeEEEE Confidence 111111 111122222222233233444544 3455555555555544443333344555554433322 34455556 Q ss_pred eccccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCEEecCCC--ceEeechhhc Q lcl|NC_019400. 79 NIPFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIMGKSWAPQDP--TAQYNYFTEW 156 (337) Q Consensus 79 ~~p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~g~i~~~~~g--~v~~d~~~~f 156 (337) .+-.+...-.|+- ++. + +.. .++.++.+.+ .+.+.+..|.+++ .|- +.++ .+++.. - T Consensus 182 ~~~k~~~~~~is~-ell--~-----d~~-~l~~~i~~~l---a~a~~~~~d~~~l---~G~---g~~~~~~Gi~~~---~ 240 (385) T protein:vir:18 182 NVKTIAHWVQASR-QVM--D-----DAP-MLQSYINNRL---MYGLALKEEGQLL---NGD---GTGDNLEGLNKV---A 240 (385) T ss_pred eeeeEEEeehhhH-HHH--h-----hHH-HHHHHHHHHH---HHHHHHHHHHHHH---hcc---CCCCcccccccc---c Confidence 6655555555553 332 1 111 2444444444 3556666665543 331 1111 111111 0 Q ss_pred CCCcceEEEecCCCCcchHHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhhcccccccccccc Q lcl|NC_019400. 157 GVTQHTANIDFTDVATDPTDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYSSTQEPLRRRLG 236 (337) Q Consensus 157 G~~~~~~~~~l~~~~~d~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~~~~~~~~~~~~ 236 (337) + ......+.++......+.+....+ ... ....-.++|++..|.+|.. +++ ....++..... T Consensus 241 ~----~~~~~~~~~~~~~~d~i~~~~~~l-~~~-----~~~~~~~~~~~~~~~~l~~---lkd------~~G~~l~~~~~ 301 (385) T protein:vir:18 241 T----AYDTSLNATGDTRADIIAHAIYQV-TES-----EFSASGIVLNPRDWHNIAL---LKD------NEGRYIFGGPQ 301 (385) T ss_pred c----cccccccccccchHHHHHHHHHhh-ccc-----cCCCCEEEEcHHHHHHHHH---hhc------CCCceeccCcc Confidence 1 111122222223333343333221 111 1122257889999998862 221 11112221111 Q ss_pred cceeccceeEEEeccEEEecCceeeecCCeeEEEEecchhhheEEeccccchhhccccCcceee---eEEEccCCCEEEE Q lcl|NC_019400. 237 QGQENANNRMFVHKNVTYIEDISNYIPDGEAYILPQGIDDMFQIHYAPADDVREANTPAQELYL---WYKSSAYLREEKV 313 (337) Q Consensus 237 ~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~p~g~~~~f~~~~ap~d~~~~~n~~~~~~y~---k~~~~~~~~~~~l 313 (337) .+. . -.+.|+.... ...+|++++++. .|..+|-..+..+ + ...... ..|. .+ .+.+ T Consensus 302 ~~~-----~-~~l~G~pV~~--~~~~p~~~~~~g------d~~~~~~~~~~~~-~---~v~~~~~~~~~~~--~~-~~~~ 360 (385) T protein:vir:18 302 AFT-----S-NIMWGLPVVP--TKAQAAGTFTVG------GFDMASQVWDRMD-A---TVEVSREDRDNFV--KN-MLTI 360 (385) T ss_pred cCC-----C-ceecceeeEE--cCcCCCCcEEEe------ecccEEEEEEecc-e---EEEEeccccchhh--cC-cEEE Confidence 100 0 1122332221 234677776652 1333333333211 0 000000 0011 11 3456 Q ss_pred EEeecccccccCcceEEEEEEeeC Q lcl|NC_019400. 314 ESETSFLTVNTRPELVVRSTGTFA 337 (337) Q Consensus 314 ~~eS~PLpi~~rP~~l~~~t~~aa 337 (337) .++.+.=..+.+|++++++|.+|| T Consensus 361 ~~~~r~~~~v~~~~a~~~~~~~aa 384 (385) T protein:vir:18 361 LCEERLALAHYRPTAIIKGTFSSG 384 (385) T ss_pred EEEEeeccEEecccceEEEEeccC Confidence 666666677799999999999999 No 34 >protein:vir:191 Length: 385 # NCBI annotation: major head subunit precursor # Family: family:all:585 # MgeID: mge:6 # MgeName: HK97 # Cross-refs: genbank:acc:NP_037701;genbank:gi:9634158;genbank:GeneID:1262530 Probab=37.77 E-value=1.1 Score=20.35 Aligned_cols=273 Identities=9% Similarity=0.002 Sum_probs=110.7 Q ss_pred CCCCCC--CccCHHHHHHHHHhcCCCccchhhcccccCcccccceEEEEEEcCceeEeeeccCCCCcccccCCceeEEEE Q lcl|NC_019400. 1 MAVVRT--NDFQIVDLGATLEIVPRQYRLITNMDLFTAYHGVTTVAQIERVDEVVTDFPARRRQGERNYVGTEKAQLKNF 78 (337) Q Consensus 1 ~d~~~~--d~Fs~~~Lt~~i~~~p~~~~~l~~l~~F~~~~~~t~~v~ie~~~~~~~l~p~v~~g~~~~~~~~~~~~~~~f 78 (337) |..-.+ ..+-...+...|-......+.|.+ +.+..++.+..+.+-...+.-.-+-++..|+.-...+ .+.....+ T Consensus 105 ~~~~~~~~g~~i~~~~~~~ii~~~~~~~~l~~--~~~~~~~~~~~~~~~~~~~~~~~a~~v~E~~~~~~~~-~~~~~~~~ 181 (385) T protein:vir:19 105 LGSDADSAGSLIQPMQIPGIIMPGLRRLTIRD--LLAQGRTSSNALEYVREEVFTNNADVVAEKALKPESD-ITFSKQTA 181 (385) T ss_pred hccccccCCceecchhhhHHHHHhhhccchhh--hcceecccCcceEEEEEecCCcceeeeccCccccccc-cceeEEEE Confidence 111111 111122222222233233444544 3455555555555544443333344555554433322 34455556 Q ss_pred eccccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCEEecCCC--ceEeechhhc Q lcl|NC_019400. 79 NIPFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIMGKSWAPQDP--TAQYNYFTEW 156 (337) Q Consensus 79 ~~p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~g~i~~~~~g--~v~~d~~~~f 156 (337) .+-.+...-.|+- ++. + +.. .++.++.+.+ .+.+.+..|.+++ .|- +.++ .+++.. - T Consensus 182 ~~~k~~~~~~is~-ell--~-----d~~-~l~~~i~~~l---a~a~~~~~d~~~l---~G~---g~~~~~~Gi~~~---~ 240 (385) T protein:vir:19 182 NVKTIAHWVQASR-QVM--D-----DAP-MLQSYINNRL---MYGLALKEEGQLL---NGD---GTGDNLEGLNKV---A 240 (385) T ss_pred eeeeEEEeehhhH-HHH--h-----hHH-HHHHHHHHHH---HHHHHHHHHHHHH---hcc---CCCCcccccccc---c Confidence 6655555555553 332 1 111 2444444444 3556666665543 331 1111 111111 0 Q ss_pred CCCcceEEEecCCCCcchHHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhhcccccccccccc Q lcl|NC_019400. 157 GVTQHTANIDFTDVATDPTDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYSSTQEPLRRRLG 236 (337) Q Consensus 157 G~~~~~~~~~l~~~~~d~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~~~~~~~~~~~~ 236 (337) + ......+.++......+.+....+ ... ....-.++|++..|.+|.. +++ ....++..... T Consensus 241 ~----~~~~~~~~~~~~~~d~i~~~~~~l-~~~-----~~~~~~~~~~~~~~~~l~~---lkd------~~G~~l~~~~~ 301 (385) T protein:vir:19 241 T----AYDTSLNATGDTRADIIAHAIYQV-TES-----EFSASGIVLNPRDWHNIAL---LKD------NEGRYIFGGPQ 301 (385) T ss_pred c----cccccccccccchHHHHHHHHHhh-ccc-----cCCCCEEEEcHHHHHHHHH---hhc------CCCceeccCcc Confidence 1 111122222223333343333221 111 1122257889999998862 221 11112221111 Q ss_pred cceeccceeEEEeccEEEecCceeeecCCeeEEEEecchhhheEEeccccchhhccccCcceee---eEEEccCCCEEEE Q lcl|NC_019400. 237 QGQENANNRMFVHKNVTYIEDISNYIPDGEAYILPQGIDDMFQIHYAPADDVREANTPAQELYL---WYKSSAYLREEKV 313 (337) Q Consensus 237 ~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~p~g~~~~f~~~~ap~d~~~~~n~~~~~~y~---k~~~~~~~~~~~l 313 (337) .+. . -.+.|+.... ...+|++++++. .|..+|-..+..+ + ...... ..|. .+ .+.+ T Consensus 302 ~~~-----~-~~l~G~pV~~--~~~~p~~~~~~g------d~~~~~~~~~~~~-~---~v~~~~~~~~~~~--~~-~~~~ 360 (385) T protein:vir:19 302 AFT-----S-NIMWGLPVVP--TKAQAAGTFTVG------GFDMASQVWDRMD-A---TVEVSREDRDNFV--KN-MLTI 360 (385) T ss_pred cCC-----C-ceecceeeEE--cCcCCCCcEEEe------ecccEEEEEEecc-e---EEEEeccccchhh--cC-cEEE Confidence 100 0 1122332221 234677776652 1333333333211 0 000000 0011 11 3456 Q ss_pred EEeecccccccCcceEEEEEEeeC Q lcl|NC_019400. 314 ESETSFLTVNTRPELVVRSTGTFA 337 (337) Q Consensus 314 ~~eS~PLpi~~rP~~l~~~t~~aa 337 (337) .++.+.=..+.+|++++++|.+|| T Consensus 361 ~~~~r~~~~v~~~~a~~~~~~~aa 384 (385) T protein:vir:19 361 LCEERLALAHYRPTAIIKGTFSSG 384 (385) T ss_pred EEEEeeccEEecccceEEEEeccC Confidence 666666677799999999999999 No 35 >protein:vir:78148 Length: 123 # NCBI annotation: hypothetical protein # Family: family:all:4955 # MgeID: mge:1847 # MgeName: Min1 # Cross-refs: genbank:acc:YP_001294802;genbank:gi:149882823;genbank:GeneID:5309176 Probab=30.45 E-value=1.6 Score=19.50 Aligned_cols=117 Identities=15% Similarity=0.047 Sum_probs=61.9 Q ss_pred EEChHHHHHHhcCHHHHHHHHhhcccccccccccccceeccceeEEEeccEEEecCceeeecCCeeEEEEecchhhheEE Q lcl|NC_019400. 202 LASRKWFSALIAHPLVMNAYQYYSSTQEPLRRRLGQGQENANNRMFVHKNVTYIEDISNYIPDGEAYILPQGIDDMFQIH 281 (337) Q Consensus 202 l~g~~~~~al~~h~~v~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~p~g~~~~f~~~ 281 (337) +++.-.|.++++.+-+-+++-+.++ ++.... .-.....|..|....+ ||.+.+.++-. ..+ T Consensus 1 vvsdlqfA~~~g~~v~~~aLpRE~a--Np~ltG-~lpV~~~GltWl~tpn----------lpg~~a~vlDs------t~l 61 (123) T protein:vir:78 1 MLSGAQFAKLIGILVDDKALPREQA--NIVLTG-SLPVSAYGLTWVTSRH----------ITGTDPWLFDV------EQL 61 (123) T ss_pred CcchhhHHHHhcchhcccccccccC--CceEec-CcceeeeceeeeecCC----------CCCCccceeeh------hhh Confidence 3355558888887776665554333 221111 1112234555655544 34333333221 112 Q ss_pred eccccc-h---hhccccCcceeeeEEEccC--CCEEEEEEeecccccccCcceEEEEEEeeC Q lcl|NC_019400. 282 YAPADD-V---REANTPAQELYLWYKSSAY--LREEKVESETSFLTVNTRPELVVRSTGTFA 337 (337) Q Consensus 282 ~ap~d~-~---~~~n~~~~~~y~k~~~~~~--~~~~~l~~eS~PLpi~~rP~~l~~~t~~aa 337 (337) +|-+|. + +.+...+.-.=.|....+. .-++.+..--+-.||...|.|.++++-.-- T Consensus 62 GgmaDE~l~~Pgya~~~~~Gvevkt~Red~~~nD~yriRaRRvTvpiv~EP~Agv~ltg~g~ 123 (123) T protein:vir:78 62 GGMADEKLLSPEFAPAGNTGVEASTERAHQGVKDGYLVRGRRNTVAVVTEPMAGVRLTGTGL 123 (123) T ss_pred ccccccccCCCcccCCCCcceeEEeeccccCCCCceEEeeeecceeEEecCccceEEeeecC Confidence 222220 0 0111111113345555544 557899999999999999999999998777 No 36 >protein:vir:9875 Length: 296 # NCBI annotation: hypothetical protein # Family: family:all:1178 # MgeID: mge:177 # MgeName: 315.5 # Cross-refs: genbank:acc:NP_795637;genbank:gi:28876404;genbank:GeneID:1257935 Probab=24.36 E-value=2.2 Score=18.72 Aligned_cols=274 Identities=12% Similarity=0.026 Sum_probs=106.1 Q ss_pred CCCCCCCccCHHHHHHH-----HHhc-CCCccchhhcccccCccccc-ceEEEEEEcCceeEeeeccCCCCcccccCCce Q lcl|NC_019400. 1 MAVVRTNDFQIVDLGAT-----LEIV-PRQYRLITNMDLFTAYHGVT-TVAQIERVDEVVTDFPARRRQGERNYVGTEKA 73 (337) Q Consensus 1 ~d~~~~d~Fs~~~Lt~~-----i~~~-p~~~~~l~~l~~F~~~~~~t-~~v~ie~~~~~~~l~p~v~~g~~~~~~~~~~~ 73 (337) -..-..+.=...+|..+ +|+. .....++.-||+++..|... .+|..=..-....-+.-|..|.+=...+-.+. T Consensus 5 ~~~~e~nlt~~~dl~~~~siDf~~~f~~~i~~L~~~LGv~r~~pla~GstIkt~k~~~y~gda~dVaEGe~Iplskvt~~ 84 (296) T protein:vir:98 5 RTYPEENLIKSTDLKYPITIDVTNKFQENISKLLEMLGVTRKISVSEGMTLKTYAGYDVTLAEGNVPEGEVIPLSKVERK 84 (296) T ss_pred cccCcCCCcchhhhhhhhhhhhHHHHhhhHHHHHHHhhhcccccccCCCEEeeccceeeeeccccccCCcccchhhheee Confidence 00000111111222111 2222 23345566678877766543 22321100011111223444443322222222 Q ss_pred eEEEEeccccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCEEecCCCceEeech Q lcl|NC_019400. 74 QLKNFNIPFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIMGKSWAPQDPTAQYNYF 153 (337) Q Consensus 74 ~~~~f~~p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~g~i~~~~~g~v~~d~~ 153 (337) .....+...-|....+|++.|| +..+|. +..... .+|...|.+-.-.-+..+|.+.. T Consensus 85 ~~~t~t~~ikK~rK~tTdEAIq-lsGyg~--aVgetd-------~qL~~~iq~kId~d~~t~LktaT------------- 141 (296) T protein:vir:98 85 IHSEKKIELKKYRKATTGEDIQ-MYGSNE--AVTNTD-------NALVRQLQKKIRTDFVTALKTGT------------- 141 (296) T ss_pred ecceEEEEeeccccccCHHHHH-hhcCCc--hhHHHH-------HHHHHHHHHhhhHHHHHHHhccc------------- Confidence 2223444444555567988886 355653 211111 11222222222222344443210 Q ss_pred hhcCCCcceEEEecCCCCcchHHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhhccccccccc Q lcl|NC_019400. 154 TEWGVTQHTANIDFTDVATDPTDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYSSTQEPLRR 233 (337) Q Consensus 154 ~~fG~~~~~~~~~l~~~~~d~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~~~~~~~~~ 233 (337) .+. +. ...++...+...|-........ ....+.++|+.|.=..++++++.|-. ...-+..++. T Consensus 142 -------~t~--~~--t~~~lQ~Ala~~~~~l~~~fed--ed~~~~V~FVnP~D~a~ylg~a~it~---qt~fG~tyl~- 204 (296) T protein:vir:98 142 -------GTQ--DA--LGAGLQGALASAWGKLQVLFED--YGSERAIVFANSLDVAEYIAKAGITT---QTAFGLTYLV- 204 (296) T ss_pred -------cee--ee--chhhHHHHHHHHhhhhhhhccc--cCCCceEEEEehHHHHHHhcCCccch---hheechhhhh- Confidence 011 10 0112222222211111111111 11234566777766666665554310 0000111110 Q ss_pred ccccceeccceeEEEeccEEEecCceeeecCCeeEEEEecchhhheEEeccccchhhccccCcce--ee------eEEEc Q lcl|NC_019400. 234 RLGQGQENANNRMFVHKNVTYIEDISNYIPDGEAYILPQGIDDMFQIHYAPADDVREANTPAQEL--YL------WYKSS 305 (337) Q Consensus 234 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~p~g~~~~f~~~~ap~d~~~~~n~~~~~~--y~------k~~~~ 305 (337) .+.| .+.. -...||+|++++.|. .+. ..+|+|...-+ .+..| |. -+-.. T Consensus 205 ------nfLG--------~~II--~S~kV~~G~~~~T~~--~Ni-~~ay~~~~~~~----l~~~f~~~~d~tglIGv~h~ 261 (296) T protein:vir:98 205 ------DFTG--------TVII--STNDVTKGEIWATVP--ENI-IFAYINPNNSE----LAKEFNLYGDPTGYIGMNHF 261 (296) T ss_pred ------hccc--------cEEE--EcCcCCCceEEEeee--cce-EEEeecccccc----hhhhhccccccccceEEEec Confidence 1111 1111 134589999999874 343 67888754111 11111 11 11111 Q ss_pred cCCC--EEEEEEeecccccccCcceEEEEEEeeC Q lcl|NC_019400. 306 AYLR--EEKVESETSFLTVNTRPELVVRSTGTFA 337 (337) Q Consensus 306 ~~~~--~~~l~~eS~PLpi~~rP~~l~~~t~~aa 337 (337) ...+ .++-..-|.-.-.|-|++.++++|+++| T Consensus 262 ~~~~~~t~eT~~~~~~~lfpE~~dgiv~~tI~~~ 295 (296) T protein:vir:98 262 QENTTLTIQTLLVSGMLMYPERIDGIVKVTLTPG 295 (296) T ss_pred cccceeeehhHhHhHHHhcccccceEEEEEecCC Confidence 2221 2222233444556899999999999999 No 37 >protein:vir:9927 Length: 295 # NCBI annotation: hypothetical protein # Family: family:all:1178 # MgeID: mge:178 # MgeName: 315.6 # Cross-refs: genbank:acc:NP_795689;genbank:gi:28876459;genbank:GeneID:1258000 Probab=23.43 E-value=2.3 Score=18.59 Aligned_cols=272 Identities=13% Similarity=0.038 Sum_probs=112.8 Q ss_pred CCCCCccCHHHHH-----HHHHhc-CCCccchhhcccccCcccc-cceEEEEEEcCceeEeeeccCCCCcccccCCceeE Q lcl|NC_019400. 3 VVRTNDFQIVDLG-----ATLEIV-PRQYRLITNMDLFTAYHGV-TTVAQIERVDEVVTDFPARRRQGERNYVGTEKAQL 75 (337) Q Consensus 3 ~~~~d~Fs~~~Lt-----~~i~~~-p~~~~~l~~l~~F~~~~~~-t~~v~ie~~~~~~~l~p~v~~g~~~~~~~~~~~~~ 75 (337) +...+.=...+|. .-+++. .....++.-||+++..|.. -.+|.+=.. ....-+.-|..|.+=...+-.+... T Consensus 1 mAe~nlt~~~dL~~~~sidfv~~f~~~i~~L~~~Lgi~r~~p~a~G~tIt~pK~-~~tgda~dVaEGe~Iplskvt~~~~ 79 (295) T protein:vir:99 1 MAEKNLNTMADLGDIKSIDFVNKFSKNINDLLKLLGVTRRETLTNDLKIQTYKW-EVTLDQTDPGEGETIPLSKVTRTKD 79 (295) T ss_pred CCCcccccHhhccCceeehhhHHhhhhHHHHHHHhccccccccccCCeEEeeee-eeecccccccCCcccchhhheeeee Confidence 2223222223332 112222 2233556667887776654 333443111 1111123344444432222222222 Q ss_pred EEEeccccccCccccHHHHhccccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCEEecCCCceEeechhh Q lcl|NC_019400. 76 KNFNIPFFPLDRQITAADVQNFRKYFTADAPKSVEDVVARVVRRIRISHEQLKEKAMLQAIMGKSWAPQDPTAQYNYFTE 155 (337) Q Consensus 76 ~~f~~p~i~~~~~v~a~dlq~~R~~G~~~~~~~~~~~v~~~l~~~~~~i~~t~E~m~a~AL~g~i~~~~~g~v~~d~~~~ 155 (337) ...+...-|....+|++.|| +..||. +...... +|...|.+-.-.-+..+|.+...... T Consensus 80 ~t~t~kikK~rK~tTdEAIq-lsGygd--pvgead~-------qL~~~ia~kId~D~~~~lktat~t~t----------- 138 (295) T protein:vir:99 80 KDYTVKWFKKRRATTAEAIA-RHGAAR--AITEADK-------RIMRELQNGIKDAFFTFLKTKPTKVK----------- 138 (295) T ss_pred eeeEEEeeeecccccHHHHH-hcCCCc--hhHHHHH-------HHHHHHHHhhhHHHHHHhccCceeee----------- Confidence 33444445555678999987 345653 2211111 11122222222223344432111110 Q ss_pred cCCCcceEEEecCCCCcchHHHHHHHHHHHHHHhhccccccccEEEEEChHHHHHHhcCHHHHHHHHhhccccccccccc Q lcl|NC_019400. 156 WGVTQHTANIDFTDVATDPTDIIEADARAYIIDNAGDNGNNYGIVVLASRKWFSALIAHPLVMNAYQYYSSTQEPLRRRL 235 (337) Q Consensus 156 fG~~~~~~~~~l~~~~~d~~~~~~~~~~~~~~~~~~~~~~~~~v~~l~g~~~~~al~~h~~v~~~~~~~~~~~~~~~~~~ 235 (337) ..+....+...+.++-.. .-....+.++|+.|.=+.++++++.+ +|..+.++ +... T Consensus 139 ---------------g~~lq~a~a~~~~al~~f---~Ee~~~~~V~FVnP~D~a~yl~~A~~-----~~~~a~~f-G~~~ 194 (295) T protein:vir:99 139 ---------------GVGLQKALSASWAKLATF---NEFEGSPLVSFVSPLDVANYLGDTKV-----GADASNVF-GMTL 194 (295) T ss_pred ---------------hhhHHHHHHHhhhhhhhc---ccccCCceEEEEehHHHHHHHhcccc-----ccchhhhh-hhhh Confidence 011111222221111110 01122345788888888888876665 23222111 0000 Q ss_pred ccceecccee-EEEeccEEEecCceeeecCCeeEEEEecchhhheEEeccccchhhccccCcceee------eEEEccCC Q lcl|NC_019400. 236 GQGQENANNR-MFVHKNVTYIEDISNYIPDGEAYILPQGIDDMFQIHYAPADDVREANTPAQELYL------WYKSSAYL 308 (337) Q Consensus 236 ~~~~~~~~~~-~~~~~~~~~~~~~~~~i~~~~~~~~p~g~~~~f~~~~ap~d~~~~~n~~~~~~y~------k~~~~~~~ 308 (337) .. .+.|.. .++ ...||+|++++.+. .+. ..+|+|.+.-+.++.- -++. -+-..... T Consensus 195 L~--nfLG~q~II~----------S~kv~~G~~~aT~~--~Ni-~~ay~~~~~g~l~~~f--~~~~D~tglIg~~h~~~~ 257 (295) T protein:vir:99 195 LK--NFLGMQNVIV----------MPSVPEGKIYSTAV--ENL-VFASLNVKGGDLGGLF--ADFTDETGLIAAARNRQL 257 (295) T ss_pred hh--hhhccceEEE----------cccCCCceEEEeec--cce-EEEEecCCchhhhhhh--hhccCcccceEEEecccc Confidence 00 122221 111 34589999998763 443 6778876632211100 0111 11111111 Q ss_pred C--EEEEEEeecccccccCcceEEEEEEeeC Q lcl|NC_019400. 309 R--EEKVESETSFLTVNTRPELVVRSTGTFA 337 (337) Q Consensus 309 ~--~~~l~~eS~PLpi~~rP~~l~~~t~~aa 337 (337) + .++-..-|.-.-.|-|++.++++|++++ T Consensus 258 ~~~t~et~~~~~~~lfpE~~dgiv~~tI~~~ 288 (295) T protein:vir:99 258 SNLTYESVFFGANVLFAEIPEGVVEATIEAA 288 (295) T ss_pred ceeeehhhhHhHHHhcccccceEEEEEEecC Confidence 1 2222233444556899999999999998 Done!