Query lcl|NC_015280.1_cdsid_YP_004322545.1 [gene=gp23] [protein=precursor of major head subunit] [protein_id=YP_004322545.1] [location=111312..112679] Match_columns 455 No_of_seqs 169 out of 420 Neff 4.7 Searched_HMMs 1612 Date Thu Nov 7 15:00:42 2013 Command /home/guerois/workspace/virfam/python/lib/hhsearch//hhsearch2 -i .//seq/seq_120 -d /home/guerois/workspace/virfam/python/profile_database/capsid_neck_tail.hhm -glob -cpu 7 -o .//seq/HHR/seq_120_vs_rec_db.hhr No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 protein:vir:106998 Length: 468 100.0 2E-237 1E-240 1318.9 35.2 453 1-455 1-468 (468) 2 protein:vir:104915 Length: 470 100.0 4E-236 3E-239 1310.9 34.5 444 1-455 3-470 (470) 3 protein:vir:104549 Length: 462 100.0 1E-234 7E-238 1303.3 33.9 444 1-455 1-462 (462) 4 protein:vir:103181 Length: 457 100.0 2E-231 1E-234 1285.1 35.5 444 1-455 1-457 (457) 5 protein:vir:106286 Length: 534 100.0 2E-223 1E-226 1241.5 34.4 447 1-454 1-534 (534) 6 protein:vir:6901 Length: 522 # 100.0 2E-222 1E-225 1235.7 33.8 447 1-454 4-522 (522) 7 protein:vir:101811 Length: 529 100.0 7E-222 5E-225 1232.9 33.5 446 1-454 1-529 (529) 8 protein:vir:101039 Length: 529 100.0 9E-222 6E-225 1232.3 32.9 445 1-454 3-529 (529) 9 protein:vir:103463 Length: 521 100.0 2E-220 2E-223 1224.5 33.9 447 1-454 3-521 (521) 10 protein:vir:6601 Length: 528 # 100.0 5E-220 3E-223 1223.0 34.6 447 1-454 1-528 (528) 11 protein:vir:98143 Length: 524 100.0 5E-220 3E-223 1223.1 33.2 445 1-454 1-524 (524) 12 protein:vir:7214 Length: 521 # 100.0 6E-220 4E-223 1222.5 33.8 447 1-454 3-521 (521) 13 protein:vir:80986 Length: 528 100.0 1E-219 9E-223 1220.3 34.3 447 1-454 1-528 (528) 14 protein:vir:5670 Length: 514 # 100.0 3E-219 2E-222 1218.6 31.6 444 5-454 1-514 (514) 15 protein:vir:100603 Length: 529 100.0 3E-217 2E-220 1207.7 33.1 447 1-454 1-529 (529) 16 protein:vir:107947 Length: 519 100.0 4E-217 2E-220 1207.0 33.8 446 1-454 1-519 (519) 17 protein:vir:5942 Length: 523 # 100.0 3E-196 2E-199 1092.8 33.6 402 1-434 1-523 (523) 18 protein:vir:81100 Length: 415 97.4 8E-05 5E-08 43.0 20.7 340 1-439 8-415 (415) 19 protein:vir:98339 Length: 415 97.4 8E-05 5E-08 43.0 20.7 340 1-439 8-415 (415) 20 protein:vir:79987 Length: 415 97.4 8E-05 5E-08 43.0 20.7 340 1-439 8-415 (415) 21 protein:vir:9410 Length: 415 # 97.3 0.0001 6.4E-08 42.5 19.8 341 1-439 29-415 (415) 22 protein:vir:1886 Length: 385 # 97.1 0.00016 1E-07 41.3 20.9 322 1-433 1-385 (385) 23 protein:vir:191 Length: 385 # 97.1 0.00016 1E-07 41.3 20.9 322 1-433 1-385 (385) 24 protein:vir:4600 Length: 415 # 96.1 0.001 6.3E-07 37.0 19.9 338 1-439 41-415 (415) 25 protein:vir:4700 Length: 415 # 96.1 0.001 6.3E-07 37.0 19.9 338 1-439 41-415 (415) 26 protein:vir:104256 Length: 458 95.5 0.0019 1.2E-06 35.5 19.6 335 1-443 73-458 (458) 27 protein:vir:4830 Length: 397 # 95.4 0.0021 1.3E-06 35.2 20.7 327 1-451 29-397 (397) 28 protein:vir:78523 Length: 338 95.4 0.0021 1.3E-06 35.2 19.2 310 47-437 1-338 (338) 29 protein:vir:4953 Length: 397 # 94.8 0.0034 2.1E-06 34.1 19.6 323 1-440 34-397 (397) 30 protein:vir:100884 Length: 389 94.3 0.0048 3E-06 33.3 18.3 322 1-442 33-389 (389) 31 protein:vir:4511 Length: 409 # 94.0 0.0056 3.5E-06 32.9 20.0 330 1-442 41-409 (409) 32 protein:vir:4092 Length: 390 # 93.9 0.006 3.7E-06 32.8 21.5 331 1-444 1-390 (390) 33 protein:vir:81227 Length: 413 93.8 0.0062 3.8E-06 32.7 21.4 337 1-434 31-413 (413) 34 protein:vir:100247 Length: 425 93.8 0.0062 3.8E-06 32.7 21.0 325 1-433 50-425 (425) 35 protein:vir:41 Length: 299 # N 92.1 0.013 8.1E-06 30.9 19.7 275 57-441 1-299 (299) 36 protein:vir:97053 Length: 390 91.7 0.015 9.2E-06 30.6 20.8 319 1-430 32-390 (390) 37 protein:vir:100172 Length: 394 91.5 0.016 9.7E-06 30.5 18.6 334 1-442 12-394 (394) 38 protein:vir:3033 Length: 272 # 91.2 0.017 1.1E-05 30.3 16.8 260 99-443 1-272 (272) 39 protein:vir:9820 Length: 272 # 91.2 0.017 1.1E-05 30.3 16.8 260 99-443 1-272 (272) 40 protein:vir:8420 Length: 477 # 91.0 0.018 1.1E-05 30.1 20.9 352 1-441 67-477 (477) 41 protein:vir:7771 Length: 330 # 90.5 0.021 1.3E-05 29.8 18.4 297 49-437 1-330 (330) 42 protein:vir:81070 Length: 390 90.3 0.022 1.4E-05 29.7 20.5 320 1-430 50-390 (390) 43 protein:vir:81160 Length: 371 90.2 0.022 1.4E-05 29.6 22.1 315 1-432 22-371 (371) 44 protein:vir:3991 Length: 404 # 90.1 0.023 1.4E-05 29.6 22.1 328 1-442 32-404 (404) 45 protein:vir:3845 Length: 395 # 89.9 0.024 1.5E-05 29.5 19.9 332 1-442 14-395 (395) 46 protein:vir:7409 Length: 408 # 89.3 0.027 1.7E-05 29.2 22.0 334 1-441 39-408 (408) 47 protein:vir:9574 Length: 300 # 89.3 0.027 1.7E-05 29.2 18.8 283 61-453 1-300 (300) 48 protein:vir:78223 Length: 333 88.8 0.03 1.9E-05 28.9 16.7 309 47-434 1-333 (333) 49 protein:vir:107593 Length: 392 88.6 0.032 2E-05 28.8 20.4 320 1-436 35-392 (392) 50 protein:vir:102873 Length: 392 88.6 0.032 2E-05 28.8 20.4 320 1-436 35-392 (392) 51 protein:vir:102082 Length: 392 88.6 0.032 2E-05 28.8 20.4 320 1-436 35-392 (392) 52 protein:vir:105004 Length: 392 88.6 0.032 2E-05 28.8 20.4 320 1-436 35-392 (392) 53 protein:vir:3870 Length: 400 # 88.3 0.033 2.1E-05 28.7 19.7 312 1-433 48-400 (400) 54 protein:vir:4339 Length: 395 # 86.8 0.043 2.7E-05 28.1 22.6 321 1-432 22-395 (395) 55 protein:vir:102119 Length: 404 86.6 0.045 2.8E-05 28.0 20.0 329 1-435 37-404 (404) 56 protein:vir:9309 Length: 324 # 86.2 0.047 2.9E-05 27.9 20.5 301 35-443 1-324 (324) 57 protein:vir:2504 Length: 305 # 85.1 0.055 3.4E-05 27.5 17.6 286 61-442 1-305 (305) 58 protein:vir:94142 Length: 304 84.3 0.061 3.8E-05 27.2 17.9 279 49-433 1-304 (304) 59 protein:vir:105905 Length: 304 84.3 0.061 3.8E-05 27.2 17.9 279 49-433 1-304 (304) 60 protein:vir:6242 Length: 390 # 83.6 0.067 4.2E-05 27.0 17.7 320 1-433 32-390 (390) 61 protein:vir:7855 Length: 497 # 83.6 0.067 4.2E-05 27.0 21.9 349 1-443 53-497 (497) 62 protein:vir:101650 Length: 497 83.6 0.067 4.2E-05 27.0 21.9 349 1-443 53-497 (497) 63 protein:vir:8102 Length: 543 # 83.6 0.067 4.2E-05 27.0 20.8 324 1-433 173-543 (543) 64 protein:vir:1433 Length: 435 # 83.6 0.067 4.2E-05 27.0 21.0 335 1-437 41-435 (435) 65 protein:vir:485 Length: 407 # 83.2 0.07 4.4E-05 26.9 22.1 335 1-440 15-407 (407) 66 protein:vir:4456 Length: 401 # 82.4 0.077 4.8E-05 26.7 21.3 328 1-432 16-401 (401) 67 protein:vir:96762 Length: 632 81.7 0.083 5.1E-05 26.5 19.8 317 1-440 260-632 (632) 68 protein:vir:96123 Length: 274 81.6 0.084 5.2E-05 26.5 13.9 257 113-455 1-274 (274) 69 protein:vir:4226 Length: 326 # 80.5 0.094 5.9E-05 26.2 17.0 303 35-443 1-326 (326) 70 protein:vir:10364 Length: 390 79.0 0.11 6.7E-05 25.9 20.8 323 1-430 30-390 (390) 71 protein:vir:99749 Length: 324 79.0 0.11 6.8E-05 25.9 20.0 302 32-443 1-324 (324) 72 protein:vir:105334 Length: 276 78.4 0.11 7.1E-05 25.7 13.6 259 99-437 1-276 (276) 73 protein:vir:98635 Length: 377 78.4 0.11 7.1E-05 25.7 13.7 329 1-432 1-377 (377) 74 protein:vir:93742 Length: 274 77.7 0.12 7.6E-05 25.6 16.9 264 109-438 1-274 (274) 75 protein:vir:6212 Length: 434 # 77.1 0.13 7.9E-05 25.5 19.0 343 1-441 56-434 (434) 76 protein:vir:80376 Length: 435 76.9 0.13 8.1E-05 25.4 21.0 333 1-437 42-435 (435) 77 protein:vir:100135 Length: 418 76.7 0.13 8.2E-05 25.4 23.8 329 1-441 55-418 (418) 78 protein:vir:9704 Length: 394 # 76.5 0.13 8.4E-05 25.4 21.5 320 1-435 31-394 (394) 79 protein:vir:104085 Length: 320 75.8 0.14 8.9E-05 25.2 16.8 293 35-436 1-320 (320) 80 protein:vir:4997 Length: 397 # 74.8 0.15 9.6E-05 25.0 21.8 329 1-442 18-397 (397) 81 protein:vir:1084 Length: 437 # 74.4 0.16 9.9E-05 25.0 16.9 320 1-440 67-437 (437) 82 protein:vir:96223 Length: 324 72.6 0.18 0.00011 24.6 19.4 301 23-443 1-324 (324) 83 protein:vir:962 Length: 397 # 72.2 0.19 0.00012 24.6 18.4 322 1-432 60-397 (397) 84 protein:vir:105038 Length: 428 72.1 0.19 0.00012 24.6 17.4 332 1-435 31-428 (428) 85 protein:vir:2344 Length: 397 # 70.6 0.21 0.00013 24.3 17.0 312 35-455 1-351 (397) 86 protein:vir:4856 Length: 293 # 68.0 0.24 0.00015 23.9 17.9 274 49-440 1-293 (293) 87 protein:vir:96262 Length: 274 67.7 0.25 0.00015 23.9 15.9 263 99-438 1-274 (274) 88 protein:vir:95898 Length: 274 67.7 0.25 0.00015 23.9 15.9 263 99-438 1-274 (274) 89 protein:vir:103955 Length: 324 66.6 0.26 0.00016 23.7 20.4 299 23-443 1-324 (324) 90 protein:vir:3158 Length: 321 # 65.3 0.29 0.00018 23.6 16.4 296 31-455 1-310 (321) 91 protein:vir:78830 Length: 324 64.6 0.3 0.00018 23.5 19.8 299 35-443 1-324 (324) 92 protein:vir:96392 Length: 324 64.6 0.3 0.00018 23.5 19.8 299 35-443 1-324 (324) 93 protein:vir:80930 Length: 278 62.9 0.33 0.0002 23.3 15.1 271 99-441 1-278 (278) 94 protein:vir:1638 Length: 298 # 62.1 0.34 0.00021 23.1 19.1 278 61-433 1-298 (298) 95 protein:vir:1268 Length: 397 # 62.0 0.34 0.00021 23.1 19.1 313 1-433 39-397 (397) 96 protein:vir:101607 Length: 379 58.5 0.41 0.00026 22.7 20.2 323 1-454 17-379 (379) 97 protein:vir:9759 Length: 303 # 58.4 0.41 0.00026 22.7 17.6 280 61-434 1-303 (303) 98 protein:vir:3613 Length: 272 # 56.8 0.45 0.00028 22.5 15.3 263 99-454 1-272 (272) 99 protein:vir:1239 Length: 274 # 51.0 0.59 0.00037 21.8 15.5 265 99-438 1-274 (274) 100 protein:vir:94673 Length: 419 50.7 0.6 0.00037 21.8 21.3 340 1-435 32-419 (419) 101 protein:vir:97148 Length: 324 48.5 0.67 0.00041 21.5 20.0 299 32-443 1-324 (324) 102 protein:vir:94494 Length: 274 47.5 0.7 0.00043 21.4 16.6 266 101-438 1-274 (274) 103 protein:vir:97433 Length: 274 47.5 0.7 0.00043 21.4 16.6 266 101-438 1-274 (274) 104 protein:vir:1328 Length: 392 # 44.0 0.82 0.00051 21.0 17.7 330 1-433 4-392 (392) 105 protein:vir:1025 Length: 408 # 39.1 1 0.00064 20.5 22.1 319 1-445 39-408 (408) 106 protein:vir:739 Length: 231 # 38.5 1.1 0.00066 20.4 11.3 217 150-433 1-231 (231) 107 protein:vir:2430 Length: 318 # 36.9 1.1 0.00071 20.3 17.3 290 32-441 1-318 (318) 108 protein:vir:94711 Length: 347 36.5 1.2 0.00072 20.2 13.6 300 99-433 1-347 (347) 109 protein:vir:3364 Length: 347 # 35.5 1.2 0.00076 20.1 13.5 306 99-433 1-347 (347) 110 protein:vir:8187 Length: 311 # 33.9 1.3 0.00082 19.9 17.4 280 51-432 1-311 (311) 111 protein:vir:5739 Length: 366 # 25.0 2.1 0.0013 18.8 20.1 325 1-435 1-366 (366) 112 protein:vir:9361 Length: 402 # 21.1 2.7 0.0016 18.3 18.1 317 1-435 49-402 (402) No 1 >protein:vir:106998 Length: 468 # NCBI annotation: major capsid protein gp23 # Family: family:all:364 # MgeID: mge:1459 # MgeName: S-PM2 # Cross-refs: genbank:acc:YP_195142;genbank:gi:58532919;uniprot:Q5GQN0;genbank:GeneID:3260495 Probab=100.00 E-value=1.5e-237 Score=1318.86 Aligned_cols=453 Identities=63% Similarity=0.992 Sum_probs=413.9 Q ss_pred CcchHHHHHHhhHhhcCCCCccccchhhHHHHHHHhhhHHHHHHHHHHhhhhhhhc--------hhhhcccccccccccc Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHEGLNDIKDPYRKSVTAILLENQERALAEERAVLTEAPTN--------VGPINTPTTSSGAVAG 72 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~~~~~i~~~~~~~v~~~~~enq~~~~~e~~~~l~ea~~~--------~~~~~~~st~tg~i~~ 72 (455) |+|+|+|+|||+||||||++|||++.|||+|+++|||||||||+|++.+|.|+..+ ..+++++|++|++|++ T Consensus 1 ~~~~e~l~~kW~plLe~~~~~~i~~~~k~~i~a~llENQe~~~~~~~~~~~~~~~~~~~~~~~~~~n~~~~~~~t~~v~~ 80 (468) T protein:vir:10 1 MFNAEHLQEKWSPVLNHGEAPAIGDRYKRAVTSVLLENQERFLREERGMLNEVAVNSLGAGTIAPAGSALGSANTGGLAG 80 (468) T ss_pred CcchHHHHHhhhHhhcCCccchhccchhhhhhhhhhhhHHHHHhccccccchhhHhhcCCcccchhhhhhhhcccccccc Confidence 99999999999999999999999999999999999999999999999999996653 4478899999999999 Q ss_pred ccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccccccccccccccccccccccccC-----c Q lcl|NC_015280. 73 FDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGTDGATPPTATTEKN-----P 147 (455) Q Consensus 73 ~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~~~~~~~~~~~-----~ 147 (455) +||+||+||||++|||||+|||||||||||||||||||+||.+|+|+||||+|||++|||..+..........+ . T Consensus 81 ~~P~Li~l~RRa~p~LIa~DIwGVQPMTgPTGLIFAmRsrY~n~~g~EAf~nEadt~fSg~~~~~~~~~~~~~~~~~~~~ 160 (468) T protein:vir:10 81 FDPVLISLVRRAMPNLMAYDVCGVQPMSGPTGLIFAMRSRYENQAGEEALFNEPDTGFTGGYDASQGDYAVRTGAGVGGD 160 (468) T ss_pred cCchhhhhHHHHHhhhhhhhceeeecCCccceeeeEEEEEecCCCCccceeccccccccccccccccccccccccccccC Confidence 99999999999999999999999999999999999999999999999999999999999875544332211111 1 Q ss_pred ccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhH Q lcl|NC_015280. 148 ALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAES 227 (455) Q Consensus 148 ~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ 227 (455) ....+.........+.++++.||+|+++|.||+++ ++|+||+|+||||+|||||||||||||||||||||||||||||+ T Consensus 161 ~~g~~~~~~~~a~~~~~~~g~gMsTa~aE~lG~~~-~~f~EMaFsIeK~tVtAKSRaLKAeYTiELAQDLKAiHGLDAEt 239 (468) T protein:vir:10 161 SEGNNPALLNDAAPGTYEVGSKMPREDLERMGEAN-RLFREMSFSIEKTSVTAQSRALKAEYTLELAQDLKAIHGLDAEQ 239 (468) T ss_pred CCCCcccccccccccccccccccchHHHhhcCCCC-cccceeeeEEEEEEEeeeccceeccccHHHHHHHHHhcCCChhH Confidence 11111222222344567889999999999999864 67999999999999999999999999999999999999999999 Q ss_pred HHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccE Q lcl|NC_015280. 228 ELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNI 307 (455) Q Consensus 228 ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~ 307 (455) ||+||||||||+||||||||+||+||+|||++|++++|+|||++++||||++|+||+|+|||+||||+|+|||+||+||| T Consensus 240 ELaNILStEImlEINReii~~l~~va~~~k~~g~~~~Gv~d~~~~~~~rw~~e~~k~L~~~i~~ean~i~~~T~rg~gn~ 319 (468) T protein:vir:10 240 ELANILSSEVLAEINREVVRRVYTVAKKGAQNNVANAGIFDLDVDSNGRWSVEKFKGLLFQVERDANAIAQETRRGKGNF 319 (468) T ss_pred HHHHHHHHHHHHHhcHHHHHhHhhhhhheecccccccccccccccccchhHHHHHHHHHHHHHHHHHHHHHhhccccccE Confidence 99999999999999999999999999999999999999999999999999999999999999999999999999999999 Q ss_pred EEEchhHHHHHHhhcccccccccccccccc-cccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCcccccee Q lcl|NC_015280. 308 IITSADVASALAMSGVLDYDSGISGAVGGI-GEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGL 386 (455) Q Consensus 308 ~v~S~~va~~L~~sG~l~~~~~~~~~~~~~-~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~dagl 386 (455) ||||++||++|+++|||++.|+++++.+++ +++|+|+++|+|+|+|||+||||||+++|+|+|||+|||||++|+|+|| T Consensus 320 ii~S~~Va~~L~~sG~l~~~~~~~~~~~~~~~~~D~tg~~~~G~l~~r~~vy~D~Ya~~~s~~dY~~vG~KG~~~~d~gl 399 (468) T protein:vir:10 320 LICSADVASALAMAGVLDYSSGLNGAGGPSIGEVDDTGNLAVGTINGRIKVFVDPYAANLSDKHYYVIGYKGTSPYDAGL 399 (468) T ss_pred EEechhHHHHHhhcCcceecccccccccccccccccCcceEEEEecCceEEEEccccccCCccceEEEEEecCcceecee Confidence 999999999999999999999999999864 7999999999999999999999999999999999999999999999999 Q ss_pred EEcccccccceeecCCccccceeeeeeecceeecccccccccccccCchh-hhhccchhhhhhhhhhcCC Q lcl|NC_015280. 387 FYCPYVPLQMYRAIGQDTFQPRIGFKTRYGMVLNPFAKGLTALSDSDPQA-AGNLNANAYYRRVRVANLM 455 (455) Q Consensus 387 fyaPYv~l~~~~~~Dp~s~qP~~g~~tRY~l~~nP~~~~~~~~~~~~~~~-~~~~~~n~y~r~~~v~~~~ 455 (455) ||||||||+|+|++||+||||++||||||||++|||+... ++.+|++.+ ....++|+|||||+||||| T Consensus 400 fyaPYv~l~~~~~~dp~sfqP~~g~~tRY~l~~NP~~~~~-~~~~g~~~~~~~~~~~N~y~r~~~v~~l~ 468 (468) T protein:vir:10 400 FYCPYVPLQMVRSIDPNTFQPKIGFKTRYGMVSNPFVTTN-GLYNGTPDGEALTPNANMYYRRVQVTNLM 468 (468) T ss_pred eeccccccccccccCCCcccceeeeeeeeceeecccceec-cccCCCcccccccccccceeeeEEEeccC Confidence 9999999999999999999999999999999999999733 355555443 2346999999999999999 No 2 >protein:vir:104915 Length: 470 # NCBI annotation: T4-like major capsid protein # Family: family:all:364 # MgeID: mge:1630 # MgeName: P-SSM2 # Cross-refs: genbank:acc:YP_214367;genbank:gi:61806007;genbank:GeneID:3294435 Probab=100.00 E-value=4.4e-236 Score=1310.89 Aligned_cols=444 Identities=65% Similarity=1.036 Sum_probs=411.2 Q ss_pred CcchHHHHHHhhHhhcCCCCccccchhhHHHHHHHhhhHHHHHHHHHHhhhhhh-hc--h---hhhcccccccccccccc Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHEGLNDIKDPYRKSVTAILLENQERALAEERAVLTEAP-TN--V---GPINTPTTSSGAVAGFD 74 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~~~~~i~~~~~~~v~~~~~enq~~~~~e~~~~l~ea~-~~--~---~~~~~~st~tg~i~~~~ 74 (455) |+++|+|+|||+||||||++|||++.+||+|+++|||||||+|+|++++|+|+. .| . ++||+|||+|++|++|| T Consensus 3 ~~~~e~l~~kw~p~l~~~~~~~i~~~~~~~v~a~l~enq~~~~~~~~~~l~e~~~~~~~~~~~~~~i~~st~t~~v~~~~ 82 (470) T protein:vir:10 3 MFNSEYLQEKWAPILDYDGLDPIKDSHRRSVTAVLLENQEKELREERNFLSEAPNVNTNSGATAGFSADATAAGPVAGFD 82 (470) T ss_pred cchhHHHHHhhhhhhcCCccchhcchhhhhhhhhhhhhhHHHHhhccchhhhhhhccccccccccccccccccccccccC Confidence 999999999999999999999999999999999999999999999999999985 22 2 26999999999999999 Q ss_pred chhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccccccccccccccccccccc------------ Q lcl|NC_015280. 75 PILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGTDGATPPTAT------------ 142 (455) Q Consensus 75 P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~~~~~~~------------ 142 (455) |+||+||||++|||||+|||||||||||||||||||+||.+|+|+|+||+||++.|||..++...... T Consensus 83 P~Li~lvRra~p~LIa~DIwGVQPMTgPTGLIFAmRsrY~n~sG~EaffnEA~T~fSG~~~~~~~~~~~~~~~a~~~g~~ 162 (470) T protein:vir:10 83 PVLISLIRRSMPNLVAYDLAGVQPMNGPTGLIFAMRSRYKTQSGTEALFNEADTAFSGQPDGLDDTSGFTATGANNVGLG 162 (470) T ss_pred chhhhhHHHHHhhhhhhhhheeecCCccceeeeEEEEEecCCCccceeeecCCcccCccccccccccccccccccccccc Confidence 99999999999999999999999999999999999999999999999999999999997655432211 Q ss_pred ----cccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHH Q lcl|NC_015280. 143 ----TEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLK 218 (455) Q Consensus 143 ----~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLk 218 (455) .+.++...++. +.......++.+.||+|+++|.||++++++|+||+|+||||+||||||||||||||||||||| T Consensus 163 ~~~~~gt~~~~~~~~--~~~a~~~~y~~~~GMsTa~aE~lg~s~~~~f~EMaFsIeK~tVtAKSRaLKAeYTiELAQDLK 240 (470) T protein:vir:10 163 TTAQQGSNPGLLNST--AAQTNATDYNVGQGMRTDSAEDLGDGTGDQFNQMAFSIEKVTVTAKSRALKAEYSLELAQDLK 240 (470) T ss_pred ccccccccccccccc--cccccccccccccccchHHhhhcCCCCCcccceeeeEEEEEEEEeeccceeccccHHHHHHHH Confidence 11122222211 222334567789999999999999999999999999999999999999999999999999999 Q ss_pred HhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015280. 219 AIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAIAQ 298 (455) Q Consensus 219 AiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i~~ 298 (455) ||||||||+||+||||+|||+||||||||+||+||+|||+.|++++|+|||+++++|||++|+||+|+|||+||||+|+| T Consensus 241 AiHGLDAEtELaNILStEImlEINReii~~l~~~a~~~k~~~~~~~Gv~Dl~~~~~gr~~~e~~~~l~~~i~~ean~i~~ 320 (470) T protein:vir:10 241 AIHGLNAEAELANILSTEILAEINREVIRTIYNVAEPGAQANVAAAGTFDLDTDSNGRWSVEKFKGLIFQIERDANAIAQ 320 (470) T ss_pred HhcCCChhHHHHHHHHHHHHHHhcHHHHHHHhhhhhhceeccccccceEEeecccchhHHHHHHHHHHHHHHHHHHHHHH Confidence 99999999999999999999999999999999999999999999999999999999999999999999999999999999 Q ss_pred HhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEecccc--ccCCcceEEEEE Q lcl|NC_015280. 299 ETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSA--NVSDNQYYVVGY 376 (455) Q Consensus 299 ~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~--~~s~~dY~~vG~ 376 (455) ||+||+|||||||++||++|+|+|||++.|++++. +++|+|+++|+|+|+|||+||||||++ ||+++|||+||| T Consensus 321 ~t~r~~~n~~i~S~~Va~~La~sG~l~~~~~~~~~----~~~D~t~~~~~G~l~~~~~vy~d~y~~~~~~a~~dy~~vG~ 396 (470) T protein:vir:10 321 RTRRGKGNMILCSADVASALTMAGVLDYTPALNAN----LNVDDTGNTFAGILQGKYRVYIDPFSASGGAAATQYYVVGY 396 (470) T ss_pred hhccccceEEEEchhHHhHhhhccccccccccccc----cccCCCCceEEEEecCceEEEeeccccccCcccccEEEEEE Confidence 99999999999999999999999999999999875 789999999999999999999999987 689999999999 Q ss_pred ecCccccceeEEcccccccceeecCCccccceeeeeeecceeecccccccccccccCchhhhhccchhhhhhhhhhcCC Q lcl|NC_015280. 377 KGTNAYDAGLFYCPYVPLQMYRAIGQDTFQPRIGFKTRYGMVLNPFAKGLTALSDSDPQAAGNLNANAYYRRVRVANLM 455 (455) Q Consensus 377 KG~~~~daglfyaPYv~l~~~~~~Dp~s~qP~~g~~tRY~l~~nP~~~~~~~~~~~~~~~~~~~~~n~y~r~~~v~~~~ 455 (455) ||++|+|+||||||||||++++.+||+||||++||||||||++|||+.+.+ +++. ....|+|.|||||+||||| T Consensus 397 KG~~~~~~glfy~PYv~l~~~~~~dp~sfqP~~g~~tRY~l~~NP~~~~~~---~~~~--~i~~~~n~y~r~~~v~~l~ 470 (470) T protein:vir:10 397 KGSSPYDAGLFYCPYVPLQMVRAVGQDTFQPKIGFKTRYGLVENPFSQGTT---QGLG--TLTRNSNRYYRRVKVANLM 470 (470) T ss_pred ecCcceecceeeccccccccCCCCCCccccceeeeeeeeceeecCcccCCC---cccc--cccCCCCceeeEEEeeccC Confidence 999999999999999999999999999999999999999999999999877 3333 3577999999999999999 No 3 >protein:vir:104549 Length: 462 # NCBI annotation: gp23 # Family: family:all:364 # MgeID: mge:1548 # MgeName: P-SSM4 # Cross-refs: genbank:acc:YP_214669;genbank:gi:61806310;genbank:GeneID:3294604 Probab=100.00 E-value=1.1e-234 Score=1303.33 Aligned_cols=444 Identities=66% Similarity=1.007 Sum_probs=407.6 Q ss_pred CcchHHHHHHhhHhhcCCCCccccchhhHHHHHHHhhhHHHHHHHHHHhhhhhhhchhhhccccccccccccccchhhhH Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHEGLNDIKDPYRKSVTAILLENQERALAEERAVLTEAPTNVGPINTPTTSSGAVAGFDPILISL 80 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~~~~~i~~~~~~~v~~~~~enq~~~~~e~~~~l~ea~~~~~~~~~~st~tg~i~~~~P~Lv~l 80 (455) ||+ |+|+|||+|||+||++|||++.+||+|+++|||||+|+|+||+.+|+||..+.+ ...+|++|+++++|||+||+| T Consensus 1 ms~-~~l~~~w~~~l~~~~~~~i~~~~~~~~~~~~~enq~~~~~~~~~~l~ea~~~~g-~~~~~~~t~~~~~~~P~Li~l 78 (462) T protein:vir:10 1 MSI-QQLQEKWAPVLNHESVPEIKDSYKKGVVAQLLENQENAIREEGQVLNETLQTTG-YTTGDTATGPVAGFDPVLISL 78 (462) T ss_pred Cch-HHHHHHhhhhhcccccchhhhhhHHHHHHHHhhhHHHHHHhcccchhccccccC-CCcCcccccccccccchhhhH Confidence 998 699999999999999999999999999999999999999999999999986644 778899999999999999999 Q ss_pred HHHHHhhhhhhheeeeccCCCcceeeeEEEeeecC------CCCcccccccccccccccccccccccccc---------- Q lcl|NC_015280. 81 IRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTN------QSGNEAFFDEPDAQFSGTDGATPPTATTE---------- 144 (455) Q Consensus 81 ~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~------qsG~EAlfnEa~t~fSg~~~~~~~~~~~~---------- 144 (455) ||||+|||||+|||||||||||||||||||+||.+ |+|+||||+|||+.||+..+.....+... T Consensus 79 ~Rra~p~LIa~DIwGVQPMTgPTGLIFAmRsrY~~~~~~~nq~gtEAlfnEadt~fSg~~~~~~~~~~~~~~~~~~~~~~ 158 (462) T protein:vir:10 79 IRRSMPQLIAYDVAGVQPMTGPTGLIFAMRSFYGSERRPANSDFREALFNEPNAGFSGGAGTGLSNYDPTASSSAVNDAE 158 (462) T ss_pred HHHHHhhhhhhcceeeecCCcchhhhheeeeeccCCccccccccchhhhccCCcCccccccccccccccccccccccccc Confidence 99999999999999999999999999999999975 57899999999999999765543332211 Q ss_pred -cCcccCCCCCCCCcccccccccccccchhhhhhcCCC-CCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhC Q lcl|NC_015280. 145 -KNPALINDATGGGTTATNYDLASSKFSTSEQEALGDG-ASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHG 222 (455) Q Consensus 145 -~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s-~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHG 222 (455) .++...++.. +...+.+..+.||+|+++|.||++ +++.|+||+|+||||+||||||||||||||||||||||||| T Consensus 159 g~~~~~~~~~~---~g~~~~~~~~~GM~Ta~aE~lg~~s~n~~f~EMaFsIeK~tVtAKSRaLKAEYTiELAQDLKAIHG 235 (462) T protein:vir:10 159 GANPGLLNDSP---AGTYEVTGDATGMATATAEALDDSSASTAFREMGFSIEKVTVTAKSRALKAEYSIEMAQDLKAIHG 235 (462) T ss_pred cccceeecCCC---ccceecccccccccchhccccCCccCCcchhhceeEEEEEEEeeeccceeccccHHHHHHHHHhcC Confidence 1111111111 112233445789999999999964 56789999999999999999999999999999999999999 Q ss_pred CChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHHHHHHHHHHHHHHHHHhcC Q lcl|NC_015280. 223 LDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAIAQETRR 302 (455) Q Consensus 223 LDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i~~~T~~ 302 (455) ||||+||+||||||||+||||||||+||+||+|||+.|++++|+|||+++++|||++|+||+|+|||+||||+|+|||+| T Consensus 236 LDAEtELaNILSTEImlEINReii~~l~~~a~~~k~~~~~~~Gv~dl~~~~~gr~~~e~~k~l~~qi~~ean~i~~~t~r 315 (462) T protein:vir:10 236 LDAESELANILSTEILAEINREVVRTIYVNAVKGAIANTATDGIFDLDVDSNGRWSVEKFKGLLFQIERDSNAIGQETRR 315 (462) T ss_pred CChhHHHHHHHHHHHHHHhhHHHHhhhhhhheeeecccccccceeeeccccchHHHHHHHHHHHHHHHHHHHHHHHHhcc Confidence 99999999999999999999999999999999999999999999999999999999999999999999999999999999 Q ss_pred CCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCccc Q lcl|NC_015280. 303 GKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAY 382 (455) Q Consensus 303 ~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~ 382 (455) |+|||||||+|||++|+|+|||++.|++++..+. .++|+++.+|+|+|+|||+||||||++||+|+|||+|||||++++ T Consensus 316 ~~~n~~i~S~~Va~~La~sG~l~~~p~~~~~~~~-~~~d~~~~~~~G~l~~r~~vy~D~Y~~~ns~~dy~~vG~KG~~~~ 394 (462) T protein:vir:10 316 GKGNILICSADVASALGMAGVLDYAPGLQGNSAL-TGVDDTSSTLVGTLNGRIKVYVDPYSSNVADKHFYVAGYKGTSPY 394 (462) T ss_pred ccceEEEEchhHHHHhhhccchhccccccccccc-cccccccceeEEEecCceEEEEecccCCCcccceEEEEEeCCccc Confidence 9999999999999999999999999999999876 489999999999999999999999999999999999999999999 Q ss_pred cceeEEcccccccceeecCCccccceeeeeeecceeecccccccccccccCchhhhhccchhhhhhhhhhcCC Q lcl|NC_015280. 383 DAGLFYCPYVPLQMYRAIGQDTFQPRIGFKTRYGMVLNPFAKGLTALSDSDPQAAGNLNANAYYRRVRVANLM 455 (455) Q Consensus 383 daglfyaPYv~l~~~~~~Dp~s~qP~~g~~tRY~l~~nP~~~~~~~~~~~~~~~~~~~~~n~y~r~~~v~~~~ 455 (455) |+||||||||||+++|++||+||||++||||||||++|||+.+.++. ......|+|.|||||+||||| T Consensus 395 ~~glfy~PYv~l~~~~~~dp~sfqP~~g~~tRY~l~~NP~t~~~~~~-----~~~~~~~~n~y~r~~~v~~l~ 462 (462) T protein:vir:10 395 DAGLFYCPYVPLQQVRAINPNTFQPKIGFKTRYGMVSNPFSGGLTQG-----SGALTANANKYYRRVQVANLM 462 (462) T ss_pred ccceeeccccccccccccCCccccceeeeeeeeeeeecCCCCCcCCc-----cccccccCcceeeeEEeeccC Confidence 99999999999999999999999999999999999999999988733 245688999999999999999 No 4 >protein:vir:103181 Length: 457 # NCBI annotation: gp135 # Family: family:all:364 # MgeID: mge:1583 # MgeName: Syn9 # Cross-refs: genbank:acc:YP_717802;genbank:gi:113200639;genbank:GeneID:4239190 Probab=100.00 E-value=2.2e-231 Score=1285.10 Aligned_cols=444 Identities=68% Similarity=1.035 Sum_probs=409.6 Q ss_pred CcchHHHHHHhhHhhcCCCCccccchhhHHHHHHHhhhHHHHHHHHHHhhhhhhhchhhhccccccccccccccchhhhH Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHEGLNDIKDPYRKSVTAILLENQERALAEERAVLTEAPTNVGPINTPTTSSGAVAGFDPILISL 80 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~~~~~i~~~~~~~v~~~~~enq~~~~~e~~~~l~ea~~~~~~~~~~st~tg~i~~~~P~Lv~l 80 (455) ||+ |+|+|||+||||||++|||++.|||+|+++|||||+|+|+||+++|+||. .-++.+.+|++|++|+++||+||+| T Consensus 1 m~~-~~l~~~w~~~l~~~~~~~i~~~~~~~~~~~~lenq~~~~~~~~~~l~ea~-~~~g~~~~s~~t~~v~~~~P~Li~l 78 (457) T protein:vir:10 1 MSF-QNLQEKWAPVLEHDSLPEIGDSYKKGVVAQLLENQEKAIAEEGKILTETL-QTTGYTGGDTVTGPVAGFDPVLISL 78 (457) T ss_pred Cch-HHHHHHhhHhhccCccchhhhhHHHHHHHHHhhhHHHHHHhccccccccc-cccCCCcccccccccccccchhhhh Confidence 988 68999999999999999999999999999999999999999999999986 3446778889999999999999999 Q ss_pred HHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCC------ccccccccccccccccccccccccc------ccCcc Q lcl|NC_015280. 81 IRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSG------NEAFFDEPDAQFSGTDGATPPTATT------EKNPA 148 (455) Q Consensus 81 ~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG------~EAlfnEa~t~fSg~~~~~~~~~~~------~~~~~ 148 (455) |||++|||||+|||||||||||||||||||+||.+|.+ +||||+|||+.||+..++....... +.++. T Consensus 79 ~Rra~p~LIa~DIwGVQPmTgPTGLIFAmRsrY~~q~~~~~a~~~EAl~nEadt~fSg~~~~~~~~~~~~~~~~~gt~~~ 158 (457) T protein:vir:10 79 IRRSMPQLIAYDIAGVQPMTGPTGLIFAMRTNYGAERNPAAAGYDEAFFNEPNAGFSGGPGAYDPGATGVTNDAEGTNPA 158 (457) T ss_pred hHHHHhhhhhhhcceeecCCCcceeeeeeeeeecCccccccccccceeeeccCcccCccccccccccccccccccccccc Confidence 99999999999999999999999999999999999876 7999999999999976655443322 22222 Q ss_pred cCCCCCCCCcccccccccccccchhhhhhcCCC-CCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhH Q lcl|NC_015280. 149 LINDATGGGTTATNYDLASSKFSTSEQEALGDG-ASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAES 227 (455) Q Consensus 149 ~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s-~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ 227 (455) ..++...+ ....++.+.||+|+++|.||++ +++.|+||+|+||||+|||||||||||||||||||||||||||||+ T Consensus 159 ~~~~~~~~---~~~~~~~~~gmsTA~aE~lgd~~~n~~f~EMaFsIeK~tVtAKSRaLKAEYTiELAQDLKAiHGLDAEt 235 (457) T protein:vir:10 159 LLNDSPAG---TYEQADDATGMSTATVEALDDSTANTAFREMGFSIEKVTVTARARALKAEYSIEMAQDLKAIHGLDAEQ 235 (457) T ss_pred ccCccccc---cccccccccchhhhhhhccCCCCCccchhhheeEEEEEEEeeeccceeccccHHHHHHHHHhcCCChhH Confidence 22222222 2335667899999999999965 4567999999999999999999999999999999999999999999 Q ss_pred HHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccE Q lcl|NC_015280. 228 ELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNI 307 (455) Q Consensus 228 ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~ 307 (455) ||+||||||||+||||||||+||+||+|||+.|++++|+|||+++++|||++|+||+|+|||+||||+|+|||+||+||| T Consensus 236 ELaNILStEImlEINReii~~l~~~a~~~~~~~~~~~gv~dl~~~~~g~~~~e~~k~L~~~i~~ean~i~~~T~rg~gn~ 315 (457) T protein:vir:10 236 ELANILSTEILAEINREVVRTIYTNAVAGAQNNTATAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAIGHQTRRGKGNI 315 (457) T ss_pred HHHHHHHHHHHHHhhHHHHHhHhhhheeeeccccccceeeeeeccccchhhHHHHHHHHHHHHHHHHHHHHhhccccceE Confidence 99999999999999999999999999999999999999999999999999999999999999999999999999999999 Q ss_pred EEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeE Q lcl|NC_015280. 308 IITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLF 387 (455) Q Consensus 308 ~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglf 387 (455) ||||++||++|+++|||+++|++++..+. +++|+++.+|+|+|+|||+||||||+++|||+|||+|||||++|+|+||| T Consensus 316 ~i~S~~Va~~L~~sg~l~~~p~~~~~~~~-~~~d~~~~~~~G~l~~r~~vy~D~Ya~~ns~~dy~~vG~KG~~~~~~glf 394 (457) T protein:vir:10 316 LICSADVVSALGMAGVLDYTPALNGNNGL-AGVDDTSSTLVGTLNGRIKVYVDPYSANVADKHFYVAGYKGTSPYDAGLF 394 (457) T ss_pred EEEchhHHHHHhhcccccccchhhccccc-cccccccceeEEEecCCeEEEEecccccCCccceEEEEEeCCcceeccee Confidence 99999999999999999999999999776 69999999999999999999999999999999999999999999999999 Q ss_pred EcccccccceeecCCccccceeeeeeecceeecccccccccccccCchhhhhccchhhhhhhhhhcCC Q lcl|NC_015280. 388 YCPYVPLQMYRAIGQDTFQPRIGFKTRYGMVLNPFAKGLTALSDSDPQAAGNLNANAYYRRVRVANLM 455 (455) Q Consensus 388 yaPYv~l~~~~~~Dp~s~qP~~g~~tRY~l~~nP~~~~~~~~~~~~~~~~~~~~~n~y~r~~~v~~~~ 455 (455) |||||||++++++||+||||++||||||||++|||+.+.++... ....|+|.||||++|+||| T Consensus 395 y~PYv~l~~~~~~dp~sfqP~~g~~tRY~l~~NP~~~~~~~~~~-----~~~~~~n~~~~rs~vs~ll 457 (457) T protein:vir:10 395 YCPYVPLQQVRAINPDTFQPKIGFKTRYGMVSNPFAGGLTQGSG-----ALTVNANKYYRRVQVANLM 457 (457) T ss_pred ecccccccccCccCCccccceeeeeeeeeeeecccccccccccc-----cccccchhhcceeeeeecC Confidence 99999999999999999999999999999999999998874432 3456899999999999999 No 5 >protein:vir:106286 Length: 534 # NCBI annotation: gp23 major head protein # Family: family:all:364 # MgeID: mge:1474 # MgeName: Aeh1 # Cross-refs: genbank:acc:NP_944113;genbank:gi:38640157;genbank:GeneID:2658034 Probab=100.00 E-value=2e-223 Score=1241.48 Aligned_cols=447 Identities=40% Similarity=0.662 Sum_probs=400.5 Q ss_pred CcchHHHHHHhhHhhcCCCCccccchhhHHHHHHHhhhHHHHHHHHH----------------------Hhhhhhhhchh Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHEGLNDIKDPYRKSVTAILLENQERALAEER----------------------AVLTEAPTNVG 58 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~~~~~i~~~~~~~v~~~~~enq~~~~~e~~----------------------~~l~ea~~~~~ 58 (455) |++ |+|+|||+||||||++|||++.+||+|+++|||||||+|+||| +.|+||..+.+ T Consensus 1 ~~~-~~l~~kw~p~l~~~~~~~i~~~~~~~~~a~l~enq~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~ea~~~~~ 79 (534) T protein:vir:10 1 MSK-KSLLKKWQPLVESEGMPAIASMKRKDIVARIFENQDEDIAHNEGGVYTDQVVVNSMVDVKGRIEEARLAEANIGGD 79 (534) T ss_pred Cch-hHHHHHhHHhhcCCccccccchhhhhhhhhhhhhHHHHHhhhcccccchhhhhhhhhccccchhhccccccccccc Confidence 665 5999999999999999999999999999999999999999985 55999865443 Q ss_pred -----hhccccccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCC----Cccccccc--cc Q lcl|NC_015280. 59 -----PINTPTTSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQS----GNEAFFDE--PD 127 (455) Q Consensus 59 -----~~~~~st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qs----G~EAlfnE--a~ 127 (455) ..|+||++|++|+++||+||+|||||+|||||+|||||||||||||||||||+||.+|. ++||||+| +| T Consensus 80 ~g~~~~~ia~s~~s~~v~~~~P~Li~lvRra~p~LIa~DIwGVQPMTgPTGLIFAMRsrY~n~~~~~s~~EAf~ne~~ad 159 (534) T protein:vir:10 80 HGYDATKIASGETSGSITNVGPAVMGLVRRAIPQLIAFDICGVQPMTSSTGQVFTLRAIYGGNSQDANAREAFHPTYGPD 159 (534) T ss_pred cccccccccccccccccccccchhhhHHHHHHHhhhhhhhheeccCCchhhhheeeeeeecCCCCCcccccccccccccc Confidence 37899999999999999999999999999999999999999999999999999999875 67999999 99 Q ss_pred cccccccccccccccc-----------------------ccCccc------CCCCC----------CCCccccccccccc Q lcl|NC_015280. 128 AQFSGTDGATPPTATT-----------------------EKNPAL------INDAT----------GGGTTATNYDLASS 168 (455) Q Consensus 128 t~fSg~~~~~~~~~~~-----------------------~~~~~~------~~~~~----------~g~t~~~~~~~~~~ 168 (455) +.|||+.++....... +..+.. .+... .........++++. T Consensus 160 t~fSG~~~a~~~~~~~~~~a~~~g~~~~~~~~~~t~~~~Gt~~~~~~~~~~v~~~~~~~~~ag~~~~~~~~~~~~y~~~~ 239 (534) T protein:vir:10 160 ADFSGRGAAQDIAVFVRGTAVASGAFAKLHIEAATGVQAGTKTVQFIKDYAVDALPADQTEAGLAYKWLLANGYAVETSS 239 (534) T ss_pred ccccccccccccccccccccccccccccccccccccccccccccccccccccccccCCccccccccccccccccceeccc Confidence 9999976543211000 000000 00000 00111223567889 Q ss_pred ccchhhhhhc---CCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHH Q lcl|NC_015280. 169 KFSTSEQEAL---GDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREV 245 (455) Q Consensus 169 gm~Ta~aE~L---G~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReI 245 (455) ||+|+.+|.| |++++++|+||+|+||||+|||||||||||||||||||||||||||||+||+||||||||+|||||| T Consensus 240 gm~Ta~AE~lg~~ggs~~~~f~EMsFsIdKvtVtAKSRaLKAEYTiELAQDLKAIHGLDAEtELsNILSTEImlEINRei 319 (534) T protein:vir:10 240 AMATAFAELQQGFNGSADNEWNEMSFRIDKQVVEAKSRQLKAQYSIEMAQDLRAVHGLDADSELSSILANEIMHEINREM 319 (534) T ss_pred ccchhhHhhhccCCCCcccchhhcceEEEEEEEeeeccceeccccHHHHHHHHHhcCCChHHHHHHHHHHHHHHHhhHHH Confidence 9999999988 5677889999999999999999999999999999999999999999999999999999999999999 Q ss_pred HHHHhhhheeeeeec----cccceeeeeecccc---chhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHH Q lcl|NC_015280. 246 VRTVYRGAKPGAQAN----VANAGVFDLDVDSN---GRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASAL 318 (455) Q Consensus 246 I~~l~~vA~~~k~~~----v~~~gv~Dl~~~~~---gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L 318 (455) ||+||+||+|||+.+ ++++|+|||+++.| +||.+|+||+|++||++|+|+|+|+|+||+|||||||+|||++| T Consensus 320 i~~l~~~a~~~k~~~~~~~~~~~G~~d~~~~~~~~~~~~~~e~~~~L~~~i~~~an~i~~~T~rg~~n~~v~S~~Va~~L 399 (534) T protein:vir:10 320 VLWINATAKVGKTGWTNMHGGKAGVFDFQDTKDIRGARWAGESYKALVVQIDKEANEIARQTGRGQGNFIICSRNVAAAL 399 (534) T ss_pred HHHHhhhhheeecccccccccccceeeeeccccccchhHHHHHHHHHHHHHHHHHHHHHHhhccccccEEEEchhHHHHH Confidence 999999999999986 56899999999999 99999999999999999999999999999999999999999999 Q ss_pred HhhcccccccccccccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEccccccccee Q lcl|NC_015280. 319 AMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYR 398 (455) Q Consensus 319 ~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~ 398 (455) +|+|||++.|.+....+ .++|+|+.+|+|+|+|||+|||||| +++|||+|||||++|+|+||||||||||+|+| T Consensus 400 ~~~g~l~~~~~~~~~~~--~~~d~~~~~~~G~l~~~~~vy~D~y----~~~dy~~vG~KG~~~~~~glfyaPYv~l~~~~ 473 (534) T protein:vir:10 400 GHTDMLMTPAVMGANTT--MNTDTTSSLFAGVLAGKYRVYIDQY----AVEDYFTVGYKGASEMDAGLYYCPYVALTPLR 473 (534) T ss_pred hhccchhcccccccccc--ccccCCCceEEEEecCceEEEecCC----CCcceEEEEEeCCcccccceeecccccccccc Confidence 99999999999988876 7999999999999999999999999 68999999999999999999999999999999 Q ss_pred ecCCccccceeeeeeecceeecccccccc-----cccccCchhhhhccchhhhhhhhhhcC Q lcl|NC_015280. 399 AIGQDTFQPRIGFKTRYGMVLNPFAKGLT-----ALSDSDPQAAGNLNANAYYRRVRVANL 454 (455) Q Consensus 399 ~~Dp~s~qP~~g~~tRY~l~~nP~~~~~~-----~~~~~~~~~~~~~~~n~y~r~~~v~~~ 454 (455) ++||+||||++||||||||++|||+++.+ ++.+|+|...+.+|+|.|||||+|||| T Consensus 474 ~~dp~sfqP~~g~~tRY~l~~NP~~~~~~~~~~~~i~~g~~~~~~~ag~n~~~~~~~Vk~l 534 (534) T protein:vir:10 474 GTDPKNFQPVLGFKTRYGVKLHPMADATQNKGFAKISNGMPQHTNMFGKNAFFRRVLVAGV 534 (534) T ss_pred ccCCccccceeeeeeeeceeecCcccccCCccccccccCCcchhhhcccccceeeeeeecC Confidence 99999999999999999999999998654 456888867778899999999999999 No 6 >protein:vir:6901 Length: 522 # NCBI annotation: gp23 major head protein # Family: family:all:364 # MgeID: mge:140 # MgeName: RB69 # Cross-refs: genbank:acc:NP_861877;genbank:gi:32453668;genbank:GeneID:1494303 Probab=100.00 E-value=2.3e-222 Score=1235.69 Aligned_cols=447 Identities=40% Similarity=0.661 Sum_probs=400.6 Q ss_pred CcchHHHHHHhhHhhcCCCCccccchhhHHHHHHHhhhHHHHHHHHH------------Hhhhhhhhchh-----hhccc Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHEGLNDIKDPYRKSVTAILLENQERALAEER------------AVLTEAPTNVG-----PINTP 63 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~~~~~i~~~~~~~v~~~~~enq~~~~~e~~------------~~l~ea~~~~~-----~~~~~ 63 (455) |.++|+|+|||+||||||++|+|.+. ||+|+++|||||||+|+||+ ++|+||..+.+ ..|+| T Consensus 4 ~~~~e~l~~kw~p~l~~~~~~~~~~~-~~~~~a~l~enq~~~~~~~~~~~~~~~~~~~~~~l~ea~~~~~~~~~~~~i~e 82 (522) T protein:vir:69 4 IKTKAQLVDKWKELLEGEGLPEIANS-KQAIIAKIFENQEKDFEVSPEYKDEKIAQAFGSFLTEAEIGGDHGYNAQNIAA 82 (522) T ss_pred cchHHHHHHhhHHHhcCCCCCccccc-hhhhhhhhhhhhhHHhhcccccchhHHHHhhhhhhhhhccccccCCCcccccc Confidence 77889999999999999999999985 99999999999999999987 88999976543 68999 Q ss_pred cccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCC----Cccccc--cccccccccccccc Q lcl|NC_015280. 64 TTSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQS----GNEAFF--DEPDAQFSGTDGAT 137 (455) Q Consensus 64 st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qs----G~EAlf--nEa~t~fSg~~~~~ 137 (455) |++|++|++|||+||+|+|||+|||||+|||||||||||||||||||+||.+|. ++|+|+ +|+|+.|||..+.. T Consensus 83 s~~t~~v~~~~P~li~lvrRa~p~LIa~DIwGVQPMTgPTGLIFAMRsrY~~q~~~~~~~eaf~~~neadt~fSG~~~~t 162 (522) T protein:vir:69 83 GQTSGAVTQIGPAVMGMVRRAIPNLIAFDICGVQPMNSPTGQVFALRAVYGKDPIAAGAKEAFHPMYAPDAMFSGQGAAK 162 (522) T ss_pred cccccccccccchHHHHHHHHHhhhhhhhceeeccCCchhhhheeeeeeccCCcccCccccccccccccccccccccccc Confidence 999999999999999999999999999999999999999999999999999875 667774 99999999976543 Q ss_pred ccccccccC----------------------------------cccCCCCCCCCcccccccccccccchhhhhh---cCC Q lcl|NC_015280. 138 PPTATTEKN----------------------------------PALINDATGGGTTATNYDLASSKFSTSEQEA---LGD 180 (455) Q Consensus 138 ~~~~~~~~~----------------------------------~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~---LG~ 180 (455) ......... +...+............++++.||+|+.+|. ||+ T Consensus 163 ~~~~~~~~~~t~~G~~~~~~~~~~gt~~~~~~a~~t~~~t~~~~~~~~~ai~s~~~~~~~y~~g~GmsTa~aEal~~lgg 242 (522) T protein:vir:69 163 KFPALAASTQTKVGDIYTHFFQETGTVYLQASAQVTISSSADDAAKLDAEIIKQMEAGALVEIAEGMATSIAELQEGFNG 242 (522) T ss_pred cccccccccccccccccccccccccceeeecccCCcCCCCCcccccccchhccccccccceeeccccchhhhhhcccCCC Confidence 321110000 0000000011122334677899999999997 577 Q ss_pred CCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeec Q lcl|NC_015280. 181 GASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQAN 260 (455) Q Consensus 181 s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~ 260 (455) +++++|+||+|+||||+|||||||||||||||||||||||||||||+||+||||||||+|||||||++|+.+|+.+++.+ T Consensus 243 ss~~~f~EMaFsIeKvTVtAKSRaLKAEYTiELAQDLKAIHGLDAEtELaNILSTEImlEINReii~~i~~sa~~~~~g~ 322 (522) T protein:vir:69 243 STDNPWNEMGFRIDKQVIEAKSRQLKAAYSIELAQDLRAVHGMDADAELSGILATEIMLEINREVVDWINYSAQVGKSGM 322 (522) T ss_pred CcccchhhhcceEeeEEEeeecccccccccHHHHHHHHHhcCCChHHHHHHHHHHHHHHHhhHHHHhhhhhhheeecccc Confidence 88899999999999999999999999999999999999999999999999999999999999999999988888877744 Q ss_pred c----ccceeeeeecccc---chhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccc Q lcl|NC_015280. 261 V----ANAGVFDLDVDSN---GRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGA 333 (455) Q Consensus 261 v----~~~gv~Dl~~~~~---gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~ 333 (455) + +++|+|||+++.| |||++|+||+|+|||+||||+|+|+|+||+|||||||+|||++|+|+|++++.++.... T Consensus 323 t~~~~~~~Gv~Dl~~~~~~~~~rw~~e~~k~L~~~i~~~an~i~~~T~rg~~n~~i~S~~Va~~L~~~~~~~~~~~~~~~ 402 (522) T protein:vir:69 323 TNIVGSKAGVFDFQDPIDIRGARWAGESFKALLFQIDKEAVEIARQTGRGEGNFIIASRNVVNVLASVDTGISYAAQGLA 402 (522) T ss_pred ccccccccceeecccccccccchhHHHHHHHHHHHHHHHHHHHHHhcccccccEEEEchhHHHHHhhccccccccccccc Confidence 3 7899999999999 99999999999999999999999999999999999999999999999999999988877 Q ss_pred cccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEcccccccceeecCCccccceeeeee Q lcl|NC_015280. 334 VGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQDTFQPRIGFKT 413 (455) Q Consensus 334 ~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s~qP~~g~~t 413 (455) .+ .++|+|+++|+|+|+|||+|||||| +++|||+|||||++|+|+||||||||||+|+|++||+||||++|||| T Consensus 403 ~g--~~~d~~~~~~~G~l~~~~~vy~D~y----~~~dy~~vG~KG~~~~~~glfyaPYv~l~~~~~~dp~sfqP~~g~~t 476 (522) T protein:vir:69 403 SG--FNTDTTKSVFAGVLGGKYRVYIDQY----AKQDYFTVGYKGANEMDAGIYYAPYVALTPLRGSDPKNFQPVMGFKT 476 (522) T ss_pred cc--ccccCCCceEEEEecCceEEEecCC----CCcceEEEEEeCCcccccceeeccccccccccccCCccccceeeeee Confidence 77 4789999999999999999999999 68999999999999999999999999999999999999999999999 Q ss_pred ecceeecccccccc-----cccccCchhhhhccchhhhhhhhhhcC Q lcl|NC_015280. 414 RYGMVLNPFAKGLT-----ALSDSDPQAAGNLNANAYYRRVRVANL 454 (455) Q Consensus 414 RY~l~~nP~~~~~~-----~~~~~~~~~~~~~~~n~y~r~~~v~~~ 454 (455) ||||++|||+++.+ +++||+|.+++.+|||.|||||+|||| T Consensus 477 RY~l~vNP~~~~~~~~~~~ri~~g~p~~~~~~~~n~y~r~v~v~~~ 522 (522) T protein:vir:69 477 RYGIGVNPFAESSLQAPGARIQSGMPSILNSLGKNAYFRRVYVKGI 522 (522) T ss_pred eeceeecCcccccCCcccceeecccchhhcccCCcceeeEEEeecC Confidence 99999999998543 578999988889999999999999999 No 7 >protein:vir:101811 Length: 529 # NCBI annotation: gp23 # Family: family:all:364 # MgeID: mge:1580 # MgeName: 31 # Cross-refs: genbank:acc:YP_238888;genbank:gi:66391963;genbank:GeneID:3416638 Probab=100.00 E-value=7.4e-222 Score=1232.91 Aligned_cols=446 Identities=41% Similarity=0.686 Sum_probs=393.7 Q ss_pred Ccc-hHHHHHHhhHhhcCCCCccccchhhHHHHHHHhhhHHHHHHHHHHh------------hhhhhhchh-----hhcc Q lcl|NC_015280. 1 MYN-AENLQEKWAPVLNHEGLNDIKDPYRKSVTAILLENQERALAEERAV------------LTEAPTNVG-----PINT 62 (455) Q Consensus 1 m~~-~~~~~~kw~~~l~~~~~~~i~~~~~~~v~~~~~enq~~~~~e~~~~------------l~ea~~~~~-----~~~~ 62 (455) |+- +|+|+|||+||||||++|||++.+||+|+++|||||||+++|++.+ |+|+..+.+ .+|+ T Consensus 1 ~~~~~~~l~~kw~p~l~~~~~~~i~~~~~~~~~a~l~enq~~~~~~~~~~~~~~~~e~~~~~l~e~~~~~~~~~~~~~i~ 80 (529) T protein:vir:10 1 MSLKNKEILNKWTPLLEGEGLPEIAGKNKQALVAQILEAQEKDSKSDPVYRDDKLIEAFGQSLMEAEVAGDHGYDPTNIA 80 (529) T ss_pred CccchHHHHHHhhHhhcCCccchhccchhhhhhhhhhhhhHHHHhcccccchhhhhhhhhccchhhcccccccccccccc Confidence 543 3689999999999999999999999999999999999999998744 999886654 6999 Q ss_pred ccccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCC----Cccccccc--ccccccccccc Q lcl|NC_015280. 63 PTTSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQS----GNEAFFDE--PDAQFSGTDGA 136 (455) Q Consensus 63 ~st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qs----G~EAlfnE--a~t~fSg~~~~ 136 (455) |||+|++|++|||+||+|||||+|||||+|||||||||||||||||||+||.+|. +.|+||++ +++.||+.... T Consensus 81 ~st~t~~v~~~~P~Li~lvRra~p~LIa~DIwGVQPMTgPTGLIFAMRsrY~~~~~~~~~~eaf~~~~~pda~~sga~~~ 160 (529) T protein:vir:10 81 AGQSSGAITNIGPAVIGMVRRAIPSLIAFDIAGVQPMTGPTGQVFALRSVYGKDPLAAGAKEAFHPMYAPDAWHSSLATK 160 (529) T ss_pred cccccccccccCchhhhhHHHHHHhhhhhhhheeccCCchhhhhheeeeeecCCcccccccccccccccccccccccccc Confidence 9999999999999999999999999999999999999999999999999998874 45666655 44555553321 Q ss_pred -------------------------------------------cccc-cccccCcccCCCCCCCCcccccccccccccch Q lcl|NC_015280. 137 -------------------------------------------TPPT-ATTEKNPALINDATGGGTTATNYDLASSKFST 172 (455) Q Consensus 137 -------------------------------------------~~~~-~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~T 172 (455) .... +....+. ...............+++++||+| T Consensus 161 ga~t~~~~t~~~~~ta~~~~a~g~g~ea~f~ea~t~fs~~~~g~~~~~g~~~t~~-~~~~~~~~~~a~~~~~~~~~GmsT 239 (529) T protein:vir:10 161 GATTTTDGTPFAKLTAGQAIAEGDIVGHFFYESGTAFLQNVSGASVTVGTNETGE-ALDKLINAAIGEGKLAEIAEGMAT 239 (529) T ss_pred cccccccccccccccccccccccccceeeecccCceeeccccccccccCccccCc-ccccccccccccccccccccchhh Confidence 1110 0000000 011111222234556778999999 Q ss_pred hhhhhc---CCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHH Q lcl|NC_015280. 173 SEQEAL---GDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTV 249 (455) Q Consensus 173 a~aE~L---G~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l 249 (455) +.+|.| |++++++|+||+|+||||+|||||||||||||||||||||||||||||+||+||||+|||+||||||||+| T Consensus 240 a~aEaL~~~ggss~~~f~EMaFsIeK~tVtAKSRaLKAEYTiELAQDLKAVHGLDAEtELsNILStEImlEINReii~~l 319 (529) T protein:vir:10 240 SIAELRQGFNGSNDNPWNEMSFRIDKQTVEAKSRQLKAQYSIELAQDLRAVHGMDADSELNGILANEVMLEINREVIDWI 319 (529) T ss_pred hhhhccccCCCcccccccceeeEEEEEEEeeeccceeccccHHHHHHHHHhcCCChHHHHHHHHHHHHHHHhhHHHHHHH Confidence 999999 66788999999999999999999999999999999999999999999999999999999999999999999 Q ss_pred hhhheeeeeecc----ccceeeeeecccc---chhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhc Q lcl|NC_015280. 250 YRGAKPGAQANV----ANAGVFDLDVDSN---GRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSG 322 (455) Q Consensus 250 ~~vA~~~k~~~v----~~~gv~Dl~~~~~---gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG 322 (455) |++|+|||+.|+ +.+|+|||+++.| +||++|+||+|++||++|+|+|+|+|+||+|||||||++||++|+|+| T Consensus 320 ~~~a~~~~~~~~~~~~~~~Gv~d~~~~~~~~~~~~~~e~~~~L~~~i~~~an~I~~~T~rg~~n~vi~S~~Va~~L~~~~ 399 (529) T protein:vir:10 320 NYTAQVGKSGWTKTDGSASGVFDFQDPIDVRGARWAGESYKALLIQIDKEANEIARQTGRGAGNFIIASRNVVSALALID 399 (529) T ss_pred hhhhhhhccccccccccccceeecccCccccccchHHHHHHHHHHHHHHHHHHHHHhhccccceEEEEchHHHHHHHhhc Confidence 999999998877 5669999998876 999999999999999999999999999999999999999999999999 Q ss_pred ccccccccccccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEcccccccceeecCC Q lcl|NC_015280. 323 VLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQ 402 (455) Q Consensus 323 ~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp 402 (455) + .+.|+.++..++ .++|+++.+|+|+|+|||+|||||| +++|||+|||||++++|+||||||||||+|+|++|| T Consensus 400 ~-~~~~~~~~~~sg-~~~d~~~~~~~G~l~~~~~vy~D~y----~~~dy~~vG~KG~~~~~~glfy~PYv~l~~~~~~dp 473 (529) T protein:vir:10 400 T-NISPAAQGMASG-LNADTTKGVFAGILGGRYKVYIDQY----ARQDYFTMGYRGANNLDAGIYYCPYVALTPLRGFDP 473 (529) T ss_pred c-cccccccccccc-cccccCCceEEEEecCceEEEecCC----CCcceEEEEEeCCcccccceeeccccccccccccCC Confidence 6 689999999887 4789999999999999999999999 689999999999999999999999999999999999 Q ss_pred ccccceeeeeeecceeecccccccc-----cccccCchhhhhccchhhhhhhhhhcC Q lcl|NC_015280. 403 DTFQPRIGFKTRYGMVLNPFAKGLT-----ALSDSDPQAAGNLNANAYYRRVRVANL 454 (455) Q Consensus 403 ~s~qP~~g~~tRY~l~~nP~~~~~~-----~~~~~~~~~~~~~~~n~y~r~~~v~~~ 454 (455) +||||++||||||||++|||+++.+ ++++|+||. ..+|+|.|||||+|||| T Consensus 474 ~sfqP~~g~~tRY~l~~NP~~~~~~~~~~~r~~~g~~~~-~~ag~n~~~r~~~Vk~l 529 (529) T protein:vir:10 474 KNFQPVMGFKTRYAIGVNPFAESRTQAPQGRITSGMPGV-NSVGKNAYFRRVWVKGL 529 (529) T ss_pred CcccceeeeeeeeceeecCccccccccccccccCCcchh-hhcCccceeEEeeeccC Confidence 9999999999999999999998644 457999996 48899999999999999 No 8 >protein:vir:101039 Length: 529 # NCBI annotation: major capsid protein # Family: family:all:364 # MgeID: mge:1582 # MgeName: 44RR2.8t # Cross-refs: genbank:acc:NP_932516;genbank:gi:37651642;genbank:GeneID:2610532 Probab=100.00 E-value=9.4e-222 Score=1232.32 Aligned_cols=445 Identities=40% Similarity=0.672 Sum_probs=394.5 Q ss_pred CcchHHHHHHhhHhhcCCCCccccchhhHHHHHHHhhhHHHHHHHHHH------------hhhhhhhchh-----hhccc Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHEGLNDIKDPYRKSVTAILLENQERALAEERA------------VLTEAPTNVG-----PINTP 63 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~~~~~i~~~~~~~v~~~~~enq~~~~~e~~~------------~l~ea~~~~~-----~~~~~ 63 (455) |++ |+|+|||+||||||++|||++.+||+|+++|||||||+++|++. +|+|+..+.+ .+|+| T Consensus 3 ~~~-~~l~~kw~p~l~~~~~~~i~~~~~~~~~a~l~enq~~~~~~~~~~~~~~~~e~~~~~l~~~~~~~~~~~~~~~i~e 81 (529) T protein:vir:10 3 LKN-KEILNKWTPLLEGEGLPEIAGKNKQALVAQILEAQEKDSKSDPVYRDDKLIEAFGQSLMEAEVAGDHGYDPTNIAA 81 (529) T ss_pred ccH-HHHHHHhHHHhcCCccchhccchhhhhhhhhhhhhHHHHhhccccchhhhhhhhhcccchhhcccccccccccccc Confidence 555 47999999999999999999999999999999999999999873 4888886654 69999 Q ss_pred cccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCC-------------------------- Q lcl|NC_015280. 64 TTSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQS-------------------------- 117 (455) Q Consensus 64 st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qs-------------------------- 117 (455) ||+|++|++|||+||+|||||+|||||+|||||||||||||||||||+||.++. T Consensus 82 st~t~~v~~~~P~Li~lvRra~p~LIa~DIwGVQPMTgPTGLIFAMRsrY~~~~~~~~~~eaf~~~y~Pda~~sga~~~g 161 (529) T protein:vir:10 82 GQSSGAITNIGPAVIGMVRRAIPSLIAFDIAGVQPMTGPTGQVFALRSVYGKDPLAAGAKEAFHPMYAPDAWHSSLATKG 161 (529) T ss_pred ccccccccccCchhhhhHHHHHhhhhhheeeeeecCCchhhhhhhhheeecCCccccccccccccccccccccccccccc Confidence 999999999999999999999999999999999999999999999999998763 Q ss_pred -----------------------Ccccccccccccccccccccccc-cccccCcccCCCCCCCCcccccccccccccchh Q lcl|NC_015280. 118 -----------------------GNEAFFDEPDAQFSGTDGATPPT-ATTEKNPALINDATGGGTTATNYDLASSKFSTS 173 (455) Q Consensus 118 -----------------------G~EAlfnEa~t~fSg~~~~~~~~-~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta 173 (455) |.|+||+|+++.||+...+.... +....+. ...............++++.||+|+ T Consensus 162 a~~~~~~~~~~~~t~~~~~a~~~g~ea~f~ea~t~fs~~~~g~~~~~g~~~~~~-~~~~~~~~~~a~~~~~~~~~Gm~Ta 240 (529) T protein:vir:10 162 ATTTTDGTPFAKLTAGQAIAEGDIVGHFFYESGTAFLQNVSGASVTVGTNETGE-ALDKLINAAIGEGKLAEIAEGMATS 240 (529) T ss_pred cccccCccccccccccccccccCcceeeeecccceecccccccccccCccccCc-ccccccccccccccccccccccchh Confidence 33667777777777644322221 1111111 1112222333455677889999999 Q ss_pred hhhhc---CCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHh Q lcl|NC_015280. 174 EQEAL---GDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVY 250 (455) Q Consensus 174 ~aE~L---G~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~ 250 (455) ++|.| |++++++|+||+|+||||+|||||||||||||||||||||||||||||+||+||||+|||+||||||||+|| T Consensus 241 ~aEaL~~~g~ss~~~f~EMaFsIeK~tVtAKSRaLKAEYTiELAQDLKAVHGLDAEtELsNILStEImlEINReii~~l~ 320 (529) T protein:vir:10 241 IAELRQGFNGSNDNPWNEMSFRIDKQTVEAKSRQLKAQYSIELAQDLRAVHGMDADSELNGILANEVMLEINREVIDWIN 320 (529) T ss_pred hhhccccCCCcccccccceeeEEEEEEEeeeccceeccccHHHHHHHHHhcCCChHHHHHHHHHHHHHHHhhHHHHHhHh Confidence 99999 567788999999999999999999999999999999999999999999999999999999999999999999 Q ss_pred hhheeeeeecc----ccceeeeeecccc---chhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcc Q lcl|NC_015280. 251 RGAKPGAQANV----ANAGVFDLDVDSN---GRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGV 323 (455) Q Consensus 251 ~vA~~~k~~~v----~~~gv~Dl~~~~~---gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~ 323 (455) +||+|||+.|+ +++|+|||+++.| +||++|+||+|++||++|+|+|+|+|+||+|||||||++||++|+|+|+ T Consensus 321 ~~a~~~k~~g~~~~~~~~Gv~d~~~~~~~~~~~~~~e~~k~L~~~i~~~an~I~~~T~rg~~n~vi~S~~Va~~L~~~~~ 400 (529) T protein:vir:10 321 YTAQVGKSGWTKTDGSASGVFDFQDPIDVRGARWAGESYKALLIQIDKEANEIARQTGRGAGNFIIASRNVVSALALIDT 400 (529) T ss_pred hhhhhhhcccccccccccceeecccCccccccchHHHHHHHHHHHHHHHHHHHHHhhccccceEEEEchHHHHHHHhhhh Confidence 99999998877 8889999998876 9999999999999999999999999999999999999999999999999 Q ss_pred cccccccccccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEcccccccceeecCCc Q lcl|NC_015280. 324 LDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQD 403 (455) Q Consensus 324 l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~ 403 (455) +++ |+..+..++ .++|+++.+|+|+|+|||+|||||| +++|||+|||||++++|+||||||||||+|+|++||+ T Consensus 401 ~~~-~~~~~~~sg-~~~d~~~~~~~G~l~~~~~vy~D~y----~~~dy~~vG~KG~~~~~~glfy~PYv~l~~~~~~dp~ 474 (529) T protein:vir:10 401 NIS-PAAQGMASG-LNADTTKGVFAGILGGRYKVYIDQY----ARQDYFTMGYRGANNLDAGIYYCPYVALTPLRGSDPK 474 (529) T ss_pred hcc-ccccccccc-cccccCCceEEEEecCceEEEecCC----CCcceEEEEEeCCcccccceeeccccccccccccCCC Confidence 855 445555555 4789999999999999999999999 6899999999999999999999999999999999999 Q ss_pred cccceeeeeeecceeecccccccc-----cccccCchhhhhccchhhhhhhhhhcC Q lcl|NC_015280. 404 TFQPRIGFKTRYGMVLNPFAKGLT-----ALSDSDPQAAGNLNANAYYRRVRVANL 454 (455) Q Consensus 404 s~qP~~g~~tRY~l~~nP~~~~~~-----~~~~~~~~~~~~~~~n~y~r~~~v~~~ 454 (455) ||||++||||||||++|||+++.+ ++++|+||. ..+|+|.|||||+|||| T Consensus 475 sfqP~~g~~tRY~l~~NP~~~~~~~~~~~r~~~g~~~~-~~ag~n~~~r~~~Vk~l 529 (529) T protein:vir:10 475 NFQPVMGFKTRYAIGVNPFAESRTQAPQGRITSGMPGV-NSVGKNAYFRRVWVKGL 529 (529) T ss_pred cccceeeeeeeeceeecCccccccccccccccCCcchh-hhcCccceeEEeeeccC Confidence 999999999999999999997654 457999996 48899999999999999 No 9 >protein:vir:103463 Length: 521 # NCBI annotation: major head subunit precursor # Family: family:all:364 # MgeID: mge:1542 # MgeName: RB32 # Cross-refs: genbank:acc:YP_803115;genbank:gi:116326395;genbank:GeneID:4405492 Probab=100.00 E-value=2.5e-220 Score=1224.54 Aligned_cols=447 Identities=40% Similarity=0.656 Sum_probs=397.2 Q ss_pred CcchHHHHHHhhHhhcCCCCccccchhhHHHHHHHhhhHHHHHHHHH------------Hhhhhhhhchh-----hhccc Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHEGLNDIKDPYRKSVTAILLENQERALAEER------------AVLTEAPTNVG-----PINTP 63 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~~~~~i~~~~~~~v~~~~~enq~~~~~e~~------------~~l~ea~~~~~-----~~~~~ 63 (455) |+++|+|+|||+||||||++|||.+. ||+||++|||||||+++|++ ++|+||..+.+ ..|+| T Consensus 3 ~~~~~~l~~kw~p~l~~~~~~~i~~~-~~~~~a~~~enq~~~~~~~~~~~~~~~~~~~~~~l~e~~~~~~~~~~~~~i~e 81 (521) T protein:vir:10 3 IKTKAELLNKWKPLLEGEGLPEIANS-KQAIIAKIFENQEKDFQTAPEYKDEKIAQAFGSFLTEAEIGGDHGYNATNIAA 81 (521) T ss_pred cchhHHHHHhhhhhhccCCCCccccc-hhhhhhhhhhhhhhhhhhccccchhHHHHHHhhhhhhhcccCccccccccccc Confidence 99999999999999999999999985 99999999999999998886 89999876543 57899 Q ss_pred cccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCC----Cccccccc--cccccccccccc Q lcl|NC_015280. 64 TTSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQS----GNEAFFDE--PDAQFSGTDGAT 137 (455) Q Consensus 64 st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qs----G~EAlfnE--a~t~fSg~~~~~ 137 (455) |++|++|+++||+||+|||||+|||||+|||||||||||||||||||+||.+|. ++|+||++ +|+.|||..++. T Consensus 82 s~~t~~v~~~~P~Li~lvRra~p~LIa~DIwGVQPMTgPTGLIFAMRsrY~~q~~~~~g~eaf~~~~~ada~fSG~~~at 161 (521) T protein:vir:10 82 GQTSGAVTQIGPAVMGMVRRAIPNLIAFDICGVQPMNSPTGQVFALRAVYGKDPIAAGAKEAFHPMYGPDAMFSGQGAAK 161 (521) T ss_pred cccccccccCCchhhhHHHHHHhhhhhhhceeeccCCchhhhheeeeeeccCCccccccccccchhcccccccccccccc Confidence 999999999999999999999999999999999999999999999999999985 67888875 999999986553 Q ss_pred ccccccccCc-------------------ccCCCC---------------CCCCcccccccccccccchhhhhhc---CC Q lcl|NC_015280. 138 PPTATTEKNP-------------------ALINDA---------------TGGGTTATNYDLASSKFSTSEQEAL---GD 180 (455) Q Consensus 138 ~~~~~~~~~~-------------------~~~~~~---------------~~g~t~~~~~~~~~~gm~Ta~aE~L---G~ 180 (455) .......... ....+. ..........++++.||+|+++|.| |+ T Consensus 162 ~~s~~~~~~~~~~Gd~~~~~~~~~g~~~~~~~~~~t~~~t~~d~~~~~~~~~~~~~~~~~y~~~~GmsTa~aEal~~~g~ 241 (521) T protein:vir:10 162 KFAALAASTQTTVGDIYTHFFQDTGTVYLQASAQVTISSTADDAAKLDAEIKKQMEAGALVEIAEGMATSIAELQESFNG 241 (521) T ss_pred ccccccccccccccccccccccccccceecccccccCCCcccccccccccccccccccceeecccccchhhHhhhccCCC Confidence 2211100000 000000 1111223456778999999999977 67 Q ss_pred CCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeec Q lcl|NC_015280. 181 GASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQAN 260 (455) Q Consensus 181 s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~ 260 (455) ++++.|+||+|+||||+|||||||||||||||||||||||||||||+||+||||||||+|||||||++|+.+|+++++.. T Consensus 242 ss~~~f~EMaFsIeKvtVtAKSRaLKAEYTiELAQDLKAVHGLDAEtELaNILSTEImlEINReii~~i~~sa~~~~~g~ 321 (521) T protein:vir:10 242 STDNPWNEMGFRIDKQVIEAKSRQLKAAYSIELAQDLRAVHGMDADAELSGILATEIMLEINREVVDWINYSAQVGKSGM 321 (521) T ss_pred CccccccceeeEEEEEEEeeeccceeccccHHHHHHHHHhcCCChHHHHHHHHHHHHHHHhhHHHhhhhhheeeeeeeee Confidence 88899999999999999999999999999999999999999999999999999999999999999999888888877655 Q ss_pred c----ccceeeeeecccc---chhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccc Q lcl|NC_015280. 261 V----ANAGVFDLDVDSN---GRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGA 333 (455) Q Consensus 261 v----~~~gv~Dl~~~~~---gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~ 333 (455) . +++|+|||+++.| +||++|+||+|+|||+||||+|+|+|+||+|||||||+|||++|+|+|++++.++.... T Consensus 322 t~~~~~~~G~~d~~~~~d~~~~~~~~e~~k~L~~~i~~~an~i~~~T~r~~~n~~i~S~~Va~~L~~~~~~~~~~~~~~~ 401 (521) T protein:vir:10 322 TLTPGSKAGVFDFQDPIDIRGARWAGESFKALLFQIDKEAVEIARQTGRGEGNFIIASRNVVNVLASVDTGISYAAQGLA 401 (521) T ss_pred eeccCccccceecccccccccchHHHHHHHHHHHHHHHHHHHHHHhcccccceEEEEchHHHHHHhhccccccccccccc Confidence 4 6899999999998 99999999999999999999999999999999999999999999999999999988777 Q ss_pred cccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEcccccccceeecCCccccceeeeee Q lcl|NC_015280. 334 VGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQDTFQPRIGFKT 413 (455) Q Consensus 334 ~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s~qP~~g~~t 413 (455) .+ .++|+|+++|+|+|+|||+|||||| +++|||+|||||++|+|+||||||||||+|+|++||+||||++|||| T Consensus 402 ~g--~~~d~~~~~~~G~l~~~~~vy~D~y----~~~dy~~vG~KG~~~~~~glfyaPYv~l~~~~~~dp~sfqP~~g~~t 475 (521) T protein:vir:10 402 TG--FNTDTTKSVFAGVLGGKYRVYIDQY----AKQDYFTVGYKGPNEMDAGIYYAPYVALTPLRGSDPKNFQPVMGFKT 475 (521) T ss_pred cc--ccccCCCceEEEEecCceEEEecCC----CCcceEEEEEeCCcccccceeeccccccccccccCCccccceeeeee Confidence 66 4789999999999999999999999 68999999999999999999999999999999999999999999999 Q ss_pred ecceeeccccccccc----ccccCch-hhhhccchhhhhhhhhhcC Q lcl|NC_015280. 414 RYGMVLNPFAKGLTA----LSDSDPQ-AAGNLNANAYYRRVRVANL 454 (455) Q Consensus 414 RY~l~~nP~~~~~~~----~~~~~~~-~~~~~~~n~y~r~~~v~~~ 454 (455) ||||++|||+++.++ ..++++| .....++|.|||||+|||| T Consensus 476 RY~l~~NP~~~~~~~~~~~~i~~~~~~~~a~~~~~sy~r~v~v~~l 521 (521) T protein:vir:10 476 RYGIGINPFAESAAQAPASRIQSGMPSILNSLGKNAYFRRVYVKGI 521 (521) T ss_pred eeceeecCcccccCCccceeecccchhhhccccccceeeeeeecCC Confidence 999999999997653 2345555 4568899999999999999 No 10 >protein:vir:6601 Length: 528 # NCBI annotation: major capsid protein # Family: family:all:364 # MgeID: mge:139 # MgeName: RB49 # Cross-refs: genbank:acc:NP_891732;genbank:gi:33620668;genbank:GeneID:1725275 Probab=100.00 E-value=4.7e-220 Score=1223.02 Aligned_cols=447 Identities=37% Similarity=0.614 Sum_probs=396.5 Q ss_pred CcchHHHHHHhhHhhcCCCCccccchhhHHHHHHHhhhHHHHHHHHH------------Hhhhhhhhchh-----hhccc Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHEGLNDIKDPYRKSVTAILLENQERALAEER------------AVLTEAPTNVG-----PINTP 63 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~~~~~i~~~~~~~v~~~~~enq~~~~~e~~------------~~l~ea~~~~~-----~~~~~ 63 (455) |+++|+|+|||+||||||++|||++.+||+|+++|||||||+|+||+ .+|.||..|.+ ++|+| T Consensus 1 ~~~~~~l~~kw~p~l~~~~~~~i~~~~~~~~~a~l~enq~~~~~~~~~~~~~~~~~~~~~~l~ea~~~~~~~~~~~~i~e 80 (528) T protein:vir:66 1 MKTTKELMEKWSPLLENEKLPEIATASKQKLVAKILESQEADFAVDPIYKDEKVVEAFGGFIAEAEVAGDHGYDASQIAA 80 (528) T ss_pred CcchHHHHHHhHHhhcCCCcchhcchhhhhhhhhhhhhhHHHhhcccchhhHHHHHhhhhhhhhhcccccccccchhccc Confidence 99999999999999999999999999999999999999999999886 78999987754 79999 Q ss_pred cccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCC-------------Ccccccccccccc Q lcl|NC_015280. 64 TTSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQS-------------GNEAFFDEPDAQF 130 (455) Q Consensus 64 st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qs-------------G~EAlfnEa~t~f 130 (455) |++|++|++|||+||+|||||+|||||+|||||||||||||||||||++|.++. +.|++|+|+++.| T Consensus 81 s~~t~~v~~~~P~Li~lvRRa~p~LIa~DIwGVQPMTgPTGlIFAmRs~Y~~~~~~~~~~eAfh~~~g~ea~fsea~t~~ 160 (528) T protein:vir:66 81 GQTTGAITNVGPAVIGMVRRAIPNLIAFDICGVQPMSTPTSQIFAIRSVYGGDPLKSGAREAFHPMYAPDAFHSSLAAKE 160 (528) T ss_pred cccccccccCchhHHHHHHHHHHhhhhhhhheeecCCchhhhheeeeeeecCCccccccccccccccccccccccccccc Confidence 999999999999999999999999999999999999999999999999998764 4466666666666 Q ss_pred cccccccc-----------------------------------ccc-ccccCcccCCCCCCCCcccccccccccccchhh Q lcl|NC_015280. 131 SGTDGATP-----------------------------------PTA-TTEKNPALINDATGGGTTATNYDLASSKFSTSE 174 (455) Q Consensus 131 Sg~~~~~~-----------------------------------~~~-~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~ 174 (455) +..++.+. ... ....+.................++++.||+|++ T Consensus 161 a~~gGpTGliFAm~s~y~s~~~g~ea~~nea~t~fs~~~~~~~~~~~~~~~g~~~g~~~~~~~~a~~~~~~~~~Gm~Ta~ 240 (528) T protein:vir:66 161 ATVGSPTGTAFAKLTLSQAITAGDIVYHTFAETGIAYLQNVTGDSVTPQKVGSESEDEVVMKLIEEGKLAEIAFGMATSI 240 (528) T ss_pred ccccCCccceeecccccccccccceeeecccccceeeeccccccccccCcccccccccccccccccccceecccccchhh Confidence 54322110 000 000000011111222223345677889999999 Q ss_pred hhh---cCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhh Q lcl|NC_015280. 175 QEA---LGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYR 251 (455) Q Consensus 175 aE~---LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~ 251 (455) +|. +|++++++|+||+|+||||+|||||||||||||||||||||||||||||+||+||||+|||+|||||||++|+. T Consensus 241 aEale~lg~~s~~~f~EMaFsIeK~tVtAKSRaLKAEYTiELAQDLKAIHGLDAEtELsNILStEImlEINREii~~i~~ 320 (528) T protein:vir:66 241 AEIQEGFNGSSNNPWAEMSMRIDKQVVEAKSRQLKARYSIEVAQDLRAVHGMDADAELNAILANEVLLEINREIVDVINF 320 (528) T ss_pred hhhhcccCCCcccchhhcceEEEeEEEEeeccceeccccHHHHHHHHHhcCCChHHHHHHHHHHHHHHHhhHHHHhhhhh Confidence 996 67788999999999999999999999999999999999999999999999999999999999999999999999 Q ss_pred hheeeeeecc----ccceeeeeecccc---chhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhccc Q lcl|NC_015280. 252 GAKPGAQANV----ANAGVFDLDVDSN---GRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVL 324 (455) Q Consensus 252 vA~~~k~~~v----~~~gv~Dl~~~~~---gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l 324 (455) +|+++++.++ +++|+|||+++.| +||++|+||+|+|||+||+|+|+|+|+||+|||||||++||++|+|+|++ T Consensus 321 ~a~~~~~~~t~~~~~~aG~~dl~~~~d~~g~rw~~e~~k~L~~~i~~~an~I~~~T~r~~gn~vi~S~~Va~~L~~~g~~ 400 (528) T protein:vir:66 321 TAQVGKTGMTQTVGSKAGVFDLQDPIDTRGARWAGESFKSLIYQIDKEAAEIARQTGRGAGNFVIASRNVVNILASADQG 400 (528) T ss_pred eeeeeeeeeeeccccccceeecccccccccchhHHHHHHHHHHHHHHHHHHHHHhhccccccEEEEchHHHHHHhhcccc Confidence 9999987655 6789999997776 69999999999999999999999999999999999999999999999999 Q ss_pred ccccccccccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEcccccccceeecCCcc Q lcl|NC_015280. 325 DYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQDT 404 (455) Q Consensus 325 ~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s 404 (455) ++.+....... .++|+|+.+|+|+|+|||+|||||| +++|||+|||||++|+|+||||||||||+|++++||+| T Consensus 401 ~~~~~~~~~~~--~~~d~~~~~~~G~l~~~~~vy~D~y----~~~dy~~vG~KG~~~~~~glfyaPYv~l~~~~~~dp~s 474 (528) T protein:vir:66 401 ISLAMQGAAKG--LNTDTTKAVFAGVLAGKYKVFIDQY----ARQDYFTVGYKGDNEMDAGIYYAPYVALTPLRATDPQS 474 (528) T ss_pred ccccccccccc--cccCCCCceeEEEecCceEEEecCC----CCcceEEEEEeCCcccccceeecccccceeeEeeCCcc Confidence 88888777765 6899999999999999999999999 68999999999999999999999999999999999999 Q ss_pred ccceeeeeeecceeecccccccc-----cccccCchhhhhccchhhhhhhhhhcC Q lcl|NC_015280. 405 FQPRIGFKTRYGMVLNPFAKGLT-----ALSDSDPQAAGNLNANAYYRRVRVANL 454 (455) Q Consensus 405 ~qP~~g~~tRY~l~~nP~~~~~~-----~~~~~~~~~~~~~~~n~y~r~~~v~~~ 454 (455) |||++||||||||++|||+++.+ ++.+|+||. ..+|+|.|||||+|||| T Consensus 475 fqP~~g~~tRY~l~vNP~~~~~~~~~~~ri~~g~~~~-~~ag~n~~~r~~~Vk~~ 528 (528) T protein:vir:66 475 FHPVLGFKTRYGIGINPFADSKSQEPSARITSGMLSK-DSVGKNAYFRRVWVKGC 528 (528) T ss_pred ccceeeeeeeeceeecCcccccCccccccccccchhh-hhcCccceeEEeeeccC Confidence 99999999999999999998764 457999997 68999999999999999 No 11 >protein:vir:98143 Length: 524 # NCBI annotation: gp23 precursor of major head subunit # Family: family:all:364 # MgeID: mge:1667 # MgeName: RB43 # Cross-refs: genbank:acc:YP_239203;genbank:gi:66391678;genbank:GeneID:3416245 Probab=100.00 E-value=4.5e-220 Score=1223.12 Aligned_cols=445 Identities=42% Similarity=0.676 Sum_probs=396.2 Q ss_pred CcchHHHHHHhhHhhcC-CCCccccchhhHHHHHHHhhhHHHHHHHHH------------Hhhhhhhhchh-----hhcc Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNH-EGLNDIKDPYRKSVTAILLENQERALAEER------------AVLTEAPTNVG-----PINT 62 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~-~~~~~i~~~~~~~v~~~~~enq~~~~~e~~------------~~l~ea~~~~~-----~~~~ 62 (455) ||+||+|+|||+||||+ |++|||++.+||+|+++||||||||+++++ ++|.||....+ ..|+ T Consensus 1 ~~~~~~l~~kw~p~l~~~~~~~~i~~~~~~~~~a~llenq~~~~~~~~~~~~~~~~~~~~~~l~ea~~~~~~~~~~~~i~ 80 (524) T protein:vir:98 1 MSKKNELMEKWNDLLESQEGLPDIATKSKKQLVAAILEAQEKDAETDPVYRDEKIVESFGGFLAEAEIAGDHNYDQTNIA 80 (524) T ss_pred CcchHHHHHHhHHHhcCCcCcchhcchhhHHHHHHHHhhHHHHHhcCccccchHHHHhhhcccccccccccccccccccc Confidence 99999999999999986 899999999999999999999999999997 78999985433 4789 Q ss_pred ccccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCC---CCcccccccc-------cccccc Q lcl|NC_015280. 63 PTTSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQ---SGNEAFFDEP-------DAQFSG 132 (455) Q Consensus 63 ~st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~q---sG~EAlfnEa-------~t~fSg 132 (455) ||++|++|+++||+||+|||||+|||||+|||||||||||||||||||+||.+| .|+|++|||| |+.||| T Consensus 81 ~s~~t~~v~~~~P~Li~lvRra~p~LIa~DIwGVQPMTgPTGLIFAmRsrY~n~~~~~gteA~~nEAf~~~ye~dt~fSG 160 (524) T protein:vir:98 81 SGKSSGAITNIGPAVIGMVRRAIPNLIAFDICGVQPMTGPTGQVFALRAVYGKDPLAGGTPADVREAFHPMFAPDTMYSG 160 (524) T ss_pred ccccccccccccchhhhHHHHHHHhhhhhhhheeccCCchhhhhhhhheeecCCCCCcccccccccccccccccccccCC Confidence 999999999999999999999999999999999999999999999999999998 4779999986 899998 Q ss_pred ccccccccccc----------------------------------ccCcccCCCCCCCCcccccccccccccchhhhhhc Q lcl|NC_015280. 133 TDGATPPTATT----------------------------------EKNPALINDATGGGTTATNYDLASSKFSTSEQEAL 178 (455) Q Consensus 133 ~~~~~~~~~~~----------------------------------~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~L 178 (455) .++........ +.++...+....+.......++++.||+|+.+|.| T Consensus 161 ~g~~t~~s~~~~g~~~~~g~~~~~~~~~~g~~~~~~~~~g~~~~tgt~p~~~~~a~~~~~~~g~~~~~~~GmsTA~aEaL 240 (524) T protein:vir:98 161 EGAHTAFAKITTGTAIATGAIVYHIFQETGIAYFQNVTSGNVTVTGADPAALDAAVIAENEKGTLAEISVGMATSVAELQ 240 (524) T ss_pred ccccccccccccccccccccccccccccccceeccccccCcccccccccccccccccccccccceeecccccchhhhhhh Confidence 75433211110 01111111111112223345678899999999987 Q ss_pred ---CCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhhee Q lcl|NC_015280. 179 ---GDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKP 255 (455) Q Consensus 179 ---G~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~ 255 (455) |++++++|+||+|+||||+|||||||||||||||||||||||||||||+||+||||||||+|||||||++|+.+|+. T Consensus 241 ~~~g~ss~~~f~EMaFsIeKvtVtAKSRaLKAEYTiELAQDLKAVHGLDAEtELsNILSTEImlEINReii~~i~~~a~~ 320 (524) T protein:vir:98 241 ENFNGSSANPWNEMAFRIDKQVIEARSRQLKAQYSVELAQDLRAVHGMDADAELSAILATEIMLEINREIVDLINYTAQV 320 (524) T ss_pred ccCCCCccccccceeeEEEEEEEeeecccccccccHHHHHHHHHhcCCChHHHHHHHHHHHHHHHhhHHHHHHHhhhhee Confidence 66788999999999999999999999999999999999999999999999999999999999999999988777776 Q ss_pred eeee----ccccceeeeeecccc---chhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHh--hccccc Q lcl|NC_015280. 256 GAQA----NVANAGVFDLDVDSN---GRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAM--SGVLDY 326 (455) Q Consensus 256 ~k~~----~v~~~gv~Dl~~~~~---gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~--sG~l~~ 326 (455) +++. .++++|+|||+++.| +||++|+||+|++||++|+|+|+|+|+||+|||||||+|||++|+| +||+++ T Consensus 321 ~~~g~t~~~~~~~G~~dl~~~~d~~~~r~~~e~~~~L~~~i~~~an~I~~~T~rg~~n~~i~S~~Va~~L~~~~~g~~~~ 400 (524) T protein:vir:98 321 GKSGFTQTVGSKAGSFDFQDPVDIRGARWAGESYKALLIQIDKEANEIARQTGRGAGNFIIASRNVVSALARIDSGITPA 400 (524) T ss_pred ceeecccccccccceeeccccccccccchhHHHHHHHHHHHHHHHHHHHHhhccccccEEEEchHHHHHHhhhhcccccc Confidence 6653 356789999998865 9999999999999999999999999999999999999999999999 899988 Q ss_pred ccccccccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEcccccccceeecCCcccc Q lcl|NC_015280. 327 DSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQDTFQ 406 (455) Q Consensus 327 ~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s~q 406 (455) .+.++.. ++.|+|+.+|+|+|+|||+|||||| +++|||+|||||++|+|+||||||||||+|+|++||+||| T Consensus 401 s~~~~~~----~~~d~~~~~~~G~l~~~~~vy~D~y----~~~dy~~vG~KG~~~~~~glfyaPYv~l~~~~~~dp~sfq 472 (524) T protein:vir:98 401 SQGLQKT----LNVDTTKAVFAGVLGGTYKVYIDQY----ARQDYFTVGFKGDNEMDAGIYYAPYVALTPLRGSDPKNFQ 472 (524) T ss_pred cchhhcc----cccCCccceEEEEecCceEEEecCC----CCcceEEEEeeCCcccccceeeccccccccccccCCcccc Confidence 8887654 7899999999999999999999999 6899999999999999999999999999999999999999 Q ss_pred ceeeeeeecceeecccccccc-----cccccCchhhhhccchhhhhhhhhhcC Q lcl|NC_015280. 407 PRIGFKTRYGMVLNPFAKGLT-----ALSDSDPQAAGNLNANAYYRRVRVANL 454 (455) Q Consensus 407 P~~g~~tRY~l~~nP~~~~~~-----~~~~~~~~~~~~~~~n~y~r~~~v~~~ 454 (455) |++||||||||++|||+++.+ ++++|+||. ..+|+|.|||||+|||| T Consensus 473 P~~g~~tRY~l~~NP~~~~~~~~~~~ri~~g~~~~-~~ag~n~~~r~~~Vk~l 524 (524) T protein:vir:98 473 PVMGFKTRYGIGINPFANSRSQAPADRITSGMISK-EMCGKNAYFRKVWVKGL 524 (524) T ss_pred ceeeeeeeeceeecCcccccCCccccccccCcchH-hhcCccceeeEeeeccC Confidence 999999999999999998654 567999997 58899999999999999 No 12 >protein:vir:7214 Length: 521 # NCBI annotation: gp23 major head protein # Family: family:all:364 # MgeID: mge:142 # MgeName: T4 # Cross-refs: genbank:acc:NP_049787;genbank:gi:9632597;genbank:GeneID:1258751 Probab=100.00 E-value=5.9e-220 Score=1222.49 Aligned_cols=447 Identities=41% Similarity=0.671 Sum_probs=397.4 Q ss_pred CcchHHHHHHhhHhhcCCCCccccchhhHHHHHHHhhhHHHHHHHHH------------Hhhhhhhhchh-----hhccc Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHEGLNDIKDPYRKSVTAILLENQERALAEER------------AVLTEAPTNVG-----PINTP 63 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~~~~~i~~~~~~~v~~~~~enq~~~~~e~~------------~~l~ea~~~~~-----~~~~~ 63 (455) |+++|+|+|||+||||||++|||.+. ||+||++|||||||+++|++ ++|+||..+.+ ..|+| T Consensus 3 ~~~~~~l~~kw~p~l~~~~~~~i~~~-~~~~~a~~~enq~~~~~~~~~~~~~~~~~~~~~~l~e~~~~~~~~~~~~~iae 81 (521) T protein:vir:72 3 IKTKAELLNKWKPLLEGEGLPEIANS-KQAIIAKIFENQEKDFQTAPEYKDEKIAQAFGSFLTEAEIGGDHGYNATNIAA 81 (521) T ss_pred cchhHHHHHhhhhhhccCCCCccccc-hhhhhhhhhhhhhhhhhhcccccchHHHHHHhhhhhhhcccCccccCcccccc Confidence 99999999999999999999999985 99999999999999998886 88999875543 57899 Q ss_pred cccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCC----Cccccccc--cccccccccccc Q lcl|NC_015280. 64 TTSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQS----GNEAFFDE--PDAQFSGTDGAT 137 (455) Q Consensus 64 st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qs----G~EAlfnE--a~t~fSg~~~~~ 137 (455) |++|++|+++||+||+|||||+|||||+|||||||||||||||||||+||.+|. |+|+||+| +|+.|||..+.. T Consensus 82 s~~t~~v~~~~P~Li~lvRra~p~LIa~DIwGVQPMTgPTGLIFAMRsrY~~q~~~~~g~ea~~~e~~~da~fSG~~~~~ 161 (521) T protein:vir:72 82 GQTSGAVTQIGPAVMGMVRRAIPNLIAFDICGVQPMNSPTGQVFALRAVYGKDPVAAGAKEAFHPMYGPDAMFSGQGAAK 161 (521) T ss_pred cccccccccCCchhhhHHHHHHhhhhhhhceeeccCCchhhhheeeeeeecCCCCCcccccccchhcccccccccccccc Confidence 999999999999999999999999999999999999999999999999999885 78999987 788999986554 Q ss_pred ccccccccCc---------------------ccCCCCCCCCc-------------ccccccccccccchhhhhh---cCC Q lcl|NC_015280. 138 PPTATTEKNP---------------------ALINDATGGGT-------------TATNYDLASSKFSTSEQEA---LGD 180 (455) Q Consensus 138 ~~~~~~~~~~---------------------~~~~~~~~g~t-------------~~~~~~~~~~gm~Ta~aE~---LG~ 180 (455) .......... ......+++.+ .....++++.||+|+.+|. +|+ T Consensus 162 ~~~~~~~~~~~a~Gd~~~~~~~~~gt~~~~~~~~~~~~~g~t~~~~t~~~v~~~~~a~~~y~~g~gm~Ta~aEal~~~g~ 241 (521) T protein:vir:72 162 KFPALAASTQTTVGDIYTHFFQETGTVYLQASVQVTIDAGATDAAKLDAEIKKQMEAGALVEIAEGMATSIAELQEGFNG 241 (521) T ss_pred cccccccccccccccccccccccccccccccccccccCCCCCCccccccccccccccCceeeeecccchhhhhhhcccCC Confidence 3221111110 00111111111 2234567889999999997 567 Q ss_pred CCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeec Q lcl|NC_015280. 181 GASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQAN 260 (455) Q Consensus 181 s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~ 260 (455) +++++|+||+|+||||+|||||||||||||||||||||||||||||+||+||||+|||+|||||||++|+.+|+++++.. T Consensus 242 ss~~~f~EMaFsIeK~tVtAKSRaLKAEYTiELAQDLKAVHGLDAEtELaNILSTEImlEINReii~~i~~sa~~g~~g~ 321 (521) T protein:vir:72 242 STDNPWNEMGFRIDKQVIEAKSRQLKAAYSIELAQDLRAVHGMDADAELSGILATEIMLEINREVVDWINYSAQVGKSGM 321 (521) T ss_pred cccccccceeeEEEEEEEeeeccceeccccHHHHHHHHHhcCCChHHHHHHHHHHHHHHHhhHHHhhhhhheeeeeeeee Confidence 78889999999999999999999999999999999999999999999999999999999999999999888888777654 Q ss_pred c----ccceeeeeecccc---chhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccc Q lcl|NC_015280. 261 V----ANAGVFDLDVDSN---GRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGA 333 (455) Q Consensus 261 v----~~~gv~Dl~~~~~---gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~ 333 (455) . +++|+|||+++.| +||++|+||+|+|||+||||+|+|+|+||+|||||||+|||++|+|+|.+++.++.... T Consensus 322 t~~~~~~~G~~d~~~~~d~~~~~~~~e~~k~L~~~i~~~an~i~~~T~r~~~n~~i~S~~Va~~L~~~~~~~~~~~~~~~ 401 (521) T protein:vir:72 322 TLTPGSKAGVFDFQDPIDIRGARWAGESFKALLFQIDKEAVEIARQTGRGEGNFIIASRNVVNVLASVDTGISYAAQGLA 401 (521) T ss_pred eeccCccccceecccccccccchHHHHHHHHHHHHHHHHHHHHHHhcccccceEEEEchHHHHHHhhccccccccccccc Confidence 4 6899999999998 99999999999999999999999999999999999999999999999999988888777 Q ss_pred cccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEcccccccceeecCCccccceeeeee Q lcl|NC_015280. 334 VGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQDTFQPRIGFKT 413 (455) Q Consensus 334 ~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s~qP~~g~~t 413 (455) .+ .++|+|+++|+|+|+|||+|||||| +++|||+|||||++|+|+||||||||||+|+|++||+||||++|||| T Consensus 402 ~g--~~~d~~~~~~~G~l~~~~~vy~D~y----~~~dy~~vG~KG~~~~~~glfyaPYv~l~~~~~~dp~sfqP~~g~~t 475 (521) T protein:vir:72 402 TG--FSTDTTKSVFAGVLGGKYRVYIDQY----AKQDYFTVGYKGPNEMDAGIYYAPYVALTPLRGSDPKNFQPVMGFKT 475 (521) T ss_pred cc--ccccCCCceEEEEccCceEEEecCC----CCcceEEEEEeCCcccccceeeccccccccccccCCccccceeeeee Confidence 66 4789999999999999999999999 68999999999999999999999999999999999999999999999 Q ss_pred ecceeecccccccc----cccccCchh-hhhccchhhhhhhhhhcC Q lcl|NC_015280. 414 RYGMVLNPFAKGLT----ALSDSDPQA-AGNLNANAYYRRVRVANL 454 (455) Q Consensus 414 RY~l~~nP~~~~~~----~~~~~~~~~-~~~~~~n~y~r~~~v~~~ 454 (455) ||||++|||+++.+ +..++++|. ....++|.|||||+|||| T Consensus 476 RY~l~~NP~~~~~~~~~a~~i~~~~~~~~a~~~~~sy~r~v~v~~l 521 (521) T protein:vir:72 476 RYGIGINPFAESAAQAPASRIQSGMPSILNSLGKNAYFRRVYVKGI 521 (521) T ss_pred eeceeecCcccccCcccceeecCcChhhhcCccccceeeeeeecCC Confidence 99999999999765 335556664 568899999999999999 No 13 >protein:vir:80986 Length: 528 # NCBI annotation: gp23 major head protein # Family: family:all:364 # MgeID: mge:1888 # MgeName: Phi1 # Cross-refs: genbank:acc:YP_001469506;genbank:gi:157311463;genbank:GeneID:5602119 Probab=100.00 E-value=1.5e-219 Score=1220.28 Aligned_cols=447 Identities=38% Similarity=0.638 Sum_probs=396.5 Q ss_pred CcchHHHHHHhhHhhcCCCCccccchhhHHHHHHHhhhHHHHHHHHH------------Hhhhhhhhchh-----hhccc Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHEGLNDIKDPYRKSVTAILLENQERALAEER------------AVLTEAPTNVG-----PINTP 63 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~~~~~i~~~~~~~v~~~~~enq~~~~~e~~------------~~l~ea~~~~~-----~~~~~ 63 (455) |+++|+|+|||+||||||++|||++.+||+|+++|||||||+++||+ ++|.||..+.+ ..|+| T Consensus 1 ~~~~~~l~~kw~p~l~~~~~~~i~~~~~~~~~a~llenq~~~~~~~~~~~~~~~~~~~~~~l~ea~~~~~~~~~~~~i~e 80 (528) T protein:vir:80 1 MKTTKELMEKWSPLLENEKLPEIATASKQKLVAKILESQEADFAVDPIYKDEKVVEAFGGFIAEAEVAGDHGYDASQIAA 80 (528) T ss_pred CcchHHHHHhhhHhhcCCccchhcchhhhhhhhhhhhhhhHHhhccccccchHHHHhhhhhccccccccccCCccccccc Confidence 99999999999999999999999999999999999999999999887 78999976654 68999 Q ss_pred cccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCC----Cccccc--cccccccccccccc Q lcl|NC_015280. 64 TTSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQS----GNEAFF--DEPDAQFSGTDGAT 137 (455) Q Consensus 64 st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qs----G~EAlf--nEa~t~fSg~~~~~ 137 (455) |++|++|++|||+||+|||||+|||||+|||||||||||||||||||+||.+|. ++|+|| +++++.||+..+.. T Consensus 81 s~~t~~v~~~~P~Li~lvRra~p~LIa~DIwGVQPMTgPTGLIFAMRsrY~~~~~~~~~~ea~~~~~~~da~fS~~~t~~ 160 (528) T protein:vir:80 81 GQTTGAITNVGPAVIGMVRRAIPNLIAFDICGVQPMSTPTSQIFAIRSVYGPNPLASQAKEAFHPMYAPDAFHSSLAAKG 160 (528) T ss_pred cccccccccCCchhhhHHHHHHhhhhhhhhheeccCCchhhhheeeeeeecCCccccccccccccccccccccccccccc Confidence 999999999999999999999999999999999999999999999999998874 456664 46777777654322 Q ss_pred cccccc-c------------------------cC--------cccCCCCCCCC----------cccccccccccccchhh Q lcl|NC_015280. 138 PPTATT-E------------------------KN--------PALINDATGGG----------TTATNYDLASSKFSTSE 174 (455) Q Consensus 138 ~~~~~~-~------------------------~~--------~~~~~~~~~g~----------t~~~~~~~~~~gm~Ta~ 174 (455) ...+.. + .+ .....+...++ ......++++.||+|+. T Consensus 161 ~a~~~ea~t~fs~~~~~~~~~~G~~~~~t~~~tg~~~~~~~~~~~~~~~~~gt~~~~~~~~~~~~~~~~~~~~~Gm~Ta~ 240 (528) T protein:vir:80 161 AAVGSPTGTPFAKLAIGTQIEAGDIVHHTFAETGIAYLQNVTAEQVTPTKAGSESEDEVVMKLMEEGKLAEIAFGMATSI 240 (528) T ss_pred cccccccccccccccccccccccceeccccccccccccccccccccCccccCCcccccccccccccccccccccccchhh Confidence 111000 0 00 00000111111 12233467889999999 Q ss_pred hh---hcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhh Q lcl|NC_015280. 175 QE---ALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYR 251 (455) Q Consensus 175 aE---~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~ 251 (455) +| .||++++++|+||+|+||||+|||||||||||||||||||||||||||||+||+||||+|||+|||||||++|+. T Consensus 241 AE~le~lg~ss~~~f~EMaFsIEKvTVtAKSRaLKAEYTiELAQDLKAIHGLDAEtELaNILStEImlEINReii~~i~~ 320 (528) T protein:vir:80 241 AEIQEGFNGSSNNPWAEMSMRIDKQVVEAKSRQLKARYSIEVAQDLRAVHGMDADAELNAILANEVLLEINREIVDVINF 320 (528) T ss_pred hhhhcccCCCccccccceeeEEEEEEEeeeccceeccccHHHHHHHHHhcCCChHHHHHHHHHHHHHHHhhHHHHhhhhh Confidence 99 667888999999999999999999999999999999999999999999999999999999999999999999999 Q ss_pred hheeeeeecc----ccceeeeeecccc---chhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhccc Q lcl|NC_015280. 252 GAKPGAQANV----ANAGVFDLDVDSN---GRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVL 324 (455) Q Consensus 252 vA~~~k~~~v----~~~gv~Dl~~~~~---gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l 324 (455) +|+++++.++ +++|+|||+++.| +||++|+||+|+|||+||+|+|+|+|+||+|||||||++||++|+|+|++ T Consensus 321 ~a~~~~~~~t~~~~~~~G~~dl~~~~d~~g~r~~~e~~k~L~~~i~~~an~I~~~T~~~~gn~vi~S~~Va~~L~~~g~~ 400 (528) T protein:vir:80 321 TAQVGKTGMTQTVGSKAGVFDLQDPIDTRGARWAGESFKSLIYQIDKEAAEIARQTGRGAGNFVIASRNVVNILASADQG 400 (528) T ss_pred eeeeeeeeeeeccccccceeeccccccccccchhHHHHHHHHHHHHHHHHHHHHhhccccccEEEEchHHHHHHhhcccc Confidence 9999987654 6789999998876 89999999999999999999999999999999999999999999999999 Q ss_pred ccccccccccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEcccccccceeecCCcc Q lcl|NC_015280. 325 DYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQDT 404 (455) Q Consensus 325 ~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s 404 (455) ++.+....... .++|+|+.+|+|+|+|||+|||||| +++|||+|||||++|+|+||||||||||+|++++||+| T Consensus 401 ~~~~~~~~~~~--~~~d~~~~~~~G~l~~~~~vy~D~y----~~~dy~~vG~KG~~~~~~glfy~PYv~l~~~~~~dp~s 474 (528) T protein:vir:80 401 ISLAMQGAAKG--LNTDTTKAVFAGVLAGKYKVFIDQY----ARQDYFTVGYKGDNEMDAGIYYAPYVALTPLRATDPQS 474 (528) T ss_pred ccccccccccc--cccCCCCceEEEEecCceEEEecCC----CCcceEEEEEeCCcccccceeecccccceeeEeeCCcc Confidence 88888776655 6899999999999999999999999 68999999999999999999999999999999999999 Q ss_pred ccceeeeeeecceeecccccccc-----cccccCchhhhhccchhhhhhhhhhcC Q lcl|NC_015280. 405 FQPRIGFKTRYGMVLNPFAKGLT-----ALSDSDPQAAGNLNANAYYRRVRVANL 454 (455) Q Consensus 405 ~qP~~g~~tRY~l~~nP~~~~~~-----~~~~~~~~~~~~~~~n~y~r~~~v~~~ 454 (455) |||++||||||||++|||+++.+ ++++|+||. ..+|+|.|||||+|||| T Consensus 475 fqP~~g~~tRY~l~~NP~~~~~~~~~~~r~~~g~~~~-~~ag~n~~~r~~~Vk~~ 528 (528) T protein:vir:80 475 FHPVLGFKTRYGIGINPFADSKSQAPSARITSGMLSK-DSVGKNAYFRRVWVKGC 528 (528) T ss_pred ccceeeeeeeeceeecCcccccCCcccccccccchhh-hhcCccceeEEeeeccC Confidence 99999999999999999998654 467999996 68999999999999999 No 14 >protein:vir:5670 Length: 514 # NCBI annotation: gp23 # Family: family:all:364 # MgeID: mge:119 # MgeName: KVP40 # Cross-refs: genbank:acc:NP_899609;genbank:gi:34419596;genbank:GeneID:2546039 Probab=100.00 E-value=3e-219 Score=1218.59 Aligned_cols=444 Identities=42% Similarity=0.702 Sum_probs=390.8 Q ss_pred HHHHHHhhHhhcCCC--CccccchhhHHHHHHHhhhHHHHHHHHH------------Hhhhhhhhchh-----hhccccc Q lcl|NC_015280. 5 ENLQEKWAPVLNHEG--LNDIKDPYRKSVTAILLENQERALAEER------------AVLTEAPTNVG-----PINTPTT 65 (455) Q Consensus 5 ~~~~~kw~~~l~~~~--~~~i~~~~~~~v~~~~~enq~~~~~e~~------------~~l~ea~~~~~-----~~~~~st 65 (455) -+|+|||+||||||| +|||++.+||+|+++|||||||+++|++ ++|+||..|.+ .+|+||+ T Consensus 1 ~~l~~kw~p~l~~~~~~~~~i~~~~~~~~~~~l~enq~~~~~~~~~~~~~~~~~~~~~~l~e~~~~~~~~~~~~~ia~s~ 80 (514) T protein:vir:56 1 MNLTEKWKDLLEAEGADMPEIATATKQKIMSKIFENQDRDINNDPMYRDPQLVEAFNAGLNEAVVNGDHGYDPANIAQGV 80 (514) T ss_pred CchhhhhhHHhcccccccccccchhhhhhhhhhhhhHHHHHhcCCcccchhhhhhhhccccccccccccccccccccccc Confidence 589999999999998 8999999999999999999999999987 56999987743 6999999 Q ss_pred cccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCC--CCccccc--cccccccccccccccccc Q lcl|NC_015280. 66 SSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQ--SGNEAFF--DEPDAQFSGTDGATPPTA 141 (455) Q Consensus 66 ~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~q--sG~EAlf--nEa~t~fSg~~~~~~~~~ 141 (455) +|++|+++||+||+|||||+|||||+|||||||||||||||||||+||.+| +++|||| +|+|+.|||..++..... T Consensus 81 ~t~~v~~~~P~ll~lvRRa~~~LIa~DIwGVQPMTgPTGLIFAMRsrY~~~~~tg~EAf~~~nEadt~fSG~~~~~~~~~ 160 (514) T protein:vir:56 81 TTGAVTNIGPTVMGMVRRAIPQLIAFDIAGVQPMTGPTSQVFTLRSVYGKDPLTGAEAFHPTRQADASFSGQAAASTIAD 160 (514) T ss_pred ccccccccchhHHHHHHHHHHhhhhhhhheeccCCchhhhheeeeeeecCCCcccccccccccccCcCcccccccccccc Confidence 999999999999999999999999999999999999999999999999988 6889999 999999999765432111 Q ss_pred cc------ccCccc------CC--------------C---------CCCCCcccccccccccccchhhhhh---cCCCCC Q lcl|NC_015280. 142 TT------EKNPAL------IN--------------D---------ATGGGTTATNYDLASSKFSTSEQEA---LGDGAS 183 (455) Q Consensus 142 ~~------~~~~~~------~~--------------~---------~~~g~t~~~~~~~~~~gm~Ta~aE~---LG~s~~ 183 (455) .. ..++.. .. . ...........+.++.||+|+.+|. ||++++ T Consensus 161 ~~~~~~~~~G~~~~~~~t~~~gd~~~~~~~~~~~~~~~~~~~~~~t~~~~~~a~~~~y~~~~Gm~Ta~aEal~~lggs~~ 240 (514) T protein:vir:56 161 FPTTGAATDGTPYKAEVTTSGGDVSMRYFLALGAVTLAVAGQMTATEYTDGVAGGLLVEIDAGMATSQAELQENFNGSSN 240 (514) T ss_pred ccccccccccccccccccccccccccccccccccccccccccccccccccccccchhhhhhhhhhhhhhhhcccCCCCcc Confidence 00 000000 00 0 0000112233466789999999997 678889 Q ss_pred CccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhh---hheeeeeec Q lcl|NC_015280. 184 TAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYR---GAKPGAQAN 260 (455) Q Consensus 184 ~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~---vA~~~k~~~ 260 (455) +.|+||+|+||||+|||||||||||||||||||||||||||||+||+||||||||+|||||||++|+. |+++||+++ T Consensus 241 ~~f~EMaFsIdK~tVtAKSRaLKAEYTiELAQDLKAVHGLDAEtELsNILSTEImlEINReii~~l~~~atv~~~~~~~~ 320 (514) T protein:vir:56 241 NEWNEMSFRIDKQVVEAKSRQLKAQYSIELAQDLRAVHGLDADAELSGILANEVMVELNREIVNLVNSQAQIGKSGWTQG 320 (514) T ss_pred cccceeeeEEEEEEEeeeccceeccccHHHHHHHHHhcCCChHHHHHHHHHHHHHHHhhHHHHHHHHhheeehhcccccc Confidence 99999999999999999999999999999999999999999999999999999999999999988874 557788899 Q ss_pred cccceeeeeecccc---chhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccc Q lcl|NC_015280. 261 VANAGVFDLDVDSN---GRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGI 337 (455) Q Consensus 261 v~~~gv~Dl~~~~~---gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~ 337 (455) ++++|+|||+++.| +||++|+||.|+|||++|+|+|+|+|+||+|||||||++||++|+|+||+++ ++..+....+ T Consensus 321 ~~~~G~~d~~~~~d~~~~~~~~e~~~~l~~~i~~~an~i~~~T~rg~gn~~i~S~~Va~~L~~sg~l~~-~~~~g~~~~~ 399 (514) T protein:vir:56 321 AGAAGVFDFSDAVDVKGARWAGEAYKALLIQIEKEANEIGRQTGRGNGNFIIASRNVVSALSMTDTLVG-PAAQGMQDGS 399 (514) T ss_pred cccccccccccccccccchHHHHHHHHHHHHHHHHHHHHHhhcccccccEEEEchhHHHHHHhhhhhcc-ccccCccccc Confidence 99999999998776 7999999999999999999999999999999999999999999999999976 4555555555 Q ss_pred cccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEcccccccceeecCCccccceeeeeeecce Q lcl|NC_015280. 338 GEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQDTFQPRIGFKTRYGM 417 (455) Q Consensus 338 ~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s~qP~~g~~tRY~l 417 (455) .++|+++.+|+|+|+|||+|||||| +++|||+|||||++|+|+||||||||||++++++||+||||++|||||||| T Consensus 400 ~~~d~~~~~~aG~l~~~~~vy~D~y----~~~dy~~vG~KG~~~~~~glfyaPYv~l~~~~~~dp~sfqP~~g~~tRY~l 475 (514) T protein:vir:56 400 MNTDTNQTVFAGVLGGRFKVYIDQY----AVNDYFTVGFKGSTEMDAGVFYSPYVPLTPLRGSDSKNFQPVIGFKTRYGV 475 (514) T ss_pred cccccCcceEEEEecCceEEEecCC----CCcceEEEEEecCcceecceeeccccccccccccCCccccceeeeeeeece Confidence 8999999999999999999999999 689999999999999999999999999999999999999999999999999 Q ss_pred eecccccccccc---cccCchhhhhccchhhhhhhhhhcC Q lcl|NC_015280. 418 VLNPFAKGLTAL---SDSDPQAAGNLNANAYYRRVRVANL 454 (455) Q Consensus 418 ~~nP~~~~~~~~---~~~~~~~~~~~~~n~y~r~~~v~~~ 454 (455) ++|||++..++. .|++|.. ...++|.|||||+|||| T Consensus 476 ~~NPy~~~~~~~~~~~~~~~~~-a~~~~n~y~r~v~v~~l 514 (514) T protein:vir:56 476 QVNPFADPTASATKVGNGAPVA-ASMGKNAYFRRVFVKGL 514 (514) T ss_pred eeCCCCCccccccccCCcchhh-hcccccceeeeEEEecC Confidence 999999855433 3444433 36689999999999999 No 15 >protein:vir:100603 Length: 529 # NCBI annotation: gp23 precursor of major head subunit # Family: family:all:364 # MgeID: mge:1488 # MgeName: 25 # Cross-refs: genbank:acc:YP_656387;genbank:gi:109290138;genbank:GeneID:4156581 Probab=100.00 E-value=3e-217 Score=1207.67 Aligned_cols=447 Identities=42% Similarity=0.672 Sum_probs=390.9 Q ss_pred Cc-chHHHHHHhhHhhcCCCCccccchhhHHHHHHHhhhHHHHHHHHH------------Hhhhhhhhchh-----hhcc Q lcl|NC_015280. 1 MY-NAENLQEKWAPVLNHEGLNDIKDPYRKSVTAILLENQERALAEER------------AVLTEAPTNVG-----PINT 62 (455) Q Consensus 1 m~-~~~~~~~kw~~~l~~~~~~~i~~~~~~~v~~~~~enq~~~~~e~~------------~~l~ea~~~~~-----~~~~ 62 (455) |+ +.|+|+|||+||||||++|||++.+||+|+++|||||||+|+||+ .+|+|+..+.+ .+|+ T Consensus 1 ~~~~~~~l~~kw~p~l~~~~~~~i~~~~~~~~~a~l~enq~~~~~~~~~~~~~~~~e~~~~~l~e~~~~~~~~~~~~~ia 80 (529) T protein:vir:10 1 MSLKTKEILNKWTPLLEGEGLPEIAGKNKQALVAQILEAQEKDSKTDPVYRDDKLIEAFGQSLMEAEVAGDHGYDPTNIA 80 (529) T ss_pred CccchHHHHHHhhHhhcCCccchhcchhhhhhhhhhhhhHHHHhhcccccchhhhhhhhhhccchhhccccccccccccc Confidence 54 347999999999999999999999999999999999999999987 44899887643 5889 Q ss_pred ccccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCC----Ccccc--cccccccccccccc Q lcl|NC_015280. 63 PTTSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQS----GNEAF--FDEPDAQFSGTDGA 136 (455) Q Consensus 63 ~st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qs----G~EAl--fnEa~t~fSg~~~~ 136 (455) ||++|++|+++||+||+||||++|||||+|||||||||||||||||||+||.+|. ++|+| ++|+|+.|||...+ T Consensus 81 ~s~~t~~v~~~~P~Li~lvRra~p~LIa~DIwGVQPMTgPTGLIFAMRsrY~~~~~~~~g~eaf~~~~e~dt~~SG~~~~ 160 (529) T protein:vir:10 81 AGQSSGAITNIGPAVIGMVRRAIPSLIAFDIAGVQPMTGPTGQVFALRSVYGKDPLAAGAKEAFHPMYAPDAWHSGLAAK 160 (529) T ss_pred ccccccccccccchhhhhHHHHHHhHHhhhhheeccCCchhhhhhhheeeecCCcCCCcccccccccccccccccccccc Confidence 9999999999999999999999999999999999999999999999999998874 55665 67999999986543 Q ss_pred ccccccc-----------------------------ccCcccCCCCC--------------CCCcccccccccccccchh Q lcl|NC_015280. 137 TPPTATT-----------------------------EKNPALINDAT--------------GGGTTATNYDLASSKFSTS 173 (455) Q Consensus 137 ~~~~~~~-----------------------------~~~~~~~~~~~--------------~g~t~~~~~~~~~~gm~Ta 173 (455) ....+.. ........+.. .........++++.||+|+ T Consensus 161 ~~~~~~~~~~~~~~t~~~a~~~~~~~~~~~nea~t~~s~~~tg~~~~~g~~~tg~~~~~~~~~~~a~~~~~~~~~gmsTa 240 (529) T protein:vir:10 161 GATTSSDGTPFAALTAGQAVATGDIVYHFFYESGSAYLQNVTGGNVTVGTNETGAALDALVSAKIAAGELAEIAEGMATS 240 (529) T ss_pred cccccccccccccccccceeeccccceeeecccccccccccccccccccccccCCccccccccccccccccccccccchh Confidence 2211100 00000000000 0111223457788999999 Q ss_pred hhhhc---CCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHh Q lcl|NC_015280. 174 EQEAL---GDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVY 250 (455) Q Consensus 174 ~aE~L---G~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~ 250 (455) .+|.| |+++++.|+||+|+||||+|||||||||||||||||||||||||||||+||+||||+|||+||||||||+|+ T Consensus 241 ~aEal~~~g~ss~~~f~EMaFsIeK~tVtAKSRaLKAEYTiELAQDLKAvHGLDAEtELsNILStEImlEINReii~~i~ 320 (529) T protein:vir:10 241 IAELRQGFNGTTDNPWNEMSFRIDKQTVEAKSRQLKAQYSIELAQDLRAVHGMDADSELNGILANEVMLEINREVIDWIN 320 (529) T ss_pred hhhccccCCCCccccccceeeEEEEEEEeeeccceeccccHHHHHHHHHhcCCChHHHHHHHHHHHHHHHhhHHHHHHhh Confidence 99987 678889999999999999999999999999999999999999999999999999999999999999999888 Q ss_pred hhheeeeee----ccccceeeeeecccc---chhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcc Q lcl|NC_015280. 251 RGAKPGAQA----NVANAGVFDLDVDSN---GRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGV 323 (455) Q Consensus 251 ~vA~~~k~~----~v~~~gv~Dl~~~~~---gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~ 323 (455) .+|+.++.. ..+.+|+|||+++.| +||++|+||+|++||++|+|+|+|+|+||+|||||||++||++|+|.|+ T Consensus 321 ~~a~~~~~g~~~~~~~~~gv~d~~~~~d~~~~~~~~e~~~~L~~~i~~~an~I~~~T~rg~~n~vi~S~~Va~~L~~~~~ 400 (529) T protein:vir:10 321 YTAQVGKSGWTQTVGSAAGVFDFQDPIDVRGARWAGESYKALLIQIDKEANEIARQTGRGAGNFIIASRNVVSALALVDA 400 (529) T ss_pred hhceeeeeeeeccccccccceeccccccccccchhHHHHHHHHHHHHHHHHHHHHhhccccceEEEEchHHHHHHhhhcc Confidence 777666632 236899999998876 8999999999999999999999999999999999999999999999999 Q ss_pred cccccccccccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEcccccccceeecCCc Q lcl|NC_015280. 324 LDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQD 403 (455) Q Consensus 324 l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~ 403 (455) +++.+......+ .++|+|+.+|+|+|+|||+|||||| +++|||+|||||++++|+||||||||||+|+|++||+ T Consensus 401 ~~~~~~~~~~sg--~~~d~~~~~~~G~l~~~~~vy~D~y----~~~dy~~vG~KG~~~~~~glfy~PYv~l~~~~~~dp~ 474 (529) T protein:vir:10 401 GITPAAQGMASG--LNADTTKGVFAGVLGGRYKVYIDQY----ARQDYFTMGYRGANNLDAGIYYCPYVALTPLRGSDPK 474 (529) T ss_pred cccccccccccc--ceeecCCceEEEEecCceEEEecCC----CCcceEEEEEeCCcccccceeeccccccccccccCCC Confidence 876666555544 4689999999999999999999999 6899999999999999999999999999999999999 Q ss_pred cccceeeeeeecceeeccccccccc-----ccccCchhhhhccchhhhhhhhhhcC Q lcl|NC_015280. 404 TFQPRIGFKTRYGMVLNPFAKGLTA-----LSDSDPQAAGNLNANAYYRRVRVANL 454 (455) Q Consensus 404 s~qP~~g~~tRY~l~~nP~~~~~~~-----~~~~~~~~~~~~~~n~y~r~~~v~~~ 454 (455) ||||++||||||||++|||+++.++ +.||+||. ..+|+|.|||||+|||| T Consensus 475 sfqP~~g~~tRY~l~~NP~~~~~~~~~~~r~~~g~~~~-~~ag~n~~~r~~~Vk~l 529 (529) T protein:vir:10 475 NFQPVMGFKTRYAIGVNPFAESRTQAPTSRISNGMPGA-HSVGKNAYFRRVWVKGL 529 (529) T ss_pred cccceeeeeeeeceeecCccccccccccccccCCcchh-hhcCccceeeEeeeccC Confidence 9999999999999999999986654 58999995 58999999999999999 No 16 >protein:vir:107947 Length: 519 # NCBI annotation: gp23 major head protein # Family: family:all:364 # MgeID: mge:2002 # MgeName: JS98 # Cross-refs: genbank:acc:YP_001595301;genbank:gi:161622607;genbank:GeneID:5783666 Probab=100.00 E-value=4e-217 Score=1206.95 Aligned_cols=446 Identities=39% Similarity=0.674 Sum_probs=394.8 Q ss_pred CcchHHHHHHhhHhhcCCCCccccchhhHHHHHHHhhhHHHHHHHHH------------Hhhhhhhhchh-----hhccc Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHEGLNDIKDPYRKSVTAILLENQERALAEER------------AVLTEAPTNVG-----PINTP 63 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~~~~~i~~~~~~~v~~~~~enq~~~~~e~~------------~~l~ea~~~~~-----~~~~~ 63 (455) |+| |+|+|||+||||||++|||++.|||+|+++||||||++|+|++ .+|+||..+.+ ..|++ T Consensus 1 ~~~-~~l~~kw~p~l~~~~~~~i~~~~~~~i~~~~~en~~~~~~~~~~~~~~~~~~~~~~~l~e~~~~~~~~~~~t~i~~ 79 (519) T protein:vir:10 1 MKK-NALVQKWSALLENEALPEIVGASKQAIIAKIFENQEQDILTAPEYRDEKISEAFGSFLTEAEIGGDHGYDATNIAA 79 (519) T ss_pred Cch-hHHHHHhHHhhcccccchhhhhhhHHHHHHHHHHHHHHhhhcccccchHHHHHHhhhcchhccCCccccCcccccc Confidence 766 5999999999999999999999999999999999999999986 88999976544 58899 Q ss_pred cccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCC----Ccccc--ccccccccccccccc Q lcl|NC_015280. 64 TTSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQS----GNEAF--FDEPDAQFSGTDGAT 137 (455) Q Consensus 64 st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qs----G~EAl--fnEa~t~fSg~~~~~ 137 (455) |++|+++++++|+|++|+||++|||||+|||||||||||||||||||+||.+|. ++|+| |+|||+.|||++++. T Consensus 80 ~~~t~~v~~~~P~l~~l~rRa~p~LIa~DIwGVQPMTgPTGLIFAMRsrY~n~~~~~~g~ea~~~~nEadt~fSG~~~~~ 159 (519) T protein:vir:10 80 GQTSGAVTQIGPAVMGMVRRAIPHLIAFDICGVQPLNNPTGQVFALRAVYGKDPIAAGAKEAFHPMYAPNAMFSGQGAAE 159 (519) T ss_pred ccccccccccchhHHHHHHHHHHhhhhhhhheeecCCchhhhhheeeeeecCCccccccccccccccccccccCcccccc Confidence 999999999999999999999999999999999999999999999999999875 45555 699999999987654 Q ss_pred ccccccccCcc--------------------c-CCCCCCCC-------------cccccccccccccchhhhhh---cCC Q lcl|NC_015280. 138 PPTATTEKNPA--------------------L-INDATGGG-------------TTATNYDLASSKFSTSEQEA---LGD 180 (455) Q Consensus 138 ~~~~~~~~~~~--------------------~-~~~~~~g~-------------t~~~~~~~~~~gm~Ta~aE~---LG~ 180 (455) ........... . .....+++ ....+.++.++||+|+.+|. +|+ T Consensus 160 ~~~~~~~~~~~~~g~~~~~~~~~s~~~~~~~~~~~t~~ag~t~~~~~~~a~~~~~~~~~~~~~~~gmsTa~aEal~~lgg 239 (519) T protein:vir:10 160 TFEALAASKVLEVGKIYSHFFEATGSAHFQAVEAVTVDAGATDAAKLDAAVTALVEAGQLAEIAEGMATSIAELQEGFNG 239 (519) T ss_pred ccccccccccccccccccccccccccceeccccccccCCCCcCccccccccccccccccccccccccccchhhccccCCC Confidence 32211100000 0 00011111 11234567899999999996 677 Q ss_pred CCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeec Q lcl|NC_015280. 181 GASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQAN 260 (455) Q Consensus 181 s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~ 260 (455) +++++|+||+|+||||+|||||||||||||||||||||||||||||+||+||||||||+|||||||++|+.+|+.+++ + T Consensus 240 ss~~~f~EMaFsIeKvTVtAKSRaLKAEYTiELAQDLKAVHGLDAEtELaNILSTEImlEINReii~~i~~sa~~~~~-g 318 (519) T protein:vir:10 240 STDNPWNEMGFRIDKQVIEAKSRQLKASYSIELAQDLRAVHGMDADAELSGILATEIMLEINREVIDWINYSAQVGKS-G 318 (519) T ss_pred ccccchhhhceeEEEEEEeeecccccccccHHHHHHHHHhcCCChHHHHHHHHHHHHHHHhhHHHHhhhhhhhhccee-e Confidence 888999999999999999999999999999999999999999999999999999999999999999988777776654 4 Q ss_pred ccc-----ceeeeeecccc---chhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhccccccccccc Q lcl|NC_015280. 261 VAN-----AGVFDLDVDSN---GRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISG 332 (455) Q Consensus 261 v~~-----~gv~Dl~~~~~---gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~ 332 (455) +++ +|+|||+++.| +||++|+||+|+|||+||+|+|+|+|+||+|||||||+|||++|+++|++++.++... T Consensus 319 ~t~~~~~~aGv~d~~~~~d~~~~rw~~e~~k~L~~~i~~~an~I~~~T~r~~gn~ii~S~~Va~~L~~~g~~~~~~~~~~ 398 (519) T protein:vir:10 319 MTNTVGAKAGVFDFQDPIDIRGARWAGESFKALLFQIDKEAAEIARQTGRGAGNFIIASRNVVNVLAAVDTSVSYAAQGL 398 (519) T ss_pred cccCcccccceeecccccccccchHHHHHHHHHHHHHHHHHHHHHHhhccccccEEEEchHHHHHHhhccchhccccccc Confidence 443 69999999876 9999999999999999999999999999999999999999999999999988887766 Q ss_pred ccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEcccccccceeecCCccccceeeee Q lcl|NC_015280. 333 AVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQDTFQPRIGFK 412 (455) Q Consensus 333 ~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s~qP~~g~~ 412 (455) ... .++|+|+.+|+|+|+|||+|||||| +++|||+|||||++|+|+||||||||||+|+|++||+||||++||| T Consensus 399 ~~~--~~~d~~~~~~~G~l~~~~~vy~D~y----~~~dy~~vG~KG~~~~~~glfyaPYv~l~~~~~~dp~sfqP~~g~~ 472 (519) T protein:vir:10 399 GQG--FNVDTTKAVFAGVLGGKYRVYIDQY----ARSDYFTIGYKGSNEMDAGIYYAPYVALTPLRGSDPKNFQPVMGFK 472 (519) T ss_pred ccc--ccccCCCceEEEEecCceEEEecCC----CCcceEEEEEecCcccccceeeccccccccccccCCccccceeeee Confidence 655 6999999999999999999999999 6899999999999999999999999999999999999999999999 Q ss_pred eecceeecccccccc-----cccccCchhhhhccchhhhhhhhhhcC Q lcl|NC_015280. 413 TRYGMVLNPFAKGLT-----ALSDSDPQAAGNLNANAYYRRVRVANL 454 (455) Q Consensus 413 tRY~l~~nP~~~~~~-----~~~~~~~~~~~~~~~n~y~r~~~v~~~ 454 (455) |||||++|||+++.+ ++.||+|..++..++|.|||||+|||| T Consensus 473 tRY~l~~NP~~~~~~~~~~~~i~~g~~~~a~~~~~n~y~r~v~v~~~ 519 (519) T protein:vir:10 473 TRYGIGINPFADPAAQAPTKRIQNGMPDIVNSLGLNGYFRRVYVKGI 519 (519) T ss_pred eeeceeecCcccccccCccceeccCchhhhccccCceeeeeeeeecC Confidence 999999999997544 467999988899999999999999999 No 17 >protein:vir:5942 Length: 523 # NCBI annotation: similar to major head protein # Family: family:all:364 # MgeID: mge:123 # MgeName: RM 378 # Cross-refs: genbank:acc:NP_835728;genbank:gi:30044131 Probab=100.00 E-value=2.6e-196 Score=1092.82 Aligned_cols=402 Identities=28% Similarity=0.412 Sum_probs=342.1 Q ss_pred Ccch---HHHHHHhhHhhcCCCCccccchhhHHHHHHHhhhHHHHHHHHHHhhhhhhhchhhhccccccccccccccchh Q lcl|NC_015280. 1 MYNA---ENLQEKWAPVLNHEGLNDIKDPYRKSVTAILLENQERALAEERAVLTEAPTNVGPINTPTTSSGAVAGFDPIL 77 (455) Q Consensus 1 m~~~---~~~~~kw~~~l~~~~~~~i~~~~~~~v~~~~~enq~~~~~e~~~~l~ea~~~~~~~~~~st~tg~i~~~~P~L 77 (455) ||.+ |+|+|||+||||.++ +.|||+||++|||||+| |++++|+|+ ..|+.|++|.| | T Consensus 1 ~~~~~~~e~l~~kw~p~l~~~~-----~~~~~~~~a~llenq~~---~~~~~l~e~-----------~~~~~~~~~~~-~ 60 (523) T protein:vir:59 1 MSQPKINEQLIEKWQPLLEGCR-----NDWERHTLATLLENQYR---EAKKHLMET-----------TQTTEVDGWNL-A 60 (523) T ss_pred CCcchhhHHHHHhhhhhhcccC-----ChhHHHHHHHHhhhhhH---HHHHhhhhh-----------hhccccccccc-h Confidence 9987 899999999999766 45899999999999997 788899884 55888999996 9 Q ss_pred hhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccccccc--------------cccccccccccccc- Q lcl|NC_015280. 78 ISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDA--------------QFSGTDGATPPTAT- 142 (455) Q Consensus 78 v~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t--------------~fSg~~~~~~~~~~- 142 (455) |+|+||++|||||+||||||||||||||||||||||.+|.|+||+|+++.+ .|++..+.+..... T Consensus 61 ~~~v~r~~p~l~a~DIWGVQPMTGPTGLIFAMRSRY~~q~gteA~yg~~~~~~~~a~~~~~ean~~~s~~~~~~~~~~d~ 140 (523) T protein:vir:59 61 LPIVRRVFANLRATDLVSVQPLSLPTGLVFYLDFKSPELPGNGSVYGGTGLTTDTATGGLYDENARLSRREYETTITVDL 140 (523) T ss_pred hhhhhhHhhhhhhhhccccccCCCCcceeEEEEeeccCCCCcccccCccccCcccccccccccccccccccccCccCCCc Confidence 999999999999999999999999999999999999999999999876443 34432211100000 Q ss_pred --------------c------ccC--------------ccc--------------------------CC----------- Q lcl|NC_015280. 143 --------------T------EKN--------------PAL--------------------------IN----------- 151 (455) Q Consensus 143 --------------~------~~~--------------~~~--------------------------~~----------- 151 (455) . ..+ .+. .+ T Consensus 141 ~~sg~~~~~~~a~stg~A~a~~s~si~k~~vTa~s~agta~~~li~A~~~q~itg~tga~fa~s~~~an~astAss~Al~ 220 (523) T protein:vir:59 141 ATAQQATMRDVGFDTGIASLVSSGAVYYVDVPVASLPGVADVNTVRFWQYDDASGDPENTVAYPLPRYNRIVGAVGSALY 220 (523) T ss_pred ccccccccccccccccchhhccccceeeeeccccccccccccccccccccccccccccccccchhhcccccccccccccc Confidence 0 000 000 00 Q ss_pred ---------C------CCCCCcccccccccccccchhhhhhcCC-----CCCCccccceeEEEEEEEEeeccccccceeH Q lcl|NC_015280. 152 ---------D------ATGGGTTATNYDLASSKFSTSEQEALGD-----GASTAFMEMAFSIDKIAVEAKGRALRADYSV 211 (455) Q Consensus 152 ---------~------~~~g~t~~~~~~~~~~gm~Ta~aE~LG~-----s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTi 211 (455) + ...+.......+..+.||+|+.+|.+++ +.++.|+||+|+||||+|||||||||||||| T Consensus 221 gEA~t~~sTd~at~~~Gtt~t~~~~~lyt~~~g~~t~~~~~~~~~~~~~~~~~~~~eM~FsIeK~tVtAkSRaLKAeYT~ 300 (523) T protein:vir:59 221 ARLFFVTGSDFATVAGGTPSTQDLDLVYYIDARNDFEDQSTDPDYPDPGFQSLDIPEINLELRSRPVATKTRKLRAAWTP 300 (523) T ss_pred ccccccccccccccCCCcccccccccccccccccchhhccccccccccccccccccceeeEEEeEEEeeecccccccccH Confidence 0 0000000111245578999999998875 5577899999999999999999999999999 Q ss_pred HHHHhHHHhh-CCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhh--------HHHH Q lcl|NC_015280. 212 ELAQDLKAIH-GLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWS--------VEKF 282 (455) Q Consensus 212 ELAQDLkAiH-GLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~--------ve~~ 282 (455) |||||||||| |||||+||+||||+|||+||||||||+||+||+|||+.|++++|+|||+++.+|||. +|++ T Consensus 301 ELAQDLKAiH~GLDAE~ELanILStEImlEINR~ii~~~~~~a~~~~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~e~~ 380 (523) T protein:vir:59 301 EAMQDLAAYHKGVDLENEIVTLMSQYIAREIDLEILSTIMAHARRTDNYGFWSEVVGEYYDETSGNFVAGNFYGSKQEWL 380 (523) T ss_pred HHHHHHHHHhcCCChhHHHHHHHHHHHHHHhhHHHHHhHhhhheeeeeccccccceeeecccccchhhhhhhhhhhHHHH Confidence 9999999999 999999999999999999999999999999999999999999999999999999997 8999 Q ss_pred HHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEecc Q lcl|NC_015280. 283 KGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPY 362 (455) Q Consensus 283 k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y 362 (455) |.|+|||++|+|+|+|+|+||+|||||||+|||++|++||||+..+. ..+|+|+++|+|+|+|||+||||+| T Consensus 381 ~~l~~~~~~~~n~i~~~t~~~~~~~~~~s~~v~~~l~~~~~~~~~~~--------~~~~~~~~~~~g~l~~~~~vy~d~~ 452 (523) T protein:vir:59 381 ATLMIELNKVSNRIQQKTAVAGANFLVTSPQVAALLESMPGFTPGND--------NRDGGTGIFYVGMVQGRYRLYKNIY 452 (523) T ss_pred HHHHHHHHHHHHHHHHhcccccccEEEEchhHHHHHHhccccccCCc--------cccccccceeEEEecCceEEEecCC Confidence 99999999999999999999999999999999999999999964333 4677899999999999999999999 Q ss_pred ccccCCcceEEEEEecC-ccccceeEEcccccccceeec-CCccccceeeeeeecceee-cccccccccccccCc Q lcl|NC_015280. 363 SANVSDNQYYVVGYKGT-NAYDAGLFYCPYVPLQMYRAI-GQDTFQPRIGFKTRYGMVL-NPFAKGLTALSDSDP 434 (455) Q Consensus 363 ~~~~s~~dY~~vG~KG~-~~~daglfyaPYv~l~~~~~~-Dp~s~qP~~g~~tRY~l~~-nP~~~~~~~~~~~~~ 434 (455) +++|||+|||||. +++|+||||||||||.++|++ ||+||||++||||||||++ |||+.++--..-=+| T Consensus 453 ----~~~dy~~~g~k~~~~~~~~~~~y~Py~~l~~~~~~~dp~s~qp~~~~~tRY~l~v~nP~~~~~~~~~~~~~ 523 (523) T protein:vir:59 453 ----QNQPVIIMGNQDLNTPWQTGAVYAPYVPLLFTPTIVDPVNFSYRRGLMTRYALEVVRPEFYGLLYVKLLQP 523 (523) T ss_pred ----CCcceEEEEecccCCcccccceecccchhhcccccccCCcccceeeeeeehhheecchhHhhhhhhhhcCC Confidence 6899999999995 599999999999999999988 9999999999999999975 999998632211122 No 18 >protein:vir:81100 Length: 415 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:1891 # MgeName: tp310-1 # Cross-refs: genbank:acc:YP_001429874;genbank:gi:156603927;genbank:GeneID:5525320 Probab=97.38 E-value=8e-05 Score=43.04 Aligned_cols=340 Identities=16% Similarity=0.124 Sum_probs=136.4 Q ss_pred CcchHH----HHHHhhH---hhcCCCCccccchhh---HHHHHHHhhhHH--HH-------------------------- Q lcl|NC_015280. 1 MYNAEN----LQEKWAP---VLNHEGLNDIKDPYR---KSVTAILLENQE--RA-------------------------- 42 (455) Q Consensus 1 m~~~~~----~~~kw~~---~l~~~~~~~i~~~~~---~~v~~~~~enq~--~~-------------------------- 42 (455) +....+ +.+++.- .|..+..-+. +.-. +.+-++|-+.++ .. T Consensus 8 ~~~l~el~~~~~~~~~e~~~~l~~~~~~~~-~~~~~e~~~l~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 86 (415) T protein:vir:81 8 QSEISDIKRQIDLKVKYATRALNNDELEKA-EKLEQEITDLRSQIQEKQEELDKLKEKDGTSENNQQSVEVNEARTYRNQ 86 (415) T ss_pred HHHHHHHHHHHHHHHHHHHHHhchHHHHHH-HHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhcccccccchhhhHHHH Confidence 111111 1111100 0000000000 0000 000001100000 00 Q ss_pred ---------------HHHHHHhhhhhhhchhhhccccccccccccccchhh--hHHHHHHhhhhhhheeeeccCCCccee Q lcl|NC_015280. 43 ---------------LAEERAVLTEAPTNVGPINTPTTSSGAVAGFDPILI--SLIRRAMPKLIAYDIAGVQPMTGPTGL 105 (455) Q Consensus 43 ---------------~~e~~~~l~ea~~~~~~~~~~st~tg~i~~~~P~Lv--~l~RRa~p~LIa~DI~GVQPmTGPTGL 105 (455) ..++++.+.+.........+.++++.+-...-|.-+ .+++++.+...-.+++.|.||+++.+- T Consensus 87 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~gg~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~ 166 (415) T protein:vir:81 87 ANINDLGISIQNTKVTSQEVRDFTEYLETRNDIQGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKYVTVKRVTNGSGK 166 (415) T ss_pred HHHHHHhhhhhhhhhHHHHHHHHHHHHhhhhhhhhccccccccccccchHHHHHHHHHHHhhhhhhhheeeeeccCCcee Confidence 011111111111111111111222222222233322 345555666778899999999998876 Q ss_pred eeEEEeeecCCCCcccccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCc Q lcl|NC_015280. 106 IFAMRSRYTNQSGNEAFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTA 185 (455) Q Consensus 106 IFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~ 185 (455) +--.+ ..+. . .. ....+....++..... T Consensus 167 ~~~~~--~~~~--~---------------~~---------------------------------~~v~E~~~~~~~~~~~ 194 (415) T protein:vir:81 167 YPVVR--QSEV--A---------------AL---------------------------------EKVEELEENPELAVKP 194 (415) T ss_pred EEEEe--ecCC--c---------------cc---------------------------------eeeccccccCcccccc Confidence 55444 1100 0 00 0000001111111123 Q ss_pred cccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccce Q lcl|NC_015280. 186 FMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAG 265 (455) Q Consensus 186 f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~g 265 (455) |.+..|++.|. +-...+|-||.+|- ..|.+++|.+-|+..|..-+|+.||.-.-+-...+-..+....+ T Consensus 195 ~~~v~~~~~k~-------~~~~~iS~ell~ds----~~~l~~~i~~~l~~~~~~~~~~~il~g~g~g~~~~~~~~~~~~~ 263 (415) T protein:vir:81 195 FFQLAYDINTH-------RGYFRISREAIEDA----KVNVLQELKLWMARTIAATRNKAIIDVITKGSTGSTSSGFEKEG 263 (415) T ss_pred eeeEEeeeeee-------EeeehhhHHHHhhc----hHHHHHHHHHHHHHHHHHHHHHHHhhccccCccccccccccccc Confidence 44555555554 44566999999984 35789999999999999999999987653321111000000000 Q ss_pred eeeeeccccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCc Q lcl|NC_015280. 266 VFDLDVDSNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGN 345 (455) Q Consensus 266 v~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~ 345 (455) +- ...++--..+....++..+... -.+.+.+||++.....|.. +... +|.. ....+.++ T Consensus 264 ~~---~~~~~~~~~~~i~~~~~~~~~~---------~~~~~~~v~n~~~~~~l~~---lkd~---~G~~--l~~~~~~~- 322 (415) T protein:vir:81 264 KK---LEVKKAKSLDDIKDAINLNVKP---------NYEHNVAIVSQTMFAKLDK---MKDK---LGNY--LIQPDVKE- 322 (415) T ss_pred cc---cccccccchhHHHHHHHhhhhh---------ccCCCEEEEcHHHHHHHHH---hhcc---CCce--eeccCcCC- Confidence 00 0011111123344444444322 1345568899999888875 2211 1111 01111111 Q ss_pred eeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEcc----ccc---ccce-eecCCccccceeeeeeecce Q lcl|NC_015280. 346 TFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCP----YVP---LQMY-RAIGQDTFQPRIGFKTRYGM 417 (455) Q Consensus 346 ~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaP----Yv~---l~~~-~~~Dp~s~qP~~g~~tRY~l 417 (455) ...++| .+++|++.++.. .|-.|+. .++|+- |+- ..+. ...|-.+++..+....|++. T Consensus 323 ~~~~~l-~G~pV~~~~~~~---------~~~~~~~----~~~~Gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~~~~r~d~ 388 (415) T protein:vir:81 323 KTQQRL-LGAKIEILPDEV---------LGQKGNN----TLIIGNLKDAIVLFDRSQYQASWTDYMHFGECLMIAVRQDC 388 (415) T ss_pred CCCcee-cceeeEEecccc---------cCCCCcc----EEEEEehhccEEEEeecceEEEEeccccCceEEEEEEEecc Confidence 122355 356777764321 1111111 123322 211 0111 22355667888888899998 Q ss_pred -eeccccccc---cccccc-Cchhhhh Q lcl|NC_015280. 418 -VLNPFAKGL---TALSDS-DPQAAGN 439 (455) Q Consensus 418 -~~nP~~~~~---~~~~~~-~~~~~~~ 439 (455) +.+|-.--. +....| -+.+.-+ T Consensus 389 ~v~~~~a~~~~~~~~~~~~~~~~~~~~ 415 (415) T protein:vir:81 389 RILDYKSAIVIEYDDSERGEGDLGLEA 415 (415) T ss_pred EEeccccEEEEEEeccCCCCCccccCC Confidence 677765521 111111 2222111 No 19 >protein:vir:98339 Length: 415 # NCBI annotation: putative capsid protein # Family: family:all:21 # MgeID: mge:1581 # MgeName: phiPVL(108) # Cross-refs: genbank:acc:YP_918931;genbank:gi:119443693;genbank:GeneID:4594501 Probab=97.38 E-value=8e-05 Score=43.04 Aligned_cols=340 Identities=16% Similarity=0.124 Sum_probs=136.4 Q ss_pred CcchHH----HHHHhhH---hhcCCCCccccchhh---HHHHHHHhhhHH--HH-------------------------- Q lcl|NC_015280. 1 MYNAEN----LQEKWAP---VLNHEGLNDIKDPYR---KSVTAILLENQE--RA-------------------------- 42 (455) Q Consensus 1 m~~~~~----~~~kw~~---~l~~~~~~~i~~~~~---~~v~~~~~enq~--~~-------------------------- 42 (455) +....+ +.+++.- .|..+..-+. +.-. +.+-++|-+.++ .. T Consensus 8 ~~~l~el~~~~~~~~~e~~~~l~~~~~~~~-~~~~~e~~~l~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 86 (415) T protein:vir:98 8 QSEISDIKRQIDLKVKYATRALNNDELEKA-EKLEQEITDLRSQIQEKQEELDKLKEKDGTSENNQQSVEVNEARTYRNQ 86 (415) T ss_pred HHHHHHHHHHHHHHHHHHHHHhchHHHHHH-HHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhcccccccchhhhHHHH Confidence 111111 1111100 0000000000 0000 000001100000 00 Q ss_pred ---------------HHHHHHhhhhhhhchhhhccccccccccccccchhh--hHHHHHHhhhhhhheeeeccCCCccee Q lcl|NC_015280. 43 ---------------LAEERAVLTEAPTNVGPINTPTTSSGAVAGFDPILI--SLIRRAMPKLIAYDIAGVQPMTGPTGL 105 (455) Q Consensus 43 ---------------~~e~~~~l~ea~~~~~~~~~~st~tg~i~~~~P~Lv--~l~RRa~p~LIa~DI~GVQPmTGPTGL 105 (455) ..++++.+.+.........+.++++.+-...-|.-+ .+++++.+...-.+++.|.||+++.+- T Consensus 87 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~gg~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~ 166 (415) T protein:vir:98 87 ANINDLGISIQNTKVTSQEVRDFTEYLETRNDIQGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKYVTVKRVTNGSGK 166 (415) T ss_pred HHHHHHhhhhhhhhhHHHHHHHHHHHHhhhhhhhhccccccccccccchHHHHHHHHHHHhhhhhhhheeeeeccCCcee Confidence 011111111111111111111222222222233322 345555666778899999999998876 Q ss_pred eeEEEeeecCCCCcccccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCc Q lcl|NC_015280. 106 IFAMRSRYTNQSGNEAFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTA 185 (455) Q Consensus 106 IFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~ 185 (455) +--.+ ..+. . .. ....+....++..... T Consensus 167 ~~~~~--~~~~--~---------------~~---------------------------------~~v~E~~~~~~~~~~~ 194 (415) T protein:vir:98 167 YPVVR--QSEV--A---------------AL---------------------------------EKVEELEENPELAVKP 194 (415) T ss_pred EEEEe--ecCC--c---------------cc---------------------------------eeeccccccCcccccc Confidence 55444 1100 0 00 0000001111111123 Q ss_pred cccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccce Q lcl|NC_015280. 186 FMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAG 265 (455) Q Consensus 186 f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~g 265 (455) |.+..|++.|. +-...+|-||.+|- ..|.+++|.+-|+..|..-+|+.||.-.-+-...+-..+....+ T Consensus 195 ~~~v~~~~~k~-------~~~~~iS~ell~ds----~~~l~~~i~~~l~~~~~~~~~~~il~g~g~g~~~~~~~~~~~~~ 263 (415) T protein:vir:98 195 FFQLAYDINTH-------RGYFRISREAIEDA----KVNVLQELKLWMARTIAATRNKAIIDVITKGSTGSTSSGFEKEG 263 (415) T ss_pred eeeEEeeeeee-------EeeehhhHHHHhhc----hHHHHHHHHHHHHHHHHHHHHHHHhhccccCccccccccccccc Confidence 44555555554 44566999999984 35789999999999999999999987653321111000000000 Q ss_pred eeeeeccccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCc Q lcl|NC_015280. 266 VFDLDVDSNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGN 345 (455) Q Consensus 266 v~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~ 345 (455) +- ...++--..+....++..+... -.+.+.+||++.....|.. +... +|.. ....+.++ T Consensus 264 ~~---~~~~~~~~~~~i~~~~~~~~~~---------~~~~~~~v~n~~~~~~l~~---lkd~---~G~~--l~~~~~~~- 322 (415) T protein:vir:98 264 KK---LEVKKAKSLDDIKDAINLNVKP---------NYEHNVAIVSQTMFAKLDK---MKDK---LGNY--LIQPDVKE- 322 (415) T ss_pred cc---cccccccchhHHHHHHHhhhhh---------ccCCCEEEEcHHHHHHHHH---hhcc---CCce--eeccCcCC- Confidence 00 0011111123344444444322 1345568899999888875 2211 1111 01111111 Q ss_pred eeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEcc----ccc---ccce-eecCCccccceeeeeeecce Q lcl|NC_015280. 346 TFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCP----YVP---LQMY-RAIGQDTFQPRIGFKTRYGM 417 (455) Q Consensus 346 ~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaP----Yv~---l~~~-~~~Dp~s~qP~~g~~tRY~l 417 (455) ...++| .+++|++.++.. .|-.|+. .++|+- |+- ..+. ...|-.+++..+....|++. T Consensus 323 ~~~~~l-~G~pV~~~~~~~---------~~~~~~~----~~~~Gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~~~~r~d~ 388 (415) T protein:vir:98 323 KTQQRL-LGAKIEILPDEV---------LGQKGNN----TLIIGNLKDAIVLFDRSQYQASWTDYMHFGECLMIAVRQDC 388 (415) T ss_pred CCCcee-cceeeEEecccc---------cCCCCcc----EEEEEehhccEEEEeecceEEEEeccccCceEEEEEEEecc Confidence 122355 356777764321 1111111 123322 211 0111 22355667888888899998 Q ss_pred -eeccccccc---cccccc-Cchhhhh Q lcl|NC_015280. 418 -VLNPFAKGL---TALSDS-DPQAAGN 439 (455) Q Consensus 418 -~~nP~~~~~---~~~~~~-~~~~~~~ 439 (455) +.+|-.--. +....| -+.+.-+ T Consensus 389 ~v~~~~a~~~~~~~~~~~~~~~~~~~~ 415 (415) T protein:vir:98 389 RILDYKSAIVIEYDDSERGEGDLGLEA 415 (415) T ss_pred EEeccccEEEEEEeccCCCCCccccCC Confidence 677765521 111111 2222111 No 20 >protein:vir:79987 Length: 415 # NCBI annotation: head protein # Family: family:all:21 # MgeID: mge:1875 # MgeName: tp310-3 # Cross-refs: genbank:acc:YP_001430002;genbank:gi:156604057;genbank:GeneID:5525447 Probab=97.38 E-value=8e-05 Score=43.04 Aligned_cols=340 Identities=16% Similarity=0.124 Sum_probs=136.4 Q ss_pred CcchHH----HHHHhhH---hhcCCCCccccchhh---HHHHHHHhhhHH--HH-------------------------- Q lcl|NC_015280. 1 MYNAEN----LQEKWAP---VLNHEGLNDIKDPYR---KSVTAILLENQE--RA-------------------------- 42 (455) Q Consensus 1 m~~~~~----~~~kw~~---~l~~~~~~~i~~~~~---~~v~~~~~enq~--~~-------------------------- 42 (455) +....+ +.+++.- .|..+..-+. +.-. +.+-++|-+.++ .. T Consensus 8 ~~~l~el~~~~~~~~~e~~~~l~~~~~~~~-~~~~~e~~~l~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 86 (415) T protein:vir:79 8 QSEISDIKRQIDLKVKYATRALNNDELEKA-EKLEQEITDLRSQIQEKQEELDKLKEKDGTSENNQQSVEVNEARTYRNQ 86 (415) T ss_pred HHHHHHHHHHHHHHHHHHHHHhchHHHHHH-HHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhcccccccchhhhHHHH Confidence 111111 1111100 0000000000 0000 000001100000 00 Q ss_pred ---------------HHHHHHhhhhhhhchhhhccccccccccccccchhh--hHHHHHHhhhhhhheeeeccCCCccee Q lcl|NC_015280. 43 ---------------LAEERAVLTEAPTNVGPINTPTTSSGAVAGFDPILI--SLIRRAMPKLIAYDIAGVQPMTGPTGL 105 (455) Q Consensus 43 ---------------~~e~~~~l~ea~~~~~~~~~~st~tg~i~~~~P~Lv--~l~RRa~p~LIa~DI~GVQPmTGPTGL 105 (455) ..++++.+.+.........+.++++.+-...-|.-+ .+++++.+...-.+++.|.||+++.+- T Consensus 87 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~gg~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~ 166 (415) T protein:vir:79 87 ANINDLGISIQNTKVTSQEVRDFTEYLETRNDIQGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKYVTVKRVTNGSGK 166 (415) T ss_pred HHHHHHhhhhhhhhhHHHHHHHHHHHHhhhhhhhhccccccccccccchHHHHHHHHHHHhhhhhhhheeeeeccCCcee Confidence 011111111111111111111222222222233322 345555666778899999999998876 Q ss_pred eeEEEeeecCCCCcccccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCc Q lcl|NC_015280. 106 IFAMRSRYTNQSGNEAFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTA 185 (455) Q Consensus 106 IFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~ 185 (455) +--.+ ..+. . .. ....+....++..... T Consensus 167 ~~~~~--~~~~--~---------------~~---------------------------------~~v~E~~~~~~~~~~~ 194 (415) T protein:vir:79 167 YPVVR--QSEV--A---------------AL---------------------------------EKVEELEENPELAVKP 194 (415) T ss_pred EEEEe--ecCC--c---------------cc---------------------------------eeeccccccCcccccc Confidence 55444 1100 0 00 0000001111111123 Q ss_pred cccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccce Q lcl|NC_015280. 186 FMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAG 265 (455) Q Consensus 186 f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~g 265 (455) |.+..|++.|. +-...+|-||.+|- ..|.+++|.+-|+..|..-+|+.||.-.-+-...+-..+....+ T Consensus 195 ~~~v~~~~~k~-------~~~~~iS~ell~ds----~~~l~~~i~~~l~~~~~~~~~~~il~g~g~g~~~~~~~~~~~~~ 263 (415) T protein:vir:79 195 FFQLAYDINTH-------RGYFRISREAIEDA----KVNVLQELKLWMARTIAATRNKAIIDVITKGSTGSTSSGFEKEG 263 (415) T ss_pred eeeEEeeeeee-------EeeehhhHHHHhhc----hHHHHHHHHHHHHHHHHHHHHHHHhhccccCccccccccccccc Confidence 44555555554 44566999999984 35789999999999999999999987653321111000000000 Q ss_pred eeeeeccccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCc Q lcl|NC_015280. 266 VFDLDVDSNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGN 345 (455) Q Consensus 266 v~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~ 345 (455) +- ...++--..+....++..+... -.+.+.+||++.....|.. +... +|.. ....+.++ T Consensus 264 ~~---~~~~~~~~~~~i~~~~~~~~~~---------~~~~~~~v~n~~~~~~l~~---lkd~---~G~~--l~~~~~~~- 322 (415) T protein:vir:79 264 KK---LEVKKAKSLDDIKDAINLNVKP---------NYEHNVAIVSQTMFAKLDK---MKDK---LGNY--LIQPDVKE- 322 (415) T ss_pred cc---cccccccchhHHHHHHHhhhhh---------ccCCCEEEEcHHHHHHHHH---hhcc---CCce--eeccCcCC- Confidence 00 0011111123344444444322 1345568899999888875 2211 1111 01111111 Q ss_pred eeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEcc----ccc---ccce-eecCCccccceeeeeeecce Q lcl|NC_015280. 346 TFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCP----YVP---LQMY-RAIGQDTFQPRIGFKTRYGM 417 (455) Q Consensus 346 ~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaP----Yv~---l~~~-~~~Dp~s~qP~~g~~tRY~l 417 (455) ...++| .+++|++.++.. .|-.|+. .++|+- |+- ..+. ...|-.+++..+....|++. T Consensus 323 ~~~~~l-~G~pV~~~~~~~---------~~~~~~~----~~~~Gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~~~~r~d~ 388 (415) T protein:vir:79 323 KTQQRL-LGAKIEILPDEV---------LGQKGNN----TLIIGNLKDAIVLFDRSQYQASWTDYMHFGECLMIAVRQDC 388 (415) T ss_pred CCCcee-cceeeEEecccc---------cCCCCcc----EEEEEehhccEEEEeecceEEEEeccccCceEEEEEEEecc Confidence 122355 356777764321 1111111 123322 211 0111 22355667888888899998 Q ss_pred -eeccccccc---cccccc-Cchhhhh Q lcl|NC_015280. 418 -VLNPFAKGL---TALSDS-DPQAAGN 439 (455) Q Consensus 418 -~~nP~~~~~---~~~~~~-~~~~~~~ 439 (455) +.+|-.--. +....| -+.+.-+ T Consensus 389 ~v~~~~a~~~~~~~~~~~~~~~~~~~~ 415 (415) T protein:vir:79 389 RILDYKSAIVIEYDDSERGEGDLGLEA 415 (415) T ss_pred EEeccccEEEEEEeccCCCCCccccCC Confidence 677765521 111111 2222111 No 21 >protein:vir:9410 Length: 415 # NCBI annotation: head protein # Family: family:all:21 # MgeID: mge:167 # MgeName: phi 13 # Cross-refs: genbank:acc:NP_803388;genbank:gi:29028700;genbank:GeneID:1258136 Probab=97.30 E-value=0.0001 Score=42.46 Aligned_cols=341 Identities=14% Similarity=0.088 Sum_probs=137.4 Q ss_pred Cc------------chHHHHHHhhHhhc--------------CCCCcc---ccchhhHHHHHHHhhhHHH--HHHHHHHh Q lcl|NC_015280. 1 MY------------NAENLQEKWAPVLN--------------HEGLND---IKDPYRKSVTAILLENQER--ALAEERAV 49 (455) Q Consensus 1 m~------------~~~~~~~kw~~~l~--------------~~~~~~---i~~~~~~~v~~~~~enq~~--~~~e~~~~ 49 (455) ++ .-+.|.++..-+.+ .....+ .....++.-......++.. ...+|++. T Consensus 29 ~~~~~~e~~~~~~~ei~~l~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~e~~~ 108 (415) T protein:vir:94 29 LNNDELEKAEKLEQEITDLRSQIQEKQEELDKLKEKDGTSENNQQSVEVNEASTYRNQANINDLGISIQNTKVTSQEVRD 108 (415) T ss_pred hchhHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhccccccccchhhHHHHHHHHHHHhhhhhhhhhHHHHHH Confidence 11 11112222211100 000000 0000000000011111100 01223322 Q ss_pred hhhhhhchhhhccccccccccccccc--hhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCccccccccc Q lcl|NC_015280. 50 LTEAPTNVGPINTPTTSSGAVAGFDP--ILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPD 127 (455) Q Consensus 50 l~ea~~~~~~~~~~st~tg~i~~~~P--~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~ 127 (455) +...........+.++++++-...-| ..-.+++.+-+..+-.+++.|+||++..+-+--.+ +.+. .+ T Consensus 109 ~~~~~~~~~~~~~~~~~~~~g~~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~--~~~~--~~------- 177 (415) T protein:vir:94 109 FTEYLETRNDIQGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKYVTVKRVTNGSGKYPVVR--QSEV--AA------- 177 (415) T ss_pred HHHHhhhhhhhhhhccccccccccCcHHHHHHHHHHHHhhhhhhhhcceeeccCCceeEEEEe--ecCC--cc------- Confidence 22211111112222222222222223 23345555567778899999999998876554333 1100 00 Q ss_pred ccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeecccccc Q lcl|NC_015280. 128 AQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRA 207 (455) Q Consensus 128 t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKA 207 (455) .....++...++.....|.+..|++.|.. -.- T Consensus 178 -----------------------------------------~~~v~Eg~~~~~~~~~~~~~i~~~~~k~~-------~~~ 209 (415) T protein:vir:94 178 -----------------------------------------LEKVEELEENPELAVKPFFQLAYDINTHR-------GYF 209 (415) T ss_pred -----------------------------------------ceeccccccccccccccceeeEeeheeee-------eec Confidence 00000111111111123555555555554 445 Q ss_pred ceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHHHHH Q lcl|NC_015280. 208 DYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLLF 287 (455) Q Consensus 208 EYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~ 287 (455) .+|-||.+|-- +|.+++|.+-|...|..-+|+.||.-.-+-.-.+-..+....+.- ...++--..+....++. T Consensus 210 ~is~ell~ds~----~~~~~~i~~~l~~~~~~~~~~~il~g~g~g~~~~~~~~~~~~~~~---~~~~~~~~~~~i~~~~~ 282 (415) T protein:vir:94 210 RISREAIEDAK----VNVLQELKLWMARTIAATRNKAIIDVITKGSTGSTSSGFEKEGKK---LEVKKAKSLDDIKDAIN 282 (415) T ss_pred hhhHHHHhhch----HHHHHHHHHHHHHHHHHHHHHHHhhccccCccccccccccccccc---cccccccchHHHHHHHH Confidence 68999999864 478999999999999999999998765332111100000000000 00001111223333433 Q ss_pred HHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEeccccccC Q lcl|NC_015280. 288 QIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVS 367 (455) Q Consensus 288 qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s 367 (455) .+.. .-.+.+.+|+++.....|... .. .+|.. ....+.++ ...++|. +++|++.+... T Consensus 283 ~~~~---------~~~~~~~~vmn~~~~~~l~~l---kd---~~G~~--l~~~~~~~-~~~~~l~-G~pV~~~~~~~--- 340 (415) T protein:vir:94 283 LNVK---------PNYEHNVAIVSQTMFAKLDKM---KD---KLGNY--LIQPDVKE-KTQQRLL-GAKIEILPDEV--- 340 (415) T ss_pred hhhh---------hccCCCEEEEcHHHHHHHHHh---hc---cCCCe--eeccCcCC-CCCceec-ceeeEEecccc--- Confidence 3331 123466788999998888752 21 11110 01111111 1223453 55677664321 Q ss_pred CcceEEEEEecCccccceeEEccccc-------cc-ceeecCCccccceeeeeeecce-eecccccccccc---ccc-Cc Q lcl|NC_015280. 368 DNQYYVVGYKGTNAYDAGLFYCPYVP-------LQ-MYRAIGQDTFQPRIGFKTRYGM-VLNPFAKGLTAL---SDS-DP 434 (455) Q Consensus 368 ~~dY~~vG~KG~~~~daglfyaPYv~-------l~-~~~~~Dp~s~qP~~g~~tRY~l-~~nP~~~~~~~~---~~~-~~ 434 (455) .|-+|+ ..++|+.+-. .. -....|-.++|-.+-...|++. +.+|-.--.-.. ..| -+ T Consensus 341 ------~~~~~~----~~i~~gd~~~~~~~~~~~~~~v~~~~~~~~~~~~r~~~r~d~~~~~~~a~~~~~~~~~~~~~~~ 410 (415) T protein:vir:94 341 ------LGQKGN----NTLIIGNLKDAIVLFDRSQYQASWTDYMHFGECLMIAVRQDCRILDYKSAIVIEYDDSERGEGD 410 (415) T ss_pred ------cCCCCc----cEEEEEehhccEEEEeecceEEEEeccccCceEEEEEEEeccEEeccccEEEEEEeccCCCCCc Confidence 111111 1122222111 11 1122355567777778889998 677755521111 111 22 Q ss_pred hhhhh Q lcl|NC_015280. 435 QAAGN 439 (455) Q Consensus 435 ~~~~~ 439 (455) .+.-+ T Consensus 411 ~~~~~ 415 (415) T protein:vir:94 411 LGLEA 415 (415) T ss_pred cccCC Confidence 22111 No 22 >protein:vir:1886 Length: 385 # NCBI annotation: major capsid subunit precursor # Family: family:all:585 # MgeID: mge:41 # MgeName: HK022 # Cross-refs: genbank:acc:NP_037666;genbank:gi:9634124;genbank:GeneID:1262513 Probab=97.12 E-value=0.00016 Score=41.32 Aligned_cols=322 Identities=13% Similarity=0.087 Sum_probs=133.5 Q ss_pred CcchHHHHHHhhHhhcCCCCccccchhhHHHHH-----HHhhhHHHHHHHH----HHhhhhhh----------------- Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHEGLNDIKDPYRKSVTA-----ILLENQERALAEE----RAVLTEAP----------------- 54 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~~~~~i~~~~~~~v~~-----~~~enq~~~~~e~----~~~l~ea~----------------- 54 (455) |++.++|+++..-..+ .+-++.+..+..+-. .=|+++.+.+.++ .+.+.+.. T Consensus 1 M~~l~el~~~~~~~~~--e~~~l~~~~~~e~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 78 (385) T protein:vir:18 1 MSELALIQKAIEESQQ--KMTQLFDAQKAEIESTGQVSKQLQSDLMKVQEELTKSGTRLFDLEQKLASGAENPGEKKSFS 78 (385) T ss_pred ChHHHHHHHHHHHHHH--HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhccccccchhhhhH Confidence 9999888888877543 222222222221110 1111221111100 00011100 Q ss_pred ---------------hchh------hhcccccccccc--ccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEe Q lcl|NC_015280. 55 ---------------TNVG------PINTPTTSSGAV--AGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRS 111 (455) Q Consensus 55 ---------------~~~~------~~~~~st~tg~i--~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRs 111 (455) .+.. .+...++..|.. ....+. +++++.....-.+++-++||+++..-+.-.. T Consensus 79 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~g~~i~~~~~~~---ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~- 154 (385) T protein:vir:18 79 ERAAEELIKSWDGKQGTFGAKTFNKSLGSDADSAGSLIQPMQIPG---IIMPGLRRLTIRDLLAQGRTSSNALEYVREE- 154 (385) T ss_pred HHHHHHHHHHHHHhhccchhhHHHhhhccccccCCceecchhhhH---HHHHhhhccchhhhcceecccCcceEEEEEe- Confidence 0000 000111111111 111233 3344444556677888888887653221110 Q ss_pred eecCCCCcccccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCcccccee Q lcl|NC_015280. 112 RYTNQSGNEAFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAF 191 (455) Q Consensus 112 rY~~qsG~EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaF 191 (455) ...+. +-..+ | +..+++-.. T Consensus 155 ---~~~~~-------------------------------------------------a~~v~--E------~~~~~~~~~ 174 (385) T protein:vir:18 155 ---VFTNN-------------------------------------------------ADVVA--E------KALKPESDI 174 (385) T ss_pred ---cCCcc-------------------------------------------------eeeec--c------Ccccccccc Confidence 00000 00000 0 122344555 Q ss_pred EEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeec Q lcl|NC_015280. 192 SIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDV 271 (455) Q Consensus 192 sIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~ 271 (455) ++++++.+.+.-+-...+|-||.||-- +.++.|.+-|+..|..-+|+.||.- .- .+-...|++.... T Consensus 175 ~~~~~~~~~~k~~~~~~is~ell~d~~-----~l~~~i~~~la~a~~~~~d~~~l~G----~g----~~~~~~Gi~~~~~ 241 (385) T protein:vir:18 175 TFSKQTANVKTIAHWVQASRQVMDDAP-----MLQSYINNRLMYGLALKEEGQLLNG----DG----TGDNLEGLNKVAT 241 (385) T ss_pred ceeEEEEeeeeEEEeehhhHHHHhhHH-----HHHHHHHHHHHHHHHHHHHHHHHhc----cC----CCCcccccccccc Confidence 666777777777777889999999842 2566777777777777777766632 11 1111222221111 Q ss_pred cc------cchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCc Q lcl|NC_015280. 272 DS------NGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGN 345 (455) Q Consensus 272 ~~------~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~ 345 (455) .. .+--.......+++++. ..-+..+-+|+||+....|... .. .+|..-+....+.+ T Consensus 242 ~~~~~~~~~~~~~~d~i~~~~~~l~---------~~~~~~~~~~~~~~~~~~l~~l---kd---~~G~~l~~~~~~~~-- 304 (385) T protein:vir:18 242 AYDTSLNATGDTRADIIAHAIYQVT---------ESEFSASGIVLNPRDWHNIALL---KD---NEGRYIFGGPQAFT-- 304 (385) T ss_pred cccccccccccchHHHHHHHHHhhc---------cccCCCCEEEEcHHHHHHHHHh---hc---CCCceeccCcccCC-- Confidence 00 00001112222223222 2334566789999999888752 21 11111110001111 Q ss_pred eeEEEecCceEEEEeccccccCCcceEEEE-EecCccccceeEEcccccccceeecCC---ccc-cceee--eeeecce- Q lcl|NC_015280. 346 TFVGTLNGRFKVYIDPYSANVSDNQYYVVG-YKGTNAYDAGLFYCPYVPLQMYRAIGQ---DTF-QPRIG--FKTRYGM- 417 (455) Q Consensus 346 ~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG-~KG~~~~daglfyaPYv~l~~~~~~Dp---~s~-qP~~g--~~tRY~l- 417 (455) .++|. |++|+++.+. |..=+++| +|. +|--+.-..+...++. +-| +..++ ...||+. T Consensus 305 --~~~l~-G~pV~~~~~~----p~~~~~~gd~~~--------~~~~~~~~~~~v~~~~~~~~~~~~~~~~~~~~~r~~~~ 369 (385) T protein:vir:18 305 --SNIMW-GLPVVPTKAQ----AAGTFTVGGFDM--------ASQVWDRMDATVEVSREDRDNFVKNMLTILCEERLALA 369 (385) T ss_pred --Cceec-ceeeEEcCcC----CCCcEEEeeccc--------EEEEEEecceEEEEeccccchhhcCcEEEEEEEeeccE Confidence 24564 4899999875 33334443 110 1111111111111111 112 22333 4458887 Q ss_pred eecccccccccccccC Q lcl|NC_015280. 418 VLNPFAKGLTALSDSD 433 (455) Q Consensus 418 ~~nP~~~~~~~~~~~~ 433 (455) +.+|-.--.-+...+. T Consensus 370 v~~~~a~~~~~~~aa~ 385 (385) T protein:vir:18 370 HYRPTAIIKGTFSSGS 385 (385) T ss_pred EecccceEEEEeccCC Confidence 6666444322222222 No 23 >protein:vir:191 Length: 385 # NCBI annotation: major head subunit precursor # Family: family:all:585 # MgeID: mge:6 # MgeName: HK97 # Cross-refs: genbank:acc:NP_037701;genbank:gi:9634158;genbank:GeneID:1262530 Probab=97.12 E-value=0.00016 Score=41.32 Aligned_cols=322 Identities=13% Similarity=0.087 Sum_probs=133.5 Q ss_pred CcchHHHHHHhhHhhcCCCCccccchhhHHHHH-----HHhhhHHHHHHHH----HHhhhhhh----------------- Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHEGLNDIKDPYRKSVTA-----ILLENQERALAEE----RAVLTEAP----------------- 54 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~~~~~i~~~~~~~v~~-----~~~enq~~~~~e~----~~~l~ea~----------------- 54 (455) |++.++|+++..-..+ .+-++.+..+..+-. .=|+++.+.+.++ .+.+.+.. T Consensus 1 M~~l~el~~~~~~~~~--e~~~l~~~~~~e~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 78 (385) T protein:vir:19 1 MSELALIQKAIEESQQ--KMTQLFDAQKAEIESTGQVSKQLQSDLMKVQEELTKSGTRLFDLEQKLASGAENPGEKKSFS 78 (385) T ss_pred ChHHHHHHHHHHHHHH--HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhccccccchhhhhH Confidence 9999888888877543 222222222221110 1111221111100 00011100 Q ss_pred ---------------hchh------hhcccccccccc--ccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEe Q lcl|NC_015280. 55 ---------------TNVG------PINTPTTSSGAV--AGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRS 111 (455) Q Consensus 55 ---------------~~~~------~~~~~st~tg~i--~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRs 111 (455) .+.. .+...++..|.. ....+. +++++.....-.+++-++||+++..-+.-.. T Consensus 79 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~g~~i~~~~~~~---ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~- 154 (385) T protein:vir:19 79 ERAAEELIKSWDGKQGTFGAKTFNKSLGSDADSAGSLIQPMQIPG---IIMPGLRRLTIRDLLAQGRTSSNALEYVREE- 154 (385) T ss_pred HHHHHHHHHHHHHhhccchhhHHHhhhccccccCCceecchhhhH---HHHHhhhccchhhhcceecccCcceEEEEEe- Confidence 0000 000111111111 111233 3344444556677888888887653221110 Q ss_pred eecCCCCcccccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCcccccee Q lcl|NC_015280. 112 RYTNQSGNEAFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAF 191 (455) Q Consensus 112 rY~~qsG~EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaF 191 (455) ...+. +-..+ | +..+++-.. T Consensus 155 ---~~~~~-------------------------------------------------a~~v~--E------~~~~~~~~~ 174 (385) T protein:vir:19 155 ---VFTNN-------------------------------------------------ADVVA--E------KALKPESDI 174 (385) T ss_pred ---cCCcc-------------------------------------------------eeeec--c------Ccccccccc Confidence 00000 00000 0 122344555 Q ss_pred EEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeec Q lcl|NC_015280. 192 SIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDV 271 (455) Q Consensus 192 sIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~ 271 (455) ++++++.+.+.-+-...+|-||.||-- +.++.|.+-|+..|..-+|+.||.- .- .+-...|++.... T Consensus 175 ~~~~~~~~~~k~~~~~~is~ell~d~~-----~l~~~i~~~la~a~~~~~d~~~l~G----~g----~~~~~~Gi~~~~~ 241 (385) T protein:vir:19 175 TFSKQTANVKTIAHWVQASRQVMDDAP-----MLQSYINNRLMYGLALKEEGQLLNG----DG----TGDNLEGLNKVAT 241 (385) T ss_pred ceeEEEEeeeeEEEeehhhHHHHhhHH-----HHHHHHHHHHHHHHHHHHHHHHHhc----cC----CCCcccccccccc Confidence 666777777777777889999999842 2566777777777777777766632 11 1111222221111 Q ss_pred cc------cchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCc Q lcl|NC_015280. 272 DS------NGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGN 345 (455) Q Consensus 272 ~~------~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~ 345 (455) .. .+--.......+++++. ..-+..+-+|+||+....|... .. .+|..-+....+.+ T Consensus 242 ~~~~~~~~~~~~~~d~i~~~~~~l~---------~~~~~~~~~~~~~~~~~~l~~l---kd---~~G~~l~~~~~~~~-- 304 (385) T protein:vir:19 242 AYDTSLNATGDTRADIIAHAIYQVT---------ESEFSASGIVLNPRDWHNIALL---KD---NEGRYIFGGPQAFT-- 304 (385) T ss_pred cccccccccccchHHHHHHHHHhhc---------cccCCCCEEEEcHHHHHHHHHh---hc---CCCceeccCcccCC-- Confidence 00 00001112222223222 2334566789999999888752 21 11111110001111 Q ss_pred eeEEEecCceEEEEeccccccCCcceEEEE-EecCccccceeEEcccccccceeecCC---ccc-cceee--eeeecce- Q lcl|NC_015280. 346 TFVGTLNGRFKVYIDPYSANVSDNQYYVVG-YKGTNAYDAGLFYCPYVPLQMYRAIGQ---DTF-QPRIG--FKTRYGM- 417 (455) Q Consensus 346 ~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG-~KG~~~~daglfyaPYv~l~~~~~~Dp---~s~-qP~~g--~~tRY~l- 417 (455) .++|. |++|+++.+. |..=+++| +|. +|--+.-..+...++. +-| +..++ ...||+. T Consensus 305 --~~~l~-G~pV~~~~~~----p~~~~~~gd~~~--------~~~~~~~~~~~v~~~~~~~~~~~~~~~~~~~~~r~~~~ 369 (385) T protein:vir:19 305 --SNIMW-GLPVVPTKAQ----AAGTFTVGGFDM--------ASQVWDRMDATVEVSREDRDNFVKNMLTILCEERLALA 369 (385) T ss_pred --Cceec-ceeeEEcCcC----CCCcEEEeeccc--------EEEEEEecceEEEEeccccchhhcCcEEEEEEEeeccE Confidence 24564 4899999875 33334443 110 1111111111111111 112 22333 4458887 Q ss_pred eecccccccccccccC Q lcl|NC_015280. 418 VLNPFAKGLTALSDSD 433 (455) Q Consensus 418 ~~nP~~~~~~~~~~~~ 433 (455) +.+|-.--.-+...+. T Consensus 370 v~~~~a~~~~~~~aa~ 385 (385) T protein:vir:19 370 HYRPTAIIKGTFSSGS 385 (385) T ss_pred EecccceEEEEeccCC Confidence 6666444322222222 No 24 >protein:vir:4600 Length: 415 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:101 # MgeName: PVL # Cross-refs: genbank:acc:NP_058445;genbank:gi:9635171;genbank:GeneID:1262708 Probab=96.10 E-value=0.001 Score=37.01 Aligned_cols=338 Identities=14% Similarity=0.071 Sum_probs=139.8 Q ss_pred CcchHHHHHHhhHhhc----------CCC----Cccccc------hhhHHHHHHHhhhHHHHHHHHHHhhhhhhhchhhh Q lcl|NC_015280. 1 MYNAENLQEKWAPVLN----------HEG----LNDIKD------PYRKSVTAILLENQERALAEERAVLTEAPTNVGPI 60 (455) Q Consensus 1 m~~~~~~~~kw~~~l~----------~~~----~~~i~~------~~~~~v~~~~~enq~~~~~e~~~~l~ea~~~~~~~ 60 (455) +-.-+.|.++..-+.. ... ..+... ...+......+.+. +...+|++.+.+........ T Consensus 41 ~~ev~~l~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~ 119 (415) T protein:vir:46 41 EQEITDLRSQIQEKQEELDKLKEKDRTSENNQQSVEVNEARTYRNQANINDLGISIQNT-KVTSQEVRDFTEYLETRNDI 119 (415) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccccchhhhhHHHHHHHHHHHhhhhh-hhhHHHHHHHHHHHhhhhhh Confidence 1111122222211000 000 000000 00000000000000 00112222222211111112 Q ss_pred ccccccccccccccchh--hhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccccccccccccccccc Q lcl|NC_015280. 61 NTPTTSSGAVAGFDPIL--ISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGTDGATP 138 (455) Q Consensus 61 ~~~st~tg~i~~~~P~L--v~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~~~ 138 (455) .+.++++.+-...=|.. -.+++.+.+...-.+++.+.||+++++-+.-.+ .. .+.++ T Consensus 120 ~~~~~~t~~g~~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~--~~--~~~~~----------------- 178 (415) T protein:vir:46 120 QGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKYVTVKRVTNGSGKYPVVR--QS--EVAAL----------------- 178 (415) T ss_pred hhccccccCCcccccHHHHHHHHHHHHhhhhhhhhcceeeccCCceeEEEEE--ec--CCcce----------------- Confidence 22222222222222322 235565667778889999999999987654443 00 00000 Q ss_pred cccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccce-eEEEEEEEEeeccccccceeHHHHHhH Q lcl|NC_015280. 139 PTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMA-FSIDKIAVEAKGRALRADYSVELAQDL 217 (455) Q Consensus 139 ~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMa-FsIEK~tVtAKSRaLKAEYTiELAQDL 217 (455) ...++ +...++.+ -++++++..++..+-...+|-||.+|- T Consensus 179 -------------------------------~~v~E--------g~~~~~~~~~~~~~v~~~~~k~~~~~~iS~ell~ds 219 (415) T protein:vir:46 179 -------------------------------EKVEE--------LEENPELAVKPFFQLAYDINTHRGYFRISREAIEDA 219 (415) T ss_pred -------------------------------eeccc--------ccccccccccceeeEEeeeeeeEeeehhhHHHHhhc Confidence 00000 11122332 245555566665565668999999984 Q ss_pred HHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeec-cccceeeeeeccccchhhHHHHHHHHHHHHHHHHHH Q lcl|NC_015280. 218 KAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQAN-VANAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAI 296 (455) Q Consensus 218 kAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~-v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i 296 (455) . .|.+++|.+-|+..|..-+|+.||.-.-+-...+--.. ......+. .++--..+....++..+.. T Consensus 220 ~----~~l~~~i~~~l~~~i~~~~d~~il~g~g~g~~~~~~~~~~~~~~~~~----~~~~~~~~~i~~~~~~~~~----- 286 (415) T protein:vir:46 220 K----VNVLQELKLWMARTIAATRNKAIIDVITKGSTGSTSSGFEKEGKKLE----VKKAKSLDDIKDAINLNVK----- 286 (415) T ss_pred h----HHHHHHHHHHHHHHHHHHHHHHHhhccccCCccccccccccccceec----cccccchHHHHHHHHhhhh----- Confidence 3 57889999999999999999999876533211111000 00011110 0111112334444444432 Q ss_pred HHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEE Q lcl|NC_015280. 297 AQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGY 376 (455) Q Consensus 297 ~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~ 376 (455) --++.+.+|+++.....|.. +.. .+|. .....+.++. ..++|. +++|++..++. .|- T Consensus 287 ----~~~~~~~~v~n~~~~~~L~~---lkd---~~G~--~i~~~~~~~~-~~~~l~-G~pV~~~~~~~---------~~~ 343 (415) T protein:vir:46 287 ----PNYEHNVAIVSQTMFAKLDK---MKD---KLGN--YLIQPDVKEK-TQQRLL-GAKIEILPDEV---------LGQ 343 (415) T ss_pred ----hccCCCEEEEcHHHHHHHHH---hhc---cCCC--eeeccCcCCC-CCcccc-ceeeEEecccc---------ccC Confidence 12356678899999888875 221 1111 1111121111 124553 45666654321 111 Q ss_pred ecCccccceeEEccccc-------cc-ceeecCCccccceeeeeeecce-eeccccccc---ccccccC-chhhhh Q lcl|NC_015280. 377 KGTNAYDAGLFYCPYVP-------LQ-MYRAIGQDTFQPRIGFKTRYGM-VLNPFAKGL---TALSDSD-PQAAGN 439 (455) Q Consensus 377 KG~~~~daglfyaPYv~-------l~-~~~~~Dp~s~qP~~g~~tRY~l-~~nP~~~~~---~~~~~~~-~~~~~~ 439 (455) .| +..++|+.|-. .. .....|-.++|-.+-...|++. +.+|-+--. +....|. +.+.-+ T Consensus 344 ~~----~~~~~~gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~~~~r~d~~v~~~~a~~~~~~~~~~~~~~~~~~~~ 415 (415) T protein:vir:46 344 KG----NNTLIIGNLKDAIVLFDRSQYQASWTDYMHFGECLMIAVRQDCRILDYKSAIVIEYDDSERGEGDLGLEA 415 (415) T ss_pred CC----ccEEEEEehhccEEEEeecceEEEeeccccCceEEEEEEEeccEEeccccEEEEEeeccCCCCCCccCCC Confidence 11 11123322211 00 1122355667777788889998 777755421 1111222 222111 No 25 >protein:vir:4700 Length: 415 # NCBI annotation: phi PVL ORF 7 homologue # Family: family:all:21 # MgeID: mge:102 # MgeName: phiPV83 # Cross-refs: genbank:acc:NP_061632;genbank:gi:9635719;genbank:GeneID:1262976 Probab=96.10 E-value=0.001 Score=37.01 Aligned_cols=338 Identities=14% Similarity=0.071 Sum_probs=139.8 Q ss_pred CcchHHHHHHhhHhhc----------CCC----Cccccc------hhhHHHHHHHhhhHHHHHHHHHHhhhhhhhchhhh Q lcl|NC_015280. 1 MYNAENLQEKWAPVLN----------HEG----LNDIKD------PYRKSVTAILLENQERALAEERAVLTEAPTNVGPI 60 (455) Q Consensus 1 m~~~~~~~~kw~~~l~----------~~~----~~~i~~------~~~~~v~~~~~enq~~~~~e~~~~l~ea~~~~~~~ 60 (455) +-.-+.|.++..-+.. ... ..+... ...+......+.+. +...+|++.+.+........ T Consensus 41 ~~ev~~l~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~ 119 (415) T protein:vir:47 41 EQEITDLRSQIQEKQEELDKLKEKDRTSENNQQSVEVNEARTYRNQANINDLGISIQNT-KVTSQEVRDFTEYLETRNDI 119 (415) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccccchhhhhHHHHHHHHHHHhhhhh-hhhHHHHHHHHHHHhhhhhh Confidence 1111122222211000 000 000000 00000000000000 00112222222211111112 Q ss_pred ccccccccccccccchh--hhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccccccccccccccccc Q lcl|NC_015280. 61 NTPTTSSGAVAGFDPIL--ISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGTDGATP 138 (455) Q Consensus 61 ~~~st~tg~i~~~~P~L--v~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~~~ 138 (455) .+.++++.+-...=|.. -.+++.+.+...-.+++.+.||+++++-+.-.+ .. .+.++ T Consensus 120 ~~~~~~t~~g~~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~--~~--~~~~~----------------- 178 (415) T protein:vir:47 120 QGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKYVTVKRVTNGSGKYPVVR--QS--EVAAL----------------- 178 (415) T ss_pred hhccccccCCcccccHHHHHHHHHHHHhhhhhhhhcceeeccCCceeEEEEE--ec--CCcce----------------- Confidence 22222222222222322 235565667778889999999999987654443 00 00000 Q ss_pred cccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccce-eEEEEEEEEeeccccccceeHHHHHhH Q lcl|NC_015280. 139 PTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMA-FSIDKIAVEAKGRALRADYSVELAQDL 217 (455) Q Consensus 139 ~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMa-FsIEK~tVtAKSRaLKAEYTiELAQDL 217 (455) ...++ +...++.+ -++++++..++..+-...+|-||.+|- T Consensus 179 -------------------------------~~v~E--------g~~~~~~~~~~~~~v~~~~~k~~~~~~iS~ell~ds 219 (415) T protein:vir:47 179 -------------------------------EKVEE--------LEENPELAVKPFFQLAYDINTHRGYFRISREAIEDA 219 (415) T ss_pred -------------------------------eeccc--------ccccccccccceeeEEeeeeeeEeeehhhHHHHhhc Confidence 00000 11122332 245555566665565668999999984 Q ss_pred HHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeec-cccceeeeeeccccchhhHHHHHHHHHHHHHHHHHH Q lcl|NC_015280. 218 KAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQAN-VANAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAI 296 (455) Q Consensus 218 kAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~-v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i 296 (455) . .|.+++|.+-|+..|..-+|+.||.-.-+-...+--.. ......+. .++--..+....++..+.. T Consensus 220 ~----~~l~~~i~~~l~~~i~~~~d~~il~g~g~g~~~~~~~~~~~~~~~~~----~~~~~~~~~i~~~~~~~~~----- 286 (415) T protein:vir:47 220 K----VNVLQELKLWMARTIAATRNKAIIDVITKGSTGSTSSGFEKEGKKLE----VKKAKSLDDIKDAINLNVK----- 286 (415) T ss_pred h----HHHHHHHHHHHHHHHHHHHHHHHhhccccCCccccccccccccceec----cccccchHHHHHHHHhhhh----- Confidence 3 57889999999999999999999876533211111000 00011110 0111112334444444432 Q ss_pred HHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEE Q lcl|NC_015280. 297 AQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGY 376 (455) Q Consensus 297 ~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~ 376 (455) --++.+.+|+++.....|.. +.. .+|. .....+.++. ..++|. +++|++..++. .|- T Consensus 287 ----~~~~~~~~v~n~~~~~~L~~---lkd---~~G~--~i~~~~~~~~-~~~~l~-G~pV~~~~~~~---------~~~ 343 (415) T protein:vir:47 287 ----PNYEHNVAIVSQTMFAKLDK---MKD---KLGN--YLIQPDVKEK-TQQRLL-GAKIEILPDEV---------LGQ 343 (415) T ss_pred ----hccCCCEEEEcHHHHHHHHH---hhc---cCCC--eeeccCcCCC-CCcccc-ceeeEEecccc---------ccC Confidence 12356678899999888875 221 1111 1111121111 124553 45666654321 111 Q ss_pred ecCccccceeEEccccc-------cc-ceeecCCccccceeeeeeecce-eeccccccc---ccccccC-chhhhh Q lcl|NC_015280. 377 KGTNAYDAGLFYCPYVP-------LQ-MYRAIGQDTFQPRIGFKTRYGM-VLNPFAKGL---TALSDSD-PQAAGN 439 (455) Q Consensus 377 KG~~~~daglfyaPYv~-------l~-~~~~~Dp~s~qP~~g~~tRY~l-~~nP~~~~~---~~~~~~~-~~~~~~ 439 (455) .| +..++|+.|-. .. .....|-.++|-.+-...|++. +.+|-+--. +....|. +.+.-+ T Consensus 344 ~~----~~~~~~gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~~~~r~d~~v~~~~a~~~~~~~~~~~~~~~~~~~~ 415 (415) T protein:vir:47 344 KG----NNTLIIGNLKDAIVLFDRSQYQASWTDYMHFGECLMIAVRQDCRILDYKSAIVIEYDDSERGEGDLGLEA 415 (415) T ss_pred CC----ccEEEEEehhccEEEEeecceEEEeeccccCceEEEEEEEeccEEeccccEEEEEeeccCCCCCCccCCC Confidence 11 11123322211 00 1122355667777788889998 777755421 1111222 222111 No 26 >protein:vir:104256 Length: 458 # NCBI annotation: major head protein precursor # Family: family:all:27070 # MgeID: mge:1504 # MgeName: T5 # Cross-refs: genbank:acc:YP_006977;genbank:gi:46401878;genbank:GeneID:2777673 Probab=95.53 E-value=0.0019 Score=35.52 Aligned_cols=335 Identities=14% Similarity=0.143 Sum_probs=121.2 Q ss_pred Ccc--------hHHHHHHhhH-----------hh-cCCCCccccchhhHHHHHHHhhhHHHHHHHHHHhh---hhh---- Q lcl|NC_015280. 1 MYN--------AENLQEKWAP-----------VL-NHEGLNDIKDPYRKSVTAILLENQERALAEERAVL---TEA---- 53 (455) Q Consensus 1 m~~--------~~~~~~kw~~-----------~l-~~~~~~~i~~~~~~~v~~~~~enq~~~~~e~~~~l---~ea---- 53 (455) +.+ .....++... .+ ..+..+.+....++. ...-.+|.+.++ +.+.+. .+. T Consensus 73 l~ee~~~~~~~~a~~~e~~~~~~~~~~~~~~~~~~~~e~~~~~~~~~~~~-~~~~~~~~~~~~-e~~~~~~~~~~~~~~~ 150 (458) T protein:vir:10 73 LDEKSKKSNELFAQTVEKQQETIVGLQDEIKSLLTAREGRSFVGDSVAKA-LYGTQENFEDEV-EKLVLLSYVMEKGVFE 150 (458) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcc-chhhhhhHHHHH-HHHHHHHHHHhhccch Confidence 000 0000011100 00 000111110000000 000000100100 011110 000 Q ss_pred -hhc---hhhhccccccccccccccchh-hhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccccccc Q lcl|NC_015280. 54 -PTN---VGPINTPTTSSGAVAGFDPIL-ISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDA 128 (455) Q Consensus 54 -~~~---~~~~~~~st~tg~i~~~~P~L-v~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t 128 (455) ... +..+....+++.....+-|.+ -.++.++.+..+..+++-++||+++..-++ ..+ .+ +.. T Consensus 151 ~~~~~~~~~a~~~~~~~~~g~~~ip~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~-~~~-----~~-------~~a 217 (458) T protein:vir:10 151 TEHGQRHLKAVNQSSSVEVSSESYETIFSQRIIRDLQKELVVGALFEELPMSSKILTML-VEP-----DA-------GKA 217 (458) T ss_pred hhhhhhhhhhhhhcccCccccceehhhHhHHHHHHHHhhhhHHhhcceeecCCcceEEE-Eec-----CC-------cce Confidence 000 001111111111111111111 123344456667889999999988653222 110 00 000 Q ss_pred cccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccc Q lcl|NC_015280. 129 QFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRAD 208 (455) Q Consensus 129 ~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAE 208 (455) .| ..++...++... ...-..+++++++.++.-+-... T Consensus 218 ~~-----------------------------------------v~e~~~~~~~~~--~~~~~~~~~~i~~~~~k~~~~v~ 254 (458) T protein:vir:10 218 TW-----------------------------------------VAASTYGTDTTT--GEEVKGALKEIHFSTYKLAAKSF 254 (458) T ss_pred ee-----------------------------------------cccccccccccc--cccccccceeeEeeeeeEEeeeh Confidence 00 000000000000 00111223455555555555678 Q ss_pred eeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeee------------ccccch Q lcl|NC_015280. 209 YSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLD------------VDSNGR 276 (455) Q Consensus 209 YTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~------------~~~~gr 276 (455) +|-||.+|-- .|.+++|.+-|+..|..-||+.||.- .-.++ ..|++... .....- T Consensus 255 is~ell~ds~----~~~~~~i~~~l~~~i~~~~d~~~l~G----~G~~~-----p~Gi~~~~~~~~~~~~~~~~~~~~~~ 321 (458) T protein:vir:10 255 ITDETEEDAI----FSLLPLLRKRLIEAHAVSIEEAFMTG----DGSGK-----PKGLLTLASEDSAKVVTEAKADGSVL 321 (458) T ss_pred hhHHHHhcch----HHHHHHHHHHHHHHHHHHHHHHhhcC----CCCCc-----cceeeecccccccceeeccccccccc Confidence 8999988833 46788888889999988888888742 00111 12222111 111111 Q ss_pred hhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCcee---EEEecC Q lcl|NC_015280. 277 WSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTF---VGTLNG 353 (455) Q Consensus 277 ~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~---~G~l~~ 353 (455) ...+....|.+.+.. . -+ ....+|+++.....|.. +... +|..- ...+.+.... .++|. T Consensus 322 ~~~~~i~~~~~~l~~-------~-~~-~~~~~v~~~~~~~~l~~---lkd~---~G~~i--~~~~~~~~~~~~~~~~l~- 383 (458) T protein:vir:10 322 VTAKTISKLRRKLGR-------H-GL-KLSKLVLIVSMDAYYDL---LEDE---EWQDV--AQVGNDSVKLQGQVGRIY- 383 (458) T ss_pred ccHHHHHHHHHhhhh-------h-hc-CCCEEEEcHHHHHHHHh---hccc---CCcee--eccccccccccCcCceec- Confidence 112223334333321 1 12 34567889988887764 2211 11100 0011111111 23565 Q ss_pred ceEEEEecccccc-CCcceEEEEEecCccccceeEEcccccccceeecCCccccceeeee--eecce-eecccccccccc Q lcl|NC_015280. 354 RFKVYIDPYSANV-SDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQDTFQPRIGFK--TRYGM-VLNPFAKGLTAL 429 (455) Q Consensus 354 ~~~vy~D~y~~~~-s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s~qP~~g~~--tRY~l-~~nP~~~~~~~~ 429 (455) +++|+++.+.... ...+.++..++ + +.++ ..-..+....||-+-...++|. .|.|+ +.+|-.-- T Consensus 384 G~pv~~~~~~p~~~~~~~~~~~~f~-~-----~~~~--~~~~~~~v~~d~~~~~~~~~~~~~~r~~~~v~~~~a~v---- 451 (458) T protein:vir:10 384 GLPVVVSEYFPAKANSAEFAVIVYK-D-----NFVM--PRQRAVTVERERQAGKQRDAYYVTQRVNLQRYFANGVV---- 451 (458) T ss_pred ceeeEEccccccccCCcceEEEEec-c-----cEEE--EEeeceEEEeecccCCCceEEEEEEEecceEecccceE---- Confidence 6899998664211 11233322222 1 0111 1111222234665545556665 47766 55553321 Q ss_pred cccCchhhhhccch Q lcl|NC_015280. 430 SDSDPQAAGNLNAN 443 (455) Q Consensus 430 ~~~~~~~~~~~~~n 443 (455) .+. .-++ T Consensus 452 -~~~------~aa~ 458 (458) T protein:vir:10 452 -SGT------YAAS 458 (458) T ss_pred -EEe------eccC Confidence 111 1111 No 27 >protein:vir:4830 Length: 397 # NCBI annotation: MPL-7201 # Family: family:all:21 # MgeID: mge:105 # MgeName: 7201 # Cross-refs: genbank:acc:NP_038327;genbank:gi:9634653;genbank:GeneID:1262632 Probab=95.41 E-value=0.0021 Score=35.23 Aligned_cols=327 Identities=12% Similarity=0.064 Sum_probs=123.2 Q ss_pred CcchHHHHHHhhH--------------hhcCC------CCccccchhhHHHHHHHhhhHHHHHHHHHHhhhhhh----hc Q lcl|NC_015280. 1 MYNAENLQEKWAP--------------VLNHE------GLNDIKDPYRKSVTAILLENQERALAEERAVLTEAP----TN 56 (455) Q Consensus 1 m~~~~~~~~kw~~--------------~l~~~------~~~~i~~~~~~~v~~~~~enq~~~~~e~~~~l~ea~----~~ 56 (455) ....+...++-.. +.+.. .........+ .....-.++ ...+++..+.+.. .. T Consensus 29 ~~~~~~~~ee~~~l~~ei~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~---~~~~~~~~~~~~~~~~~~~ 104 (397) T protein:vir:48 29 MLDDSVTAEELQAIKNERDTAKMKRDMFKEQYTEARANEVVNMSEEEK-KPLTKSEEE---VKAGFVKDFKNLVRGRYQN 104 (397) T ss_pred hcchhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhcc-ccccchhhH---HHHHHHHHHHHHHhhhhhH Confidence 1110111111000 00000 0000000000 000000000 0111222221111 11 Q ss_pred hhhhcccccc-ccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCccccccccccccccccc Q lcl|NC_015280. 57 VGPINTPTTS-SGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGTDG 135 (455) Q Consensus 57 ~~~~~~~st~-tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~~~ 135 (455) ..+..+.+++ .|.+.--....-.+++.+.+...-.+++.++||++++|-+--.+ ..+..+. T Consensus 105 ~~~~~~~~t~~~gg~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~--~~~~~~~---------------- 166 (397) T protein:vir:48 105 LLDSKTDASGSDAGLTIPQDIQTAIHTLVRQYDSLQEYVNVENVTTLTGSRVYEK--WADITGL---------------- 166 (397) T ss_pred HHHHhhccCCccccccccHHHHHHHHHHHHHHHHHHhhhceeeccCCcceEEEEe--ecCCCcc---------------- Confidence 1111112222 22221111112233343445556688899999999998765444 1100000 Q ss_pred ccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHH Q lcl|NC_015280. 136 ATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQ 215 (455) Q Consensus 136 ~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQ 215 (455) +-...+.+..++.....|.++.|++.|..+ ...+|-||.+ T Consensus 167 ---------------------------------a~~v~E~~~~~~~~~~~~~~v~~~~~k~~~-------~~~iS~ell~ 206 (397) T protein:vir:48 167 ---------------------------------AKLDDEAGSIGTNDDPKLYPIRYAIKRYAG-------ISTVTNSLLA 206 (397) T ss_pred ---------------------------------eeeeccccccccccccceeeEEeeheeeee-------ehhhHHHHHh Confidence 000111111122222235556666555544 4679999999 Q ss_pred hHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHHHHHHHHHHHHH Q lcl|NC_015280. 216 DLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLLFQIERDANA 295 (455) Q Consensus 216 DLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~ 295 (455) |-. +|.+++|.+-|+..|..-+|+.||.-.-+ +....++.++ +....+...+... T Consensus 207 ds~----~~l~~~v~~~l~~~~~~~~d~~il~G~g~--------~~~~~~~~~~----------d~i~~~~~~l~~~--- 261 (397) T protein:vir:48 207 DSA----ENILAWLSGWIAKKVVVTRNKAILEAIAT--------LPTKPTLTKW----------DDIIDLQAKVDPA--- 261 (397) T ss_pred hch----HHHHHHHHHHHHHHHHHHHHHHHhhcccc--------cccccccccH----------HHHHHHHHHhhhh--- Confidence 853 57899999999999999999998853211 1122222222 1233343443321 Q ss_pred HHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEE-e-ccccccC-Cc--- Q lcl|NC_015280. 296 IAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYI-D-PYSANVS-DN--- 369 (455) Q Consensus 296 i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~-D-~y~~~~s-~~--- 369 (455) - ..+..++|++.....|.. +... +|. .+...|.++. ..++|. +++|++ | ....+.. +. T Consensus 262 -----~-~~~a~~v~n~~~~~~L~~---lkd~---~G~--~i~~~~~~~~-~~~~l~-G~PV~~~~~~~~~~~~~~~~~~ 325 (397) T protein:vir:48 262 -----I-KQTSFFLTNTSGFTALKK---VKNA---FGD--YLMERDVKSP-TGYSID-GFAVKEVADRWLANASSGAMPL 325 (397) T ss_pred -----h-cCCCEEEECHHHHHHHHH---hhcC---CCc--eeeccCcCCC-CCceec-cceeEEecccccCCcCCCceEE Confidence 1 234567899999998875 2211 111 1111121111 123554 455543 2 2211111 11 Q ss_pred ------ceEEEEEecCccccceeEEcccccccceeecCCccccceeeeeeecce-eeccccccccccc---c-cCchhhh Q lcl|NC_015280. 370 ------QYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQDTFQPRIGFKTRYGM-VLNPFAKGLTALS---D-SDPQAAG 438 (455) Q Consensus 370 ------dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s~qP~~g~~tRY~l-~~nP~~~~~~~~~---~-~~~~~~~ 438 (455) +|++++..+..+.. ..++.. -+-...+-.+-...|++. +.||-.--.-... . ..+.++ T Consensus 326 ~~gd~~~~~~~~~~~~~~i~----~~~~~~------~~~~~~~~~~r~~~r~d~~~~~~~a~~~~~~~~~~~~~~~~~~- 394 (397) T protein:vir:48 326 YFGDLKQAVTLFDRQQMSLL----STNIGG------GAFETDTTKIRVIDRFDVVATDTESFVPASFKAIADQKGNLGS- 394 (397) T ss_pred EEEeccceEEEEeecceEEE----Eeccch------hhhhcCceeEEEEeeeccEEecccceEEEEecccccCCCCccc- Confidence 12333333222211 111100 012233344555566665 4555322111110 0 011111 Q ss_pred hccchhhhhhhhh Q lcl|NC_015280. 439 NLNANAYYRRVRV 451 (455) Q Consensus 439 ~~~~n~y~r~~~v 451 (455) +.| T Consensus 395 ----------~~~ 397 (397) T protein:vir:48 395 ----------TAV 397 (397) T ss_pred ----------cCC Confidence 111 No 28 >protein:vir:78523 Length: 338 # NCBI annotation: Putative head structural protein # Family: family:all:507 # MgeID: mge:1853 # MgeName: U2 # Cross-refs: genbank:acc:YP_001491585;genbank:gi:157786408;genbank:GeneID:5625675 Probab=95.40 E-value=0.0021 Score=35.23 Aligned_cols=310 Identities=14% Similarity=0.040 Sum_probs=127.2 Q ss_pred HHhhhhhhhch--hhhccccccccccccccchh-hhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCccccc Q lcl|NC_015280. 47 RAVLTEAPTNV--GPINTPTTSSGAVAGFDPIL-ISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFF 123 (455) Q Consensus 47 ~~~l~ea~~~~--~~~~~~st~tg~i~~~~P~L-v~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlf 123 (455) .+-|+|...+. .+..+..+++++- -.-+.+ -.+++.+-+..+-..+|.+.||+++..-|.-... T Consensus 1 ~~~~~e~~~~~~~~~~~~~~~~~~~~-liP~~~~~~ii~~~~~~s~l~~l~~~~~~~~~~~~ip~~~~------------ 67 (338) T protein:vir:78 1 MATLNELAPNTAGSNHQGRLAHVPSD-LLPKEIVGPIFDKAQESSLVLRLGENIPISYGETIIPTTVK------------ 67 (338) T ss_pred CcchHHhhhhhcccccccceeccccc-ccchHHHHHHHHHHHhhchhhhhcceeeccCCceEEEEEec------------ Confidence 12233322111 1122222222221 222222 2455556667778889999999987665554320 Q ss_pred ccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeecc Q lcl|NC_015280. 124 DEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGR 203 (455) Q Consensus 124 nEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSR 203 (455) .+...+-+.. .+-..+++ ..+++-.-+++.++...+.. T Consensus 68 -~~~a~~v~~~---------------------------------~~~~~~Eg--------~~~~~~~~~f~~v~l~~~k~ 105 (338) T protein:vir:78 68 -RPEVGQVGVG---------------------------------TSNEQREG--------GTKPLSGTAWDTRSVAPIKL 105 (338) T ss_pred -Cccceeeccc---------------------------------cccccccc--------ccccccccceeEEEEEEEEE Confidence 0000000000 00001111 11223333445555555555 Q ss_pred ccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhhe---eeeeeccccceeeeeeccccchhhHH Q lcl|NC_015280. 204 ALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAK---PGAQANVANAGVFDLDVDSNGRWSVE 280 (455) Q Consensus 204 aLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~---~~k~~~v~~~gv~Dl~~~~~gr~~ve 280 (455) +-...+|-||.+|-. .|.|++|.+-|+..|...||..||.---...- .+-.......+....+....+ T Consensus 106 ~~~~~is~ell~ds~----~~~~~~i~~~la~a~~~~~d~~~l~G~g~~~~~~~~gi~~~~~~~~~~~~~~~~~~----- 176 (338) T protein:vir:78 106 ATIVTVSEEFARMNP----SGLYTKLQADLAYAIGRGIDLAVFHGKSPLTGSALQGIDTNNVIVNTTNVDYLQTG----- 176 (338) T ss_pred EEeehhhHHHHhcCH----HHHHHHHHHHHHHHHHHHHHHHhhcccCCCcccccccccccccccccccccccccc----- Confidence 556678889999833 57889999999999999998888853221100 000000000000101100000 Q ss_pred HHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEe Q lcl|NC_015280. 281 KFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYID 360 (455) Q Consensus 281 ~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D 360 (455) ....|..-.++-.-......+..+-+++|++....|...--+... +|.. ....+.++ .-.++|. +++|+++ T Consensus 177 --~~~~~~~~~~~~~~~~~~~~~~~~~~~m~~~~~~~L~~~~~l~d~---~g~~--l~~~~~~~-~~~~~l~-G~PV~~~ 247 (338) T protein:vir:78 177 --TTPLLDRFLDGYDLVSANTDVDFNGWAADPRYRARLLRSQAYRDA---NGNV--DPTRINLA-ASAGDLL-GLPVQFG 247 (338) T ss_pred --chhhHHHHHHHHHHhhhhccccceEEEEchHHHHHHHHHhhhccC---CCce--eecccccC-CCCceee-eeeEEEc Confidence 001122222222222333456677899999988877543222111 0000 00001111 1134554 5588887 Q ss_pred cccccc-----CCcceEEEE--------EecCccccceeEEcccccccceeecCCcc-----cc---ceeeeeeecce-e Q lcl|NC_015280. 361 PYSANV-----SDNQYYVVG--------YKGTNAYDAGLFYCPYVPLQMYRAIGQDT-----FQ---PRIGFKTRYGM-V 418 (455) Q Consensus 361 ~y~~~~-----s~~dY~~vG--------~KG~~~~daglfyaPYv~l~~~~~~Dp~s-----~q---P~~g~~tRY~l-~ 418 (455) .+...+ ....-+++| ..+.-+.+ ..+| .......||.. || =.+=...|+|. + T Consensus 248 ~~ip~~~~~~~~~~~~~~~gdfs~~~~~~~~~~~i~----~~~~--~~~~~~~~~~~~~~~~~~~~~~~~r~~~r~d~~v 321 (338) T protein:vir:78 248 KAVGGDLGAATDSKVRVVGGDFSQLKYGFADEIRVK----MSDT--ATLTDNTSPTPQTVSMWQTNQIAILIEVTFGWLL 321 (338) T ss_pred cccCccccccCCcccEEEEEecceEEEEeecccEEE----Eeec--ccccccccccccchhhhhcCcEEEEEEEEeccEe Confidence 543211 111122223 22211100 0111 11112223332 11 12233568886 7 Q ss_pred ecccccccccccccCchhh Q lcl|NC_015280. 419 LNPFAKGLTALSDSDPQAA 437 (455) Q Consensus 419 ~nP~~~~~~~~~~~~~~~~ 437 (455) .||-.- ..+.+++..-| T Consensus 322 ~~~~a~--~~l~~~~~~~~ 338 (338) T protein:vir:78 322 GDKQAF--VKFVDDEDPDA 338 (338) T ss_pred ecccce--EEEecccCCCC Confidence 777443 23333333222 No 29 >protein:vir:4953 Length: 397 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:108 # MgeName: Sfi19 # Cross-refs: genbank:acc:NP_049929;genbank:gi:9632900;genbank:GeneID:1262076 Probab=94.81 E-value=0.0034 Score=34.09 Aligned_cols=323 Identities=13% Similarity=0.056 Sum_probs=129.5 Q ss_pred CcchHHHHHHhhHhhcCCCCccccchhhHHHHHHHhhhHH-------------------HHHHHHHHhhhhhhhchhh-- Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHEGLNDIKDPYRKSVTAILLENQE-------------------RALAEERAVLTEAPTNVGP-- 59 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~~~~~i~~~~~~~v~~~~~enq~-------------------~~~~e~~~~l~ea~~~~~~-- 59 (455) ++ .|++.+...-+-..+. .++..-..+-+... +...++++.+.+...+... T Consensus 34 ~~-~ee~~~~~~~i~~~~~-------~~e~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~ 105 (397) T protein:vir:49 34 VS-AEELQAIKNERDTAKM-------KRDMFKEQYTEARANEVANMSEEEKKPLTKSEEEVKAGFVKDFKNLVRGRYQNL 105 (397) T ss_pred cC-HHHHHHHHHHHHHHHH-------HHHHHHHHHHHHHHHhhhccccccccccccchhHHHHHHHHHHHHHHhcchhHH Confidence 11 1122222211110000 00000000000000 0011222222222211111 Q ss_pred --hcccccc-ccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccccccccccccccc Q lcl|NC_015280. 60 --INTPTTS-SGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGTDGA 136 (455) Q Consensus 60 --~~~~st~-tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~ 136 (455) ....+++ .|.+.--....-.+++.+.+..+..+++.++||++++|-+.-++ ..+..+. T Consensus 106 ~~~~~~~t~~~gg~~vP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~--~~~~~~~----------------- 166 (397) T protein:vir:49 106 LDSKTDASGSDAGLTIPQDIQTAIHTLVSQYDSLQEYVNVENVTTLTGSRVYEK--WTDITGL----------------- 166 (397) T ss_pred HHHhhccccccCcccccHhHHHHHHHHHHhhhhHHhhhceeecccCccceEEEe--eccCCcc----------------- Confidence 1111222 22221111122234444456667788899999999998654333 1110000 Q ss_pred cccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHh Q lcl|NC_015280. 137 TPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQD 216 (455) Q Consensus 137 ~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQD 216 (455) +...++.+...+.....|.++.|++.|.. -...+|-||.+| T Consensus 167 --------------------------------a~~v~E~~~~~~~~~~~~~~i~~~~~k~~-------~~~~iS~ell~d 207 (397) T protein:vir:49 167 --------------------------------ANIDDEAGKIADVDDPKLSLIKYTIKRYA-------GISTVTNSLLAD 207 (397) T ss_pred --------------------------------eeeecCccccccccccceeeEEeeeeeEE-------eeehhHHHHHhh Confidence 00011111111111223555555555544 455689999998 Q ss_pred HHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHHHHHHHHHHHHHH Q lcl|NC_015280. 217 LKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAI 296 (455) Q Consensus 217 LkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i 296 (455) -. .|.+++|.+-|+..|..-+|+.||.-.-+.. ...++++++ ....+.+.+... T Consensus 208 s~----~~l~~~i~~~l~~~~~~~~d~ai~~G~g~~~--------~~~~~~~~d----------~i~~~~~~l~~~---- 261 (397) T protein:vir:49 208 SA----ENILAWLSGWIAKKVVVTRNKAILEAIAALP--------TKPTLTKWD----------DIIDLEAKVDPA---- 261 (397) T ss_pred hH----HHHHHHHHHHHHHHHHHHHHHHHHhhccccc--------cccccccHH----------HHHHHHHhhhhh---- Confidence 53 5789999999999999999999886533222 223333221 234454555422 Q ss_pred HHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEE--eccccccC-Cc---- Q lcl|NC_015280. 297 AQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYI--DPYSANVS-DN---- 369 (455) Q Consensus 297 ~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~--D~y~~~~s-~~---- 369 (455) - .....+|+++.....|... ... +|.. ....|.++ ...++|. |++|++ +.+..+.. +. T Consensus 262 ----~-~~~a~~vmn~~~~~~l~~l---kd~---~G~~--l~~~~~~~-~~~~~l~-G~PV~~~~~~~~~~~~~~~~~i~ 326 (397) T protein:vir:49 262 ----I-KQTSFFLTNTSGFTALKKV---KNA---LGDY--LMERDVKS-PTGYSID-GFAVKEVADRWLANGTGGAMPLY 326 (397) T ss_pred ----h-cCCCEEEEcHHHHHHHHHh---hcC---CCce--eeccCcCC-CCCceec-ceeeEEecccccccccCCceeEE Confidence 1 2335678899999888752 211 1111 11112111 1224554 456664 32222111 11 Q ss_pred -----ceEEEEEecCccccceeEEcccccccceeecCCccccceeeeeeecce-eecccccccccc---c-ccCchhhhh Q lcl|NC_015280. 370 -----QYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQDTFQPRIGFKTRYGM-VLNPFAKGLTAL---S-DSDPQAAGN 439 (455) Q Consensus 370 -----dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s~qP~~g~~tRY~l-~~nP~~~~~~~~---~-~~~~~~~~~ 439 (455) +|++++.++..+ +=+.||.. -+-...+-.+-...|++. +.||-.--.-+. . ..-+.+..+ T Consensus 327 ~gd~~~~~~~~~~~~~~----i~~~~~~~------~~~~~~~~~~r~~~r~d~~~~~~~a~~~~~~~~~~~~~~~~~~~~ 396 (397) T protein:vir:49 327 FGDLKQAVTLFDRQHMS----LLSTNIGG------GAFETDTTKVRVIDRFDVVATDTEAFVPASFKAIADQKGNLGSTA 396 (397) T ss_pred EeeccceEEEEeecceE----EEEecccc------chhhcCceeEEEEeeeCcEEecccceEEEEeecccCCCCCccccc Confidence 122233222222 11222211 112234445556667766 555532211111 0 111111112 Q ss_pred c Q lcl|NC_015280. 440 L 440 (455) Q Consensus 440 ~ 440 (455) . T Consensus 397 ~ 397 (397) T protein:vir:49 397 V 397 (397) T ss_pred C Confidence 1 No 30 >protein:vir:100884 Length: 389 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:1473 # MgeName: Lc-Nu # Cross-refs: genbank:acc:YP_358764;genbank:gi:78000028;genbank:GeneID:3726155 Probab=94.30 E-value=0.0048 Score=33.30 Aligned_cols=322 Identities=11% Similarity=0.088 Sum_probs=126.4 Q ss_pred CcchHHHHHHhhHhhcC-------------------CCCccccchhhHHHHHHHhhhHHHHHHHHHHhhhhhhhchh--- Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNH-------------------EGLNDIKDPYRKSVTAILLENQERALAEERAVLTEAPTNVG--- 58 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~-------------------~~~~~i~~~~~~~v~~~~~enq~~~~~e~~~~l~ea~~~~~--- 58 (455) +...+.+.+++...... ...+......+ ..-.++.. ..+++..+.+...+.+ T Consensus 33 ~e~~~~l~~ei~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~-----~~~~~~~~-~~~~~~~~~~~lr~~~~~~ 106 (389) T protein:vir:10 33 VDDFQKIKDDLTAAKARRDAINDQIKALEAEKPAEPKTEPKDDGSKK-----GTDLSKKP-IDAKKKAINDFIHSHGKVI 106 (389) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccccccc-----ccccchhH-HHHHHHHHHHHhhcchhhh Confidence 11112222222221100 00000000000 00001000 0111222222111111 Q ss_pred hhcccc-ccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCccccccccccccccccccc Q lcl|NC_015280. 59 PINTPT-TSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGTDGAT 137 (455) Q Consensus 59 ~~~~~s-t~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~~ 137 (455) ...+++ ++.|.+.--....-.++++..+..+..+++.|.||+++++-+--++ .. ++. .+. T Consensus 107 ~~~~~~t~~~gg~~vP~~~~~~i~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~--~~--~~~---------------~~~ 167 (389) T protein:vir:10 107 DATSKVTSTEAGVLIPEEIIYDPTAEVNSVVDLSTLVTKTPVTTPKGTYPILK--RA--TDR---------------FSS 167 (389) T ss_pred hhhcccccCCcceeehHHHHHHHHHHHHhhhhHHhhcceeeccCCeeEEEEEe--cC--CCc---------------ccc Confidence 111222 2223222212222235555556667789999999999876544443 00 000 000 Q ss_pred ccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhH Q lcl|NC_015280. 138 PPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDL 217 (455) Q Consensus 138 ~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDL 217 (455) ..+....++.....|.+..+++.|.. --..+|-||.+|- T Consensus 168 ----------------------------------~~E~~~~~~~~~~~~~~i~~~~~k~~-------~~~~iS~ell~ds 206 (389) T protein:vir:10 168 ----------------------------------VAELAENPKLAEPEFNKVDWSVATYR-------GAIPLSEEAIADS 206 (389) T ss_pred ----------------------------------ccccccccccccccceeeeeeheeeE-------eeehhhHHHHhhh Confidence 00000001111233555666665554 4456899999984 Q ss_pred HHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHHHHHHHHHHHHHHH Q lcl|NC_015280. 218 KAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAIA 297 (455) Q Consensus 218 kAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i~ 297 (455) ..|.+++|.+-|...+..-+|..|+.-+-. +. ..++... .+ ......+ +...... T Consensus 207 ----~~~l~~~i~~~la~~~~~~~~~~i~~g~~~----~~-----~~~~~~~----~~---~d~l~~~-~~~~~~~---- 261 (389) T protein:vir:10 207 ----AVDLTALVGQSIKEKSVNTYNAMIAPVLQS----FT-----AKKTTTD----TL---VDSLKHI-LNVDLDP---- 261 (389) T ss_pred ----hHHHHHHHHHHHHHHHHHHHHHHHhhhhcc----cc-----ccccccc----cc---HHHHHHH-HHhhhhh---- Confidence 246788888888888888888888754321 11 1111100 00 1122222 2211111 Q ss_pred HHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccc-cccCCceeEEEecCceEEEE-ec-cccccCCcceEEE Q lcl|NC_015280. 298 QETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGE-IDDTGNTFVGTLNGRFKVYI-DP-YSANVSDNQYYVV 374 (455) Q Consensus 298 ~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~-~d~t~~~~~G~l~~~~~vy~-D~-y~~~~s~~dY~~v 374 (455) .+ ..-+|+++.....|... .. .+|..-+... .+.+.....++|. +++||+ |. +.... ..+ T Consensus 262 ----~~-~a~~~~n~~~~~~L~~l---kd---~~G~~i~~~~~~~~~~~~~~~~l~-G~pV~~~~~~~~~~~-~~~---- 324 (389) T protein:vir:10 262 ----AY-SRALVVTQSLFNTLDTL---KD---KNGRYLLHDASDSITDGTAKGTIL-GVPVYVVGDTLLGSL-AGD---- 324 (389) T ss_pred ----hh-CcEEEecHHHHHHHHHh---hc---cCCCeeeecCcccccccccccccc-cceeEEecccccCCC-CCc---- Confidence 12 23478899888888752 21 1111110000 0112222334564 456554 32 11110 011 Q ss_pred EEecCccccceeEEccccc-------cc-ceeecCCccccceeeeeeecce-eecccccccccccccCchhhhhccc Q lcl|NC_015280. 375 GYKGTNAYDAGLFYCPYVP-------LQ-MYRAIGQDTFQPRIGFKTRYGM-VLNPFAKGLTALSDSDPQAAGNLNA 442 (455) Q Consensus 375 G~KG~~~~daglfyaPYv~-------l~-~~~~~Dp~s~qP~~g~~tRY~l-~~nP~~~~~~~~~~~~~~~~~~~~~ 442 (455) ..++|+.+-. .. -....|-..|.-.+...-|++. +.||-+--.-.+.. -.+..++| T Consensus 325 ---------~~~~~gd~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~r~d~~~~~~~a~~~~~~~~---~~~~~~~~ 389 (389) T protein:vir:10 325 ---------QKAFVGDLKRGVLFTDRQQVTLAWEDSKIYGKYLGAAFRFGVQKADSKAGYFVTNTD---VPGSALGK 389 (389) T ss_pred ---------eEEEEeeccccEEEEeecceEEEeeccccccceEEEEEEeccEEecccceEEEEeec---cCCCCCCC Confidence 1223222211 01 1122344556666777789998 66665432111111 11123344 No 31 >protein:vir:4511 Length: 409 # NCBI annotation: capsid # Family: family:all:21 # MgeID: mge:97 # MgeName: V # Cross-refs: genbank:acc:NP_599037;genbank:gi:19548995;genbank:GeneID:935211 Probab=94.02 E-value=0.0056 Score=32.92 Aligned_cols=330 Identities=15% Similarity=0.093 Sum_probs=127.0 Q ss_pred CcchHHHHHHhhH---------------------hhcCCCCccccchhhHHHHHHHhhhHHHH-HHHHHHhhhhhhhchh Q lcl|NC_015280. 1 MYNAENLQEKWAP---------------------VLNHEGLNDIKDPYRKSVTAILLENQERA-LAEERAVLTEAPTNVG 58 (455) Q Consensus 1 m~~~~~~~~kw~~---------------------~l~~~~~~~i~~~~~~~v~~~~~enq~~~-~~e~~~~l~ea~~~~~ 58 (455) +...+.|.++..- .+..+.- .-.+..+++.....|.+=... ..+|++.+.|.. T Consensus 41 ~~e~~~l~~~i~~~e~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~a~~~~l~~~~~~~~~~e~~~~~~~~---- 115 (409) T protein:vir:45 41 KSELEALDERIAREEELRRQDQAYIESNEEEQRQNLDPENN-SQQDEKRAQVFDKWMRHGASELTSEERKALRELR---- 115 (409) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhcccCCCCCc-chhhHHHHHHHHHHHHhhhhhccHHHHHHHHHHh---- Confidence 2222233333221 1111111 111112222323333221111 245555555543 Q ss_pred hhcccccccc---cc---ccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccccccccccc Q lcl|NC_015280. 59 PINTPTTSSG---AV---AGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSG 132 (455) Q Consensus 59 ~~~~~st~tg---~i---~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg 132 (455) +.+++++ .. ..+.+.++.+.| +..+..+++-|-|+++.....+-..+.- +. T Consensus 116 ---a~~~~~~~~gg~liP~~~~~~ii~~~~---~~~~l~~~~~~~~~~~~~~~~~~~~~~~-~~---------------- 172 (409) T protein:vir:45 116 ---AQGVAQDEKGGYTVPETFLAKVVEKMK---SYGGIASVAQILTTSDGRTMEWATADGT-SE---------------- 172 (409) T ss_pred ---hccCccCcCCceeccHhHHHHHHHHHH---hhhhhhhhceeeecCCCceEEEEeeccC-cc---------------- Confidence 1122221 11 222333444444 3334467788888877655444332000 00 Q ss_pred cccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeecc-ccccceeH Q lcl|NC_015280. 133 TDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGR-ALRADYSV 211 (455) Q Consensus 133 ~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSR-aLKAEYTi 211 (455) .+-.+.+.+. .++-...++.++.+++.. +-=..+|- T Consensus 173 -----------------------------------~~~~v~E~~~--------~~~~~~~f~~~~l~~~k~~~~~i~is~ 209 (409) T protein:vir:45 173 -----------------------------------VGVLLGENEE--------AGEEDTDFGMGSLGALKMTSKIIRVSN 209 (409) T ss_pred -----------------------------------cccccccccc--------ccccccccceeeeeeeeeeeeehhhhH Confidence 0000111111 122222223333333222 11235799 Q ss_pred HHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeec-----cccchhhHHHHHHHH Q lcl|NC_015280. 212 ELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDV-----DSNGRWSVEKFKGLL 286 (455) Q Consensus 212 ELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~-----~~~gr~~ve~~k~l~ 286 (455) ||.+|- .+|.+++|.+-|+..|..-+|+.||.-=-+-. .-...|++.... ...+--..+....|+ T Consensus 210 ell~ds----~~~l~~~i~~~la~a~~~~~~~a~l~G~G~~~------~~~p~Gil~~~~~~~~~~~~~~~~~d~i~~l~ 279 (409) T protein:vir:45 210 ELLQDS----AIDMEAYLARRIAERIGRGEARYLIQGTGAGT------PKQPKGLAASVTGTTQTAAANAVKWQEILALK 279 (409) T ss_pred HHHhcc----HHHHHHHHHHHHHHHHHHHHHHHhhccCCCCC------ccccceeeeccccccccccccccchHHHHHHH Confidence 999994 25789999999999999999998874110000 000112221100 000000112333444 Q ss_pred HHHHHHHHHHHHHhcCCCccE-EEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEeccccc Q lcl|NC_015280. 287 FQIERDANAIAQETRRGKGNI-IITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSAN 365 (455) Q Consensus 287 ~qi~~ean~i~~~T~~~~gn~-~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~ 365 (455) +.+.- --+..+.| +++++.....|.. |... +|. .+...|.+.. -.++|.| ++|+++.++.. T Consensus 280 ~~l~~--------~~~~~a~~~~~~n~~~~~~l~~---lkd~---~G~--~i~~~~~~~~-~~~~l~G-~PV~~~~~~p~ 341 (409) T protein:vir:45 280 HSIDP--------AYRRGPKFRLAFNDNTLKLISE---MEDG---QGR--PLWLPDIVGV-APASVLN-VPYVIDQEIDD 341 (409) T ss_pred Hhhhh--------hhccCCeEEEEECHHHHHHHHH---hhcC---CCc--eeeccCcCCC-CCceecc-eeeEEecCcCC Confidence 44321 23445666 5789888777764 2211 111 1111122111 1135654 68888865421 Q ss_pred cCCcce-EEEEEecCccccceeEEcccccccceeecCCccccceeeee--eecce-eecccccccccccccCchhhhhcc Q lcl|NC_015280. 366 VSDNQY-YVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQDTFQPRIGFK--TRYGM-VLNPFAKGLTALSDSDPQAAGNLN 441 (455) Q Consensus 366 ~s~~dY-~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s~qP~~g~~--tRY~l-~~nP~~~~~~~~~~~~~~~~~~~~ 441 (455) ....++ +++| +-. ..+...--........||-.-...++|. .||+. +.||-+.-.-.. .. +++ T Consensus 342 ~~~~~~~i~~G---d~~---~~~i~~~~~~~~~~~~d~~~~~~~~~~~~~~r~d~~~~~~~A~~~l~~---k~----s~~ 408 (409) T protein:vir:45 342 IGAGKKFMFCG---DFD---RFIIRRVRYMILKRLVERYAEYDQTGFLAFHRFDCILEDTSAIKALVG---KG----SVG 408 (409) T ss_pred ccCCccEEEEe---ehh---hhheeeccceEEEEeecccccCCcEEEEEEEEeccEeechhheEEEEe---cc----CCC Confidence 111111 2222 110 0111110011112233655333444443 48877 677765422111 11 111 Q ss_pred c Q lcl|NC_015280. 442 A 442 (455) Q Consensus 442 ~ 442 (455) . T Consensus 409 ~ 409 (409) T protein:vir:45 409 G 409 (409) T ss_pred C Confidence 1 No 32 >protein:vir:4092 Length: 390 # NCBI annotation: major capsid protein a # Family: family:all:635 # MgeID: mge:86 # MgeName: 2389 # Cross-refs: genbank:acc:NP_510986;swissprot:trembl:q8w604;genbank:gi:17488508;uniprot:Q8W604;genbank:GeneID:1260361 Probab=93.92 E-value=0.006 Score=32.78 Aligned_cols=331 Identities=11% Similarity=0.042 Sum_probs=120.0 Q ss_pred CcchHHHHHHhhHhhcCC--CCccccc--hhhHHHHHH---HhhhHHHHH-----------------------HHHHHhh Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHE--GLNDIKD--PYRKSVTAI---LLENQERAL-----------------------AEERAVL 50 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~--~~~~i~~--~~~~~v~~~---~~enq~~~~-----------------------~e~~~~l 50 (455) |-+-+++.++..-.-... .+.+... ...+.+... +-++..+.. .|+|+++ T Consensus 1 ik~L~e~~~e~~e~~~~~~~~~~~~~~~~e~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~r~~~ 80 (390) T protein:vir:40 1 MNNLDKKDSETLNISTAFLNAIKEGATEAEQVTAFTNMAEQIQNNIIAQARKEVNREMNDNNVLASRGANALTSDESKYY 80 (390) T ss_pred CchHHHHHHHHHHHHHHHHHHHhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCchhccHHHHHHH Confidence 444444443333211100 0000000 000111100 000000000 1222222 Q ss_pred hhhhhchhhhccccccccccccccchhhh-HHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCccccccccccc Q lcl|NC_015280. 51 TEAPTNVGPINTPTTSSGAVAGFDPILIS-LIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQ 129 (455) Q Consensus 51 ~ea~~~~~~~~~~st~tg~i~~~~P~Lv~-l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~ 129 (455) .+ .....++++.-...-+.+.. +++.+-..-+-.+++-+.||++....|.... .. .++ T Consensus 81 ~~-------~~~~~~~~~gg~lvP~~~~~~I~~~~~~~s~i~~~~~~~~~~~~~~~i~~~~----~~--~~a-------- 139 (390) T protein:vir:40 81 NE-------VIAGNGFAGVTALLPPTVFERVFEDLTVEHPLLSKINFVNTTATTEWIISVG----DV--ATA-------- 139 (390) T ss_pred HH-------HHhccCcccCcccccHHHHHHHHHHHHhhhhhhhhceeeecCCceeEEEEEc----CC--cce-------- Confidence 22 22222222211111112112 2332233334567899999988665554211 00 000 Q ss_pred ccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccce Q lcl|NC_015280. 130 FSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADY 209 (455) Q Consensus 130 fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEY 209 (455) -...+.....+.....|.+..|++.|..+- ... T Consensus 140 ----------------------------------------~~~~E~~~~~~~~~~~f~~i~l~~~k~~~~-------i~i 172 (390) T protein:vir:40 140 ----------------------------------------WWGPLCAEIKEVLDNGFDKIQTGMYKLSAY-------IPV 172 (390) T ss_pred ----------------------------------------eeeccccccCccccccceeeEeeeeeEEEe-------ehh Confidence 000000001111233477788877777653 357 Q ss_pred eHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHH---------Hhhhhe--eeeeeccccceeeeeeccccchhh Q lcl|NC_015280. 210 SVELAQDLKAIHGLDAESELANILSTEILAEINREVVRT---------VYRGAK--PGAQANVANAGVFDLDVDSNGRWS 278 (455) Q Consensus 210 TiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~---------l~~vA~--~~k~~~v~~~gv~Dl~~~~~gr~~ 278 (455) |-||.+|-- .|.|++|.+.|+..|..-+|+.||.- |...+. .+.. ...+.+.+. T Consensus 173 S~ell~ds~----~~l~~~i~~~la~~i~~~~~~a~l~G~G~~~P~Gil~~~~~~~~~~~-~~~~~~~~t---------- 237 (390) T protein:vir:40 173 CNAMLDLGP----SWLDQYVRTILGEAMALGLEAGIVNGSGKDQPIGMMRDLNNVTAGEH-PVKTATPLT---------- 237 (390) T ss_pred hHHHHhcch----HHHHHHHHHHHHHHHHHHHHhhhhcccCCCccceeeecccccccccc-ccccccccc---------- Confidence 889998864 47899999999999999999998852 111110 0000 000111110 Q ss_pred HHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEE Q lcl|NC_015280. 279 VEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVY 358 (455) Q Consensus 279 ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy 358 (455) -+....++..+..-......+. .+++.|++-....+..|...-++. |.+|....+.+.-+++|+ T Consensus 238 ~~~~~~~~~~l~~~~~~~~~~~-~~~a~~i~n~~t~~~~l~~~~~~~---------------d~~G~~v~~~~~~g~pvv 301 (390) T protein:vir:40 238 DLTPATLATKVMLPLTDNGKKS-VSDAILVINPADYWSKIYAATSYM---------------TPQGVWVTGILPVPLEIV 301 (390) T ss_pred hhhHHHHHHHHHHHhhcchhhh-hcCceEEEcchhHHHHHHHHhhcc---------------CCCCccccccCCCceeEE Confidence 0112223333332222222222 234555544445566665433332 233333333334477888 Q ss_pred EeccccccCCcceEEEE--------EecCcccccee--EEc-cccccc-----ceeecCCccccceeeeeeecc-eeecc Q lcl|NC_015280. 359 IDPYSANVSDNQYYVVG--------YKGTNAYDAGL--FYC-PYVPLQ-----MYRAIGQDTFQPRIGFKTRYG-MVLNP 421 (455) Q Consensus 359 ~D~y~~~~s~~dY~~vG--------~KG~~~~dagl--fya-PYv~l~-----~~~~~Dp~s~qP~~g~~tRY~-l~~nP 421 (455) ++.++. .+=++.| -.+....+.+- +|. ..+-.- -.+++||++|. ++=++.==| -.+.| T Consensus 302 ~~~~~p----~~~i~~Gd~s~~~i~~~~~~~v~~~~~~~f~~~~~~~r~~~r~dg~v~~~~A~~-~l~~~~~~~~~~~~~ 376 (390) T protein:vir:40 302 QSVAVP----VGKAVAGRAKDYFMGIGSEQVIRTSTEYRLLDDETLYYAKQYANGRPKDNSSFL-VFDITGLEGSPAIDV 376 (390) T ss_pred EcCCCC----CCcEEEEeeceEEEEeecceEEEecchhhhhcCcEEEEEEEEeCCEEecccceE-EEEeeccCCCCCCCc Confidence 886643 2223333 22222211110 000 000000 00123444443 000110000 02222 Q ss_pred cccccccccccCchhhhhccchh Q lcl|NC_015280. 422 FAKGLTALSDSDPQAAGNLNANA 444 (455) Q Consensus 422 ~~~~~~~~~~~~~~~~~~~~~n~ 444 (455) |..-.+ ...++.+ - T Consensus 377 ~~~~~~--~~~~~~~-------~ 390 (390) T protein:vir:40 377 NVVNNA--TPSETPA-------E 390 (390) T ss_pred ceeeCC--CCCCCCC-------C Confidence 222110 0111111 1 No 33 >protein:vir:81227 Length: 413 # NCBI annotation: gp6, major capsid protein # Family: family:all:585 # MgeID: mge:1893 # MgeName: BFK20 # Cross-refs: genbank:acc:YP_001456736;genbank:gi:157168379;hssp:P49861;interpro:IPR006444;uniprot:Q9MBJ9;genbank:GeneID:5580350 Probab=93.84 E-value=0.0062 Score=32.69 Aligned_cols=337 Identities=13% Similarity=-0.002 Sum_probs=117.6 Q ss_pred CcchHHHHHHhhHhhcCCCC-----ccccchhhHHHHH-------HHhhhHHHHHHH---HH-----------Hhhhhhh Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHEGL-----NDIKDPYRKSVTA-------ILLENQERALAE---ER-----------AVLTEAP 54 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~~~-----~~i~~~~~~~v~~-------~~~enq~~~~~e---~~-----------~~l~ea~ 54 (455) |...+++.++=..+.+.... ....+..+++... .+-++..+...+ +. ....+-. T Consensus 31 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 110 (413) T protein:vir:81 31 EDAKRERAKSVKANQDFLRELQEATAGSVDSEKSGELTRKGEGYKSIGEFFAKRAGDQIKQQAGGAQLNYSVGEYVAPRV 110 (413) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHhHHhHHHhhhHhhhhhhhhhhhhhhhhhhhhHHHHHHHHHHhhhhhhhhhhhHH Confidence 11111111111111000000 0000000000000 000000000000 00 0000000 Q ss_pred hchhhhccccccccccccccchhh--hHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccccccccccc Q lcl|NC_015280. 55 TNVGPINTPTTSSGAVAGFDPILI--SLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSG 132 (455) Q Consensus 55 ~~~~~~~~~st~tg~i~~~~P~Lv--~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg 132 (455) .........++++.+....=|..+ .+++..-+..+..+++.|+||++++.-+.-.+ -.. .+. T Consensus 111 ~~~~~~~~~~~~~~~~~~~vp~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~-~~~--------------~~~- 174 (413) T protein:vir:81 111 KAASDPASTATLTDEFQGGYGTTWNRNIIYRRREKLVVADLMDNLTMTNTTIKYLMEK-ANR--------------VVE- 174 (413) T ss_pred HhhhhhhhhcccccccccccchhhHHHHHHHHhhhhhHHhhcceeeccCCceeEEEec-ccc--------------ccc- Confidence 001111111111111111112211 24444445667789999999999875432221 000 000 Q ss_pred cccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHH Q lcl|NC_015280. 133 TDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVE 212 (455) Q Consensus 133 ~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiE 212 (455) . .+-...+++..+++....|.+..|.+.|.. -....|-| T Consensus 175 ----------------------------~------~a~~v~Eg~~~~~~~~~~f~~i~~~~~k~~-------~~~~iS~e 213 (413) T protein:vir:81 175 ----------------------------G------GFKTVAEGGKKPYMRFADFDIVTESLSKIA-------GLTKITDE 213 (413) T ss_pred ----------------------------c------ccceecCcccccccCcccceeeEeeeeeEE-------EeehhhHH Confidence 0 000011111122222223555555555554 44568899 Q ss_pred HHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccc-----cchhhHHHHHHHHH Q lcl|NC_015280. 213 LAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDS-----NGRWSVEKFKGLLF 287 (455) Q Consensus 213 LAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~-----~gr~~ve~~k~l~~ 287 (455) |.+|-- +.++.|.+-|+..|..-+|+.||.- .-.+-...|++...... ++-+. + T Consensus 214 ll~ds~-----~l~~~i~~~la~~~~~~~d~~~l~G--------~G~~~~~~Gi~~~~~~~~~~~~~~~~~--------~ 272 (413) T protein:vir:81 214 MIEDYD-----FLVSYINARLLEELAIEEERQLLLG--------DGTGNNLTGLLKRDGIQTLAVSNKDEL--------A 272 (413) T ss_pred HHHHHH-----HHHHHHHHHHHHHHHHHHHHHHhcc--------CCCCCcccccccccccccccccccchh--------H Confidence 999862 2577777777777777777776631 11111123443221110 01111 2 Q ss_pred HHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhh----cccccccccccccccccccccCCceeEEEecCceEEEEeccc Q lcl|NC_015280. 288 QIERDANAIAQETRRGKGNIIITSADVASALAMS----GVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYS 363 (455) Q Consensus 288 qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~s----G~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~ 363 (455) .....+-..+..-..+..+-+|+++.....|..- |-+-+.+...+..+ ..+....++|. +++|+++... T Consensus 273 ~~i~~~~~~~~~~~~~~~~~~vmn~~~~~~l~~lkd~~G~~l~~~~~~~~~~------~~~~~~~~~l~-G~pv~~s~~~ 345 (413) T protein:vir:81 273 DSIYKAMTNISLATPFQADALVINPLDYQELRLAKDANGQYYGGGVFQGQYG------SGGIMLDPAPW-GLRTVQSQVV 345 (413) T ss_pred HHHHHHHHHhhhhccCCCcEEEEcHHHHHHHHHhhccCCceecccccccccc------ccccccCceec-ceeeEEcCCC Confidence 2111111222222344556678899888777631 11111111111111 00111224454 6688888653 Q ss_pred cccCCcceEEEEE-ecCccccceeEEcccccccceeecCC------ccccceeeeeeecce-eeccccccccccc-ccCc Q lcl|NC_015280. 364 ANVSDNQYYVVGY-KGTNAYDAGLFYCPYVPLQMYRAIGQ------DTFQPRIGFKTRYGM-VLNPFAKGLTALS-DSDP 434 (455) Q Consensus 364 ~~~s~~dY~~vG~-KG~~~~daglfyaPYv~l~~~~~~Dp------~s~qP~~g~~tRY~l-~~nP~~~~~~~~~-~~~~ 434 (455) |..-+++|- +- +|--+....+...+++ .+.|=.+-...||+. +.+|-.--.=++. --.| T Consensus 346 ----~~~~~~~gd~~~--------~~~~~~~~~~~v~~~~~~~~~~~~~~~~~r~~~r~d~~~~~~~a~~~l~~~~~~~p 413 (413) T protein:vir:81 346 ----PVGKPVVGAFRS--------AASVLRKGGVRIDSTNTNVDDFENNLITVRAEERVGLMVTFPEAIVQLDVAEVVTP 413 (413) T ss_pred ----CcccEEEEeccc--------EEEEEEecceEEEEeccccchhhcCcEEEEEEEeeccEEecccceEEEEecCCCCC Confidence 223333332 21 0111111111111111 233445555667776 4444332111111 1112 No 34 >protein:vir:100247 Length: 425 # NCBI annotation: gp76 # Family: family:all:21 # MgeID: mge:1619 # MgeName: Bcep176 # Cross-refs: genbank:acc:YP_355412;genbank:gi:77864702;genbank:GeneID:3725969 Probab=93.84 E-value=0.0062 Score=32.69 Aligned_cols=325 Identities=14% Similarity=0.128 Sum_probs=124.5 Q ss_pred Ccch--------------HHHHHH-----------------hhHhhcCCC-----Ccccc-chhhHHHHHHHhhhHHHHH Q lcl|NC_015280. 1 MYNA--------------ENLQEK-----------------WAPVLNHEG-----LNDIK-DPYRKSVTAILLENQERAL 43 (455) Q Consensus 1 m~~~--------------~~~~~k-----------------w~~~l~~~~-----~~~i~-~~~~~~v~~~~~enq~~~~ 43 (455) +... .+..++ ...-+..+. ...+. ..+|+. +.+.-+. T Consensus 50 ~k~~~~~~~~~~~~~~~~~e~~~~~~~~~~ei~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a-----f~~~l~~- 123 (425) T protein:vir:10 50 FKAEHTKQLDAVKAGLPTSDALAKVDKVSADLEALQAAVDEANIKIAAAQMGANGVKPLRDPEYTEA-----FKAHVKR- 123 (425) T ss_pred HHHHHHHHHHHHHhhhccHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccccHHHHHH-----HHHHhhh- Confidence 0000 000000 000000000 00000 000111 1110000 Q ss_pred HHHHHhhhhhhhchhhhccccccccccccccchhh-hHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccc Q lcl|NC_015280. 44 AEERAVLTEAPTNVGPINTPTTSSGAVAGFDPILI-SLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAF 122 (455) Q Consensus 44 ~e~~~~l~ea~~~~~~~~~~st~tg~i~~~~P~Lv-~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAl 122 (455) .|+++.+.+ ..++.|.+. .-+.+. .+++++-+..+..+++.|-||+++.+-+.-.. ++.. T Consensus 124 ~e~~~al~~----------~t~~~gG~l-vP~~~~~~ii~~~~~~s~l~~l~~~~~~~~~~~~~~~~~------~~~~-- 184 (425) T protein:vir:10 124 GDVQAALNK----------GEDSEGGYL-TPIEWDRTITNKLVLISPMRQLCRVQPVSKAGFSKLFNM------GGTT-- 184 (425) T ss_pred hhhHHHhhc----------CcCCCCcee-ccHhHHHHHHHHHHhhhhhhhhceeeeccCCceEEEEEc------CCcc-- Confidence 011111111 111222221 122222 25555555667788999999987765443110 0000 Q ss_pred cccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeec Q lcl|NC_015280. 123 FDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKG 202 (455) Q Consensus 123 fnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKS 202 (455) +-..++.+..+++....|.++.|++-|..+ T Consensus 185 ----------------------------------------------a~wv~E~~~~~~~~~~~f~~v~~~~~k~~~---- 214 (425) T protein:vir:10 185 ----------------------------------------------SGWVGEASQRPQTNAATFQPLSFASGEIYA---- 214 (425) T ss_pred ----------------------------------------------eeeeccccccccccccccceeeeeheeeEe---- Confidence 000111111222222246677777766655 Q ss_pred cccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHH--------Hhhhheeeeeeccccceeee-eeccc Q lcl|NC_015280. 203 RALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRT--------VYRGAKPGAQANVANAGVFD-LDVDS 273 (455) Q Consensus 203 RaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~--------l~~vA~~~k~~~v~~~gv~D-l~~~~ 273 (455) ...+|-||.+|-. .|.+++|.+-|+..|..-+|+-||.- |.+....+........|... ..... T Consensus 215 ---~i~iS~ell~ds~----~~l~~~i~~~la~ai~~~~d~~~l~G~G~~~p~Gil~~~~~~~~~~~~~~~~~~~~~~~~ 287 (425) T protein:vir:10 215 ---NPAATQQILDDAE----IDLESWLATEVQTEFAKQEGKAFLAGDGTNKPNGLLTYIAGGANAAKHPFGAIEVVNSGA 287 (425) T ss_pred ---ehHhHHHHHhcch----hHHHHHHHHHHHHHHHHHHHhhhhcccCCCCcceeeeccccccccccccccccccccccc Confidence 4568999998853 56888999999999999998888752 11111101000000000000 00000 Q ss_pred cchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecC Q lcl|NC_015280. 274 NGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNG 353 (455) Q Consensus 274 ~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~ 353 (455) .+--..+....|.+.+... -+..+ .+|+++.....|.. +.. .+|..=+ ..+.+.. ..++|. T Consensus 288 ~~~~~~d~l~~l~~~l~~~--------~~~~a-~~vmn~~~~~~L~~---lkD---~~G~~l~--~~~~~~g-~~~~l~- 348 (425) T protein:vir:10 288 AADITSDGIIDLVYDLPSA--------FTGNA-RFAMNRNTQRQVRK---LKD---GQGNYLW--QPSYVAG-QPATLA- 348 (425) T ss_pred cccccHHHHHHHHhhhhhh--------hccCC-EEEEchHHHHHHHH---hhc---CCCceee--ccCccCC-CCceec- Confidence 1111122233343333211 22333 45789988888874 221 1111101 1111111 124665 Q ss_pred ceEEEEecccccc-CCcceEEEEEecCccccceeEEcccccccceeecCCccccceee--eeeecce-eecccccccccc Q lcl|NC_015280. 354 RFKVYIDPYSANV-SDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQDTFQPRIG--FKTRYGM-VLNPFAKGLTAL 429 (455) Q Consensus 354 ~~~vy~D~y~~~~-s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s~qP~~g--~~tRY~l-~~nP~~~~~~~~ 429 (455) +++|+++.++... +..+.+++| +-. ...+.. ....+....||-.-+-.++ ...||+. +.+|-+.-.-+. T Consensus 349 G~PV~~~~~~p~~~~~~~~i~~G---d~~--~~~~i~--~~~~~~v~~d~~~~~~~~~~~~~~r~d~~v~~~~A~~~l~~ 421 (425) T protein:vir:10 349 GYPVTEVPDMPDVAANSTPILFG---DFQ--QTYLII--DRIGVRVLRDPYTAKPYVLFYTTKRVGGGLLNPEPMRAMKV 421 (425) T ss_pred ceeeEEecCcCCccCCccEEEEE---ehh--ccEEEE--EecceEEEecccccCCcEEEEEEEEeccEeecccceEEEEe Confidence 4688887553211 112334443 111 001111 1112222335443233333 4458887 777776532222 Q ss_pred cccC Q lcl|NC_015280. 430 SDSD 433 (455) Q Consensus 430 ~~~~ 433 (455) .-.+ T Consensus 422 ~as~ 425 (425) T protein:vir:10 422 AASE 425 (425) T ss_pred eccC Confidence 2112 No 35 >protein:vir:41 Length: 299 # NCBI annotation: major capsid protein # Family: family:all:507 # MgeID: mge:2 # MgeName: A118 # Cross-refs: genbank:acc:NP_463467;swissprot:trembl:q9t1b7;genbank:gi:16798789;uniprot:Q9T1B7;genbank:GeneID:922353 Probab=92.06 E-value=0.013 Score=30.93 Aligned_cols=275 Identities=13% Similarity=0.016 Sum_probs=126.8 Q ss_pred hhhhc--cccccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccccccccccccc Q lcl|NC_015280. 57 VGPIN--TPTTSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGTD 134 (455) Q Consensus 57 ~~~~~--~~st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~~ 134 (455) ++..- ..+++++...--....-.+++++....+-.+++-+-||++.+.-+- .. ++.++ . T Consensus 1 ~g~~a~~~~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~~~~~~~~~~~~~~~~-~~------~~~~a-------~----- 61 (299) T protein:vir:41 1 MGFNPDTTTMQSAKTGSIPINISEQIITGVKNGSAAMKLAKAVPMTKPEEEFT-FM------SGVGA-------F----- 61 (299) T ss_pred CCcCCCcccccCCCceecchhHHHHHHHHHHhcchhhhhceeeecCCCcEEEE-EE------cCCce-------e----- Confidence 22111 1112222221111122356667777888899999999988763221 10 00000 0 Q ss_pred cccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHH Q lcl|NC_015280. 135 GATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELA 214 (455) Q Consensus 135 ~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELA 214 (455) ..+ | +..++|...++++++...|..+-...+|-||. T Consensus 62 ------------------------------------~v~--E------~~~~~~~~~~f~~v~l~~~k~~~~~~is~ell 97 (299) T protein:vir:41 62 ------------------------------------WVD--E------AERIQTSKPTFTKAKMRSKKMGVIIPTTKENL 97 (299) T ss_pred ------------------------------------eee--c------CccccccccceeEEEEeeEEEEEeehhhHHHH Confidence 000 1 12244555566788888888888889999999 Q ss_pred HhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeee-----ccccchhhHHHHHHHHHHH Q lcl|NC_015280. 215 QDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLD-----VDSNGRWSVEKFKGLLFQI 289 (455) Q Consensus 215 QDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~-----~~~~gr~~ve~~k~l~~qi 289 (455) +|-. .|.++.|.+.|+..|...+++.||.---+ ++ ..|++-.. ....+--..+....++.++ T Consensus 98 ~ds~----~~~~~~i~~~l~~a~~~~~d~a~l~G~g~----~~-----~~gil~~~~~~~~~~~~~~~~~~~l~~~~~~l 164 (299) T protein:vir:41 98 NYSV----TNFFSLMQAEIVEAFYKKFDQAVFTGVES----PY-----NWNILKSATDASNLVEETANKYDDLNEAIGLI 164 (299) T ss_pred hcCH----HHHHHHHHHHHHHHHHHHHHHHHhhcccC----cc-----cccccccccccceeeccccccHHHHHHHHHhh Confidence 9754 46788899999999999888888742110 01 01111000 0000111122334444444 Q ss_pred HHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEeccccccCCc Q lcl|NC_015280. 290 ERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDN 369 (455) Q Consensus 290 ~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~ 369 (455) .. --++++.++|+++....|.. +.. .+|. .....+.++. .++|. +++|++.......... T Consensus 165 ~~---------~~~~~~~~v~n~~~~~~L~~---lkd---~~G~--~l~~~~~~~~--~~~l~-G~PV~~~~~~~~~~~~ 224 (299) T protein:vir:41 165 EA---------EDLEPNGIATIRKQRVKYRS---TKD---GNGM--PIFNTATSNG--VDDVL-GLPIAYTPKYTFGDKD 224 (299) T ss_pred hc---------ccCCcCEEEEcHHHHHHHHH---hhc---cCCc--eeecCCcCCC--Cceec-ceeeEEecccCCCCCc Confidence 32 23456678999999988885 221 1111 1111112211 24565 5788777654321111 Q ss_pred ceEE--------EEEecCccccceeEEcccccccceeecCCcc-----ccc-eeee--eeecce-eeccccccccccccc Q lcl|NC_015280. 370 QYYV--------VGYKGTNAYDAGLFYCPYVPLQMYRAIGQDT-----FQP-RIGF--KTRYGM-VLNPFAKGLTALSDS 432 (455) Q Consensus 370 dY~~--------vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s-----~qP-~~g~--~tRY~l-~~nP~~~~~~~~~~~ 432 (455) =.++ +|..+..+.+- -.+.......||+. ||- .++| ..|+|. +.||-+-..-+... T Consensus 225 ~~~~~gdfs~~~i~~~~~~~i~~------~~~~~~~~~~~~~~~~~~~~~~~~~~~r~~~~~d~~v~~~~A~~~l~~~a- 297 (299) T protein:vir:41 225 ISELVGDWNQAYYGILRGVEYEI------LTEATLTTVADETGKPLNLAERDMAAIKATFEVGFMVVKDEAFSAVQPKA- 297 (299) T ss_pred eEEEEEecccEEEEEecCcEEEE------eecccccccccccccchhhhhcCcEEEEEEEEeccEEecccceEEEEecc- Confidence 1122 22222211000 00000111123321 332 2333 467887 56664432211111 Q ss_pred Cchhhhhcc Q lcl|NC_015280. 433 DPQAAGNLN 441 (455) Q Consensus 433 ~~~~~~~~~ 441 (455) +| T Consensus 298 -------a~ 299 (299) T protein:vir:41 298 -------GN 299 (299) T ss_pred -------CC Confidence 11 No 36 >protein:vir:97053 Length: 390 # NCBI annotation: putative head protein # Family: family:all:585 # MgeID: mge:1653 # MgeName: OP1 # Cross-refs: genbank:acc:YP_453565;genbank:gi:84662600;genbank:GeneID:5142468 Probab=91.66 E-value=0.015 Score=30.61 Aligned_cols=319 Identities=14% Similarity=0.060 Sum_probs=121.1 Q ss_pred Cc------------chHHHHHHhhHh------hcCCCC---ccccc--------hhhHHHHHHHhhhHHHHHHHHHHhhh Q lcl|NC_015280. 1 MY------------NAENLQEKWAPV------LNHEGL---NDIKD--------PYRKSVTAILLENQERALAEERAVLT 51 (455) Q Consensus 1 m~------------~~~~~~~kw~~~------l~~~~~---~~i~~--------~~~~~v~~~~~enq~~~~~e~~~~l~ 51 (455) ++ ..+.|+++=..+ ++.++. ++.++ ...++.+....+-.-+.-.+.+. T Consensus 32 ~~~e~~~~~~~~~~e~~~l~~~i~~~e~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--- 108 (390) T protein:vir:97 32 LNASARSKVDELFATVGNLSAEVQAARQRVAELEGNGAGGDVQHVSVGDMFVASEQFQASTGRWNDRSARATMNIKA--- 108 (390) T ss_pred CCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcccccccccccchhhhhhhHHHHHHHHHhhhhhhhhhhHHHH--- Confidence 11 001111111110 000000 00000 00011111111000000000000 Q ss_pred hhhhchhhhccccccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCccccccccccccc Q lcl|NC_015280. 52 EAPTNVGPINTPTTSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFS 131 (455) Q Consensus 52 ea~~~~~~~~~~st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fS 131 (455) ........+++++...-....+-.+++++.+..+..+++.+-||++++.-+.-.. ..++. T Consensus 109 ----~~~~~~~~~~~~~g~lip~~~~~~ii~~~~~~~~i~~~~~~~~~~~~~~~~~~~~----~~~~~------------ 168 (390) T protein:vir:97 109 ----ALNTASTDAAGSAGALTTPNRLPGFITPPDARLTVRDLIGSGRTDSALIEYVQET----GFVNN------------ 168 (390) T ss_pred ----HHHhhhcccccccccccchhhhHHHHHHHhhhhhhHhhcceeeccCCceEEEEEe----cCCcc------------ Confidence 0111111122222221111223344444555666778899999988764332211 00000 Q ss_pred ccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeH Q lcl|NC_015280. 132 GTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSV 211 (455) Q Consensus 132 g~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTi 211 (455) +-..+++ ..+++-..++++++...|.-+-...+|- T Consensus 169 -------------------------------------a~~v~Eg--------~~~~~~~~~~~~i~~~~~k~~~~~~is~ 203 (390) T protein:vir:97 169 -------------------------------------AAIVAEG--------ALKPESSLKFAKKTDTTHVIAHTMKATR 203 (390) T ss_pred -------------------------------------eeeecCC--------ccccccccceeEEEEeeeeEEEeehhhH Confidence 0000011 1122223334444444554445678999 Q ss_pred HHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeec------cccchhhHHHHHHH Q lcl|NC_015280. 212 ELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDV------DSNGRWSVEKFKGL 285 (455) Q Consensus 212 ELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~------~~~gr~~ve~~k~l 285 (455) ||.+|-- +.++.|.+-|+..|...+|+.||.- .-.+-...|++.... ...+--..+....+ T Consensus 204 ell~ds~-----~l~~~i~~~la~a~~~~~d~a~l~G--------~g~~~~p~Gi~~~~~~~~~~~~~~~~~~~d~~~~~ 270 (390) T protein:vir:97 204 QILSDAP-----QLASYMNNRLIRGLKVKEDAEILRG--------TGANDGLLGLIPQATTYAAPTTIAGATRVDQLRLA 270 (390) T ss_pred HHHHhHH-----HHHHHHHHHHHHHHHHHHHHHHhhc--------CCCCccccceeeccccccccccccccchHHHHHHH Confidence 9999842 4788888888888888888877632 111112334432110 00111111222222 Q ss_pred HHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEeccccc Q lcl|NC_015280. 286 LFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSAN 365 (455) Q Consensus 286 ~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~ 365 (455) ++.+ ...-...+-+|+||+....|.. +... +|..=+ .+... .-.++|. +++|+++... T Consensus 271 ~~~~---------~~~~~~~~~~v~n~~~~~~L~~---lkd~---~G~~l~---~~~~~-~~~~~l~-G~pV~~~~~~-- 328 (390) T protein:vir:97 271 MLQA---------SLAEYPASGIVINPIDWAAIEL---AKDA---NNQYLI---GNARG-TLTPTLW-GLPVVATQAM-- 328 (390) T ss_pred HHhh---------ccccCCCCEEEEcHHHHHHHHH---hhcC---CCceee---cCccC-CCCceec-ceeeEEcCCC-- Confidence 2222 2233456668899999888874 2211 111101 01111 1123553 6788887653 Q ss_pred cCCcceEEEE-EecCccccceeEEcccccccceeecCC---ccccceeeeeeecce-eeccccccccccc Q lcl|NC_015280. 366 VSDNQYYVVG-YKGTNAYDAGLFYCPYVPLQMYRAIGQ---DTFQPRIGFKTRYGM-VLNPFAKGLTALS 430 (455) Q Consensus 366 ~s~~dY~~vG-~KG~~~~daglfyaPYv~l~~~~~~Dp---~s~qP~~g~~tRY~l-~~nP~~~~~~~~~ 430 (455) |..-+++| ++. ++++...-.+.....-++ .+-+=.+-...||++ +.+|-+--.-.+. T Consensus 329 --~~~~~~~gd~~~------~~~~~~~~~~~i~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~v~~~~a 390 (390) T protein:vir:97 329 --APGEFLVGAFDL------AAQIFDQWDARVEIGYVNDDFQRNMVTVLAEERLALVVYRPEALITGSFA 390 (390) T ss_pred --CCCcEEEEeccc------eEEEEEecceEEEEeecccccccCcEEEEEEEeeccEEeccccEEEEEeC Confidence 33334444 220 111111111111111111 123333445568887 5556444221111 No 37 >protein:vir:100172 Length: 394 # NCBI annotation: putative major head protein # Family: family:all:21 # MgeID: mge:1524 # MgeName: phi AT3 # Cross-refs: genbank:acc:YP_025031;genbank:gi:48697264;genbank:GeneID:2948270 Probab=91.48 E-value=0.016 Score=30.48 Aligned_cols=334 Identities=14% Similarity=0.131 Sum_probs=125.1 Q ss_pred CcchHHHHHHhhHhhcCCC-----CccccchhhHH-HHHHHhhhHHHHHH-----------------------------H Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHEG-----LNDIKDPYRKS-VTAILLENQERALA-----------------------------E 45 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~~-----~~~i~~~~~~~-v~~~~~enq~~~~~-----------------------------e 45 (455) -...++++++=.-.++.+. +.++.+...+. -...-|+.|.+++. + T Consensus 12 ~~~~~e~~~~~~~~~~~~~~~~ee~~~~~~~~~~~~~~~~~l~~~i~~~e~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 91 (394) T protein:vir:10 12 SAKCADLNAQLNAKLQDENASVDDFQKIKDDLTAAKARRDAINDQIKDLEAENKANSDPDKPVDNAQPNGTDLKKKPIDA 91 (394) T ss_pred HHHHHHHHHHHHHHHhhhhccHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhcchhhhhhhhcccccchhhhHHHH Confidence 0000011111100111100 00000000000 00001111111111 1 Q ss_pred HHHhhhhhhhchh-----hhccccccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcc Q lcl|NC_015280. 46 ERAVLTEAPTNVG-----PINTPTTSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNE 120 (455) Q Consensus 46 ~~~~l~ea~~~~~-----~~~~~st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~E 120 (455) +++.+.+-..... .....+++.|.+.--.+..-.++++..+..+-.+++.+.||+++++-+--.+ .. ++. T Consensus 92 ~~~~~~~~l~~~~~~~~~~~~~~t~~~gg~~vP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~--~~--~~~- 166 (394) T protein:vir:10 92 KKKAINDFIHSHGKVIDNAAGHVTSTEAGVLIPEEIIYDPTAEVNSVVDLSTLVTKTPVTTPKGTYPILK--RA--TDR- 166 (394) T ss_pred HHHHHHHHHhccchhhhhhhcccccccCceeccHHHHHHHHHHHHhhhhhhhhceeeeccCCceEEEEEe--cC--CCc- Confidence 1111111100000 0001112222222222223345666666667789999999999876665444 00 000 Q ss_pred cccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEe Q lcl|NC_015280. 121 AFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEA 200 (455) Q Consensus 121 AlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtA 200 (455) .. ...+.+..++.....|.+..|++.|. T Consensus 167 -------~~-----------------------------------------~~~E~~~~~~~~~~~~~~v~l~~~k~---- 194 (394) T protein:vir:10 167 -------FS-----------------------------------------SVAELAENPALAEPEFEQVDWSVSTY---- 194 (394) T ss_pred -------cc-----------------------------------------cccccccccccccccceeEEeeeeee---- Confidence 00 00000011111122355555555555 Q ss_pred eccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHH Q lcl|NC_015280. 201 KGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVE 280 (455) Q Consensus 201 KSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve 280 (455) +-...+|-||.+|- ..|.++.|.+-|+..|..-+|+.|+.-.- .+...++...- ... T Consensus 195 ---~~~~~iS~ell~ds----~~~l~~~i~~~la~~~~~~~~~~il~g~g----~~~~~~~~~~~------------~~d 251 (394) T protein:vir:10 195 ---RGAIPLSEEAIADS----AVDLTSLVGQSINEKSVNTYNAMIAPVLQ----SFTAKATTTDT------------LVD 251 (394) T ss_pred ---EeeehhHHHHHhhh----hHHHHHHHHHHHHHHHHHHHHHHHhhccc----ccccccccccc------------cHH Confidence 44567999999984 25788899999999999999998875432 12111111100 011 Q ss_pred HHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhccccccccccccccccc-ccccCCceeEEEecCceEEEE Q lcl|NC_015280. 281 KFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIG-EIDDTGNTFVGTLNGRFKVYI 359 (455) Q Consensus 281 ~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~-~~d~t~~~~~G~l~~~~~vy~ 359 (455) ....++...... .+. ..+|+++.....|... ... +|..-+.. -...+.....++|. +++|++ T Consensus 252 ~l~~~~~~~~~~---------~~~-a~~vmn~~~~~~l~~l---kd~---~G~~i~~~~~~~~~~~~~~~~L~-G~PV~~ 314 (394) T protein:vir:10 252 SLKHILNVDLDP---------AYS-RALVVTQSLFNTLDTL---KDK---NGRYLLHDASDSITDGTAKGTVL-GVPVYV 314 (394) T ss_pred HHHHHHHhhhhh---------hcc-CEEEecHHHHHHHHHh---hcc---CCCeeeeccccccccCCcccccc-cceeEE Confidence 122222111111 122 2477899888877752 211 11100000 01112222334564 456654 Q ss_pred -e-ccccccCCcce-EEEEE-ecCccccceeEEcccccccceeecCCccccceeeeeeecce-eecccccccc---cccc Q lcl|NC_015280. 360 -D-PYSANVSDNQY-YVVGY-KGTNAYDAGLFYCPYVPLQMYRAIGQDTFQPRIGFKTRYGM-VLNPFAKGLT---ALSD 431 (455) Q Consensus 360 -D-~y~~~~s~~dY-~~vG~-KG~~~~daglfyaPYv~l~~~~~~Dp~s~qP~~g~~tRY~l-~~nP~~~~~~---~~~~ 431 (455) | .+... ...+. +++|- +. ++....- ........|...|.-.+-...|++. +.||-....- .... T Consensus 315 ~~~~~~~~-~~~~~~i~~gd~s~------~~~~~~~-~~~~v~~~~~~~~~~~~~~~~r~d~~~~~~~ai~~~~~~~~~~ 386 (394) T protein:vir:10 315 VGDALLGS-AAGDQKAFVGDLKR------GVLFADR-QQVTLAWEDSKIYGRYLGAAFRFGVKQADSNAGYFVTNTDAAS 386 (394) T ss_pred ecccccCC-CCCceEEEEeeccc------cEEEEee-cceEEEEecccccceeEEEEEEeccEEeccccEEEEEeecccC Confidence 3 11111 00111 22220 00 0000000 0001122345556666777789888 6666554111 1111 Q ss_pred cCchhhhhccc Q lcl|NC_015280. 432 SDPQAAGNLNA 442 (455) Q Consensus 432 ~~~~~~~~~~~ 442 (455) |..- .-|+ T Consensus 387 ~~~~---~~~~ 394 (394) T protein:vir:10 387 GSTS---GTGK 394 (394) T ss_pred CCCC---CCCC Confidence 1111 1122 No 38 >protein:vir:3033 Length: 272 # NCBI annotation: major capsid protein # Family: family:all:522 # MgeID: mge:61 # MgeName: PhiNIH1.1 # Cross-refs: genbank:acc:NP_438146;genbank:gi:16271809;genbank:GeneID:929235 Probab=91.15 E-value=0.017 Score=30.25 Aligned_cols=260 Identities=11% Similarity=0.089 Sum_probs=119.8 Q ss_pred CCCcceeeeEEEeeecCCCCcccccccc-----------cccccccccccccccccccCcccCCCCCCCCcccccccccc Q lcl|NC_015280. 99 MTGPTGLIFAMRSRYTNQSGNEAFFDEP-----------DAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLAS 167 (455) Q Consensus 99 mTGPTGLIFAMRsrY~~qsG~EAlfnEa-----------~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~ 167 (455) |. ...++.+ ..+.+|- ...|+.-... +....+ ..+... ... T Consensus 1 MA-----------~~~T~~~-~~~iPev~s~~v~~~~~~~~~~~~~~~~--------------~~~~~g-~~G~tv-~iP 52 (272) T protein:vir:30 1 MA-----------VGTTKMA-QMLDPEVLADMIDAEVGKAIRFAPLAEV--------------DTTLEG-QPGTTL-TVP 52 (272) T ss_pred CC-----------Cccccch-heechHHHHHHHHHHHHHHhhhhccccc--------------cccccC-CCCCEE-EEE Confidence 11 0000100 1111110 0001000000 000000 000000 000 Q ss_pred cccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHH Q lcl|NC_015280. 168 SKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVR 247 (455) Q Consensus 168 ~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~ 247 (455) .=-....++.++++...+.. ..+.+..+++.|.++-.-++|=|++.+ -+-|.++++.+-|+..|+.+|+++|+. T Consensus 53 ~~~~~~~a~~v~eg~~i~~~--~~~~~~~~~~~~~~~~~~~itd~~~~~----s~~d~~~~~~~~~~~~~a~~~d~~i~~ 126 (272) T protein:vir:30 53 KWDYIGDAEDVAEGEAIPMT--QLGFKKTTMTIKKAGKGVEITDEAILS----GYGDPVGQAAKQIVEAIDHKVDADVLD 126 (272) T ss_pred EecCCCCcccccCCCccccc--ccccceEEEEeeeeeeeeeecHHHHhh----ccccHHHHHHHHHHHHHHHHHHHHHHH Confidence 00011223333433333333 345677777888877666777666533 347999999999999999999999998 Q ss_pred HHhhhheeeeeeccccceeeeeeccccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccc Q lcl|NC_015280. 248 TVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYD 327 (455) Q Consensus 248 ~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~ 327 (455) .+...... ++.... + +-.-.+..++..+ -...++++++|++++.|......++. T Consensus 127 ~~~~a~~~-----~~~~~t--~----------d~i~da~~~l~~~---------~~~~~~~vv~p~~~~~L~k~~~~~~~ 180 (272) T protein:vir:30 127 ALSKSTQT-----VEATAT--V----------DGVSKALDIFNDE---------DDAETVIVMNPADASTLRLDAAKEWL 180 (272) T ss_pred Hhcccccc-----cccccC--H----------HHHHHHHHHHhcc---------CCCccEEEEcHHHHHHHHHhcccccc Confidence 76543321 111111 1 1112222333322 24567999999999999876554433 Q ss_pred cccccccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEcccccccceeecCCccccc Q lcl|NC_015280. 328 SGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQDTFQP 407 (455) Q Consensus 328 ~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s~qP 407 (455) .......+ ...+-..|++. |++|+++.+. |..=+++.-+|.- +++-..-+..... =|+.+++- T Consensus 181 ~~~~~~~~------~~~~g~ig~i~-G~~Vi~s~~~----p~~t~~~~~~~a~----~~~~~~~~~ve~~--r~~~~~~~ 243 (272) T protein:vir:30 181 GATEVGAN------RVVSGVYGEVL-GVQIVRSRKC----PKGTAYMVRKGAL----RIMLKRNTMVETD--RDITKAIN 243 (272) T ss_pred cccccccc------ccccccchhhc-CeeEEEcCCC----CcceEEEEcCCeE----EEEecCCceeeec--ccccccee Confidence 32221111 11112356774 5799999764 3222222222211 1121222222221 27888888 Q ss_pred eeeeeeecce-eecccccccccccccCchhhhhccch Q lcl|NC_015280. 408 RIGFKTRYGM-VLNPFAKGLTALSDSDPQAAGNLNAN 443 (455) Q Consensus 408 ~~g~~tRY~l-~~nP~~~~~~~~~~~~~~~~~~~~~n 443 (455) .+-..-|||+ +.||-..-.- +-.+++ |- T Consensus 244 ~i~~~~~~~~~v~~~~~vv~~---t~~~a~-----~~ 272 (272) T protein:vir:30 244 QIVANKHYGVYLYKAEKAVKI---TLKDAA-----KK 272 (272) T ss_pred EEEEEEEEEEEEEcCCceEEE---Eecccc-----cC Confidence 8888889998 6777533111 111212 11 No 39 >protein:vir:9820 Length: 272 # NCBI annotation: putative major capsid/head protein # Family: family:all:522 # MgeID: mge:176 # MgeName: 315.4 # Cross-refs: genbank:acc:NP_795582;genbank:gi:28876339;genbank:GeneID:1257858 Probab=91.15 E-value=0.017 Score=30.25 Aligned_cols=260 Identities=11% Similarity=0.089 Sum_probs=119.8 Q ss_pred CCCcceeeeEEEeeecCCCCcccccccc-----------cccccccccccccccccccCcccCCCCCCCCcccccccccc Q lcl|NC_015280. 99 MTGPTGLIFAMRSRYTNQSGNEAFFDEP-----------DAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLAS 167 (455) Q Consensus 99 mTGPTGLIFAMRsrY~~qsG~EAlfnEa-----------~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~ 167 (455) |. ...++.+ ..+.+|- ...|+.-... +....+ ..+... ... T Consensus 1 MA-----------~~~T~~~-~~~iPev~s~~v~~~~~~~~~~~~~~~~--------------~~~~~g-~~G~tv-~iP 52 (272) T protein:vir:98 1 MA-----------VGTTKMA-QMLDPEVLADMIDAEVGKAIRFAPLAEV--------------DTTLEG-QPGTTL-TVP 52 (272) T ss_pred CC-----------Cccccch-heechHHHHHHHHHHHHHHhhhhccccc--------------cccccC-CCCCEE-EEE Confidence 11 0000100 1111110 0001000000 000000 000000 000 Q ss_pred cccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHH Q lcl|NC_015280. 168 SKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVR 247 (455) Q Consensus 168 ~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~ 247 (455) .=-....++.++++...+.. ..+.+..+++.|.++-.-++|=|++.+ -+-|.++++.+-|+..|+.+|+++|+. T Consensus 53 ~~~~~~~a~~v~eg~~i~~~--~~~~~~~~~~~~~~~~~~~itd~~~~~----s~~d~~~~~~~~~~~~~a~~~d~~i~~ 126 (272) T protein:vir:98 53 KWDYIGDAEDVAEGEAIPMT--QLGFKKTTMTIKKAGKGVEITDEAILS----GYGDPVGQAAKQIVEAIDHKVDADVLD 126 (272) T ss_pred EecCCCCcccccCCCccccc--ccccceEEEEeeeeeeeeeecHHHHhh----ccccHHHHHHHHHHHHHHHHHHHHHHH Confidence 00011223333433333333 345677777888877666777666533 347999999999999999999999998 Q ss_pred HHhhhheeeeeeccccceeeeeeccccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccc Q lcl|NC_015280. 248 TVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYD 327 (455) Q Consensus 248 ~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~ 327 (455) .+...... ++.... + +-.-.+..++..+ -...++++++|++++.|......++. T Consensus 127 ~~~~a~~~-----~~~~~t--~----------d~i~da~~~l~~~---------~~~~~~~vv~p~~~~~L~k~~~~~~~ 180 (272) T protein:vir:98 127 ALSKSTQT-----VEATAT--V----------DGVSKALDIFNDE---------DDAETVIVMNPADASTLRLDAAKEWL 180 (272) T ss_pred Hhcccccc-----cccccC--H----------HHHHHHHHHHhcc---------CCCccEEEEcHHHHHHHHHhcccccc Confidence 76543321 111111 1 1112222333322 24567999999999999876554433 Q ss_pred cccccccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEcccccccceeecCCccccc Q lcl|NC_015280. 328 SGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQDTFQP 407 (455) Q Consensus 328 ~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s~qP 407 (455) .......+ ...+-..|++. |++|+++.+. |..=+++.-+|.- +++-..-+..... =|+.+++- T Consensus 181 ~~~~~~~~------~~~~g~ig~i~-G~~Vi~s~~~----p~~t~~~~~~~a~----~~~~~~~~~ve~~--r~~~~~~~ 243 (272) T protein:vir:98 181 GATEVGAN------RVVSGVYGEVL-GVQIVRSRKC----PKGTAYMVRKGAL----RIMLKRNTMVETD--RDITKAIN 243 (272) T ss_pred cccccccc------ccccccchhhc-CeeEEEcCCC----CcceEEEEcCCeE----EEEecCCceeeec--ccccccee Confidence 32221111 11112356774 5799999764 3222222222211 1121222222221 27888888 Q ss_pred eeeeeeecce-eecccccccccccccCchhhhhccch Q lcl|NC_015280. 408 RIGFKTRYGM-VLNPFAKGLTALSDSDPQAAGNLNAN 443 (455) Q Consensus 408 ~~g~~tRY~l-~~nP~~~~~~~~~~~~~~~~~~~~~n 443 (455) .+-..-|||+ +.||-..-.- +-.+++ |- T Consensus 244 ~i~~~~~~~~~v~~~~~vv~~---t~~~a~-----~~ 272 (272) T protein:vir:98 244 QIVANKHYGVYLYKAEKAVKI---TLKDAA-----KK 272 (272) T ss_pred EEEEEEEEEEEEEcCCceEEE---Eecccc-----cC Confidence 8888889998 6777533111 111212 11 No 40 >protein:vir:8420 Length: 477 # NCBI annotation: gp15 # Family: family:all:21 # MgeID: mge:155 # MgeName: Omega # Cross-refs: genbank:acc:NP_818316;genbank:gi:29566752;genbank:GeneID:1260033 Probab=90.98 E-value=0.018 Score=30.14 Aligned_cols=352 Identities=14% Similarity=0.090 Sum_probs=133.3 Q ss_pred CcchHHHHHHhhHhhcCC-------C---CccccchhhHHHHHHHhh---hHHHHH------HHH-H------------- Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHE-------G---LNDIKDPYRKSVTAILLE---NQERAL------AEE-R------------- 47 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~-------~---~~~i~~~~~~~v~~~~~e---nq~~~~------~e~-~------------- 47 (455) -...+++.++=......+ . ..+.....++......+. ++.+.. .+. + T Consensus 67 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 146 (477) T protein:vir:84 67 DEQIRELESEIERSGKLEAETKTVRKATVEVNEALTYEKGNGQSYFRDLAMQTVGMADEPAKERLRRHMVDVESDKEIRK 146 (477) T ss_pred HHHHHHHHHHHHHhhcchhhhhhhcccccccccchhhhhhHHHHHHHHHHHHHhhhhhhHHHHHHHHHHhhhhhhhhHHH Confidence 011111111111000000 0 000000111110000110 000000 000 0 Q ss_pred --Hhhhhhhhchhhhccccccccccccccchhh--hHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCccccc Q lcl|NC_015280. 48 --AVLTEAPTNVGPINTPTTSSGAVAGFDPILI--SLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFF 123 (455) Q Consensus 48 --~~l~ea~~~~~~~~~~st~tg~i~~~~P~Lv--~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlf 123 (455) ....++ ..+..++++|. ...-|..+ .++...-+..+..+++++.||++.+|-+-=-|..- |. T Consensus 147 ~~~~~~~~-----~~~~~~~~~gg-~lv~~~~~~~~ii~~l~~~~~i~~~~~~~~~~~~~~~~~ip~~~~----~~---- 212 (477) T protein:vir:84 147 IAKVGEEY-----RDLDRNGGTGG-YAVPPLWMMNRFIELARAGRTYANLCPTEPLPGGTSSINIPKILT----GT---- 212 (477) T ss_pred HHHhhhhh-----ccccccCCCcc-eeeccchhHHHHHHHhhhcchHHHhhceeeecCCcceeEEEEEec----Cc---- Confidence 000110 01111111111 11223322 25555556777789999999999988653222100 00 Q ss_pred ccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeecc Q lcl|NC_015280. 124 DEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGR 203 (455) Q Consensus 124 nEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSR 203 (455) .. +....++... .....++...+++.++..+|.- T Consensus 213 ---~~----------------------------------------a~~~~Eg~~~---~~~~~~~s~~~f~~i~~~~~k~ 246 (477) T protein:vir:84 213 ---ST----------------------------------------AIQAADNAAL---TAPSAHEVDLTDGFVQANVKTI 246 (477) T ss_pred ---ce----------------------------------------eeeeccCccc---ccccccccccceeeEEEeeeeE Confidence 00 0000000000 0112445556677788888888 Q ss_pred ccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeee------cccc-ch Q lcl|NC_015280. 204 ALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLD------VDSN-GR 276 (455) Q Consensus 204 aLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~------~~~~-gr 276 (455) +-...+|-||.+|-. .|.++.|.+-|+..|..-+++.||.- .-.+....|++-.. .... .- T Consensus 247 ~~~~~iS~ell~ds~----~~l~~~i~~~l~~~~~~~~d~~~l~G--------~Gt~~~p~Gi~~~~~~~~~~~~~~~~t 314 (477) T protein:vir:84 247 AGQQGIAIQLLDQAA----VSVDEFVFRDLAADYANKLNVQVISG--------TGSNNQVVGVRATAGITQVTATSAGSA 314 (477) T ss_pred EeeeHHHHHHHhccc----hhHHHHHHHHHHHHHHHHHHHHHhcc--------CCCCCccceeeeccccccccccccccc Confidence 888889999999843 56899999999999999999887742 11111234444221 1110 11 Q ss_pred hhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHh----hcccccccccccccccccccccCCceeEEEec Q lcl|NC_015280. 277 WSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAM----SGVLDYDSGISGAVGGIGEIDDTGNTFVGTLN 352 (455) Q Consensus 277 ~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~----sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~ 352 (455) |. ....+ +.-...++.-.....+-.+..+|.+|.....|.. .|..-+.|...+........+.-.....|+|. T Consensus 315 ~~--~~~~~-~~~i~~~~~~~~~~~~~~~~~~v~~~~~~~~l~~lkd~~G~~l~~~~~~~~~~~~~~~~~~~~~~~~~l~ 391 (477) T protein:vir:84 315 LE--KHQII-YQKIADAIQRVHTSRFLEPEVIVMHPRRWASFHAIFAGDDRPLIVPSGPGFNNLGVLTEVASQRVVGQMH 391 (477) T ss_pred hh--hHHHH-HHHHHHHHhhccccccCCccEEEEcHHHHHHHHHhhccCCCeeeecCcccccccccccccccccccchhc Confidence 11 11111 1111122221122233345567778876665543 22222222211111110111112223356674 Q ss_pred CceEEEEecccccc----CCcceEEEEEecCccccceeEEcccccccceeecCCcc--ccceeeeeeecce-----eecc Q lcl|NC_015280. 353 GRFKVYIDPYSANV----SDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQDT--FQPRIGFKTRYGM-----VLNP 421 (455) Q Consensus 353 ~~~~vy~D~y~~~~----s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s--~qP~~g~~tRY~l-----~~nP 421 (455) +++|+++++...+ .+..-+++|--.+.- . .+..+...++|.+ -...+.|.+ ||+ +-+| T Consensus 392 -G~pVv~s~~~p~~~~~~~d~~~i~~gd~~~~~------i---~~~~~~~~~~~~~~~~~~~~~~~v-~~~~~~~~~r~~ 460 (477) T protein:vir:84 392 -GLPVVTDPTLPTTLGTGTDQDVIHVLRASDLA------L---FESSVRMRALQETRAENLSVLLQV-YGYLAFTAARFP 460 (477) T ss_pred -ccceEecCcccccccccCCcceEEEEEeceEE------E---EeeceeEEeccccccccceeeeee-hhhhhhhhhccc Confidence 6799999765211 112234444332110 0 0111111223332 122333322 222 1234 Q ss_pred cccccccccccCchhhhhcc Q lcl|NC_015280. 422 FAKGLTALSDSDPQAAGNLN 441 (455) Q Consensus 422 ~~~~~~~~~~~~~~~~~~~~ 441 (455) -.-- ...|.+..+-... T Consensus 461 ~afv---~~t~~~~~~~~~~ 477 (477) T protein:vir:84 461 QSVV---EIGGTALTAPTFA 477 (477) T ss_pred cceE---EeecccccccccC Confidence 3221 1123332222211 No 41 >protein:vir:7771 Length: 330 # NCBI annotation: gp17 # Family: family:all:507 # MgeID: mge:149 # MgeName: Bxz2 # Cross-refs: genbank:acc:NP_817605;genbank:gi:29566035;genbank:GeneID:1259229 Probab=90.46 E-value=0.021 Score=29.81 Aligned_cols=297 Identities=13% Similarity=0.034 Sum_probs=121.5 Q ss_pred hhhhhhhchhhhccccccccccccccchh-hhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCccccccccc Q lcl|NC_015280. 49 VLTEAPTNVGPINTPTTSSGAVAGFDPIL-ISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPD 127 (455) Q Consensus 49 ~l~ea~~~~~~~~~~st~tg~i~~~~P~L-v~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~ 127 (455) |=.|- .......+|.++... .-|.+ -.+++++.++.+-.+++-+.||+++.--| -. .. .+.++ T Consensus 1 m~~~~---~~a~~~~~t~~~g~~-i~~~~~~~ii~~~~~~s~l~~~~~~~~~~~~~~~~-p~---~~--~~~~a------ 64 (330) T protein:vir:77 1 MAGST---VPSTQVALTGDFSAF-LTPEQSQDYFAEIEKTSIVQRIARKVPMGPTGISI-PH---WT--GAVSA------ 64 (330) T ss_pred Ccccc---cchhhccccCCCcce-echhHHHHHHHHHHhccchhhhcceeeccCCceEE-EE---Ec--CCcce------ Confidence 21110 000111112121111 11222 23566677778888899999998865322 11 10 00000 Q ss_pred ccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeecccccc Q lcl|NC_015280. 128 AQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRA 207 (455) Q Consensus 128 t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKA 207 (455) -..++ +..+++-..++++++...|..+-.. T Consensus 65 ------------------------------------------~~v~E--------g~~~~~~~~~f~~i~~~~~k~~~~~ 94 (330) T protein:vir:77 65 ------------------------------------------SWTGE--------AERKPITKGSFGKQELEPVKITTIF 94 (330) T ss_pred ------------------------------------------eEecC--------CCccccccceeeEEEEeEEEEEEee Confidence 00011 1223444455677777777777777 Q ss_pred ceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHH---------Hhhhheeeeeeccccceeeeeeccccchhh Q lcl|NC_015280. 208 DYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRT---------VYRGAKPGAQANVANAGVFDLDVDSNGRWS 278 (455) Q Consensus 208 EYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~---------l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ 278 (455) .+|-||.+|- +.|.|+.|.+-|+..|...||+-||.- +...+... ..+......+..... .-. T Consensus 95 ~is~ell~ds----~~~~~~~i~~~l~~ai~~~~~~~~l~G~g~~~~~~g~~~~~~~~--~~~~~~~~~~~~~~~--~~~ 166 (330) T protein:vir:77 95 AESAEVVRLN----PLNYLNTMRTKIAEAIALKFDAAAIHGIDKPSAFKGYLAETTKV--VSLADTNLTTASGPQ--GNA 166 (330) T ss_pred hhhHHHHhcc----hHHHHHHHHHHHHHHHHHHHHHHhhcccCCCCcccccccccccc--ceeeccccccccccc--chh Confidence 8999999984 468999999999999999999988831 11111100 000111111111000 001 Q ss_pred HHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccc--cccccCCceeEEEecCceE Q lcl|NC_015280. 279 VEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGI--GEIDDTGNTFVGTLNGRFK 356 (455) Q Consensus 279 ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~--~~~d~t~~~~~G~l~~~~~ 356 (455) .+....++..+. + .-...+.+|++++....|.. +...- |..-+. ...........++|. +++ T Consensus 167 ~~~l~~~~~~~~-------~--~~~~~~~~vmn~~~~~~l~~---lkd~~---G~~l~~~~~~~~~~~~~~~~~l~-G~P 230 (330) T protein:vir:77 167 YLAVNNALSLLV-------N--SGKKWTGTLLDNVTEPILNT---AVDGN---GRPLFVESTYTEQVGAIREGRIL-GRP 230 (330) T ss_pred HHHHHHHHHhhh-------h--cCCCccEEEEcHHHHHHHHH---HhccC---CceeecCccccccccccCCceec-cee Confidence 112222222221 1 12344568999999988875 22111 100000 000001111234454 488 Q ss_pred EEEeccccccC----------CcceEEEEEecCccc----cceeEEcccccccceeecCCcc-c---cceeeeeeecce- Q lcl|NC_015280. 357 VYIDPYSANVS----------DNQYYVVGYKGTNAY----DAGLFYCPYVPLQMYRAIGQDT-F---QPRIGFKTRYGM- 417 (455) Q Consensus 357 vy~D~y~~~~s----------~~dY~~vG~KG~~~~----daglfyaPYv~l~~~~~~Dp~s-~---qP~~g~~tRY~l- 417 (455) |++........ ++.++++|-.+..+. ++.+.+.- +-.......+.+ | +=.+=...|++. T Consensus 231 V~~~~~~p~~~~~~~~~~~~gd~s~~~i~~~~~~~i~~~~e~~~~~~~--~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~ 308 (330) T protein:vir:77 231 TYVADNVVNGTVGNRVVGVMGDFSQVIWGQIGGLSFDVTDQATLDFGE--EQGGVWVPKLISLWQHNMVAVRCEAEFAFM 308 (330) T ss_pred eEEeccccCCCCCCccEEEEEecceEEEEEecCcEEEEeecceeeecc--cccccccccccchhhcCcEEEEEEEEeccE Confidence 88886542211 112222333322221 11111100 000000000001 1 222223446666 Q ss_pred eecccccc-cc-cccccCchhh Q lcl|NC_015280. 418 VLNPFAKG-LT-ALSDSDPQAA 437 (455) Q Consensus 418 ~~nP~~~~-~~-~~~~~~~~~~ 437 (455) +.+|-.-- .+ +....+|--. T Consensus 309 v~~~~a~~~i~~~~~~~~~~~~ 330 (330) T protein:vir:77 309 VNDKDAFVKLTDQVAGTDPEEE 330 (330) T ss_pred EecccceEEEEeccCCcCCCCC Confidence 55553221 11 1111111111 No 42 >protein:vir:81070 Length: 390 # NCBI annotation: p09 # Family: family:all:585 # MgeID: mge:1889 # MgeName: Xop411 # Cross-refs: genbank:acc:YP_001285679;genbank:gi:148727187;genbank:GeneID:5247115 Probab=90.26 E-value=0.022 Score=29.69 Aligned_cols=320 Identities=14% Similarity=0.032 Sum_probs=118.9 Q ss_pred Ccch-HHHHHHhhHhhcCCC---CccccchhhH----HHHHHHhhhHHHHHHHHHHhhhhhhhchhhhcccccccccccc Q lcl|NC_015280. 1 MYNA-ENLQEKWAPVLNHEG---LNDIKDPYRK----SVTAILLENQERALAEERAVLTEAPTNVGPINTPTTSSGAVAG 72 (455) Q Consensus 1 m~~~-~~~~~kw~~~l~~~~---~~~i~~~~~~----~v~~~~~enq~~~~~e~~~~l~ea~~~~~~~~~~st~tg~i~~ 72 (455) +... +.++++=.. ++... .++-...... .-+..++.+.......++.-+..+ ........++++. .- T Consensus 50 l~~~i~~~e~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~---~~~~~~~~~~~~g-~~ 124 (390) T protein:vir:81 50 LSAEVQAARQRVAE-LEGNGAGGDVQHVSVGDMFVASEQFQASAGRWNDRSARATMNIKAA---LNTASTDAAGSAG-AL 124 (390) T ss_pred HHHHHHHHHHHHHH-HHhcccccccccccchhhhhhhHHHHHHHHHHhhhhhhhhhHHHHH---HHhhccccccCCc-ce Confidence 0000 111111000 11100 0000000000 011111111100000000000000 0001111111111 11 Q ss_pred ccch-hhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccccccccccccccccccccccccCcccCC Q lcl|NC_015280. 73 FDPI-LISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGTDGATPPTATTEKNPALIN 151 (455) Q Consensus 73 ~~P~-Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~ 151 (455) ..|. .-.++++.-+..+-.+++.+.||++++.-+.-.. +..+. T Consensus 125 ~~~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~----~~~~~-------------------------------- 168 (390) T protein:vir:81 125 TTPNRLPGFITPPDARLTVRDLIGSGRTDSALIEYVQET----GFVNN-------------------------------- 168 (390) T ss_pred echhhhHHHHHHHhhhhhhhhhcceeeccCCceEEEEEe----cCCcc-------------------------------- Confidence 1221 2234444445566788999999998774332211 00000 Q ss_pred CCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHH Q lcl|NC_015280. 152 DATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELAN 231 (455) Q Consensus 152 ~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELan 231 (455) +-..+++ ..+++-..++++++.+.|.-+-...+|-||.+|- . +.++.|.+ T Consensus 169 -----------------a~~v~Eg--------~~~~~~~~~~~~i~~~~~k~~~~~~is~ell~d~--~---~~~~~i~~ 218 (390) T protein:vir:81 169 -----------------AAIVAEG--------ALKPESSLKFAKKTDTTHVIAHTMKATRQILSDA--P---QLASYMNN 218 (390) T ss_pred -----------------eeeecCC--------cccccccceeeEEEEeeeEEEEeehhhHHHHHhH--H---HHHHHHHH Confidence 0000011 1122223334444444444445566799999984 2 47888888 Q ss_pred HHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeec------cccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCc Q lcl|NC_015280. 232 ILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDV------DSNGRWSVEKFKGLLFQIERDANAIAQETRRGKG 305 (455) Q Consensus 232 ILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~------~~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~g 305 (455) -|+..|...+|+-||.- .-.+-...|++.... ...+-..++....+++++. ..-... T Consensus 219 ~l~~~~~~~~d~a~l~G--------~g~~~~~~Gi~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~---------~~~~~~ 281 (390) T protein:vir:81 219 RLIRGLKVKEDAEILRG--------TGANDGLLGLIPQATTYAAPTTIAGATRVDQLRLAMLQAS---------LAEYNP 281 (390) T ss_pred HHHHHHHHHHHHHHHhc--------CCCCCcccceeecccccccccccccchhHHHHHHHHHhhc---------cccCCC Confidence 88888888888877632 111112333332111 1112223333444444433 223455 Q ss_pred cEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCccccce Q lcl|NC_015280. 306 NIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAG 385 (455) Q Consensus 306 n~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~dag 385 (455) +.+|+||.....|.. +... +|..=+ .+.+. ...++| .|++|++.... |..-+++|---. . T Consensus 282 ~~~v~~~~~~~~l~~---lkd~---~G~~l~---~~~~~-~~~~~l-~G~pv~~~~~~----p~~~~~~gd~~~-----~ 341 (390) T protein:vir:81 282 SGIVINPIDWAAIEL---AKDA---NNQYLI---GNARG-TLTPTL-WGLPVVATQAM----APGEFLVGAFDL-----A 341 (390) T ss_pred CEEEEcHHHHHHHHH---hhcC---CCceee---cCccc-ccCcee-cceeeEEcCCC----CCCcEEEEehhc-----e Confidence 668899999888874 2211 111000 01111 112345 36688887654 333344442100 0 Q ss_pred eEEccccccccee-ecC-Cc---cccceeeeeeecce-eeccccccccccc Q lcl|NC_015280. 386 LFYCPYVPLQMYR-AIG-QD---TFQPRIGFKTRYGM-VLNPFAKGLTALS 430 (455) Q Consensus 386 lfyaPYv~l~~~~-~~D-p~---s~qP~~g~~tRY~l-~~nP~~~~~~~~~ 430 (455) ++. +.-..... ..+ +. +-+=.+=...|++. +.+|-+--.-.+. T Consensus 342 ~~~--~~~~~~~v~~~~~~~~~~~~~v~~r~~~r~d~~v~~~~a~v~~t~a 390 (390) T protein:vir:81 342 AQI--FDQWDARVEIGYVGEDFQRNMITVLAEERLALVVYRPEALISGSFA 390 (390) T ss_pred EEE--EEecceEEEEecccchhhcCcEEEEEEEeeccEEecccceEEEEeC Confidence 000 10011110 111 11 12223334556666 5555444211111 No 43 >protein:vir:81160 Length: 371 # NCBI annotation: major capsid protein # Family: family:all:21 # MgeID: mge:1892 # MgeName: Geobacillus virus E2 # Cross-refs: genbank:acc:YP_001285811;genbank:gi:148747732;genbank:GeneID:5247203 Probab=90.16 E-value=0.022 Score=29.63 Aligned_cols=315 Identities=16% Similarity=0.103 Sum_probs=130.3 Q ss_pred CcchHHHHHHhhHhhcCCCCccccchhhHHH--HHHHhhhHHHH-------------HHHHHHhhhhhh-hchhhhcccc Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHEGLNDIKDPYRKSV--TAILLENQERA-------------LAEERAVLTEAP-TNVGPINTPT 64 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~~~~~i~~~~~~~v--~~~~~enq~~~-------------~~e~~~~l~ea~-~~~~~~~~~s 64 (455) ..++ .-.++|..+.. ||.+. ++++ ...+.|-+.+. ..+++..+.... .........+ T Consensus 22 ~~~~-~~~e~~~~~~~-----ei~~l-~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~a~~~~ 94 (371) T protein:vir:81 22 LLAE-NKIEEAKKLKE-----EIVAL-QEKFDVAKELYEEQKQTIEDKEPLKPTVQVKENEVEAFVNHIRTRFRNAMSEG 94 (371) T ss_pred HhhH-HHHHHHHHHHH-----HHHHH-HHHHHHHHHHHHHHHHhhccccccccchhhHHHHHHHHHHHHHHHHHHhhccC Confidence 1111 11122322111 11110 0000 00111100000 011111111100 0001111122 Q ss_pred -ccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCccccccccccccccccccccccccc Q lcl|NC_015280. 65 -TSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGTDGATPPTATT 143 (455) Q Consensus 65 -t~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~~~~~~~~ 143 (455) +++|++.--....-.+++.+.++....+++++.||++.++-+.-.+ ..+. .+ T Consensus 95 t~~~gg~~vP~~~~~~ii~~~~~~s~i~~~~~~~~~~~~~~~~~~~~--~~~~--~~----------------------- 147 (371) T protein:vir:81 95 SNQDGGYTVPQDIQTRINELRESKDALQNLITVEPVTTLSGSRVFKK--RSQQ--TG----------------------- 147 (371) T ss_pred CCccCceeecHhHHHHHHHHHHhhhhhhhhceeeeccCCceeEEEEe--ecCC--cc----------------------- Confidence 2223222111122245666667888889999999998887765443 1000 00 Q ss_pred ccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCC Q lcl|NC_015280. 144 EKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGL 223 (455) Q Consensus 144 ~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGL 223 (455) +-..++++..++.+...|.+..++..|..+ ...+|-||.+|-. . T Consensus 148 -------------------------a~~v~Eg~~~~~~~~~~f~~i~~~~~k~~~-------~~~iS~ell~ds~----~ 191 (371) T protein:vir:81 148 -------------------------FVEVAEGAAIGEKATPQFTLLQYQVKKYAG-------FFRVTNELLNDST----E 191 (371) T ss_pred -------------------------eeeeccccccccccccceeeEEeeeeEEEE-------eehhhHHHHhhhh----H Confidence 000011111121122235555555555554 4579999999853 4 Q ss_pred ChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHHHHH-HHHHHHHHHHHHhcC Q lcl|NC_015280. 224 DAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLLF-QIERDANAIAQETRR 302 (455) Q Consensus 224 DAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~-qi~~ean~i~~~T~~ 302 (455) |.++.|.+.|...|..-+|+.|+.-.-+.+ ..|+..++ ..+.++. .+. ..-+ T Consensus 192 ~l~~~i~~~l~~a~~~~~~~~i~~g~g~~~---------~~~~~~~~----------~i~~~~~~~l~--------~~~~ 244 (371) T protein:vir:81 192 AIVNTLVRWIGDESRVTRNGLIINVLNTKA---------KTAIADLD----------GLKQIINVQLD--------PVFR 244 (371) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHhhccccc---------ccccccHH----------HHHHHHHhhcc--------hhhh Confidence 678889999999998888888877543322 22222111 1222211 111 0111 Q ss_pred CCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEec---C Q lcl|NC_015280. 303 GKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKG---T 379 (455) Q Consensus 303 ~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG---~ 379 (455) ....+|+++.....|.. +... +|.. ....+.+. ...|+|. +++|++..+ +..|..+ . T Consensus 245 -~~a~~vmn~~~~~~L~~---lkd~---~g~~--l~~~~~~~-~~~~~l~-G~pV~~~~~---------~~~~~~~~~~~ 304 (371) T protein:vir:81 245 -STSSVIVNQDAFNWLDT---LKDQ---NGQY--LLQPSISS-PTGRQLL-GLPVVIVSN---------KVLANRVDGGT 304 (371) T ss_pred -cCCEEEEcHHHHHHHHH---hhcc---CCCe--eeecccCC-CCCceec-ceeEEEecc---------cccCccccccc Confidence 22357889988888875 2211 1111 11111111 1236664 567777633 2333321 1 Q ss_pred ccccceeEEccccc-------ccceeecCCc------cccceeeeeeecce-eeccccccccccccc Q lcl|NC_015280. 380 NAYDAGLFYCPYVP-------LQMYRAIGQD------TFQPRIGFKTRYGM-VLNPFAKGLTALSDS 432 (455) Q Consensus 380 ~~~daglfyaPYv~-------l~~~~~~Dp~------s~qP~~g~~tRY~l-~~nP~~~~~~~~~~~ 432 (455) ..-...++|+.+.. ..+...+++. ..|=.+-...|++. +.||-.--.-+.... T Consensus 305 ~~~~~~i~~Gd~~~~~~~~~~~~~~i~~~~~~~~~f~~~~v~~~~~~r~d~~~~~~~a~~~~~~~~A 371 (371) T protein:vir:81 305 GAQFAPIIVGDLKEAVVMFDRQRTEIMSSNVAMDAFETDATLWRAIERMDVKMRDDEAFVFGEVQLA 371 (371) T ss_pred cCCcceEEEEehhceEEEEeecceEEEEeccccchhhcCceEEEEEEeeccEEecccceEEEEEecC Confidence 11122344554322 1122222333 23445666678887 666644422222221 No 44 >protein:vir:3991 Length: 404 # NCBI annotation: major structural protein # Family: family:all:21 # MgeID: mge:319 # MgeName: BK5-T # Cross-refs: genbank:acc:NP_116499;genbank:gi:14251132;genbank:GeneID:921252 Probab=90.10 E-value=0.023 Score=29.60 Aligned_cols=328 Identities=11% Similarity=0.067 Sum_probs=128.3 Q ss_pred Ccc-------hHHHHHHhhHhhcCCCCccccchhhHHHHHHH----------hhhHHHHH-HHH-HHhhhhhhhch---- Q lcl|NC_015280. 1 MYN-------AENLQEKWAPVLNHEGLNDIKDPYRKSVTAIL----------LENQERAL-AEE-RAVLTEAPTNV---- 57 (455) Q Consensus 1 m~~-------~~~~~~kw~~~l~~~~~~~i~~~~~~~v~~~~----------~enq~~~~-~e~-~~~l~ea~~~~---- 57 (455) +.+ .+++.++|........ ++....++...... .+...... .++ +++........ T Consensus 32 ~~~~~~~~ee~~~~~~~~~~~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 109 (404) T protein:vir:39 32 LNDDNFSAEAMSELKNKRDNEKVRRD--ALREQLVEAQAEQVVNMREEEKGPLNKSEYELKDKFVKEFVNMVRNPMAFLN 109 (404) T ss_pred hccccccHHHHHHHHHHHHHHHHHHH--HHHHHHHHHHHHHHhccccccccccccchhhhHHHHHHHHHHHHhcchhhhh Confidence 221 1233334432111100 00000000000000 00000000 001 11111110000 Q ss_pred ----hhhccccccccccc---cccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccccccccc Q lcl|NC_015280. 58 ----GPINTPTTSSGAVA---GFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQF 130 (455) Q Consensus 58 ----~~~~~~st~tg~i~---~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~f 130 (455) ..+...++++|... .+.+.++ +.+-+.....+++.++||++++|-+--.| ..+..+ T Consensus 110 ~~e~~a~~~~t~~~gg~~iP~~~~~~ii---~~~~~~~~l~~~~~~~~~~~~~~~~~~~~--~~~~~~------------ 172 (404) T protein:vir:39 110 TVSSKTETSGSDSAAGLTIPQDIRTMIN---TLVRQYDSLQQYVRVESVSTSNGSRVYEK--WTDVTP------------ 172 (404) T ss_pred hhhhhhhhcccccCCceeccHHHHHHHH---HHHHhhhhHHhhcceeeccCCcceEEEEe--ecCCcc------------ Confidence 01111122222221 2223333 33445567788899999999987664333 100000 Q ss_pred cccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeecccccccee Q lcl|NC_015280. 131 SGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYS 210 (455) Q Consensus 131 Sg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYT 210 (455) .+...++++..++.....|.++.|++.|..+-. .+| T Consensus 173 -------------------------------------~a~~v~Eg~~~~~~~~~~f~~i~~~~~k~~~~~-------~iS 208 (404) T protein:vir:39 173 -------------------------------------LTVMDAEDGKIPDLDNPRLTIIKYLIKRYAGII-------TAT 208 (404) T ss_pred -------------------------------------ceeeecCccccccccccceeeEEeeeeeEEeee-------hhH Confidence 000011111112222334777777777776654 489 Q ss_pred HHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHHHHHHHH Q lcl|NC_015280. 211 VELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLLFQIE 290 (455) Q Consensus 211 iELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~ 290 (455) -||.+|-. .|.++.|.+-|+..|..-+|+.||.-.- .+....+..+++ ....+++... T Consensus 209 ~ell~ds~----~~l~~~i~~~l~~~~~~~~d~~il~g~g--------~~~~~~~~~~~~----------~i~~~~~~~~ 266 (404) T protein:vir:39 209 NTLLKDTA----ENILAWLSSWIAKKVVVTRNQAIIAAMG--------TVPKKPTIAKFD----------DVITMINTSV 266 (404) T ss_pred HHHHhhch----HHHHHHHHHHHHHHHHHHHHHHHHhccc--------ccccccccccHH----------HHHHHHHHhh Confidence 99999842 5778999999999999999998875321 122222333221 1112221111 Q ss_pred HHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEeccccccCCcc Q lcl|NC_015280. 291 RDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQ 370 (455) Q Consensus 291 ~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~d 370 (455) .. . -+ ....+|+|+.....|... ... +|..- ...+.+.. ..++|. +++|++-.+ T Consensus 267 ~~---~----~~-~~a~~v~n~~~~~~L~~l---kd~---~G~~l--~~~~~~~~-~~~~l~-G~pV~~~~~-------- 320 (404) T protein:vir:39 267 DP---A----II-ATSSLLTNQSGLNKLALV---KTA---EGKYL--LEPDPTKP-NSYLIK-GKKVIVVAD-------- 320 (404) T ss_pred hh---h----hc-cCCEEEEcHHHHHHHHHh---hcc---CCcee--eccCcCCC-Ccceec-ceeEEEecc-------- Confidence 00 0 11 123588999998888852 211 11110 00111111 113453 445554211 Q ss_pred eEEEEEecCccccceeEEccccc-------ccceeecCCc------cccceeeeeeecce-eeccccccccccccc-Cch Q lcl|NC_015280. 371 YYVVGYKGTNAYDAGLFYCPYVP-------LQMYRAIGQD------TFQPRIGFKTRYGM-VLNPFAKGLTALSDS-DPQ 435 (455) Q Consensus 371 Y~~vG~KG~~~~daglfyaPYv~-------l~~~~~~Dp~------s~qP~~g~~tRY~l-~~nP~~~~~~~~~~~-~~~ 435 (455) ..++-.+.. +..+||.-+-. ......+++. ..|-.+-...|||. +.+|-.--.-++... ..- T Consensus 321 -~~~~~~~~~--~~~~~~gd~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~r~~~r~d~~~~~~~a~~~~~~~~~a~~~ 397 (404) T protein:vir:39 321 -RWLPNSGST--VYPLYYGDMSQAITLFDRENMSLLPTNIGAGAFETDTTKIRVIDRFDVKTTDSEALVAGSFTAIADQV 397 (404) T ss_pred -cccCccCCC--ccEEEEEeccccEEEEeecceEEEEeccchhhhhhceeeEEEEeeeccEEecccceEEEEeeccccCC Confidence 111111111 11123222211 0011112222 34455667788988 677754421111111 111 Q ss_pred hhhhccc Q lcl|NC_015280. 436 AAGNLNA 442 (455) Q Consensus 436 ~~~~~~~ 442 (455) +...+|| T Consensus 398 ~~~~~~~ 404 (404) T protein:vir:39 398 GNFTAGK 404 (404) T ss_pred CCCCCCC Confidence 1224455 No 45 >protein:vir:3845 Length: 395 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:322 # MgeName: phi adh # Cross-refs: genbank:acc:NP_050151;swissprot:trembl:q9t1f6;genbank:gi:9633043;uniprot:Q9T1F6;genbank:GeneID:1262163 Probab=89.87 E-value=0.024 Score=29.48 Aligned_cols=332 Identities=14% Similarity=0.088 Sum_probs=120.8 Q ss_pred CcchHHHHHHhhHhhcCCCCcc-------ccc-------h-hhHHHHHHHhhhHHHHH----------------HHHHHh Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHEGLND-------IKD-------P-YRKSVTAILLENQERAL----------------AEERAV 49 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~~~~~-------i~~-------~-~~~~v~~~~~enq~~~~----------------~e~~~~ 49 (455) ....+.|.+|=.-....+...+ +.. . .+.+......+.+.... .+.+++ T Consensus 14 ~~~~~~l~e~~~~~~~~~~~~~~~~~~ee~~~l~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 93 (395) T protein:vir:38 14 GQKVQDLEDKRAQFAIDLGNDASSHSVDDINKLNASLKNAKMAQELAKSAYEDARANLNAEPVNKKPLPVKDGKPDAQAM 93 (395) T ss_pred HHHHHHHHHHHHHHHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhccccccccchhhhhHHHHHH Confidence 1111111111111110000000 000 0 00000000000000000 000111 Q ss_pred hhhhhhchhhhccccc-cccccccccchhh--hHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccccc Q lcl|NC_015280. 50 LTEAPTNVGPINTPTT-SSGAVAGFDPILI--SLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEP 126 (455) Q Consensus 50 l~ea~~~~~~~~~~st-~tg~i~~~~P~Lv--~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa 126 (455) ...-........+.++ ++++-...=|.-+ .+++.+.+..+..+++.++||++++|-+-=.+ -.+ T Consensus 94 ~~~~~~~~~~~~~~~~~~~~~gg~~vP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~--~~~----------- 160 (395) T protein:vir:38 94 KNQFVKDFKNLVTSGTTGTGNAGLTIPEDIQLQIRTLTRSFTSLESLANVENVTTSHGSRVYEK--LAD----------- 160 (395) T ss_pred HHHHHHHHHHHHhhccCccCCCceecchhHhhHHHHHHHhhcchhhhcceeeccCCcceEEEEe--ecc----------- Confidence 1000001111112222 2222111122222 35555556667888999999999998652222 000 Q ss_pred cccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccc Q lcl|NC_015280. 127 DAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALR 206 (455) Q Consensus 127 ~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLK 206 (455) ++. .+....+.+..++.....|.+..|+..|..+ . T Consensus 161 ---~~~-----------------------------------~a~~v~E~~~~~~~~~~~f~~v~~~~~k~~~-------~ 195 (395) T protein:vir:38 161 ---ITP-----------------------------------LKDLDDESALIGDNDDPELTVVKYLIHRYAG-------I 195 (395) T ss_pred ---CCc-----------------------------------cccccccccccccccccceeeEEeeeeeeEe-------e Confidence 000 0000111111122222335555555555554 4 Q ss_pred cceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHHHH Q lcl|NC_015280. 207 ADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLL 286 (455) Q Consensus 207 AEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~ 286 (455) ..+|-||.+|- +.|-++.|.+-|+..|..-||+.|+.-.= .+....+...++ ....++ T Consensus 196 ~~iS~ell~ds----~~~l~~~i~~~la~~~~~~~~~~il~g~g--------~~~~~~~~~~~~----------~i~~~~ 253 (395) T protein:vir:38 196 TTVTNTLLKDT----VDNIIQWLVNWAAKKDVVTRNAKILEVMG--------KAPKKPTISQFD----------NIKDLE 253 (395) T ss_pred hhhHHHHHhhh----HHHHHHHHHHHHHHHHHHHHHHHHhhccc--------ccccccccccHH----------HHHHHH Confidence 45999999983 34678888888888888888887775221 111112222111 122222 Q ss_pred HHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEecccccc Q lcl|NC_015280. 287 FQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANV 366 (455) Q Consensus 287 ~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~ 366 (455) +. ....--+. ...+||++.....|.. +... +|..-+ ..+.+ ....++|. +++|++..... T Consensus 254 ~~-------~l~~~~~~-~a~~v~n~~~~~~L~~---lkd~---~G~~l~--~~~~~-~~~~~~l~-G~pV~~~~~~~-- 313 (395) T protein:vir:38 254 NN-------TLDPAIES-TSSFITNQSGYNILSK---VKDA---DGRYLM--QPDVT-SPDKYLID-GKPVIRIADKW-- 313 (395) T ss_pred HH-------hhhhhhcC-CCEEEEcHHHHHHHHH---hhcc---CCceee--ccCcC-CCCcceec-cceeEEecccc-- Confidence 21 11111122 2347899999888865 2211 111110 11111 11123453 55666543210 Q ss_pred CCcceEEEEEecCccccceeEEccccc-------ccceeecCC------ccccceeeeeeecce-eecccccccccc--c Q lcl|NC_015280. 367 SDNQYYVVGYKGTNAYDAGLFYCPYVP-------LQMYRAIGQ------DTFQPRIGFKTRYGM-VLNPFAKGLTAL--S 430 (455) Q Consensus 367 s~~dY~~vG~KG~~~~daglfyaPYv~-------l~~~~~~Dp------~s~qP~~g~~tRY~l-~~nP~~~~~~~~--~ 430 (455) ++-- .-+..+||+-+-. ......+++ ...+-.+-+..||+. +.+|-.--.-.. . T Consensus 314 -------~~~~---~~~~~i~~gd~~~~~~i~~~~~~~i~~~~~~~~~~~~~~~~~r~~~r~d~~~~~~~a~~~~~~~~~ 383 (395) T protein:vir:38 314 -------LPDV---SGSHPLYFGDLKQGITLFDRQQMQIDTTNVGAGSFEHDTTKLRFIDRFDVQLIDDGAFAAASFKTV 383 (395) T ss_pred -------cCcC---CCcceEEEEeccccEEEEEecceEEEEeccccchhhcCceEEEEEEeeccEEecccceEEEEeecc Confidence 0000 0011122221110 001111111 233445666778877 445544311111 1 Q ss_pred ccCchhhhhccc Q lcl|NC_015280. 431 DSDPQAAGNLNA 442 (455) Q Consensus 431 ~~~~~~~~~~~~ 442 (455) .-++......|| T Consensus 384 ~~~~~~~~~~~~ 395 (395) T protein:vir:38 384 ANQAQGTAGTGK 395 (395) T ss_pred cCCCCCccCCCC Confidence 223323333455 No 46 >protein:vir:7409 Length: 408 # NCBI annotation: major structural protein # Family: family:all:21 # MgeID: mge:146 # MgeName: P335 # Cross-refs: genbank:acc:NP_839926;genbank:gi:30089896;genbank:GeneID:1260683 Probab=89.30 E-value=0.027 Score=29.17 Aligned_cols=334 Identities=10% Similarity=0.034 Sum_probs=122.7 Q ss_pred CcchHHHHHHhhHhhcCCCCccccchhhHHHHHHHh----------hhHHHHH-HHH-HHhhhhhhhc--------hhhh Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHEGLNDIKDPYRKSVTAILL----------ENQERAL-AEE-RAVLTEAPTN--------VGPI 60 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~~~~~i~~~~~~~v~~~~~----------enq~~~~-~e~-~~~l~ea~~~--------~~~~ 60 (455) +-.-+++..++..+...-. ++....+........ +...+.. .+. +++....... ...+ T Consensus 39 ~e~i~e~~~~~~~~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~ 116 (408) T protein:vir:74 39 AEAMSELKNKRDNEKVRRD--ALREQLVEAQAEQVVNMREEEKGPLNKSENELKDKFVKDFVNMVRNPMAFLNTVSSKTE 116 (408) T ss_pred HHHHHHHHHHHHHHHHHHH--HHHHHHHHHHHHHHhhccccccccccchhhhhHHHHHHHHHHHHhcchhhhhhhhhhhh Confidence 1112333444433211100 000000000000000 0000000 011 1111111000 0011 Q ss_pred ccccccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccccccccccccccccccc Q lcl|NC_015280. 61 NTPTTSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGTDGATPPT 140 (455) Q Consensus 61 ~~~st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~~~~~ 140 (455) ...++..|.+.--....-.+++.+.+.....++++++||++.+|-+--.+ ..+. . . T Consensus 117 ~~~~~~~gg~~vP~~~~~~Ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~--~~~~--------------~---~----- 172 (408) T protein:vir:74 117 TSGSDSAAGLTIPQDIRTMINTLVRQYDSLQQYVRVESVSTSSGSRVYEK--WTDV--------------T---P----- 172 (408) T ss_pred cccccCCCceeechhHhhHHHHHHhhhcchhhhcceeeccCCcceEEEEe--ecCC--------------c---c----- Confidence 11222222222111111234444445566788999999999887664333 1000 0 0 Q ss_pred cccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccce-eEEEEEEEEeeccccccceeHHHHHhHHH Q lcl|NC_015280. 141 ATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMA-FSIDKIAVEAKGRALRADYSVELAQDLKA 219 (455) Q Consensus 141 ~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMa-FsIEK~tVtAKSRaLKAEYTiELAQDLkA 219 (455) .+...++... .++.+ .+++++++..+.-+-...+|-||.+|- T Consensus 173 ---------------------------~~~~v~E~~~--------~~~~~~~~~~~i~~~~~k~~~~~~iS~ell~ds-- 215 (408) T protein:vir:74 173 ---------------------------LKAMDEEDGK--------IPDLDNPRLTIIKYLIKRYAGIITATNTLLKDT-- 215 (408) T ss_pred ---------------------------cccccccccc--------cccccccceeeEEeeeeeEEeeehhHHHHHhhc-- Confidence 0000011111 12221 334444445555555566999999983 Q ss_pred hhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015280. 220 IHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAIAQE 299 (455) Q Consensus 220 iHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i~~~ 299 (455) .+|.++.|.+-|+..|..-+|+.||.- .-.+....++.+++ ....+++ ..... T Consensus 216 --~~~l~~~i~~~l~~~~~~~~d~~il~G--------~G~~~~~~~~~~~~----------~i~~~~~-------~~l~~ 268 (408) T protein:vir:74 216 --AENILAWLSSWIAKKVVVTRNQAIIAA--------MGTVPKKPTIANFD----------DVITMIN-------TSVDP 268 (408) T ss_pred --hHHHHHHHHHHHHHHHHHHHHHHHhhc--------ccccccccccccHH----------HHHHHHH-------Hhhhh Confidence 357888999999999998888887742 11222222333221 1111111 11111 Q ss_pred hcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEecc--cccc-CCcceEEEE- Q lcl|NC_015280. 300 TRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPY--SANV-SDNQYYVVG- 375 (455) Q Consensus 300 T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y--~~~~-s~~dY~~vG- 375 (455) .-+..+ .+||++.....|.. +... +|. .....+.++. ..++| .+++|++-.. .... ++..-+++| T Consensus 269 ~~~~~a-~~v~n~~~~~~l~~---lkd~---~G~--~l~~~~~~~~-~~~~l-~G~pV~~~~~~~~~~~~~~~~~i~~gd 337 (408) T protein:vir:74 269 AIIATS-SLLTNQSGLNKLAL---VKTA---EGK--YLLEPDPTKP-NSYLI-KGKQVIVVADRWLPNSGSTVYPLYYGD 337 (408) T ss_pred hhcCCC-EEEEcHHHHHHHHH---hhcC---CCc--eEeccCcCCC-CCcee-cceeeEEecCcccccccCCcceEEEEe Confidence 112222 46789999988885 2211 111 1111122221 12455 3556665221 1000 001112221 Q ss_pred EecCc----cccceeEEcccccccceeecCCccccceeeeeeecce-eecccccc------cccccccCchhhhhcc Q lcl|NC_015280. 376 YKGTN----AYDAGLFYCPYVPLQMYRAIGQDTFQPRIGFKTRYGM-VLNPFAKG------LTALSDSDPQAAGNLN 441 (455) Q Consensus 376 ~KG~~----~~daglfyaPYv~l~~~~~~Dp~s~qP~~g~~tRY~l-~~nP~~~~------~~~~~~~~~~~~~~~~ 441 (455) ++..- --.-.+=..||.- .+-...+-.+-+..||+. +.+|-.-- .+.....-+..+-++- T Consensus 338 ~~~~~~~~~~~~~~i~~~~~~~------~~f~~~~~~~r~~~r~d~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~ 408 (408) T protein:vir:74 338 MSQAITLFDRENMSLLPTNIGA------GAFETDTTKIRVIDRFDVKATDSEALVAGSFTAIADQVGNFKTTTSTAV 408 (408) T ss_pred hhccEEEEEecceEEEEecccc------chhhcceeeEEEEEeeCcEEecccceEEEEeecccCCCCCCCCCccccC Confidence 01000 0000111112110 011345666777788887 66664321 1111111111110110 No 47 >protein:vir:9574 Length: 300 # NCBI annotation: gp40 # Family: family:all:966 # MgeID: mge:171 # MgeName: SM1 # Cross-refs: genbank:acc:NP_862879;genbank:gi:32469471;genbank:GeneID:1461316 Probab=89.27 E-value=0.027 Score=29.15 Aligned_cols=283 Identities=10% Similarity=0.024 Sum_probs=121.9 Q ss_pred ccccccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccccccccccccccccccc Q lcl|NC_015280. 61 NTPTTSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGTDGATPPT 140 (455) Q Consensus 61 ~~~st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~~~~~ 140 (455) -+++|+++...--....-.++.++.+..+..+++.+.||++-..- |-.. .. +.+| T Consensus 1 ma~~t~~~G~lip~~~~~~ii~~l~~~s~i~~l~~~~~~~~~~~~-~p~~---~~--~~~a------------------- 55 (300) T protein:vir:95 1 MSEAQLSKGNLFNPELVTKVINKVKGHSSIAKLSPQKPIPFNGQR-EFVF---DF--DSDI------------------- 55 (300) T ss_pred CcccccCCcceechhhHHHHHHHHHhhhhhhhhcceeeccCCceE-EEEE---ec--Ccce------------------- Confidence 455665554432222233334444455566789999999764322 2211 00 0000 Q ss_pred cccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHh Q lcl|NC_015280. 141 ATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAI 220 (455) Q Consensus 141 ~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAi 220 (455) -..++ +...++...+++.++..+|.=+-...+|-||.+.... T Consensus 56 -----------------------------~wv~E--------g~~~~~s~~~f~~v~l~~~k~~~~~~iS~ell~~~~d- 97 (300) T protein:vir:95 56 -----------------------------DIVAE--------NGKKTHGGVSLDPVTIVPLKVEYGARVSDEFLHASEE- 97 (300) T ss_pred -----------------------------EEeeC--------CcccccccccceeeEeeeEEEEEeehhhHHHhccCCC- Confidence 00111 1123344445556666666556667788898753222 Q ss_pred hCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeecc----ccceeeeeeccccchhhHHHHHHHHHHHHHHHHHH Q lcl|NC_015280. 221 HGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANV----ANAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAI 296 (455) Q Consensus 221 HGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v----~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i 296 (455) ..+|-+++|.+-|...|...+++.++.-... ..|.-.++ ...+.........+--.-+....++..+. T Consensus 98 ~~~~l~~~i~~~l~~aia~~~d~~~l~G~~~--~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~------ 169 (300) T protein:vir:95 98 AKVDMLTDFVEGFSKKLARGLDIMSIHGINP--RTKQASTIIGDNCFDKKVTQTVPFKDTNPDESMEDAVGMID------ 169 (300) T ss_pred CHHHHHHHHHHHHHHHHHHHHHHhhhhcccC--CCCCCcccccccccccccceeecccccchHHHHHHHHHHhh------ Confidence 2356778888888888888888888754311 01110000 00111111111111111111222222221 Q ss_pred HHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEecccccc--CCcceEEE Q lcl|NC_015280. 297 AQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANV--SDNQYYVV 374 (455) Q Consensus 297 ~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~--s~~dY~~v 374 (455) .-.++.+-+|++|+....|.. +.. .+|.. ....+.++ -..|+|. +++|+++.+.... .+.+.+++ T Consensus 170 ---~~~~~~~~~vmn~~~~~~L~~---lkd---~~G~~--i~~~~~~~-~~~~~l~-G~Pv~~s~~v~~~~~~~~~~~~~ 236 (300) T protein:vir:95 170 ---GSERDITGAILDPIFTTALSK---MKN---AEGGK--LYPELAWG-GVPDAIN-GLAVDKNRTVSYSQTDPKNTAIV 236 (300) T ss_pred ---hcCCCccEEEECHHHHHHHHH---hhc---cCCCe--eccCcccc-CCCceec-ceeeEEecCCCCCCCCCccEEEE Confidence 124556668899999888865 221 11111 11111111 1246674 4688887654211 12223333 Q ss_pred EEecCccccceeEEcccccccce--eecCCcc-----c---cceeeeeeecce-eecccccccccccccCchhhhhccch Q lcl|NC_015280. 375 GYKGTNAYDAGLFYCPYVPLQMY--RAIGQDT-----F---QPRIGFKTRYGM-VLNPFAKGLTALSDSDPQAAGNLNAN 443 (455) Q Consensus 375 G~KG~~~~daglfyaPYv~l~~~--~~~Dp~s-----~---qP~~g~~tRY~l-~~nP~~~~~~~~~~~~~~~~~~~~~n 443 (455) |= +..+++|...-...+. .-.|+++ | |=.+-+..|+|. +.||-+--. +.+ T Consensus 237 GD-----f~~~~~~~~~~~~~~~v~~~~~~d~~~~~~f~~~~v~~r~~~r~d~~v~~~~a~~~--l~~------------ 297 (300) T protein:vir:95 237 GD-----FETMFKWGYAKEVPMEIIKYGDPDNSGRDLKGYNQIYIRCEAYIGWGIMDAASFAR--IVK------------ 297 (300) T ss_pred ee-----ccceEEEEEecccEEEEeeccCCCCcchhhhhcCcEEEEEEEeecceeecccceEE--Eec------------ Confidence 31 0111223222222221 1123332 2 133334558886 556644421 111 Q ss_pred hhhhhhhhhc Q lcl|NC_015280. 444 AYYRRVRVAN 453 (455) Q Consensus 444 ~y~r~~~v~~ 453 (455) +.| T Consensus 298 -------~~g 300 (300) T protein:vir:95 298 -------TGG 300 (300) T ss_pred -------CCC Confidence 111 No 48 >protein:vir:78223 Length: 333 # NCBI annotation: Putative major head protein # Family: family:all:966 # MgeID: mge:1849 # MgeName: Bethlehem # Cross-refs: genbank:acc:YP_001491666;genbank:gi:157786490;genbank:GeneID:5625701 Probab=88.77 E-value=0.03 Score=28.91 Aligned_cols=309 Identities=13% Similarity=0.045 Sum_probs=115.7 Q ss_pred HHhhhhhhhchhhh--ccccccccccccccc-hhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCccccc Q lcl|NC_015280. 47 RAVLTEAPTNVGPI--NTPTTSSGAVAGFDP-ILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFF 123 (455) Q Consensus 47 ~~~l~ea~~~~~~~--~~~st~tg~i~~~~P-~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlf 123 (455) -+.|+|-..+.... .+..++.++. ..-+ ..-.+++.+.+..+..+++-+.||++..--|.-.. . T Consensus 1 ~a~l~el~~~~~~~~~~g~~~~~~~~-liP~~~~~~ii~~l~~~s~l~~~~~~~~~~~~~~~~p~~~--~---------- 67 (333) T protein:vir:78 1 MATLNELLPNSAGSNHQGRLAHVPSD-LLPKEIVGPIFDKAQESSLVLRMGEQIPISYGETIIPTTV--K---------- 67 (333) T ss_pred CchhHHhhhhcccccccCceecCCcc-ccchhHHHHHHHHHHhhchhhhhcceeeccCCceEEEEEe--C---------- Confidence 22233321111100 0000111111 1111 12235555556777888999999886433322221 0 Q ss_pred ccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeecc Q lcl|NC_015280. 124 DEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGR 203 (455) Q Consensus 124 nEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSR 203 (455) .+...|-+. +......+++.... ....|.+..++..|..+ T Consensus 68 -~~~a~~v~e---------------------------------g~~~~~~e~~~~~~-~~~~f~~i~l~~~kl~~----- 107 (333) T protein:vir:78 68 -RPEVGQVGV---------------------------------GTSNEQREGGLKPL-SGTAWDTRSVSPIKLAT----- 107 (333) T ss_pred -CceeEeecC---------------------------------cccccccccccccc-cccceeEEEEeeEEEEE----- Confidence 000001000 00000011111111 12235555555555544 Q ss_pred ccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeecc-ccceeeeeec-cccchhhHHH Q lcl|NC_015280. 204 ALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANV-ANAGVFDLDV-DSNGRWSVEK 281 (455) Q Consensus 204 aLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v-~~~gv~Dl~~-~~~gr~~ve~ 281 (455) -...|-||.+|-. .|.+++|.+.|+..|...|+..+|.---.....+- .++ ...++..... ...+-. T Consensus 108 --~~~is~ell~~s~----~~~~~~i~~~la~ai~~~~d~~~l~G~g~~~~~~~-~g~~~~~~~~~~~~~~~~~~~---- 176 (333) T protein:vir:78 108 --IVTVSEEFARMNP----SGLYTKLQGDLAYAIGRGIDLAVFHGKSPLTGSAL-QGIDTDNVIANTTNVDYLQET---- 176 (333) T ss_pred --eehhhHHHHhcCH----HHHHHHHHHHHHHHHHHHHHHHHhcccCCCCCccc-ccccccccccccccccccccc---- Confidence 3457778887754 47899999999999999999988742211100000 000 0001100000 000000 Q ss_pred HHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEec Q lcl|NC_015280. 282 FKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDP 361 (455) Q Consensus 282 ~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~ 361 (455) ....|.-...+-.....-....++.+|++|+-...|.....+... +|..- ...+..+ .-.|+|. +++|+++. T Consensus 177 -~~~~~~~i~~~~~~~~~~~~~~~~~~vmn~~~~~~L~~~~~~~d~---~G~~i--~~~~~~~-~~~~~l~-G~Pv~~~~ 248 (333) T protein:vir:78 177 -GDPLLDRLLDGYDLVSANTDVEFNGWAVDPRFRAHLLRAQAYRDA---NGNVD--PSRINLA-AQTGDVL-GLPAQFGR 248 (333) T ss_pred -cchhHHHHHHHHHhhccccccCceEEEEcchHHHHHHHHhhhcCC---CCcee--ecCcccc-CCCceee-ceeeEEcc Confidence 001111111111111222456677788899887777654333211 11100 0111111 1125665 46888875 Q ss_pred ccccc-----CCcceEEE--------EEecCccccceeEEcccccccceeecCCcc-cc-cee--eeeeecce-eecccc Q lcl|NC_015280. 362 YSANV-----SDNQYYVV--------GYKGTNAYDAGLFYCPYVPLQMYRAIGQDT-FQ-PRI--GFKTRYGM-VLNPFA 423 (455) Q Consensus 362 y~~~~-----s~~dY~~v--------G~KG~~~~daglfyaPYv~l~~~~~~Dp~s-~q-P~~--g~~tRY~l-~~nP~~ 423 (455) +...+ .+...+++ |..+..+.+ ..+|.-...... .+.+ || -.+ =...|++. +.+|-. T Consensus 249 ~i~~~~~~~~~~~~~~~~gD~~~~~~g~~~~~~i~----~~~~~~~~~~~~-~~~~~~~~~~v~~r~~~r~d~~v~~~~a 323 (333) T protein:vir:78 249 AVGGDLGAAVDSKTRIIGGDFSQLKFGFADEIRIK----MSDTATLTDSGS-ATVSMWQTNQIAILIEVTFGWLLGDKQA 323 (333) T ss_pred ccCCCccccCCCccEEEEEecccEEEEEeeccEEE----Eecccccccccc-ceeehhhcCcEEEEEEEEEccEEecccc Confidence 43211 01112333 333222211 122210000000 0000 11 112 23457776 566622 Q ss_pred ccccccccc-Cc Q lcl|NC_015280. 424 KGLTALSDS-DP 434 (455) Q Consensus 424 ~~~~~~~~~-~~ 434 (455) - ..+.+. .| T Consensus 324 ~--~~l~~~~a~ 333 (333) T protein:vir:78 324 F--VKFVDDEQP 333 (333) T ss_pred e--EEEeccCCC Confidence 2 122221 22 No 49 >protein:vir:107593 Length: 392 # NCBI annotation: major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1491 # MgeName: Gamma # Cross-refs: genbank:acc:YP_338188;genbank:gi:77020144;genbank:GeneID:3703724 Probab=88.57 E-value=0.032 Score=28.82 Aligned_cols=320 Identities=17% Similarity=0.112 Sum_probs=134.8 Q ss_pred CcchHHHHHHhhHhh----------cC----C-CCccccchhhHHHHHHHhhhHHHHHHHHHHhhhhhhhchhhhccccc Q lcl|NC_015280. 1 MYNAENLQEKWAPVL----------NH----E-GLNDIKDPYRKSVTAILLENQERALAEERAVLTEAPTNVGPINTPTT 65 (455) Q Consensus 1 m~~~~~~~~kw~~~l----------~~----~-~~~~i~~~~~~~v~~~~~enq~~~~~e~~~~l~ea~~~~~~~~~~st 65 (455) +...+.|+++....- .. + ...+....+|+..... |.+++.. .++++++..... .......| T Consensus 35 ~~e~~~l~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-l~~~~~~-~~~~~~~~~~~~--~~~~~~~t 110 (392) T protein:vir:10 35 MEEVRSLQKKIDLQRSLDEAETEERNNGREVETRNVDGEMEYRDVFMKA-LRNKPLN-AEEREFLEDDLE--QRAMSGLT 110 (392) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHhhccccccccCccchHHHHHHHHHH-Hhccccc-HHHHHHHhhhhh--hhhccccc Confidence 222233333332110 00 0 0111222233333322 2222210 122333222110 01111122 Q ss_pred c-ccccc---cccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCccccccccccccccccccccccc Q lcl|NC_015280. 66 S-SGAVA---GFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGTDGATPPTA 141 (455) Q Consensus 66 ~-tg~i~---~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~~~~~~ 141 (455) + .|... .+.+.++.+. .....-.+++++.||++++|-+.-.+ .. ++.+ T Consensus 111 ~~~gg~~vP~~~~~~ii~~~---~~~s~l~~~~~~~~~~~~~~~~~~~~--~~--~~~~--------------------- 162 (392) T protein:vir:10 111 GEDGGLVIPQDIQTQINELA---RSFDALEQYVTVEPVRTRSGSRVLEK--NS--DMIP--------------------- 162 (392) T ss_pred cCCCceecchhHHHHHHHHH---HhhhhhhhhceeeeccCCceeEEEEe--ec--CCcc--------------------- Confidence 2 12211 2233344444 44555668999999999887543222 10 0000 Q ss_pred ccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhh Q lcl|NC_015280. 142 TTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIH 221 (455) Q Consensus 142 ~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiH 221 (455) +-..+++....+.....|.++.|...|. +-...+|-||.+|- T Consensus 163 ---------------------------a~~v~E~~~~~~~~~~~~~~v~l~~~k~-------~~~~~iS~ell~ds---- 204 (392) T protein:vir:10 163 ---------------------------FAEITEMGEIPETDNPKFSNVQYAVKDR-------AGILPLSRSLLQDS---- 204 (392) T ss_pred ---------------------------ceeecccccccccccccceeEEeeeeeE-------EEeehhhHHHHhhh---- Confidence 0000001111111122355555555554 44556899999984 Q ss_pred CCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHHHH-HHHHHHHHHHHHHh Q lcl|NC_015280. 222 GLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLL-FQIERDANAIAQET 300 (455) Q Consensus 222 GLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~-~qi~~ean~i~~~T 300 (455) ..|.+++|.+-|...|..-++.-|+.-.-+.. ..++..+ +....++ +.+... T Consensus 205 ~~~l~~~i~~~l~~~i~~~~d~~~~~g~g~~~---------~~~~~~~----------d~i~~~~~~~l~~~-------- 257 (392) T protein:vir:10 205 DQNILKYVTKWLGKKSKVTRNVLILGVIEKLT---------KQAIKSL----------DDIKDVLNVKLDPA-------- 257 (392) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHhhcccccc---------ccCccCH----------HHHHHHHHHhhhhh-------- Confidence 25678899999999999999888875332211 2222211 1222222 222111 Q ss_pred cCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCc Q lcl|NC_015280. 301 RRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTN 380 (455) Q Consensus 301 ~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~ 380 (455) -+ ..-..|+|+.....|... ..+ +|..- ...+.+ ....++|.|...|+++.. ..++.+|.. T Consensus 258 ~~-~~a~~vm~~~~~~~L~~l---kd~---~G~~l--~~~~~~-~~~~~tllG~~~v~~~~~---------~~~~~~~~~ 318 (392) T protein:vir:10 258 IS-PNAILLTNQDGFNYLDKL---KDK---DGKYI--LQSDPT-QKNKKLFAGTNPVVVVSN---------RFLKSKGTT 318 (392) T ss_pred hc-cCCEEEEcHHHHHHHHHh---hcc---CCCeE--eecCcc-CCccccccCcccEEEecc---------cccCCCccc Confidence 11 223368899998888752 211 11111 111111 123456777777776531 122223333 Q ss_pred cccceeEEccccc-------ccceeecCCc------cccceeeeeeecce-eecccccccccc----cccCchh Q lcl|NC_015280. 381 AYDAGLFYCPYVP-------LQMYRAIGQD------TFQPRIGFKTRYGM-VLNPFAKGLTAL----SDSDPQA 436 (455) Q Consensus 381 ~~daglfyaPYv~-------l~~~~~~Dp~------s~qP~~g~~tRY~l-~~nP~~~~~~~~----~~~~~~~ 436 (455) .-+..++|+.+-. ..+.-.++|. +.|=.+-...|+|. +.+|-.--.-+. ..-.|+| T Consensus 319 ~~~~~~~~gdfs~~~~i~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~~l~~~~~a~~~~~~~ 392 (392) T protein:vir:10 319 AKKAPLIIGDLKEAIVLFKREDMELASTDVGGKAFTRNTLDLRAIQRDDVQMWDNEAAVYGEIDLSAPVEQPQG 392 (392) T ss_pred CCceEEEEEehhceEEEEeecceEEEEeccccchhhcCceEEEEEEeeccEEecccceEEEEecccccccCCCC Confidence 3334445544321 1111122332 34556777788887 566654422111 1223333 No 50 >protein:vir:102873 Length: 392 # NCBI annotation: major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1492 # MgeName: Cherry # Cross-refs: genbank:acc:YP_338137;genbank:gi:77020198;genbank:GeneID:3703782 Probab=88.57 E-value=0.032 Score=28.82 Aligned_cols=320 Identities=17% Similarity=0.112 Sum_probs=134.8 Q ss_pred CcchHHHHHHhhHhh----------cC----C-CCccccchhhHHHHHHHhhhHHHHHHHHHHhhhhhhhchhhhccccc Q lcl|NC_015280. 1 MYNAENLQEKWAPVL----------NH----E-GLNDIKDPYRKSVTAILLENQERALAEERAVLTEAPTNVGPINTPTT 65 (455) Q Consensus 1 m~~~~~~~~kw~~~l----------~~----~-~~~~i~~~~~~~v~~~~~enq~~~~~e~~~~l~ea~~~~~~~~~~st 65 (455) +...+.|+++....- .. + ...+....+|+..... |.+++.. .++++++..... .......| T Consensus 35 ~~e~~~l~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-l~~~~~~-~~~~~~~~~~~~--~~~~~~~t 110 (392) T protein:vir:10 35 MEEVRSLQKKIDLQRSLDEAETEERNNGREVETRNVDGEMEYRDVFMKA-LRNKPLN-AEEREFLEDDLE--QRAMSGLT 110 (392) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHhhccccccccCccchHHHHHHHHHH-Hhccccc-HHHHHHHhhhhh--hhhccccc Confidence 222233333332110 00 0 0111222233333322 2222210 122333222110 01111122 Q ss_pred c-ccccc---cccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCccccccccccccccccccccccc Q lcl|NC_015280. 66 S-SGAVA---GFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGTDGATPPTA 141 (455) Q Consensus 66 ~-tg~i~---~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~~~~~~ 141 (455) + .|... .+.+.++.+. .....-.+++++.||++++|-+.-.+ .. ++.+ T Consensus 111 ~~~gg~~vP~~~~~~ii~~~---~~~s~l~~~~~~~~~~~~~~~~~~~~--~~--~~~~--------------------- 162 (392) T protein:vir:10 111 GEDGGLVIPQDIQTQINELA---RSFDALEQYVTVEPVRTRSGSRVLEK--NS--DMIP--------------------- 162 (392) T ss_pred cCCCceecchhHHHHHHHHH---HhhhhhhhhceeeeccCCceeEEEEe--ec--CCcc--------------------- Confidence 2 12211 2233344444 44555668999999999887543222 10 0000 Q ss_pred ccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhh Q lcl|NC_015280. 142 TTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIH 221 (455) Q Consensus 142 ~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiH 221 (455) +-..+++....+.....|.++.|...|. +-...+|-||.+|- T Consensus 163 ---------------------------a~~v~E~~~~~~~~~~~~~~v~l~~~k~-------~~~~~iS~ell~ds---- 204 (392) T protein:vir:10 163 ---------------------------FAEITEMGEIPETDNPKFSNVQYAVKDR-------AGILPLSRSLLQDS---- 204 (392) T ss_pred ---------------------------ceeecccccccccccccceeEEeeeeeE-------EEeehhhHHHHhhh---- Confidence 0000001111111122355555555554 44556899999984 Q ss_pred CCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHHHH-HHHHHHHHHHHHHh Q lcl|NC_015280. 222 GLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLL-FQIERDANAIAQET 300 (455) Q Consensus 222 GLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~-~qi~~ean~i~~~T 300 (455) ..|.+++|.+-|...|..-++.-|+.-.-+.. ..++..+ +....++ +.+... T Consensus 205 ~~~l~~~i~~~l~~~i~~~~d~~~~~g~g~~~---------~~~~~~~----------d~i~~~~~~~l~~~-------- 257 (392) T protein:vir:10 205 DQNILKYVTKWLGKKSKVTRNVLILGVIEKLT---------KQAIKSL----------DDIKDVLNVKLDPA-------- 257 (392) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHhhcccccc---------ccCccCH----------HHHHHHHHHhhhhh-------- Confidence 25678899999999999999888875332211 2222211 1222222 222111 Q ss_pred cCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCc Q lcl|NC_015280. 301 RRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTN 380 (455) Q Consensus 301 ~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~ 380 (455) -+ ..-..|+|+.....|... ..+ +|..- ...+.+ ....++|.|...|+++.. ..++.+|.. T Consensus 258 ~~-~~a~~vm~~~~~~~L~~l---kd~---~G~~l--~~~~~~-~~~~~tllG~~~v~~~~~---------~~~~~~~~~ 318 (392) T protein:vir:10 258 IS-PNAILLTNQDGFNYLDKL---KDK---DGKYI--LQSDPT-QKNKKLFAGTNPVVVVSN---------RFLKSKGTT 318 (392) T ss_pred hc-cCCEEEEcHHHHHHHHHh---hcc---CCCeE--eecCcc-CCccccccCcccEEEecc---------cccCCCccc Confidence 11 223368899998888752 211 11111 111111 123456777777776531 122223333 Q ss_pred cccceeEEccccc-------ccceeecCCc------cccceeeeeeecce-eecccccccccc----cccCchh Q lcl|NC_015280. 381 AYDAGLFYCPYVP-------LQMYRAIGQD------TFQPRIGFKTRYGM-VLNPFAKGLTAL----SDSDPQA 436 (455) Q Consensus 381 ~~daglfyaPYv~-------l~~~~~~Dp~------s~qP~~g~~tRY~l-~~nP~~~~~~~~----~~~~~~~ 436 (455) .-+..++|+.+-. ..+.-.++|. +.|=.+-...|+|. +.+|-.--.-+. ..-.|+| T Consensus 319 ~~~~~~~~gdfs~~~~i~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~~l~~~~~a~~~~~~~ 392 (392) T protein:vir:10 319 AKKAPLIIGDLKEAIVLFKREDMELASTDVGGKAFTRNTLDLRAIQRDDVQMWDNEAAVYGEIDLSAPVEQPQG 392 (392) T ss_pred CCceEEEEEehhceEEEEeecceEEEEeccccchhhcCceEEEEEEeeccEEecccceEEEEecccccccCCCC Confidence 3334445544321 1111122332 34556777788887 566654422111 1223333 No 51 >protein:vir:102082 Length: 392 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:1503 # MgeName: Fah # Cross-refs: genbank:acc:YP_512315;genbank:gi:89152484;genbank:GeneID:3953075 Probab=88.57 E-value=0.032 Score=28.82 Aligned_cols=320 Identities=17% Similarity=0.112 Sum_probs=134.8 Q ss_pred CcchHHHHHHhhHhh----------cC----C-CCccccchhhHHHHHHHhhhHHHHHHHHHHhhhhhhhchhhhccccc Q lcl|NC_015280. 1 MYNAENLQEKWAPVL----------NH----E-GLNDIKDPYRKSVTAILLENQERALAEERAVLTEAPTNVGPINTPTT 65 (455) Q Consensus 1 m~~~~~~~~kw~~~l----------~~----~-~~~~i~~~~~~~v~~~~~enq~~~~~e~~~~l~ea~~~~~~~~~~st 65 (455) +...+.|+++....- .. + ...+....+|+..... |.+++.. .++++++..... .......| T Consensus 35 ~~e~~~l~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-l~~~~~~-~~~~~~~~~~~~--~~~~~~~t 110 (392) T protein:vir:10 35 MEEVRSLQKKIDLQRSLDEAETEERNNGREVETRNVDGEMEYRDVFMKA-LRNKPLN-AEEREFLEDDLE--QRAMSGLT 110 (392) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHhhccccccccCccchHHHHHHHHHH-Hhccccc-HHHHHHHhhhhh--hhhccccc Confidence 222233333332110 00 0 0111222233333322 2222210 122333222110 01111122 Q ss_pred c-ccccc---cccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCccccccccccccccccccccccc Q lcl|NC_015280. 66 S-SGAVA---GFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGTDGATPPTA 141 (455) Q Consensus 66 ~-tg~i~---~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~~~~~~ 141 (455) + .|... .+.+.++.+. .....-.+++++.||++++|-+.-.+ .. ++.+ T Consensus 111 ~~~gg~~vP~~~~~~ii~~~---~~~s~l~~~~~~~~~~~~~~~~~~~~--~~--~~~~--------------------- 162 (392) T protein:vir:10 111 GEDGGLVIPQDIQTQINELA---RSFDALEQYVTVEPVRTRSGSRVLEK--NS--DMIP--------------------- 162 (392) T ss_pred cCCCceecchhHHHHHHHHH---HhhhhhhhhceeeeccCCceeEEEEe--ec--CCcc--------------------- Confidence 2 12211 2233344444 44555668999999999887543222 10 0000 Q ss_pred ccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhh Q lcl|NC_015280. 142 TTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIH 221 (455) Q Consensus 142 ~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiH 221 (455) +-..+++....+.....|.++.|...|. +-...+|-||.+|- T Consensus 163 ---------------------------a~~v~E~~~~~~~~~~~~~~v~l~~~k~-------~~~~~iS~ell~ds---- 204 (392) T protein:vir:10 163 ---------------------------FAEITEMGEIPETDNPKFSNVQYAVKDR-------AGILPLSRSLLQDS---- 204 (392) T ss_pred ---------------------------ceeecccccccccccccceeEEeeeeeE-------EEeehhhHHHHhhh---- Confidence 0000001111111122355555555554 44556899999984 Q ss_pred CCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHHHH-HHHHHHHHHHHHHh Q lcl|NC_015280. 222 GLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLL-FQIERDANAIAQET 300 (455) Q Consensus 222 GLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~-~qi~~ean~i~~~T 300 (455) ..|.+++|.+-|...|..-++.-|+.-.-+.. ..++..+ +....++ +.+... T Consensus 205 ~~~l~~~i~~~l~~~i~~~~d~~~~~g~g~~~---------~~~~~~~----------d~i~~~~~~~l~~~-------- 257 (392) T protein:vir:10 205 DQNILKYVTKWLGKKSKVTRNVLILGVIEKLT---------KQAIKSL----------DDIKDVLNVKLDPA-------- 257 (392) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHhhcccccc---------ccCccCH----------HHHHHHHHHhhhhh-------- Confidence 25678899999999999999888875332211 2222211 1222222 222111 Q ss_pred cCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCc Q lcl|NC_015280. 301 RRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTN 380 (455) Q Consensus 301 ~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~ 380 (455) -+ ..-..|+|+.....|... ..+ +|..- ...+.+ ....++|.|...|+++.. ..++.+|.. T Consensus 258 ~~-~~a~~vm~~~~~~~L~~l---kd~---~G~~l--~~~~~~-~~~~~tllG~~~v~~~~~---------~~~~~~~~~ 318 (392) T protein:vir:10 258 IS-PNAILLTNQDGFNYLDKL---KDK---DGKYI--LQSDPT-QKNKKLFAGTNPVVVVSN---------RFLKSKGTT 318 (392) T ss_pred hc-cCCEEEEcHHHHHHHHHh---hcc---CCCeE--eecCcc-CCccccccCcccEEEecc---------cccCCCccc Confidence 11 223368899998888752 211 11111 111111 123456777777776531 122223333 Q ss_pred cccceeEEccccc-------ccceeecCCc------cccceeeeeeecce-eecccccccccc----cccCchh Q lcl|NC_015280. 381 AYDAGLFYCPYVP-------LQMYRAIGQD------TFQPRIGFKTRYGM-VLNPFAKGLTAL----SDSDPQA 436 (455) Q Consensus 381 ~~daglfyaPYv~-------l~~~~~~Dp~------s~qP~~g~~tRY~l-~~nP~~~~~~~~----~~~~~~~ 436 (455) .-+..++|+.+-. ..+.-.++|. +.|=.+-...|+|. +.+|-.--.-+. ..-.|+| T Consensus 319 ~~~~~~~~gdfs~~~~i~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~~l~~~~~a~~~~~~~ 392 (392) T protein:vir:10 319 AKKAPLIIGDLKEAIVLFKREDMELASTDVGGKAFTRNTLDLRAIQRDDVQMWDNEAAVYGEIDLSAPVEQPQG 392 (392) T ss_pred CCceEEEEEehhceEEEEeecceEEEEeccccchhhcCceEEEEEEeeccEEecccceEEEEecccccccCCCC Confidence 3334445544321 1111122332 34556777788887 566654422111 1223333 No 52 >protein:vir:105004 Length: 392 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:1490 # MgeName: W Beta # Cross-refs: genbank:acc:YP_459969;genbank:gi:85701384;genbank:GeneID:3882145 Probab=88.57 E-value=0.032 Score=28.82 Aligned_cols=320 Identities=17% Similarity=0.112 Sum_probs=134.8 Q ss_pred CcchHHHHHHhhHhh----------cC----C-CCccccchhhHHHHHHHhhhHHHHHHHHHHhhhhhhhchhhhccccc Q lcl|NC_015280. 1 MYNAENLQEKWAPVL----------NH----E-GLNDIKDPYRKSVTAILLENQERALAEERAVLTEAPTNVGPINTPTT 65 (455) Q Consensus 1 m~~~~~~~~kw~~~l----------~~----~-~~~~i~~~~~~~v~~~~~enq~~~~~e~~~~l~ea~~~~~~~~~~st 65 (455) +...+.|+++....- .. + ...+....+|+..... |.+++.. .++++++..... .......| T Consensus 35 ~~e~~~l~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-l~~~~~~-~~~~~~~~~~~~--~~~~~~~t 110 (392) T protein:vir:10 35 MEEVRSLQKKIDLQRSLDEAETEERNNGREVETRNVDGEMEYRDVFMKA-LRNKPLN-AEEREFLEDDLE--QRAMSGLT 110 (392) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHhhccccccccCccchHHHHHHHHHH-Hhccccc-HHHHHHHhhhhh--hhhccccc Confidence 222233333332110 00 0 0111222233333322 2222210 122333222110 01111122 Q ss_pred c-ccccc---cccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCccccccccccccccccccccccc Q lcl|NC_015280. 66 S-SGAVA---GFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGTDGATPPTA 141 (455) Q Consensus 66 ~-tg~i~---~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~~~~~~ 141 (455) + .|... .+.+.++.+. .....-.+++++.||++++|-+.-.+ .. ++.+ T Consensus 111 ~~~gg~~vP~~~~~~ii~~~---~~~s~l~~~~~~~~~~~~~~~~~~~~--~~--~~~~--------------------- 162 (392) T protein:vir:10 111 GEDGGLVIPQDIQTQINELA---RSFDALEQYVTVEPVRTRSGSRVLEK--NS--DMIP--------------------- 162 (392) T ss_pred cCCCceecchhHHHHHHHHH---HhhhhhhhhceeeeccCCceeEEEEe--ec--CCcc--------------------- Confidence 2 12211 2233344444 44555668999999999887543222 10 0000 Q ss_pred ccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhh Q lcl|NC_015280. 142 TTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIH 221 (455) Q Consensus 142 ~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiH 221 (455) +-..+++....+.....|.++.|...|. +-...+|-||.+|- T Consensus 163 ---------------------------a~~v~E~~~~~~~~~~~~~~v~l~~~k~-------~~~~~iS~ell~ds---- 204 (392) T protein:vir:10 163 ---------------------------FAEITEMGEIPETDNPKFSNVQYAVKDR-------AGILPLSRSLLQDS---- 204 (392) T ss_pred ---------------------------ceeecccccccccccccceeEEeeeeeE-------EEeehhhHHHHhhh---- Confidence 0000001111111122355555555554 44556899999984 Q ss_pred CCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHHHH-HHHHHHHHHHHHHh Q lcl|NC_015280. 222 GLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLL-FQIERDANAIAQET 300 (455) Q Consensus 222 GLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~-~qi~~ean~i~~~T 300 (455) ..|.+++|.+-|...|..-++.-|+.-.-+.. ..++..+ +....++ +.+... T Consensus 205 ~~~l~~~i~~~l~~~i~~~~d~~~~~g~g~~~---------~~~~~~~----------d~i~~~~~~~l~~~-------- 257 (392) T protein:vir:10 205 DQNILKYVTKWLGKKSKVTRNVLILGVIEKLT---------KQAIKSL----------DDIKDVLNVKLDPA-------- 257 (392) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHhhcccccc---------ccCccCH----------HHHHHHHHHhhhhh-------- Confidence 25678899999999999999888875332211 2222211 1222222 222111 Q ss_pred cCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCc Q lcl|NC_015280. 301 RRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTN 380 (455) Q Consensus 301 ~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~ 380 (455) -+ ..-..|+|+.....|... ..+ +|..- ...+.+ ....++|.|...|+++.. ..++.+|.. T Consensus 258 ~~-~~a~~vm~~~~~~~L~~l---kd~---~G~~l--~~~~~~-~~~~~tllG~~~v~~~~~---------~~~~~~~~~ 318 (392) T protein:vir:10 258 IS-PNAILLTNQDGFNYLDKL---KDK---DGKYI--LQSDPT-QKNKKLFAGTNPVVVVSN---------RFLKSKGTT 318 (392) T ss_pred hc-cCCEEEEcHHHHHHHHHh---hcc---CCCeE--eecCcc-CCccccccCcccEEEecc---------cccCCCccc Confidence 11 223368899998888752 211 11111 111111 123456777777776531 122223333 Q ss_pred cccceeEEccccc-------ccceeecCCc------cccceeeeeeecce-eecccccccccc----cccCchh Q lcl|NC_015280. 381 AYDAGLFYCPYVP-------LQMYRAIGQD------TFQPRIGFKTRYGM-VLNPFAKGLTAL----SDSDPQA 436 (455) Q Consensus 381 ~~daglfyaPYv~-------l~~~~~~Dp~------s~qP~~g~~tRY~l-~~nP~~~~~~~~----~~~~~~~ 436 (455) .-+..++|+.+-. ..+.-.++|. +.|=.+-...|+|. +.+|-.--.-+. ..-.|+| T Consensus 319 ~~~~~~~~gdfs~~~~i~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~~l~~~~~a~~~~~~~ 392 (392) T protein:vir:10 319 AKKAPLIIGDLKEAIVLFKREDMELASTDVGGKAFTRNTLDLRAIQRDDVQMWDNEAAVYGEIDLSAPVEQPQG 392 (392) T ss_pred CCceEEEEEehhceEEEEeecceEEEEeccccchhhcCceEEEEEEeeccEEecccceEEEEecccccccCCCC Confidence 3334445544321 1111122332 34556777788887 566654422111 1223333 No 53 >protein:vir:3870 Length: 400 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:82 # MgeName: A2 # Cross-refs: genbank:acc:NP_680487;swissprot:trembl:q8ltc0;genbank:gi:22296527;interpro:IPR006444;uniprot:Q8LTC0;genbank:GeneID:951713 Probab=88.29 E-value=0.033 Score=28.69 Aligned_cols=312 Identities=13% Similarity=0.063 Sum_probs=125.3 Q ss_pred CcchHHHHHHhhHh----------hcCCCC-----cccc-chhhHHHHHHHhhhHHHHH---HHHHHhhh------hhhh Q lcl|NC_015280. 1 MYNAENLQEKWAPV----------LNHEGL-----NDIK-DPYRKSVTAILLENQERAL---AEERAVLT------EAPT 55 (455) Q Consensus 1 m~~~~~~~~kw~~~----------l~~~~~-----~~i~-~~~~~~v~~~~~enq~~~~---~e~~~~l~------ea~~ 55 (455) ....+.+.++...+ .+.+.. +... ....+.....-.+...+.. +.++.... .... T Consensus 48 ~~~~~~l~~ei~~l~e~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 127 (400) T protein:vir:38 48 RAKYDKAGKEIKDLEEKRDLYEAALKGNEQSSGKKPDHPEEHSYRDALNAYLHTRGRNTDGVNFEKTDVGTFAVLRAVPT 127 (400) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHhhcccccccchhhhhHHHHHHHHHhhHHHHHHHHHHHHHHHHHHhhhhhhhH Confidence 11112222222211 111000 0000 0011111111111111110 00000000 0000 Q ss_pred chhhhcccccc--ccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCccccccccccccccc Q lcl|NC_015280. 56 NVGPINTPTTS--SGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGT 133 (455) Q Consensus 56 ~~~~~~~~st~--tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~ 133 (455) ........+++ +|.+.--.+..-.++++..+..+..+++.+.||++.++-+--++.. ++.-+ + T Consensus 128 ~~~~~~~~~~~~~~gg~~vP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~----~~~~~--------~--- 192 (400) T protein:vir:38 128 DASDAVNAGVKAADAASTIPETISNTPQRELQTVVDLKPFTNVFQASTQKGTYPTVANA----TTKMV--------T--- 192 (400) T ss_pred HHHHHHhhcccccCCcccccHHHHHHHHHHHHhhhhhhhcceeEeccCcceEEEEEecC----CCccc--------c--- Confidence 00111111111 1222111112233444445666788899999999988755444310 00000 0 Q ss_pred ccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccc-eeEEEEEEEEeeccccccceeHH Q lcl|NC_015280. 134 DGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEM-AFSIDKIAVEAKGRALRADYSVE 212 (455) Q Consensus 134 ~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EM-aFsIEK~tVtAKSRaLKAEYTiE 212 (455) ..+. ...++. ..+++.++..++.-+-...+|-| T Consensus 193 --------------------------------------~~E~--------~~~~~~~~~~f~~i~~~~~k~~~~~~is~e 226 (400) T protein:vir:38 193 --------------------------------------VAEL--------EKNPAMAKPEFKPVNWSVETYRQALPVSQE 226 (400) T ss_pred --------------------------------------cccc--------ccccccccccceeeEeehhheeeehhhHHH Confidence 0000 001111 12334445555555557789999 Q ss_pred HHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHHHHHHHHHH Q lcl|NC_015280. 213 LAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLLFQIERD 292 (455) Q Consensus 213 LAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~e 292 (455) |.+|- ..|.+++|.+-|+..|...+|+-|+.-.-.. ...++..++ ....+ +..... T Consensus 227 ll~ds----~~~~~~~i~~~l~~~~~~~~~~~i~~~~~~~---------~~~~~~~~~----------~~~~~-~~~~~~ 282 (400) T protein:vir:38 227 SIDDS----AIDLVGLIAQNGQQIKVNTTNGAVATLLKGF---------TAKTISSVD----------DLKHI-NNVDLD 282 (400) T ss_pred HHhhh----HHHHHHHHHHHHHHHHHHHHHHhhhhccccc---------cccccccHH----------HHHHH-HHhhhh Confidence 99985 3478889999999999988888887543221 122222111 11112 111111 Q ss_pred HHHHHHHhcCCCccEEEEchhHHHHHHhh----cccccccccccccccccccccCCceeEEEecCceEEEEeccccccCC Q lcl|NC_015280. 293 ANAIAQETRRGKGNIIITSADVASALAMS----GVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSD 368 (455) Q Consensus 293 an~i~~~T~~~~gn~~v~S~~va~~L~~s----G~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~ 368 (455) ....+ .+|+||.....|... |-+-+.|.. ++ ...|+|. |++|++..++. T Consensus 283 --------~~~~a-~~v~~~~~~~~l~~lkd~~G~~i~~~~~------------~~-~~~~~l~-G~pv~~~~~~~---- 335 (400) T protein:vir:38 283 --------PAYSR-VIIASQSFYNFLDTVKDGNGRYLLQDSI------------LT-PSGKSVL-GMPIAVVSDDT---- 335 (400) T ss_pred --------hhhCc-EEEEcHHHHHHHHHhhccCCCeeeecCc------------CC-CCccccc-cceeEEecccc---- Confidence 11233 467788888877742 222122211 11 1123554 55666553321 Q ss_pred cceEEEEEecCccccceeEEccccc--------ccceeecCCccccceeeeeeecce-eecccccccccccccC Q lcl|NC_015280. 369 NQYYVVGYKGTNAYDAGLFYCPYVP--------LQMYRAIGQDTFQPRIGFKTRYGM-VLNPFAKGLTALSDSD 433 (455) Q Consensus 369 ~dY~~vG~KG~~~~daglfyaPYv~--------l~~~~~~Dp~s~qP~~g~~tRY~l-~~nP~~~~~~~~~~~~ 433 (455) .+-.| +.-++|+.+-. ....+..|-..|+..+-...|+|. +.+|-.--.-...... T Consensus 336 -----~~~~g----~~~~~~gd~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~d~~~~~~~a~~~l~~~~~a 400 (400) T protein:vir:38 336 -----LGAAG----EAHAFLGDIKRAILFANRADFMVRWVDDQIYGQFLQAGMRFGVSVADEKAGYFLTYTPKA 400 (400) T ss_pred -----cCCCC----ceEEEEEeccccEEEEeecceEEEEecccccceeEEEEEEeccEEecccceEEEEeecCC Confidence 11111 11223322211 112234466677778888899998 6666554221121111 No 54 >protein:vir:4339 Length: 395 # NCBI annotation: major head protein # Family: family:all:585 # MgeID: mge:93 # MgeName: D3 # Cross-refs: genbank:acc:NP_061502;genbank:gi:9635591;genbank:GeneID:1262860 Probab=86.77 E-value=0.043 Score=28.06 Aligned_cols=321 Identities=12% Similarity=0.054 Sum_probs=119.9 Q ss_pred CcchHHHHHHhhHhhcCCCCccccchhhHHH----------HHHHhhhHHHH----------------------HHHHHH Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHEGLNDIKDPYRKSV----------TAILLENQERA----------------------LAEERA 48 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~~~~~i~~~~~~~v----------~~~~~enq~~~----------------------~~e~~~ 48 (455) +....+..+.... .+.+..+..++++ -+++.+-+.+. ....++ T Consensus 22 ~~~~~e~~~~~~~-----~~~~~~~e~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 96 (395) T protein:vir:43 22 IKSQAEQVNTQIA-----NFGEMNKETRAKVDELLTAQGELQARLSAAEQAMLANEKRDGGEEAPKTAGQMVAESLKEQG 96 (395) T ss_pred HHHHHHHHHHHHH-----HHhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccchhhhHHHHHHHHHHHHH Confidence 1000000000000 0000001111111 01110000000 000011 Q ss_pred hhhhhhhchhhh----cccccccccc-ccccc-hhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccc Q lcl|NC_015280. 49 VLTEAPTNVGPI----NTPTTSSGAV-AGFDP-ILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAF 122 (455) Q Consensus 49 ~l~ea~~~~~~~----~~~st~tg~i-~~~~P-~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAl 122 (455) +..... ..... -+.++++++- .-..| ..-.++++..+..+..+++.++||.+++.-+.-.. ..++. T Consensus 97 ~~~~~~-~~~~~~~~~~~~~~~~~~~g~~vp~~~~~~ii~~~~~~~~l~~l~~~~~~~~~~~~~~~~~----~~~~~--- 168 (395) T protein:vir:43 97 VTSSLR-GSHRVSMPRSAITSIDGSGGALVAPDRRPGVVAAPQRRLTIRDLVAPGTTESNSVEYVRET----GFVNN--- 168 (395) T ss_pred HHHHhh-hhhhhhhhhhhhcccCCCCccccchhhHHHHHHHHHhhhhHHhhccceecCCCceEEEEEe----cCCCc--- Confidence 111100 00000 0001111111 01112 12334555556677889999999988754331111 00000 Q ss_pred cccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeec Q lcl|NC_015280. 123 FDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKG 202 (455) Q Consensus 123 fnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKS 202 (455) +...++ +..+++-..+++++++..+. T Consensus 169 ----------------------------------------------a~~v~E--------~~~~~~~~~~~~~i~~~~~k 194 (395) T protein:vir:43 169 ----------------------------------------------AAPVSE--------GTQKPYSDLTFELENAPVRT 194 (395) T ss_pred ----------------------------------------------eeeecC--------CccccccccceeEEEEeeee Confidence 000000 11233344455566666666 Q ss_pred cccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeee--------ecccc Q lcl|NC_015280. 203 RALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDL--------DVDSN 274 (455) Q Consensus 203 RaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl--------~~~~~ 274 (455) -+-...+|-||.||.- +.++.|.+-|+..+...+|+.||.- +-.+ -...|++-. ..... T Consensus 195 ~~~~~~is~ell~d~~-----~l~~~v~~~la~a~~~~~d~~~l~G----~g~~----~~~~Gi~~~~~~~~~~~~~~~~ 261 (395) T protein:vir:43 195 IAHLFKASRQILDDAS-----ALQSYIDARARYGLMLVEECQLLYG----NGTG----ANLHGIIPQAQAYAPPSGVVVT 261 (395) T ss_pred EEEeehhhHHHHHhHH-----HHHHHHHHHHHHHHHHHHHHHHHhc----cCCC----Cccccccccccccccccccccc Confidence 6667789999999852 3678888889989998888888742 1000 111222210 00000 Q ss_pred chhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCc Q lcl|NC_015280. 275 GRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGR 354 (455) Q Consensus 275 gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~ 354 (455) +--.++....+++.+ ...-+.+..+|+||.....|.. +.. .+|..= -.+.. ..-.++|. + T Consensus 262 ~~~~~~~i~~~~~~~---------~~~~~~~~~~vmn~~~~~~l~~---lkd---~~G~~i---~~~~~-~~~~~~l~-G 321 (395) T protein:vir:43 262 AEQRIDRIRLAILQA---------QLAEFPASGIVLNPIDWALIEL---NKD---AENRYI---IGSPQ-NGTTPTLW-R 321 (395) T ss_pred cchhHHHHHHHHHhh---------ccccCCCcEEEEcHHHHHHHHH---hhc---cCCcee---ccccc-cCCCceec-c Confidence 000111222222222 1223345678999999888764 221 111111 11111 11234665 4 Q ss_pred eEEEEeccccccCCcceEEEEEecCccccceeEEcccccccceeecCCc---cccc---eeeeeeecce-eecccccccc Q lcl|NC_015280. 355 FKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQD---TFQP---RIGFKTRYGM-VLNPFAKGLT 427 (455) Q Consensus 355 ~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~---s~qP---~~g~~tRY~l-~~nP~~~~~~ 427 (455) ++|+++.+.. ..=+++|--.. +|--+.-..+..-+++. .|+- .+=+..|++. +.+|-+--.- T Consensus 322 ~pVv~~~~~~----~~~~~~gd~~~-------~~~~~~~~~~~i~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~~~ 390 (395) T protein:vir:43 322 LPVVETQAIT----QDEFLTGAFSL-------GAQIFDRMDIEVLVSTENDKDFENNMVTIRAEERLAFAVYRPEAFVTG 390 (395) T ss_pred eeeEEcCCCC----CCcEEEEeccc-------eEEEEEecceEEEEeccccchhhcCcEEEEEEEeeccEEecccceEEE Confidence 7999987643 22233332110 00001001111111211 1322 3333457777 4445433111 Q ss_pred ccccc Q lcl|NC_015280. 428 ALSDS 432 (455) Q Consensus 428 ~~~~~ 432 (455) +.+.. T Consensus 391 ~~taa 395 (395) T protein:vir:43 391 SLTAS 395 (395) T ss_pred EeccC Confidence 11111 No 55 >protein:vir:102119 Length: 404 # NCBI annotation: phage major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1641 # MgeName: phiSM101 # Cross-refs: genbank:acc:YP_699941;genbank:gi:110804052;genbank:GeneID:4206662 Probab=86.56 E-value=0.045 Score=27.98 Aligned_cols=329 Identities=12% Similarity=0.029 Sum_probs=118.7 Q ss_pred CcchHHHHHHhhHhhcCC---------CC---c---cc-cchhhHHHHHHHhhhHHHHHHHHHHhhhhhhhchhhhcccc Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHE---------GL---N---DI-KDPYRKSVTAILLENQERALAEERAVLTEAPTNVGPINTPT 64 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~---------~~---~---~i-~~~~~~~v~~~~~enq~~~~~e~~~~l~ea~~~~~~~~~~s 64 (455) .-.-+.|.++..-..+.+ .. . +. ....++.....+.+++.+.. ++..+.....-...+...+ T Consensus 37 ~~e~~~l~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--~~~~~~~~~~e~~a~~~~~ 114 (404) T protein:vir:10 37 SNEIDILQAKIEAQKRKENIENNFNEDNVKSLNTGKEENVIYNGALFVRAIADNLLKQK--NQRGLNLSEKEINAISENI 114 (404) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHhhhhccccccccchhhHHHHHHHHHHHHHHHHHHH--HhhhhcchhhHHhhhcccc Confidence 111123333332111000 00 0 00 00011111111222221100 0000000000000111112 Q ss_pred ccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccccccccccccccccccccccc Q lcl|NC_015280. 65 TSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGTDGATPPTATTE 144 (455) Q Consensus 65 t~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~~~~~~~~~ 144 (455) +++|.+.--....-.+++.+-......+++++.||+++.|-+-=.| .... +...| T Consensus 115 ~~~gg~~vP~~~~~~ii~~~~~~~~l~~l~~~~~~~~~~g~~~~~~--~~~~---------~~~~~-------------- 169 (404) T protein:vir:10 115 DEDGGYAVPEDIQTKINTRLKDTTDLYNMVDYEPVFTRSGSRTYEK--RSKQ---------KPMKP-------------- 169 (404) T ss_pred CCCCceeechhHHHHHHHHHhhhhhHhhhhceeeccCCccceEEEE--ecCC---------cceee-------------- Confidence 2223222111111234444445567788999999999998643222 1000 00000 Q ss_pred cCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCC Q lcl|NC_015280. 145 KNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLD 224 (455) Q Consensus 145 ~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLD 224 (455) ..+++....+. ...++++++.+.|.-+-...+|-||.+|-. .+ T Consensus 170 ---------------------------v~e~~~~~~~~------~~~~f~~i~~~~~k~~~~~~iS~ell~ds~----~~ 212 (404) T protein:vir:10 170 ---------------------------LSENQQIPTNG------DNGKLERFNFKLKDLADFMSIPNDLLKFAD----KS 212 (404) T ss_pred ---------------------------ccccccccccc------cccceeeeEeeheeeEeeehhhHHHHhhcH----HH Confidence 00001111110 112234444444444445678999998843 35 Q ss_pred hhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeee------ccccchhhHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015280. 225 AESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLD------VDSNGRWSVEKFKGLLFQIERDANAIAQ 298 (455) Q Consensus 225 AE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~------~~~~gr~~ve~~k~l~~qi~~ean~i~~ 298 (455) .++.|.+.|+..|...+|+.||.-.- .+-...|+.... ..... .....+.++ +... T Consensus 213 l~~~i~~~la~~~~~~~~~~il~G~g--------~~~~~~gi~~~~~~~~~~~~~~~--~~~~~~~~~-------~~~l- 274 (404) T protein:vir:10 213 LEDWIINWFVDKVRITRNAEILYGAG--------GDEHATGIMTANKFKKITLPKSP--ALKDFKKCK-------NVEL- 274 (404) T ss_pred HHHHHHHHHHHHHHHHHHHHHhhcCC--------CCCcccceeeccccceeeccccc--cHHHHHHHH-------Hhhh- Confidence 67778888888888888887763211 111122222211 11111 111222111 1111 Q ss_pred HhcCCCcc-EEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEE-eccccccCCcceEEEEE Q lcl|NC_015280. 299 ETRRGKGN-IIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYI-DPYSANVSDNQYYVVGY 376 (455) Q Consensus 299 ~T~~~~gn-~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~-D~y~~~~s~~dY~~vG~ 376 (455) . ..+.+| .+|||++....|... ... +|..- ...+.++ ...++|. +++|++ +....... T Consensus 275 ~-~~~~~~~~~v~n~~~~~~L~~l---kd~---~G~~l--~~~~~~~-~~~~~l~-G~PV~~~~~~~~~~~--------- 334 (404) T protein:vir:10 275 L-NVFKATSSWIVNQDGFNYLDSL---EDK---TGRPY--LQPDPKD-PTQYRFL-GLPVIELPNDLLLST--------- 334 (404) T ss_pred h-ccccCCCEEEEcHHHHHHHHHh---hcc---CCcee--eccCcCC-CCCcccc-ceeeEEecccccCCC--------- Confidence 1 223333 368999998888762 211 11111 1111111 1123554 457764 32111000 Q ss_pred ecCccccceeEEccccc---------ccceeecCC----ccccceeeeeeecce-eecccccccccccc-cCch Q lcl|NC_015280. 377 KGTNAYDAGLFYCPYVP---------LQMYRAIGQ----DTFQPRIGFKTRYGM-VLNPFAKGLTALSD-SDPQ 435 (455) Q Consensus 377 KG~~~~daglfyaPYv~---------l~~~~~~Dp----~s~qP~~g~~tRY~l-~~nP~~~~~~~~~~-~~~~ 435 (455) ..+..++|+.+-. +.....-++ ...+-.+-...|++. +.+|-.--.-+... ..|- T Consensus 335 ----~~~~~~~~gd~s~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~r~d~~v~~~~a~~~~~~~~aa~~~ 404 (404) T protein:vir:10 335 ----ESAIPVLLGDTKEAYKYVSDGAYELATTNIGAGAFETNTTKARIIMRIDGNVKDSEALLIAEIPVESVQA 404 (404) T ss_pred ----CCccEEEEEeccccEEEEEecceEEEEeccccchhhcCceEEEEEEeeccEEecccceEEEEeecccCCC Confidence 0111122222211 111111122 234455667788888 56654432111111 1110 No 56 >protein:vir:9309 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:165 # MgeName: phi 11 # Cross-refs: genbank:acc:NP_803287;genbank:gi:29028597;genbank:GeneID:1258044 Probab=86.20 E-value=0.047 Score=27.85 Aligned_cols=301 Identities=9% Similarity=0.005 Sum_probs=127.7 Q ss_pred HhhhHHHHHHHHHHhhhhhhhchhhhccccc---cccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEe Q lcl|NC_015280. 35 LLENQERALAEERAVLTEAPTNVGPINTPTT---SSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRS 111 (455) Q Consensus 35 ~~enq~~~~~e~~~~l~ea~~~~~~~~~~st---~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRs 111 (455) ..++|+.. .+.+++..+......+-+..+ ++++..--....-.+++.+..+.+..+++.+-||++++--|.-.. T Consensus 1 ~~~~~~~~--~~~~~f~~~~~~~~~~~a~~~~~~~~~~~liP~~~~~~ii~~~~~~s~l~~l~~~~~~~~~~~~ip~~~- 77 (324) T protein:vir:93 1 MEQTQKLK--LNLQHFASNNVKPQVFNPDNVMMHEKKDGTLLNDFTTPILQEVMENSKIMQLGKYEPMEGTEKKFTFWA- 77 (324) T ss_pred CchhHHHH--HHHHHHHHhhhhhhhcccccccccCCCcceechhHHHHHHHHHHhhchhhhhcceeeccCCceEEEEEe- Confidence 22222221 111222221111112212221 122211112233335555667778889999999998764332110 Q ss_pred eecCCCCcccccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCcccccee Q lcl|NC_015280. 112 RYTNQSGNEAFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAF 191 (455) Q Consensus 112 rY~~qsG~EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaF 191 (455) .+.+ +-..+++ ..+++..- T Consensus 78 -----~~~~------------------------------------------------a~~v~Eg--------~~~~~~~~ 96 (324) T protein:vir:93 78 -----DKPG------------------------------------------------AYWVGEG--------QKIETSKA 96 (324) T ss_pred -----cCcc------------------------------------------------eeeecCC--------cccccccc Confidence 0000 0000111 12333344 Q ss_pred EEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeec Q lcl|NC_015280. 192 SIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDV 271 (455) Q Consensus 192 sIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~ 271 (455) ++++++++.+..+-....|-||.+|-. .|.++.|.+-|+..|...+++.+|.---.. ....|+++... T Consensus 97 ~f~~i~~~~~k~~~~~~iS~ell~ds~----~~l~~~i~~~l~~aia~~~d~a~l~G~g~~--------~~~~~~~~~~~ 164 (324) T protein:vir:93 97 TWVNATMRAFKLGVILPVTKEFLNYTY----SQFFEEMKPMIAEAFYKKFDEAGILNQGNN--------PFGKSIAQSIE 164 (324) T ss_pred ceeEEEEEeEEEEEeehhhHHHHhcch----HHHHHHHHHHHHHHHHHHHHHHHhcCCCCC--------CcCcccccccc Confidence 456666666666667788999999953 467889999999999999988887532111 11112221110 Q ss_pred c----ccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCcee Q lcl|NC_015280. 272 D----SNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTF 347 (455) Q Consensus 272 ~----~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~ 347 (455) . ..+.-..+....++.+++. .-+....++|++.....|... ... +|..- . .+.. T Consensus 165 ~~~~~~~~~~~~~~i~~~~~~l~~---------~~~~~~~~v~n~~~~~~L~~l---~d~---~G~~~--~-~~~~---- 222 (324) T protein:vir:93 165 KTNKVIKGDFTQDNIIDLEALLED---------DELEANAFISKTQNRSLLRKI---VDP---ETKER--I-YDRN---- 222 (324) T ss_pred ccceeccccccHHHHHHHHHhhhh---------ccCCCCEEEEcHHHHHHHHHh---hCC---CCCee--e-cCCC---- Confidence 0 0111112233334333331 223445689999999988752 211 11110 0 1111 Q ss_pred EEEecCceEEEEeccccccCCcceEEEE--------EecCccccceeEEcccccccceeecCC------ccccceeeeee Q lcl|NC_015280. 348 VGTLNGRFKVYIDPYSANVSDNQYYVVG--------YKGTNAYDAGLFYCPYVPLQMYRAIGQ------DTFQPRIGFKT 413 (455) Q Consensus 348 ~G~l~~~~~vy~D~y~~~~s~~dY~~vG--------~KG~~~~daglfyaPYv~l~~~~~~Dp------~s~qP~~g~~t 413 (455) .+.|. +++|++.+... .+...+++| ..+..+.+ ...+..+...+-.|. ..-|=.+=... T Consensus 223 ~~~l~-G~PVv~~~~~~--~~~~~i~~gdfs~~~~~~~~~~~i~----~~~~~~~~~~~~~~~~~~~~f~~n~~~~r~~~ 295 (324) T protein:vir:93 223 SDSLD-GLPVVNLKSSN--LKRGELITGDFDKLIYGIPQLIEYK----IDETAQLSTVKNEDGTPVNLFEQDMVALRATM 295 (324) T ss_pred CCccc-ceeeEeecCCC--CCcceEEEEecceEEEEEecCcEEE----EeecccccccccccccchhhhhcCcEEEEEEE Confidence 23343 46777755321 222233333 32222111 011100000000010 11123444556 Q ss_pred ecce-eecccccccccccccCchh-hhhccch Q lcl|NC_015280. 414 RYGM-VLNPFAKGLTALSDSDPQA-AGNLNAN 443 (455) Q Consensus 414 RY~l-~~nP~~~~~~~~~~~~~~~-~~~~~~n 443 (455) |||. +.+|-+- ..+.. .+|+ .-..++- T Consensus 296 r~d~~v~~~~a~--~~l~~-a~~~~~~~~~~~ 324 (324) T protein:vir:93 296 HVALHIADDKAF--AKLVP-ADKRTDSVPGEV 324 (324) T ss_pred EeccEEecccce--EEEec-ccccCCCCCCCC Confidence 8887 5666322 11111 1111 1122222 No 57 >protein:vir:2504 Length: 305 # NCBI annotation: major capsid subunit gp9 # Family: family:all:507 # MgeID: mge:53 # MgeName: TM4 # Cross-refs: genbank:acc:NP_569745;genbank:gi:18496895;genbank:GeneID:932268 Probab=85.10 E-value=0.055 Score=27.47 Aligned_cols=286 Identities=11% Similarity=0.035 Sum_probs=116.9 Q ss_pred cccccc-ccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCccccccccccccccccccccc Q lcl|NC_015280. 61 NTPTTS-SGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGTDGATPP 139 (455) Q Consensus 61 ~~~st~-tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~~~~ 139 (455) .+..++ ++...--....-.+++++.+..+..+++-+.||++++--|.-.. .+.+| .| T Consensus 1 ma~~t~~~gg~liP~~~~~~Ii~~~~~~s~l~~l~~~~~~~~~~~~~p~~~------~~~~a-------~w--------- 58 (305) T protein:vir:25 1 MADISRAEVASLIQEAYSDTLLAAAKQGSTVLSAFQNVNMGTKTTHLPVLA------TLPEA-------DW--------- 58 (305) T ss_pred CCCccCCccceecCHHHHHHHHHHHHhhchhhhhcceeeccCCcEEEEEEe------CCcce-------EE--------- Confidence 222332 23322222222445666667777888999999988763332211 00000 00 Q ss_pred ccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHH Q lcl|NC_015280. 140 TATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKA 219 (455) Q Consensus 140 ~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkA 219 (455) .++++... ...++.-..++++++..++..+-...+|-||.+|-. T Consensus 59 --------------------------------v~E~~~~~---~~~~~~s~~~f~~i~~~~~k~~~~~~is~ell~ds~- 102 (305) T protein:vir:25 59 --------------------------------VGESATDP---KGVKPTSKVTWANRTLVAEEIAVIIPVHENVIDDAT- 102 (305) T ss_pred --------------------------------eecccccc---cccccccccceeeEEeeeEEEEEeehhhHHHHhcch- Confidence 00000000 001111223344555555555556779999999843 Q ss_pred hhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeee-----eccccchhhHHHHHHHHHHHHHHHH Q lcl|NC_015280. 220 IHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDL-----DVDSNGRWSVEKFKGLLFQIERDAN 294 (455) Q Consensus 220 iHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl-----~~~~~gr~~ve~~k~l~~qi~~ean 294 (455) .|.|++|.+-|+..|+..+++.+|.--- +..+....++... ......- .....-.++.-+. .+. T Consensus 103 ---~~~~~~i~~~l~~~~a~~~d~a~~~G~g------~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~-~~~ 171 (305) T protein:vir:25 103 ---VAVLTEVAELGGQAIGKKLDQAVIFGTD------KPASWVSPALIPAAVTAGQAVEVVG-GVANESDIVGATN-RAA 171 (305) T ss_pred ---HHHHHHHHHHHHHHHHHHHhhhheeccC------CCCCccccccccccccccccccccc-cchhhhHHHHHHH-HHH Confidence 5789999999999999999998884211 1000000000000 0000000 0000001111111 111 Q ss_pred HHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEeccccccCCcceEEE Q lcl|NC_015280. 295 AIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVV 374 (455) Q Consensus 295 ~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~v 374 (455) .... .-.+..|=+++++.-...|.. +.+. +|..=+ + -++| .+++|+|..+.......--+++ T Consensus 172 ~~~~-~~~~~~~~~v~~~~~~~~l~~---lkd~---~G~~i~------~----~~~l-~G~Pv~~~~~~~~~~~~~~~~~ 233 (305) T protein:vir:25 172 KAVA-SAGWAPDTLLSSLALRYEVAN---IRDA---NGNPVF------R----DDSF-AGFRTFFNRNGAWDADAAIEVI 233 (305) T ss_pred Hhhh-hcccccceeEecHHHHHHHHH---hhcc---CCceee------c----CCcc-cccceEEcCccCCCCCccEEEE Confidence 1111 112333447888888887764 2211 111100 0 1345 3577777755322111111222 Q ss_pred --------EEecCccccceeEEcccccccceeecCCcc-ccc-eee--eeeecce-eecccccccccccccCchhhhhcc Q lcl|NC_015280. 375 --------GYKGTNAYDAGLFYCPYVPLQMYRAIGQDT-FQP-RIG--FKTRYGM-VLNPFAKGLTALSDSDPQAAGNLN 441 (455) Q Consensus 375 --------G~KG~~~~daglfyaPYv~l~~~~~~Dp~s-~qP-~~g--~~tRY~l-~~nP~~~~~~~~~~~~~~~~~~~~ 441 (455) |..+.-+.+ ...+. .+...-.|.+ ||- .++ ...|||+ +.||-+--.- ++-+|.....- T Consensus 234 gd~s~~~i~~~~~~~i~----~~~~~--~~~~~~~~~~~~~~~~~~~R~~~r~~~~v~~p~a~v~~---~~~~~~~~~pa 304 (305) T protein:vir:25 234 ADSSRVKIGVRQDITVK----FLDQA--TLGTGENQINLAERDMVALRLKARFAYVLGVSATAQGA---NKTPVAVVAPA 304 (305) T ss_pred EecceEEEEEecCeEEE----Eeeee--eeecCCceeeeeecCcEEEEEEEeecceeeCcccEEEE---ccccccccCCC Confidence 222211110 00110 0000011111 221 233 3668997 7788765222 23333321111 Q ss_pred c Q lcl|NC_015280. 442 A 442 (455) Q Consensus 442 ~ 442 (455) . T Consensus 305 ~ 305 (305) T protein:vir:25 305 A 305 (305) T ss_pred C Confidence 1 No 58 >protein:vir:94142 Length: 304 # NCBI annotation: ORF013 # Family: family:all:507 # MgeID: mge:1494 # MgeName: 96 # Cross-refs: genbank:acc:YP_240234;genbank:gi:66395898;genbank:GeneID:5133311 Probab=84.31 E-value=0.061 Score=27.23 Aligned_cols=279 Identities=13% Similarity=0.061 Sum_probs=123.8 Q ss_pred hhhhhhhchhhhccccccccccccccchh-hhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCccccccccc Q lcl|NC_015280. 49 VLTEAPTNVGPINTPTTSSGAVAGFDPIL-ISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPD 127 (455) Q Consensus 49 ~l~ea~~~~~~~~~~st~tg~i~~~~P~L-v~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~ 127 (455) |--+ .+...-...|+++.. ..-+.+ -.+++++.++.+..+++-+=||++.+--|. ++.. +.++ T Consensus 1 ma~~---~~~~~~~~~t~~gg~-lip~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~ip----~~~~--~~~a------ 64 (304) T protein:vir:94 1 MATP---TYTPGNVILSDFKNG-VIPAEQGTLIMKDIMANSAIMKLAKNEPMTAQKKKFT----YLAK--GVGA------ 64 (304) T ss_pred Cccc---ccccccccccCCCce-ecchhHHHHHHHHHHhccchhhhcceeeccCCceEEE----EEeC--Ccce------ Confidence 2111 011111111222221 122222 346666666777788888888887543221 1100 0000 Q ss_pred ccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeecccccc Q lcl|NC_015280. 128 AQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRA 207 (455) Q Consensus 128 t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKA 207 (455) -..++ +..+++-.-+++++++..|..+-.. T Consensus 65 ------------------------------------------~~v~E--------~~~~~~~~~~~~~i~~~~~k~~~~~ 94 (304) T protein:vir:94 65 ------------------------------------------YWVSE--------TERIQTSKPEYAQAEMEAKKIGVII 94 (304) T ss_pred ------------------------------------------EEeec--------CcccccccceeeEEEEEEEEEEEee Confidence 00001 1123344455667777777777788 Q ss_pred ceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeee-----ccccchhhHHHH Q lcl|NC_015280. 208 DYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLD-----VDSNGRWSVEKF 282 (455) Q Consensus 208 EYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~-----~~~~gr~~ve~~ 282 (455) .+|-||.+|- .+|.++.|.+-|...|...||+.+|.---+. +-.+....+++.-. ...++....+.. T Consensus 95 ~iS~ell~ds----~~~l~~~i~~~l~~~ia~~~d~~~l~G~g~~----~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i 166 (304) T protein:vir:94 95 PLSKEFLKWT----AKDFFNEVKPLIAEAFYKAFDQAVIFGTKSP----YNTSTSGKPLVEGAEEKGNVVTDTNNLYVDL 166 (304) T ss_pred hhhHHHHhcc----hHHHHHHHHHHHHHHHHHHHHhhheeccCCC----cccccccccccccccccccccccccchHHHH Confidence 8999999875 3678888999898888888888887532110 00111111111100 001111222233 Q ss_pred HHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEecc Q lcl|NC_015280. 283 KGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPY 362 (455) Q Consensus 283 k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y 362 (455) ..++.++. . .-....-++|++.....|.. +... +|..-+ .+. .|+|. +++||++.+ T Consensus 167 ~~~~~~l~--------~-~~~~~~~~v~~~~~~~~L~~---lkd~---~G~~l~----~~~----~~~l~-G~PV~~~~~ 222 (304) T protein:vir:94 167 SALMATIE--------D-EELDPNGVLTTRSFRSKMRN---ALDA---NDRPLF----DAN----GNEIM-GLPLSYTGA 222 (304) T ss_pred HHHHHHhh--------h-ccCCcCEEEEcHHHHHHHHH---hhcc---CCcEee----cCC----Ccccc-ceeeEEecc Confidence 33333332 1 12233457899999998875 2211 111111 111 24554 578888866 Q ss_pred ccccCCc--------ceEEEEEecCccccceeEEccccccc--ceeecCCcc-----cc---ceeeeeeecce-eecccc Q lcl|NC_015280. 363 SANVSDN--------QYYVVGYKGTNAYDAGLFYCPYVPLQ--MYRAIGQDT-----FQ---PRIGFKTRYGM-VLNPFA 423 (455) Q Consensus 363 ~~~~s~~--------dY~~vG~KG~~~~daglfyaPYv~l~--~~~~~Dp~s-----~q---P~~g~~tRY~l-~~nP~~ 423 (455) ....++. .++++|..+..+.+ ...+.. +....|++. || =.+=...||++ +.||-+ T Consensus 223 ~~~~~~~~~~~~gd~~~~~~~~~~~~~i~------~~~e~~~~~~~~~~~~g~~~~~f~~~~~~~r~~~r~~~~v~~~~a 296 (304) T protein:vir:94 223 DVYDKKKSLALMGDWDYARYGILQGIEYA------ISEDATLTTLQASDASGQPVSLFERDMFALRATMHIAYMNVKPEA 296 (304) T ss_pred cccCCCCcEEEEEehhhEEEEEecceEEE------EeecceeeeecccccCccchhhhhcCcEEEEEEEEeccEeecccc Confidence 5322211 12233333222110 011111 111112221 32 33334568888 566554 Q ss_pred cccccccccC Q lcl|NC_015280. 424 KGLTALSDSD 433 (455) Q Consensus 424 ~~~~~~~~~~ 433 (455) - ..+...| T Consensus 297 ~--~~l~~a~ 304 (304) T protein:vir:94 297 F--ATLKPTE 304 (304) T ss_pred e--EEEEecC Confidence 4 2233333 No 59 >protein:vir:105905 Length: 304 # NCBI annotation: major capsid protein # Family: family:all:507 # MgeID: mge:1514 # MgeName: phiETA3 # Cross-refs: genbank:acc:YP_001004375;genbank:gi:122891830;genbank:GeneID:4712376 Probab=84.31 E-value=0.061 Score=27.23 Aligned_cols=279 Identities=13% Similarity=0.061 Sum_probs=123.8 Q ss_pred hhhhhhhchhhhccccccccccccccchh-hhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCccccccccc Q lcl|NC_015280. 49 VLTEAPTNVGPINTPTTSSGAVAGFDPIL-ISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPD 127 (455) Q Consensus 49 ~l~ea~~~~~~~~~~st~tg~i~~~~P~L-v~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~ 127 (455) |--+ .+...-...|+++.. ..-+.+ -.+++++.++.+..+++-+=||++.+--|. ++.. +.++ T Consensus 1 ma~~---~~~~~~~~~t~~gg~-lip~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~ip----~~~~--~~~a------ 64 (304) T protein:vir:10 1 MATP---TYTPGNVILSDFKNG-VIPAEQGTLIMKDIMANSAIMKLAKNEPMTAQKKKFT----YLAK--GVGA------ 64 (304) T ss_pred Cccc---ccccccccccCCCce-ecchhHHHHHHHHHHhccchhhhcceeeccCCceEEE----EEeC--Ccce------ Confidence 2111 011111111222221 122222 346666666777788888888887543221 1100 0000 Q ss_pred ccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeecccccc Q lcl|NC_015280. 128 AQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRA 207 (455) Q Consensus 128 t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKA 207 (455) -..++ +..+++-.-+++++++..|..+-.. T Consensus 65 ------------------------------------------~~v~E--------~~~~~~~~~~~~~i~~~~~k~~~~~ 94 (304) T protein:vir:10 65 ------------------------------------------YWVSE--------TERIQTSKPEYAQAEMEAKKIGVII 94 (304) T ss_pred ------------------------------------------EEeec--------CcccccccceeeEEEEEEEEEEEee Confidence 00001 1123344455667777777777788 Q ss_pred ceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeee-----ccccchhhHHHH Q lcl|NC_015280. 208 DYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLD-----VDSNGRWSVEKF 282 (455) Q Consensus 208 EYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~-----~~~~gr~~ve~~ 282 (455) .+|-||.+|- .+|.++.|.+-|...|...||+.+|.---+. +-.+....+++.-. ...++....+.. T Consensus 95 ~iS~ell~ds----~~~l~~~i~~~l~~~ia~~~d~~~l~G~g~~----~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i 166 (304) T protein:vir:10 95 PLSKEFLKWT----AKDFFNEVKPLIAEAFYKAFDQAVIFGTKSP----YNTSTSGKPLVEGAEEKGNVVTDTNNLYVDL 166 (304) T ss_pred hhhHHHHhcc----hHHHHHHHHHHHHHHHHHHHHhhheeccCCC----cccccccccccccccccccccccccchHHHH Confidence 8999999875 3678888999898888888888887532110 00111111111100 001111222233 Q ss_pred HHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEecc Q lcl|NC_015280. 283 KGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPY 362 (455) Q Consensus 283 k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y 362 (455) ..++.++. . .-....-++|++.....|.. +... +|..-+ .+. .|+|. +++||++.+ T Consensus 167 ~~~~~~l~--------~-~~~~~~~~v~~~~~~~~L~~---lkd~---~G~~l~----~~~----~~~l~-G~PV~~~~~ 222 (304) T protein:vir:10 167 SALMATIE--------D-EELDPNGVLTTRSFRSKMRN---ALDA---NDRPLF----DAN----GNEIM-GLPLSYTGA 222 (304) T ss_pred HHHHHHhh--------h-ccCCcCEEEEcHHHHHHHHH---hhcc---CCcEee----cCC----Ccccc-ceeeEEecc Confidence 33333332 1 12233457899999998875 2211 111111 111 24554 578888866 Q ss_pred ccccCCc--------ceEEEEEecCccccceeEEccccccc--ceeecCCcc-----cc---ceeeeeeecce-eecccc Q lcl|NC_015280. 363 SANVSDN--------QYYVVGYKGTNAYDAGLFYCPYVPLQ--MYRAIGQDT-----FQ---PRIGFKTRYGM-VLNPFA 423 (455) Q Consensus 363 ~~~~s~~--------dY~~vG~KG~~~~daglfyaPYv~l~--~~~~~Dp~s-----~q---P~~g~~tRY~l-~~nP~~ 423 (455) ....++. .++++|..+..+.+ ...+.. +....|++. || =.+=...||++ +.||-+ T Consensus 223 ~~~~~~~~~~~~gd~~~~~~~~~~~~~i~------~~~e~~~~~~~~~~~~g~~~~~f~~~~~~~r~~~r~~~~v~~~~a 296 (304) T protein:vir:10 223 DVYDKKKSLALMGDWDYARYGILQGIEYA------ISEDATLTTLQASDASGQPVSLFERDMFALRATMHIAYMNVKPEA 296 (304) T ss_pred cccCCCCcEEEEEehhhEEEEEecceEEE------EeecceeeeecccccCccchhhhhcCcEEEEEEEEeccEeecccc Confidence 5322211 12233333222110 011111 111112221 32 33334568888 566554 Q ss_pred cccccccccC Q lcl|NC_015280. 424 KGLTALSDSD 433 (455) Q Consensus 424 ~~~~~~~~~~ 433 (455) - ..+...| T Consensus 297 ~--~~l~~a~ 304 (304) T protein:vir:10 297 F--ATLKPTE 304 (304) T ss_pred e--EEEEecC Confidence 4 2233333 No 60 >protein:vir:6242 Length: 390 # NCBI annotation: gp36 # Family: family:all:21 # MgeID: mge:131 # MgeName: phi-BT1 # Cross-refs: genbank:acc:NP_813696;swissprot:trembl:q859c1;genbank:gi:29366756;interpro:IPR006444;uniprot:Q859C1;genbank:GeneID:1258897 Probab=83.61 E-value=0.067 Score=27.02 Aligned_cols=320 Identities=15% Similarity=0.103 Sum_probs=118.4 Q ss_pred CcchHHHHHHhhHhhcCCCCccccchhhHHHHHHH-----hhhHHHHHH---------------HHHHh-----hhhhhh Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHEGLNDIKDPYRKSVTAIL-----LENQERALA---------------EERAV-----LTEAPT 55 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~~~~~i~~~~~~~v~~~~-----~enq~~~~~---------------e~~~~-----l~ea~~ 55 (455) ++ ++.+++|..+...-. .-.+.|-... ++.+...+. ++.+. +.|-.. T Consensus 32 lt--~e~~~~~~~l~~e~~------~l~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~~~~~~~r~ 103 (390) T protein:vir:62 32 MT--DEAREKEERLITAVS------DYDARIKRGIEAIKAIDPVTSLLSGLQGSGSGAQRSADVDDDATLRAGNLGEARS 103 (390) T ss_pred cc--HHHHHHHHHHHHHHH------HHHHHHHHHHHHHHHHHHHHHHHhhcccccccchhhcchHHHHHHhhhhhhhhHH Confidence 11 122233322211000 0000000000 000000000 00000 000000 Q ss_pred -chhhhcccccccccc-ccccchhhhHHHHHH-hhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccccccccccc Q lcl|NC_015280. 56 -NVGPINTPTTSSGAV-AGFDPILISLIRRAM-PKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSG 132 (455) Q Consensus 56 -~~~~~~~~st~tg~i-~~~~P~Lv~l~RRa~-p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg 132 (455) .........+++++- .-..+..-.++..+. ...+...++-|-||++...+-+.... + T Consensus 104 ~~~~~~~~~~t~~~~g~~~~~~~~~~~i~~~~~~~~~l~~~~~~~~~~~~~~~~~p~~~--------------------~ 163 (390) T protein:vir:62 104 FEFAPEKRDGTKAGNPNVLSRTLYGQLIAQAVERSAIMRGGATTFTTSDANPLDFTVIT--------------------G 163 (390) T ss_pred HHhhhhhhcccccCCCccccccchHHHHHHHHhhhhhhhhcceeeecCCCceeEEEEEc--------------------C Confidence 000000111222111 111112212222221 12244566667677665544443320 0 Q ss_pred cccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHH Q lcl|NC_015280. 133 TDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVE 212 (455) Q Consensus 133 ~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiE 212 (455) .. .+...++.+ .+++-.-++++++..+|.-+-...+|-| T Consensus 164 ~~---------------------------------~a~wv~E~~--------~~~~~~~~f~~i~~~~~k~~~~~~iS~e 202 (390) T protein:vir:62 164 RS---------------------------------SASIVGETA--------EIPESYPATAQRSMGGFKYGFASVVSYE 202 (390) T ss_pred Cc---------------------------------ceeeecccc--------cccccccceeeeEeeeeeEEeehHHHHH Confidence 00 000111111 2333334456666666767777789999 Q ss_pred HHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeecccc--------chhhHHHHHH Q lcl|NC_015280. 213 LAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSN--------GRWSVEKFKG 284 (455) Q Consensus 213 LAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~--------gr~~ve~~k~ 284 (455) |.+|- .+|.+++|.+-|+..|..-+|..||.- -| .+.|++....... +.-....... T Consensus 203 ll~ds----~~~l~~~i~~~l~~~i~~~~d~~~l~G------~G-----~p~Gi~~~~~~~~~~~~~~~~~~~~~~~l~~ 267 (390) T protein:vir:62 203 FATDQ----VLDLVGFLVSDAGPAIGDAMGRHFITG------TG-----QPRGILTDASPATATFLATDTDSKVSDALID 267 (390) T ss_pred HHhhh----hHHHHHHHHHHHHHHHHHHHHhhhhcc------CC-----ccccccccccccccceecccccccchHHHHH Confidence 99993 367899999999999999999998842 01 1222222110000 0000111222 Q ss_pred HHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEecccc Q lcl|NC_015280. 285 LLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSA 364 (455) Q Consensus 285 l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~ 364 (455) |.+.+. .. -+..+. .|+++.....|.. |... +|. .....+.+.. .-++|.| ++|+++.+. T Consensus 268 ~~~~l~-------~~-~~~~a~-~vmn~~~~~~L~~---lkd~---~g~--~l~~~~~~~g-~~~~l~G-~Pv~~~~~~- 327 (390) T protein:vir:62 268 LFHEVP-------SA-YRANAK-YVVNDLRAAQMRK---LKDA---NGQ--YLWQSGLTVG-APSLFNG-KVVETDDGM- 327 (390) T ss_pred HHHhhh-------hh-hhcCCE-EEEchHHHHHHHH---hhcc---CCC--eeecCCcCCC-ccceecc-cceEEecCC- Confidence 322222 11 223344 5778887777764 2211 111 0111111111 1135654 688887654 Q ss_pred ccCCcceEEEEEecCccccceeEEcccccccceeecCCcc--ccceeeeeeecce-eecccccccccccccC Q lcl|NC_015280. 365 NVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQDT--FQPRIGFKTRYGM-VLNPFAKGLTALSDSD 433 (455) Q Consensus 365 ~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s--~qP~~g~~tRY~l-~~nP~~~~~~~~~~~~ 433 (455) |.+=+++|- -. -.+...--.+...+..|+-. -|=.+=+..|++. +.||-+.-.-.+..+. T Consensus 328 ---p~~~i~~gd---~s---~~~i~~~~~~~v~~~~~~~~~~~~~~~~~~~r~d~~~~~~~A~~~l~~~~~a 390 (390) T protein:vir:62 328 ---PADKILFAD---LS---KYRVRFAGSLRVDRSVDAKFSTDQIVYRFLQRADGLLVDARGAKVLTVTPGA 390 (390) T ss_pred ---CCccEEEee---cc---ceeEEeecceEEEeeccccccCCcEEEEEEEEeCcEeechhheEEEEeecCC Confidence 333333331 10 00000000111112223322 2223334567776 6666665332222222 No 61 >protein:vir:7855 Length: 497 # NCBI annotation: gp12 # Family: family:all:585 # MgeID: mge:150 # MgeName: CJW1 # Cross-refs: genbank:acc:NP_817462;genbank:gi:29565891;genbank:GeneID:1259081 Probab=83.60 E-value=0.067 Score=27.01 Aligned_cols=349 Identities=14% Similarity=0.048 Sum_probs=130.7 Q ss_pred CcchHHHHHHhhHhhc-------------CCCCcc-ccchhhHHHHHHHhhhHHHH----------------------HH Q lcl|NC_015280. 1 MYNAENLQEKWAPVLN-------------HEGLND-IKDPYRKSVTAILLENQERA----------------------LA 44 (455) Q Consensus 1 m~~~~~~~~kw~~~l~-------------~~~~~~-i~~~~~~~v~~~~~enq~~~----------------------~~ 44 (455) ..+.+++..++..++. .....+ .....+......-+.++... .. T Consensus 53 ~~~~~~~~~~~~~~~a~~~~~~~~~~~~e~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 132 (497) T protein:vir:78 53 HERAQEMLKSLGGADAAKDGLDNDIPEVEVRNLKQIRKHLARAVIMNPELKNATSFEKGTKFDVSFNVSAKAADPGTAAA 132 (497) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhHHHHHHHHHhhhHHHHhhhhhhhhhhhhhhhhhhhhhhhhHHHHH Confidence 1111111111111111 000000 00000000000000000000 00 Q ss_pred HHHHhhhh---hhhchhhhccccccccccc---cccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCC Q lcl|NC_015280. 45 EERAVLTE---APTNVGPINTPTTSSGAVA---GFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSG 118 (455) Q Consensus 45 e~~~~l~e---a~~~~~~~~~~st~tg~i~---~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG 118 (455) +.+..... +..-...+...+++++... .+.+.++.+.| +.....+++.+-||+++..- |... .. T Consensus 133 ~~~~~~~~~~~~~~~~~~~~~~~~~~gg~~vp~~~~~~ii~~~~---~~~~i~~l~~~~~~~~~~~~-~~~~--~~---- 202 (497) T protein:vir:78 133 ELMGAFADGETAPAAIGQNPFGSTGTFAPGILPTFLPGIVEQLF---YELSLADLISSRPVTSPNLS-YLTE--SA---- 202 (497) T ss_pred HHHHHHhhhhhhHHHHHhhhcccCcccccccchhhhHHHHHHHH---hhhhHHhhccccccCCCceE-EEEE--cC---- Confidence 00000000 0000111112222333321 23344444444 45567899999999987532 2221 00 Q ss_pred cccccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEE Q lcl|NC_015280. 119 NEAFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAV 198 (455) Q Consensus 119 ~EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tV 198 (455) .++ . +-..++ +..++|...+++++++ T Consensus 203 -----------~~~--~---------------------------------a~wv~E--------~~~~~~s~~~f~~i~~ 228 (497) T protein:vir:78 203 -----------AHN--N---------------------------------AAAVAE--------AGTYPFSSEEFARVYE 228 (497) T ss_pred -----------CCC--c---------------------------------ceeecc--------CcccccccccceeeEe Confidence 000 0 000011 1224455566677777 Q ss_pred EeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHH--------Hhhhheeeeeecccc------- Q lcl|NC_015280. 199 EAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRT--------VYRGAKPGAQANVAN------- 263 (455) Q Consensus 199 tAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~--------l~~vA~~~k~~~v~~------- 263 (455) .+|.-+-...+|-||++|-- +.++.|.+-|+..|..-+|+.||.- |.+.+....+..... T Consensus 229 ~~~k~a~~~~iS~ell~d~~-----~l~~~i~~~l~~~i~~~~d~~~l~G~G~~~p~Gil~~~~~~~~~~~~~~~~~~~~ 303 (497) T protein:vir:78 229 QVGKVANALTITDEGLRDAP-----ELFNFVQGRLLEGIQRKEEVQLLAGGGYPGVNGLLQRSTGFTASSASSLFGATSA 303 (497) T ss_pred eeeeeEeecHhHHHHHHhHH-----HHHHHHHHHHHHHHHHHHHHHhhcCCCcccccccccccccccccccccchhhhhh Confidence 77777777889999999942 3789999999999999999988862 222211111110000 Q ss_pred -ceeeeeeccccchhhHH-----HHHHHH----------------------HHHHHHHHHHHHHhcCCCccEEEEchhHH Q lcl|NC_015280. 264 -AGVFDLDVDSNGRWSVE-----KFKGLL----------------------FQIERDANAIAQETRRGKGNIIITSADVA 315 (455) Q Consensus 264 -~gv~Dl~~~~~gr~~ve-----~~k~l~----------------------~qi~~ean~i~~~T~~~~gn~~v~S~~va 315 (455) .+..++..+..+.|.+. ..+... ..-...+-...+.+....++-+|.++.-. T Consensus 304 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~vmn~~~~ 383 (497) T protein:vir:78 304 TVSNVKFPADGTNGAFVGQDTVASLKYGRVVTGAAGSGSGVAGSYPTAAEIAENVFDAFVDIQLTLFQTPNAVVMNPRDW 383 (497) T ss_pred hhhhhhhhcccccchhhhhhHHHHHHHHHhhhhhhhhccchhccccchhhhhhHHHHHHhhhhhhcccCCCeEEEchHHH Confidence 00001111111111111 000000 00011112223345556667778888877 Q ss_pred HHHHhh----cccccccccccccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCcc------ccce Q lcl|NC_015280. 316 SALAMS----GVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNA------YDAG 385 (455) Q Consensus 316 ~~L~~s----G~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~------~dag 385 (455) ..|... |-+-+.|...+..+ ......++|. +++|++.+... .+=+++|--.... .+-. T Consensus 384 ~~l~~lkd~~G~~i~~~~~~~~~~-------~~~~~~~~l~-G~pV~~t~~~~----~~~~~~Gd~~~~~~~i~~r~~~~ 451 (497) T protein:vir:78 384 ELLRLTKDANGQYMGGNFFGNAYG-------NPVNGGKNIW-GVPVVTTPLIP----LGTILVGHFAPSVIQTARREGVT 451 (497) T ss_pred HHHHHhhcCCCceeccCccccccc-------ccccCCceee-ceeeEecCCCC----CCceEEeecccceEEEEEecccE Confidence 776542 22222121111111 0011122454 57887776542 2222233110000 0011 Q ss_pred eEEcccccccceeecCCccccceeeeeeecce-eecccccccccccccCchhhhhccch Q lcl|NC_015280. 386 LFYCPYVPLQMYRAIGQDTFQPRIGFKTRYGM-VLNPFAKGLTALSDSDPQAAGNLNAN 443 (455) Q Consensus 386 lfyaPYv~l~~~~~~Dp~s~qP~~g~~tRY~l-~~nP~~~~~~~~~~~~~~~~~~~~~n 443 (455) +-..||..- +=.+.|=.+=+..|+++ +.+|-+--.-++..+. -.| T Consensus 452 v~~~~~~~~------~f~~n~v~~r~~~r~~~~v~~p~A~~~l~~~~~~-------~~~ 497 (497) T protein:vir:78 452 MQMTNSNGT------DFVDGKVTVRAEERLGLLVYRPSAFQLIQLKKGA-------TGS 497 (497) T ss_pred EEeecccch------hhhcCcEEEEEEEeecceeeccccEEEEEecCCc-------cCC Confidence 111222100 01122334444678977 7788665322221110 111 No 62 >protein:vir:101650 Length: 497 # NCBI annotation: gp13 # Family: family:all:585 # MgeID: mge:1515 # MgeName: 244 # Cross-refs: genbank:acc:YP_654768;genbank:gi:109302766;genbank:GeneID:4156084 Probab=83.60 E-value=0.067 Score=27.01 Aligned_cols=349 Identities=14% Similarity=0.048 Sum_probs=130.7 Q ss_pred CcchHHHHHHhhHhhc-------------CCCCcc-ccchhhHHHHHHHhhhHHHH----------------------HH Q lcl|NC_015280. 1 MYNAENLQEKWAPVLN-------------HEGLND-IKDPYRKSVTAILLENQERA----------------------LA 44 (455) Q Consensus 1 m~~~~~~~~kw~~~l~-------------~~~~~~-i~~~~~~~v~~~~~enq~~~----------------------~~ 44 (455) ..+.+++..++..++. .....+ .....+......-+.++... .. T Consensus 53 ~~~~~~~~~~~~~~~a~~~~~~~~~~~~e~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 132 (497) T protein:vir:10 53 HERAQEMLKSLGGADAAKDGLDNDIPEVEVRNLKQIRKHLARAVIMNPELKNATSFEKGTKFDVSFNVSAKAADPGTAAA 132 (497) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhHHHHHHHHHhhhHHHHhhhhhhhhhhhhhhhhhhhhhhhhHHHHH Confidence 1111111111111111 000000 00000000000000000000 00 Q ss_pred HHHHhhhh---hhhchhhhccccccccccc---cccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCC Q lcl|NC_015280. 45 EERAVLTE---APTNVGPINTPTTSSGAVA---GFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSG 118 (455) Q Consensus 45 e~~~~l~e---a~~~~~~~~~~st~tg~i~---~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG 118 (455) +.+..... +..-...+...+++++... .+.+.++.+.| +.....+++.+-||+++..- |... .. T Consensus 133 ~~~~~~~~~~~~~~~~~~~~~~~~~~gg~~vp~~~~~~ii~~~~---~~~~i~~l~~~~~~~~~~~~-~~~~--~~---- 202 (497) T protein:vir:10 133 ELMGAFADGETAPAAIGQNPFGSTGTFAPGILPTFLPGIVEQLF---YELSLADLISSRPVTSPNLS-YLTE--SA---- 202 (497) T ss_pred HHHHHHhhhhhhHHHHHhhhcccCcccccccchhhhHHHHHHHH---hhhhHHhhccccccCCCceE-EEEE--cC---- Confidence 00000000 0000111112222333321 23344444444 45567899999999987532 2221 00 Q ss_pred cccccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEE Q lcl|NC_015280. 119 NEAFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAV 198 (455) Q Consensus 119 ~EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tV 198 (455) .++ . +-..++ +..++|...+++++++ T Consensus 203 -----------~~~--~---------------------------------a~wv~E--------~~~~~~s~~~f~~i~~ 228 (497) T protein:vir:10 203 -----------AHN--N---------------------------------AAAVAE--------AGTYPFSSEEFARVYE 228 (497) T ss_pred -----------CCC--c---------------------------------ceeecc--------CcccccccccceeeEe Confidence 000 0 000011 1224455566677777 Q ss_pred EeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHH--------Hhhhheeeeeecccc------- Q lcl|NC_015280. 199 EAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRT--------VYRGAKPGAQANVAN------- 263 (455) Q Consensus 199 tAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~--------l~~vA~~~k~~~v~~------- 263 (455) .+|.-+-...+|-||++|-- +.++.|.+-|+..|..-+|+.||.- |.+.+....+..... T Consensus 229 ~~~k~a~~~~iS~ell~d~~-----~l~~~i~~~l~~~i~~~~d~~~l~G~G~~~p~Gil~~~~~~~~~~~~~~~~~~~~ 303 (497) T protein:vir:10 229 QVGKVANALTITDEGLRDAP-----ELFNFVQGRLLEGIQRKEEVQLLAGGGYPGVNGLLQRSTGFTASSASSLFGATSA 303 (497) T ss_pred eeeeeEeecHhHHHHHHhHH-----HHHHHHHHHHHHHHHHHHHHHhhcCCCcccccccccccccccccccccchhhhhh Confidence 77777777889999999942 3789999999999999999988862 222211111110000 Q ss_pred -ceeeeeeccccchhhHH-----HHHHHH----------------------HHHHHHHHHHHHHhcCCCccEEEEchhHH Q lcl|NC_015280. 264 -AGVFDLDVDSNGRWSVE-----KFKGLL----------------------FQIERDANAIAQETRRGKGNIIITSADVA 315 (455) Q Consensus 264 -~gv~Dl~~~~~gr~~ve-----~~k~l~----------------------~qi~~ean~i~~~T~~~~gn~~v~S~~va 315 (455) .+..++..+..+.|.+. ..+... ..-...+-...+.+....++-+|.++.-. T Consensus 304 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~vmn~~~~ 383 (497) T protein:vir:10 304 TVSNVKFPADGTNGAFVGQDTVASLKYGRVVTGAAGSGSGVAGSYPTAAEIAENVFDAFVDIQLTLFQTPNAVVMNPRDW 383 (497) T ss_pred hhhhhhhhcccccchhhhhhHHHHHHHHHhhhhhhhhccchhccccchhhhhhHHHHHHhhhhhhcccCCCeEEEchHHH Confidence 00001111111111111 000000 00011112223345556667778888877 Q ss_pred HHHHhh----cccccccccccccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCcc------ccce Q lcl|NC_015280. 316 SALAMS----GVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNA------YDAG 385 (455) Q Consensus 316 ~~L~~s----G~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~------~dag 385 (455) ..|... |-+-+.|...+..+ ......++|. +++|++.+... .+=+++|--.... .+-. T Consensus 384 ~~l~~lkd~~G~~i~~~~~~~~~~-------~~~~~~~~l~-G~pV~~t~~~~----~~~~~~Gd~~~~~~~i~~r~~~~ 451 (497) T protein:vir:10 384 ELLRLTKDANGQYMGGNFFGNAYG-------NPVNGGKNIW-GVPVVTTPLIP----LGTILVGHFAPSVIQTARREGVT 451 (497) T ss_pred HHHHHhhcCCCceeccCccccccc-------ccccCCceee-ceeeEecCCCC----CCceEEeecccceEEEEEecccE Confidence 776542 22222121111111 0011122454 57887776542 2222233110000 0011 Q ss_pred eEEcccccccceeecCCccccceeeeeeecce-eecccccccccccccCchhhhhccch Q lcl|NC_015280. 386 LFYCPYVPLQMYRAIGQDTFQPRIGFKTRYGM-VLNPFAKGLTALSDSDPQAAGNLNAN 443 (455) Q Consensus 386 lfyaPYv~l~~~~~~Dp~s~qP~~g~~tRY~l-~~nP~~~~~~~~~~~~~~~~~~~~~n 443 (455) +-..||..- +=.+.|=.+=+..|+++ +.+|-+--.-++..+. -.| T Consensus 452 v~~~~~~~~------~f~~n~v~~r~~~r~~~~v~~p~A~~~l~~~~~~-------~~~ 497 (497) T protein:vir:10 452 MQMTNSNGT------DFVDGKVTVRAEERLGLLVYRPSAFQLIQLKKGA-------TGS 497 (497) T ss_pred EEeecccch------hhhcCcEEEEEEEeecceeeccccEEEEEecCCc-------cCC Confidence 111222100 01122334444678977 7788665322221110 111 No 63 >protein:vir:8102 Length: 543 # NCBI annotation: gp6 # Family: family:all:21 # MgeID: mge:152 # MgeName: Che9c # Cross-refs: genbank:acc:NP_817683;genbank:gi:29566114;genbank:GeneID:1259308 Probab=83.60 E-value=0.067 Score=27.01 Aligned_cols=324 Identities=13% Similarity=0.033 Sum_probs=116.2 Q ss_pred Ccc-----hHHHHHHhhHhh-----------------cCCCCccccchhhHHHHHHHhhhHHHHHHH-HHHhhhhhhhch Q lcl|NC_015280. 1 MYN-----AENLQEKWAPVL-----------------NHEGLNDIKDPYRKSVTAILLENQERALAE-ERAVLTEAPTNV 57 (455) Q Consensus 1 m~~-----~~~~~~kw~~~l-----------------~~~~~~~i~~~~~~~v~~~~~enq~~~~~e-~~~~l~ea~~~~ 57 (455) |.. .+.+..+...+- .......-....++.....+.+.+...++. ++..+.++... T Consensus 173 ~~~~~~~~~e~l~~~~e~~~~~~~~~~~~~d~~e~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~l~~~e~~~~~~~~~~- 251 (543) T protein:vir:81 173 LRARALSAIEKMQGASDNVRAAATKIIERFDDEDSTLARQCLATSSPAYLRAWSKMARNPHAAILTEEEKRAINEVRAM- 251 (543) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhhhhhHHHHHHHhhHHHHhhhhhhhhhhhhhhc- Confidence 110 011111111100 000000001112222222222222222221 12222222100 Q ss_pred hhhccccccccccccccchhhhHHHHHHhh-hhhhheeeeccCCCcceeeeEEEeeecCCCCcccccccccccccccccc Q lcl|NC_015280. 58 GPINTPTTSSGAVAGFDPILISLIRRAMPK-LIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGTDGA 136 (455) Q Consensus 58 ~~~~~~st~tg~i~~~~P~Lv~l~RRa~p~-LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~ 136 (455) ..++++|.+.--....-.++.+.... -+...++-|.|++|..- +- + . ..+. T Consensus 252 ----~~t~~~gg~lip~~~~~~ii~~~~~~~~~l~~~~~~~~~~g~~~--~~-~--~--~~~~----------------- 303 (543) T protein:vir:81 252 ----GLTKADGGYLVPFQLDPTVIITSNGSLNDIRRFARQVVATGDVW--HG-V--S--SAAV----------------- 303 (543) T ss_pred ----ccccccCcccCchhhhhHHHHHHHhhhchhhhhcccccCCcceE--EE-E--e--cCCc----------------- Confidence 01111222111111111222222211 23344455555544321 10 1 0 0000 Q ss_pred cccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHh Q lcl|NC_015280. 137 TPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQD 216 (455) Q Consensus 137 ~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQD 216 (455) .+...++++ .+++-..+++.++++++.-+=...+|-||.+| T Consensus 304 -------------------------------~a~~v~Eg~--------~~~~~~~~~~~i~~~~~k~~~~~~is~ell~d 344 (543) T protein:vir:81 304 -------------------------------QWSWDAEFE--------EVSDDSPEFGQPEIPVKKAQGFVPISIEALQD 344 (543) T ss_pred -------------------------------ceeecccCc--------cccccccccceeeeeeeeeEeeehhhHHHHhc Confidence 000011111 12223334556666666666677899999987 Q ss_pred HHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeee--------eeccccchhhHHHHHHHHHH Q lcl|NC_015280. 217 LKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFD--------LDVDSNGRWSVEKFKGLLFQ 288 (455) Q Consensus 217 LkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~D--------l~~~~~gr~~ve~~k~l~~q 288 (455) - + |.++.|.+-|...|...+|+-||.- .-.+-...|++- ......+-...+....+... T Consensus 345 ~-~----~~~~~i~~~l~~~~~~~~d~ail~G--------~Gt~~~p~Gi~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 411 (543) T protein:vir:81 345 E-A----NVTETVALLFAEGKDELEAVTLTTG--------TGQGNQPTGIVTALAGTAAEIAPVTAETFALADVYAVYEQ 411 (543) T ss_pred c-H----HHHHHHHHHHHHHHHHHHHHHHhcc--------CCCCcccccchhhcccccccccccccccccHHHHHHHHHh Confidence 3 2 6899999999999999999888732 000001111110 11111111122333344333 Q ss_pred HHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEeccccccC- Q lcl|NC_015280. 289 IERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVS- 367 (455) Q Consensus 289 i~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s- 367 (455) +.-. -+.. ..+|+|+.+...|.. +... +|..-+ .....+. -++|. +++|++..++..+. T Consensus 412 l~~~--------~~~~-~~~v~n~~~~~~l~~---lkd~---~G~~l~--~~~~~g~--~~~l~-G~pv~~~~~~~~~~~ 471 (543) T protein:vir:81 412 LAAR--------HRRQ-GAWLANNLIYNKIRQ---FDTQ---GGAGLW--TTIGNGE--PSQLL-GRPVGEAEAMDANWN 471 (543) T ss_pred hhcc--------ccCC-cEEEEcHHHHHHHHH---hhcC---CCceec--cCcCCCC--Ccccc-ceeeEEecccccccc Confidence 3311 1112 246889999888875 2211 111111 0011111 24564 47888776532110 Q ss_pred ----Ccce-EEEEEecCccccceeEEcccccccceeecCCc--------cccceeeeeeecce-eecccccccccccccC Q lcl|NC_015280. 368 ----DNQY-YVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQD--------TFQPRIGFKTRYGM-VLNPFAKGLTALSDSD 433 (455) Q Consensus 368 ----~~dY-~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~--------s~qP~~g~~tRY~l-~~nP~~~~~~~~~~~~ 433 (455) ..++ +++|-- +.+++... ..+...+||. ..+=.+=+..|+|. +.||-+--.-++.... T Consensus 472 ~~~~~~~~~i~~gd~------~~~~i~~~--~~~~i~~~~~~~~~~~~~~~~~~~~~~~r~d~~v~~~~A~~~l~~~~~a 543 (543) T protein:vir:81 472 TSASADNFVLLYGNF------QNYVIADR--IGMTVEFIPHLFGTNRRPNGSRGWFAYYRMGADVVNPNAFRLLNVETAS 543 (543) T ss_pred ccccCCcceEEEeec------cceeEEee--cccEEEEeccccccchhhcCceEEEEEEeeccEeecccceEEEEecccC Confidence 0111 111110 00111000 0111122332 23344555667887 6666554322222211 No 64 >protein:vir:1433 Length: 435 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:30 # MgeName: phiE125 # Cross-refs: genbank:acc:NP_536362;genbank:gi:17975167;genbank:GeneID:929171 Probab=83.57 E-value=0.067 Score=27.01 Aligned_cols=335 Identities=12% Similarity=0.049 Sum_probs=115.6 Q ss_pred Cc-chHHHHHHhhHh-------------hcCC-------CCccc-----cchhhHHHHHHHhhhHH---HHHHHH-HHh- Q lcl|NC_015280. 1 MY-NAENLQEKWAPV-------------LNHE-------GLNDI-----KDPYRKSVTAILLENQE---RALAEE-RAV- 49 (455) Q Consensus 1 m~-~~~~~~~kw~~~-------------l~~~-------~~~~i-----~~~~~~~v~~~~~enq~---~~~~e~-~~~- 49 (455) |- +-+.|.++..-+ .+.. +.+.. .+..|..-+.+.+.... .+.... +.. T Consensus 41 l~~ei~~l~~~I~~~e~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 120 (435) T protein:vir:14 41 LSSKFSELTAQIERAEAAERMAAAAAVPVDPNPTAVAAPAAAPVHAQPKALEVKGAKMARMVRALAAARGDAQLASKLAI 120 (435) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHhhcccccchhhhhhhccccccccccchhhhhHHHHHHHHHHHHhhcchhhHHHHHHH Confidence 11 112222221111 0000 00000 00011111111111000 000000 000 Q ss_pred ---hhhhhhchhhhccccccccccccccchhhhHHHHHHhhhhhhhe-eeeccCCCcceeeeEEEeeecCCCCccccccc Q lcl|NC_015280. 50 ---LTEAPTNVGPINTPTTSSGAVAGFDPILISLIRRAMPKLIAYDI-AGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDE 125 (455) Q Consensus 50 ---l~ea~~~~~~~~~~st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI-~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnE 125 (455) +.+...+ .+...++..|...--....-.++.++.++.+..++ +=+-||+... +-+... . ++.+ T Consensus 121 ~~~~~~~~~~--~~~~~t~~~gg~~vP~~~~~~ii~~l~~~~~i~~~~~~~~~~~~~~-~~~p~~---~--~~~~----- 187 (435) T protein:vir:14 121 ERGFGEEVAM--SLNTLSPGAGGVLVPENLSSEVIELLRPKSVVRKLGARTLPLSNGN-ITIPRL---K--GGAI----- 187 (435) T ss_pred hhhhhhhhhh--hcccCCcCCCccccchhHHHHHHHHHhhhchhhhhcceeeecCCCc-eEEEEE---e--CCcc----- Confidence 0010000 01111111121111111111244444455555554 2233433221 111110 0 0000 Q ss_pred ccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeecccc Q lcl|NC_015280. 126 PDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRAL 205 (455) Q Consensus 126 a~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaL 205 (455) +...++ +..+++..-++++++..++.-+- T Consensus 188 -------------------------------------------a~~v~E--------~~~~~~~~~~f~~i~~~~~k~~~ 216 (435) T protein:vir:14 188 -------------------------------------------VGYIGA--------DTDIPTTQQQFDDLKLTAKKMAA 216 (435) T ss_pred -------------------------------------------eeeecc--------CccccccccceeEEEeeeEEEEE Confidence 000000 11233344455666666666666 Q ss_pred ccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccc------c-chh- Q lcl|NC_015280. 206 RADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDS------N-GRW- 277 (455) Q Consensus 206 KAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~------~-gr~- 277 (455) ....|-||.+|-. .+.+.|+.|.+-|+..|...+|+-||.- .-.+-...|++...... + +.+ T Consensus 217 ~~~iS~ell~ds~--~~~~l~~~i~~~l~~ai~~~~d~a~l~G--------~G~~~~p~Gi~~~~~~~~~~~~~~~~~~~ 286 (435) T protein:vir:14 217 LVPIANDLIKYAG--VNPNVDQIVVGDLTAAIGAREDKAFIRD--------DGTANTPKGLRFWALPSNVITASDASTLQ 286 (435) T ss_pred eehhhHHHHHhhc--cCHHHHHHHHHHHHHHHHHHHHHHhhcc--------CCCCccccceeecccccceeccccccchh Confidence 7789999999932 1234677777777777777777777621 11111133333211110 0 000 Q ss_pred -hHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceE Q lcl|NC_015280. 278 -SVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFK 356 (455) Q Consensus 278 -~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~ 356 (455) .......|+..+. .--......-+|+++.....|... .. .+|..-+ .+.+ .|+|. +++ T Consensus 287 ~~~~~~~~l~~~~~-------~~~~~~~~~~~v~n~~~~~~L~~l---kd---~~G~~l~---~~~~----~g~l~-G~P 345 (435) T protein:vir:14 287 KIETDLGKVILALE-------NADANLTQPGWIMAPRTFRFLEGL---RD---GNGNKVY---PELA----NGMLK-GYP 345 (435) T ss_pred hHHHHHHHHHHHhh-------hccccccCCEEEEcHHHHHHHHHh---hc---cCCceec---cCCC----CCeee-cce Confidence 0111122222211 111122334568899999888752 21 1111111 1112 35664 478 Q ss_pred EEEecccccc----CCcceEE--------EEEecCccccceeEEcccccccceeecCCccc---cceeeeeeecce-eec Q lcl|NC_015280. 357 VYIDPYSANV----SDNQYYV--------VGYKGTNAYDAGLFYCPYVPLQMYRAIGQDTF---QPRIGFKTRYGM-VLN 420 (455) Q Consensus 357 vy~D~y~~~~----s~~dY~~--------vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s~---qP~~g~~tRY~l-~~n 420 (455) |+++.+...+ .+..-++ +|..+.-+ +-..||.-........-..| |=.+=...|++. +.+ T Consensus 346 v~~~~~~p~~~~~~~~~~~i~~gd~s~~~i~~~~~~~----~~~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~~~~ 421 (435) T protein:vir:14 346 VGKTTQVPINLGETGKESEIYFTDFGDVFIGEEETLE----IDYSKEATYKDADGHMVSAFQRDQTLIRVIAKNDFGPRH 421 (435) T ss_pred eEeeccccccccCCCccceEEEeecccEEEEEecccE----EEEeccccccccccchhhhhhcChhheeeeeeeCceeec Confidence 8887553110 0111122 23222222 23333321110000000001 223335567777 555 Q ss_pred ccccccccccccCchhh Q lcl|NC_015280. 421 PFAKGLTALSDSDPQAA 437 (455) Q Consensus 421 P~~~~~~~~~~~~~~~~ 437 (455) |-+.. ..+|-+|++ T Consensus 422 ~~a~~---~l~~~~~~~ 435 (435) T protein:vir:14 422 VESIA---VLAGVAWGA 435 (435) T ss_pred ccceE---EEecCCCCC Confidence 55432 236777886 No 65 >protein:vir:485 Length: 407 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:11 # MgeName: P27 # Cross-refs: genbank:acc:NP_543092;swissprot:trembl:q8w627;genbank:gi:18249904;uniprot:Q8W627;genbank:GeneID:929693 Probab=83.24 E-value=0.07 Score=26.91 Aligned_cols=335 Identities=13% Similarity=0.156 Sum_probs=127.9 Q ss_pred Ccch-----------HHHHHHhhHhhcC-CC-----------------------------CccccchhhHHHHHHHhhhH Q lcl|NC_015280. 1 MYNA-----------ENLQEKWAPVLNH-EG-----------------------------LNDIKDPYRKSVTAILLENQ 39 (455) Q Consensus 1 m~~~-----------~~~~~kw~~~l~~-~~-----------------------------~~~i~~~~~~~v~~~~~enq 39 (455) +..- +++++++.-+... +. ..+....+|+.+...+...+ T Consensus 15 ~~~~~~~k~~~~~~~~~~e~~~~~l~~~~e~~~~~~~~~e~~~~~~~~~~~~~~~~~~~~~~~~~~e~~~a~~~~l~~g~ 94 (407) T protein:vir:48 15 QRKFDDFKEKNDKRIDAIEQEKGKLAGEVETLNGKLAELENLKSDLEAELAEVKRPAGGTQNKVASEHKEAFIGFMRKGR 94 (407) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhccccccccchhhHHHHHHHHHHhccc Confidence 0000 1111111111000 00 00111112222222222111 Q ss_pred HHHHHH-HHHhhhhhhhchhhhcccccccccc---ccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecC Q lcl|NC_015280. 40 ERALAE-ERAVLTEAPTNVGPINTPTTSSGAV---AGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTN 115 (455) Q Consensus 40 ~~~~~e-~~~~l~ea~~~~~~~~~~st~tg~i---~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~ 115 (455) ...+++ |++.| ...+.++|.+ ..+.+.++.+.| ...+-.+++.+-||++++.-++-.. T Consensus 95 ~~~~~~~e~~a~----------~~~t~~~gG~~iP~~~~~~I~~~~~---~~~~l~~~~~~~~~~~~~~~~~~~~----- 156 (407) T protein:vir:48 95 EDGLRELERKAL----------QVGNDEDGGYAIPEELDRTILTLLK---DEVVMRQEATVITLGGSDYKKLVNL----- 156 (407) T ss_pred hhhhhHHHHHhh----------hcccCCCCcccccHhHHHHHHHHHH---hhhhhhhhceeeecCCCceEEEEec----- Confidence 111111 11111 1111222221 223444555555 3445667888888887764443111 Q ss_pred CCCcccccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEE Q lcl|NC_015280. 116 QSGNEAFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDK 195 (455) Q Consensus 116 qsG~EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK 195 (455) ++.. +-..++.+..++.....|.+..|.+.| T Consensus 157 -~~~~------------------------------------------------a~~v~E~~~~~~~~~~~f~~i~~~~~k 187 (407) T protein:vir:48 157 -GGTT------------------------------------------------SGWVGETDARPETATSKLGLIEPFMGE 187 (407) T ss_pred -CCcc------------------------------------------------eeeecccccccccccccceeEEeeeee Confidence 0000 000111111111112235555555544 Q ss_pred EEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHH--------Hhhhheeeeeeccccceee Q lcl|NC_015280. 196 IAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRT--------VYRGAKPGAQANVANAGVF 267 (455) Q Consensus 196 ~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~--------l~~vA~~~k~~~v~~~gv~ 267 (455) .. -...+|-||.+|-. .|.+++|.+-|+..|...+++-||.- |.+.+...........|.. T Consensus 188 ~~-------~~~~iS~ell~ds~----~~l~~~i~~~l~~~i~~~~~~a~l~G~G~~~p~Gil~~~~~~~~~~~~~~~~~ 256 (407) T protein:vir:48 188 IY-------GNPQATQKMLDDAF----FNVEDWINSELALEFAEQEEIAFTSGDGSKKPKGFLAYESTDEDDKTRAFGKL 256 (407) T ss_pred eE-------eehhhHHHHHhcch----HHHHHHHHHHHHHHHHHHHHhhhhccCCCCccceeeecccccccccccccccc Confidence 44 44579999999843 56888999999999998888877642 1111111110000000000 Q ss_pred e-eeccccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCce Q lcl|NC_015280. 268 D-LDVDSNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNT 346 (455) Q Consensus 268 D-l~~~~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~ 346 (455) . ......+.-.......|.+.+.. . -+..+. +|+++.....|.. |... +|..-+ ..+.+.. T Consensus 257 ~~~~~~~~~~~~~d~i~~l~~~l~~-------~-~~~~a~-~v~n~~~~~~L~~---lkD~---~Gr~l~--~~~~~~g- 318 (407) T protein:vir:48 257 QHIASGAASGVTADAIIKLIYTLRK-------A-HRSGAK-FMMNNSSLFAIRL---LKDN---DGNYLW--RPGIELG- 318 (407) T ss_pred cccccccccccChHHHHHHHHhhch-------h-hhcCCE-EEEcHHHHHHHHH---hhcc---CCceee--ccCcCCC- Confidence 0 00000111111222333333321 1 223333 5789998888865 2211 111111 1111111 Q ss_pred eEEEecCceEEEEeccccccC-CcceEEEEEecCccccceeEEcccccccceeecCCccccceeeeee--ecce-eeccc Q lcl|NC_015280. 347 FVGTLNGRFKVYIDPYSANVS-DNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQDTFQPRIGFKT--RYGM-VLNPF 422 (455) Q Consensus 347 ~~G~l~~~~~vy~D~y~~~~s-~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s~qP~~g~~t--RY~l-~~nP~ 422 (455) ..++|. +++|+++.+..... ..+.+++| +-. ..++. +--..+....||-.-+..++|.. |++. +.+|- T Consensus 319 ~~~~l~-G~PV~~~~~~p~~~~~~~~i~~G---d~~--~~~~i--~~~~~~~i~~d~~~~~~~~~~~~~~r~d~~v~~~~ 390 (407) T protein:vir:48 319 QPSSLA-GYGIVENEQMPDIAADAKAIAFG---NFK--RGYTI--VDRIGTRILRDPYTNKPFVGFYTTKRTGGMLVDSQ 390 (407) T ss_pred CCceec-ceeeEEecCcCCccCCccEEEEE---ecc--ccEEE--EEeeceEEEeeccccCCcEEEEEEEEeccEEeccc Confidence 124564 57888886532111 11223333 110 00111 10011222335554445555554 8888 67776 Q ss_pred ccccccccccCchhhhhc Q lcl|NC_015280. 423 AKGLTALSDSDPQAAGNL 440 (455) Q Consensus 423 ~~~~~~~~~~~~~~~~~~ 440 (455) +--.-+.....+..+ ++ T Consensus 391 a~~~l~~~aa~~~~~-~~ 407 (407) T protein:vir:48 391 AIKLMKIGAATRQKA-AA 407 (407) T ss_pred ceEEEEeeccCCCCC-CC Confidence 653333333343332 22 No 66 >protein:vir:4456 Length: 401 # NCBI annotation: Major capsid protein precursor # Family: family:all:21 # MgeID: mge:96 # MgeName: ST64B # Cross-refs: genbank:acc:NP_700379;genbank:gi:23505451;genbank:GeneID:955658 Probab=82.38 E-value=0.077 Score=26.68 Aligned_cols=328 Identities=13% Similarity=0.148 Sum_probs=121.5 Q ss_pred Ccc-----------hHHHHHHhhHhh-------------------------cCC-----CCccccchhhHHHHHHHhhhH Q lcl|NC_015280. 1 MYN-----------AENLQEKWAPVL-------------------------NHE-----GLNDIKDPYRKSVTAILLENQ 39 (455) Q Consensus 1 m~~-----------~~~~~~kw~~~l-------------------------~~~-----~~~~i~~~~~~~v~~~~~enq 39 (455) +.. .+.+.+....+. +.+ ........+|+.+...|..-+ T Consensus 16 ~~~~~~~k~~~~~~~~~~e~~~~~l~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~e~~~a~~~~lr~~~ 95 (401) T protein:vir:44 16 QQKFDDFKAKNDKRVEAIEQEKGKLAGQVETLNGKLSELENLKSDLEKELLELKRPARGAQNKVAAEHKDAFVGFLRKGR 95 (401) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhccccccccchhHHHHHHHHHHHhhhh Confidence 000 000111111100 000 000111112222222211111 Q ss_pred HHHHHH-HHHhhhhhhhchhhhcccccccccc---ccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecC Q lcl|NC_015280. 40 ERALAE-ERAVLTEAPTNVGPINTPTTSSGAV---AGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTN 115 (455) Q Consensus 40 ~~~~~e-~~~~l~ea~~~~~~~~~~st~tg~i---~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~ 115 (455) ...+++ |++ .+...+.+.|.+ ..+.+.++.+.|. ..+..+++-+.||++++..+.-.. T Consensus 96 ~~~~~~~e~~----------a~~~~~~~~GG~~iP~~~~~~ii~~~~~---~~~l~~~~~~~~~~~~~~~~~~~~----- 157 (401) T protein:vir:44 96 EDGLRDLERK----------ALQVGTDEDGGYAVPEELDRSILSLLKD---EVVMRQEATVITVGGSDYKKLVNL----- 157 (401) T ss_pred hhhhHHHHHH----------HhhcCCCCCCceeccHhHHHHHHHHHHh---hhhhhhhceeeecCCCceEEEEec----- Confidence 111110 000 011111112221 3445556666653 334577899999998864433211 Q ss_pred CCCcccccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEE Q lcl|NC_015280. 116 QSGNEAFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDK 195 (455) Q Consensus 116 qsG~EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK 195 (455) .+.. . -..++.+..+......|.+..|.+.| T Consensus 158 -~~~~-------a-----------------------------------------~wv~E~~~~~~~~~~~~~~v~~~~~k 188 (401) T protein:vir:44 158 -GGTA-------S-----------------------------------------GWVGETDTRSQTATSRLGLIEPFMGE 188 (401) T ss_pred -CCcc-------c-----------------------------------------eeeccccccCccccccceeeeeehhh Confidence 0000 0 00011111111112235555555555 Q ss_pred EEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHH--------Hhhhheeeeeeccccceee Q lcl|NC_015280. 196 IAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRT--------VYRGAKPGAQANVANAGVF 267 (455) Q Consensus 196 ~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~--------l~~vA~~~k~~~v~~~gv~ 267 (455) ..+ -..+|-||.+|- .+|.+++|.+-|+..|...+++.+|.- |.+.+...........+.. T Consensus 189 ~~~-------~~~iS~ell~ds----~~~l~~~i~~~la~ai~~~~~~~~l~G~G~~~p~Gil~~~~~~~~~~~~~~~~~ 257 (401) T protein:vir:44 189 IYG-------NPQATQKMLDDA----FFNVEAWINSELATEFAEQEEIAFTTGDGTKKPKGFLAYESTEESDKARAFGKL 257 (401) T ss_pred eee-------ehhhhHHHHhcc----hHHHHHHHHHHHHHHHHHHHHhhhhccCCCCccceeeccccccccccccccccc Confidence 443 456889999984 357888999999999988888888742 1111111111000000000 Q ss_pred eeecc-ccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCce Q lcl|NC_015280. 268 DLDVD-SNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNT 346 (455) Q Consensus 268 Dl~~~-~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~ 346 (455) +.... ..+.-..+....|.+.+..+ -+..+ ..|+++.....|.. +... +|.. ....+.+.. T Consensus 258 ~~~~t~~~~~~~~d~i~~~~~~l~~~--------~~~~a-~~v~n~~~~~~L~~---lkd~---~G~~--l~~~~~~~g- 319 (401) T protein:vir:44 258 QHIVSGEATAVTADAIIKLIYTLRKA--------HRTGA-KFMMNNNSLFAIRL---LKDT---EGNY--LWRPGLELG- 319 (401) T ss_pred cccccccccccCHHHHHHHHHhcchh--------hhcCC-EEEEcHHHHHHHHH---hhcc---CCce--eecCCcCCC- Confidence 00000 01111122333343433321 12222 46788888888874 2211 1111 011111110 Q ss_pred eEEEecCceEEEEecccccc-CCcceEEEEEecCccccceeEEcccccccceeecCCccccceeeeee--ecce-eeccc Q lcl|NC_015280. 347 FVGTLNGRFKVYIDPYSANV-SDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQDTFQPRIGFKT--RYGM-VLNPF 422 (455) Q Consensus 347 ~~G~l~~~~~vy~D~y~~~~-s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s~qP~~g~~t--RY~l-~~nP~ 422 (455) --++|. +++|+++...... +..+.+++| +-. -+|-=+-...+....||-.-+-.++|.. |+|. +.+|- T Consensus 320 ~~~~l~-G~PVv~~~~~p~~~~~~~~i~~G---d~~----~~~~i~~~~~~~~~~~~~~~~~~v~~~a~~r~d~~~~~~~ 391 (401) T protein:vir:44 320 QPSSLA-GYGIAENEQMPDIAADAKAIAFG---NFK----RGYTIVDRIGTRILRDPYTNKPFVGFYTTKRTGGMLVDSQ 391 (401) T ss_pred CCceec-ceeeEEecCcCCccCCccEEEEe---ehh----ccEEEEEecceEEeeeccccCCcEEEEEEEEeccEEeccc Confidence 123553 6677766432111 111222222 110 0011000111222335544344455544 7776 56665 Q ss_pred cccccccccc Q lcl|NC_015280. 423 AKGLTALSDS 432 (455) Q Consensus 423 ~~~~~~~~~~ 432 (455) +.-.-+.... T Consensus 392 a~~~l~~~aa 401 (401) T protein:vir:44 392 AIKLLKIAAA 401 (401) T ss_pred ceEEEEeecC Confidence 5532222211 No 67 >protein:vir:96762 Length: 632 # NCBI annotation: putative phage-related protein # Family: family:all:21 # MgeID: mge:1628 # MgeName: VP882 # Cross-refs: genbank:acc:YP_001039818;genbank:gi:126010917;genbank:GeneID:5076272 Probab=81.74 E-value=0.083 Score=26.51 Aligned_cols=317 Identities=14% Similarity=0.155 Sum_probs=124.0 Q ss_pred CcchH-----------HHHHHhhHhhcCCCCccccch-----hhHHHHHHHh-------hhHHHH-------H------- Q lcl|NC_015280. 1 MYNAE-----------NLQEKWAPVLNHEGLNDIKDP-----YRKSVTAILL-------ENQERA-------L------- 43 (455) Q Consensus 1 m~~~~-----------~~~~kw~~~l~~~~~~~i~~~-----~~~~v~~~~~-------enq~~~-------~------- 43 (455) ..+++ ......+. .....+++.+. .++.....-| .++... + T Consensus 260 ~~ra~~ld~l~~~~~a~~~~~~a~--~~~~~~~~~~~~~i~~~~re~~~~~l~rai~a~a~~~~~~a~~~~e~a~~~a~~ 337 (632) T protein:vir:96 260 QFRALVLERMNPGQPGNFEKPGAG--DLPGKPAIHSARDLGIQHKELQQYSLMRAINAAATGDWSKAGFEREVSLAIADA 337 (632) T ss_pred HHHHHHHHHHhhhhhhhhhhhhhh--hhhhhhhhhhhhhhhhhHHHHHHHHHHHHHHhhhccchhhhhhhhHHHHHHHHh Confidence 11111 11111111 11111222111 1111111111 111100 0 Q ss_pred --HHHHHhhhhhhhch-hhhccccccccccccccchh-hhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCC-C Q lcl|NC_015280. 44 --AEERAVLTEAPTNV-GPINTPTTSSGAVAGFDPIL-ISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQS-G 118 (455) Q Consensus 44 --~e~~~~l~ea~~~~-~~~~~~st~tg~i~~~~P~L-v~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qs-G 118 (455) ++.+++.+....-. ..+...++++|...-....+ -.++....|..|...+ |++.+++.+|-+ + +..+. + T Consensus 338 ~G~~arg~~~~~~~l~~ra~~~~t~~~gg~lvp~~~~~~~iie~lr~~s~i~~l-~~~~~~~~~g~~---~--ip~~~~~ 411 (632) T protein:vir:96 338 SGKEARGFYMPHEVLVQRQLEKKTAGKGGELVATELLSEEFIDILRNKAIIGQM-GARMLPGLVGDV---D--IPKKTSG 411 (632) T ss_pred hhhhhhhhhhhHHHHHHhhhhcccccccccccccccchHHHHHHHhhcchhhhh-cceEeecCCcce---E--EEEEeCC Confidence 01111111000000 01111122222211111111 1234433456676665 776666665532 1 11000 0 Q ss_pred cccccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEE Q lcl|NC_015280. 119 NEAFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAV 198 (455) Q Consensus 119 ~EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tV 198 (455) .+ +.... | +..+++-..++++++. T Consensus 412 ~~------------------------------------------------a~wv~--E------~~~~~~s~~~f~~i~l 435 (632) T protein:vir:96 412 AN------------------------------------------------FYWIG--E------DEDVQDSDFDFTTLSF 435 (632) T ss_pred ce------------------------------------------------eEeec--C------CccccccccceeeEEe Confidence 00 00000 0 1224445566777777 Q ss_pred EeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeee------ecc Q lcl|NC_015280. 199 EAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDL------DVD 272 (455) Q Consensus 199 tAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl------~~~ 272 (455) .+|+=+-...+|-||..| -++|.|++|.+-|...|...+++.+|.-- -.+-...|++-. ..+ T Consensus 436 ~~~k~~~~v~iS~ell~d----s~~~~~~~i~~~l~~a~~~~~d~a~l~G~--------G~~~~p~Gi~~~~~~~~~~~~ 503 (632) T protein:vir:96 436 SPKTIAGAVPVTRKLRKQ----SSIHVENLIREDLIEGIGVALDLAMLTGT--------GLANDPVGLLNMTGVPALTYP 503 (632) T ss_pred eeeEEEEehhhHHHHHhc----cchHHHHHHHHHHHHHHHHHHHHHhhccc--------CCCCccceeeecccccceecc Confidence 777767777888888776 25789999999999999999999987421 101112233321 111 Q ss_pred ccc-hhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCcee--EE Q lcl|NC_015280. 273 SNG-RWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTF--VG 349 (455) Q Consensus 273 ~~g-r~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~--~G 349 (455) ..+ -| +....|...| ............++++.....|...-..+ .+|... -| T Consensus 504 ~~~~~~--~~i~~~~~~i-------~~~~~~~~~~~~~~~~~~~~~l~~~~l~d----------------~~G~~i~~~~ 558 (632) T protein:vir:96 504 AGGVDW--ASVVDMETKI-------STFNADAGRLAYLTSVTQRGAAKKAQVFD----------------NTGERIWQNN 558 (632) T ss_pred cccCCH--HHHHHHHHHH-------hhcccccCccEEEEchhHHHHHHHHhccC----------------CCCceeecCC Confidence 111 12 1222232222 12222333445678988877776542221 111111 14 Q ss_pred EecCceEEEEeccccccCCcceEEEEEecCccccceeEEcccccccceeecCC----ccccceeeeeeecce-eeccccc Q lcl|NC_015280. 350 TLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQ----DTFQPRIGFKTRYGM-VLNPFAK 424 (455) Q Consensus 350 ~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp----~s~qP~~g~~tRY~l-~~nP~~~ 424 (455) +|+ +|+|++..+. |.+=+++|-- +-+|+.-+- .+.-.+|| .+.|=.+=...|+++ +.+|-.- T Consensus 559 ~l~-G~pv~~s~~i----p~~~~~~gd~------s~~~i~~~~--~~~i~~~~~~~~~~~~v~~~~~~~~d~~v~~~~af 625 (632) T protein:vir:96 559 EVN-GYRAEASNQI----PADTWIFGDW------SQIVIAMWG--VLDLKVDPYTKAASDGLVLRVFQDVDAGVRRKEAF 625 (632) T ss_pred eec-ccceEecccc----ccCcEEEeec------ceEEEEEec--ceEEEEccccccccCceEEEEEeecCceeechhhh Confidence 564 6788887553 2222333311 001111111 11112233 334444445677777 4455322 Q ss_pred ccccccccCchhhhhc Q lcl|NC_015280. 425 GLTALSDSDPQAAGNL 440 (455) Q Consensus 425 ~~~~~~~~~~~~~~~~ 440 (455) -. ...++ T Consensus 626 ~~---------~k~~A 632 (632) T protein:vir:96 626 CI---------AKKGA 632 (632) T ss_pred hh---------eeecC Confidence 11 11111 No 68 >protein:vir:96123 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1602 # MgeName: 37 # Cross-refs: genbank:acc:YP_240078;genbank:gi:66395742;genbank:GeneID:5133103 Probab=81.59 E-value=0.084 Score=26.47 Aligned_cols=257 Identities=8% Similarity=0.053 Sum_probs=115.4 Q ss_pred ecCCC-C-cccccccc-----------cccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcC Q lcl|NC_015280. 113 YTNQS-G-NEAFFDEP-----------DAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALG 179 (455) Q Consensus 113 Y~~qs-G-~EAlfnEa-----------~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG 179 (455) -.+.. . ...+.+|- .--|++-... +.... ..+|.+-. ...-. .+..++.+. T Consensus 1 ma~~~T~~~d~i~Pev~s~~v~~~~~~~~~~~~~~~~---------~~~l~--g~~G~tv~----ip~~~-~~g~~~~~~ 64 (274) T protein:vir:96 1 MAQGTTKVSNLIVPEVLAPMMQAELDKKLRFAQFADI---------DSTLV--GQPGDTLT----FPAFT-YSGDAQVIA 64 (274) T ss_pred CCccccchhhhhhhHHHHHHHHHHHHhhhhhcccccc---------ccccc--CCCCCEEE----EEeec-cCCCccccC Confidence 11000 0 01111110 0001110000 00000 00000000 00000 123344444 Q ss_pred CCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHH-HhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeee Q lcl|NC_015280. 180 DGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLK-AIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQ 258 (455) Q Consensus 180 ~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLk-AiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~ 258 (455) ++...++.++.++= .+++.+-|+-.-+++ |+. +..+-|.-.+..+-++..++.+++++|+..+....... T Consensus 65 ~g~~i~~~~it~~~--~~~~i~~~~~~~~i~-----D~~~~~~~~d~~~~~~~~~~~~~a~~~d~~i~~~l~~a~~~~-- 135 (274) T protein:vir:96 65 EGEKIPVDQIGTSK--REAKVRKIGKGTELT-----DEAVLSGFGDPQGEAVRQHGLAIANKVDNDVLEALKGATLTV-- 135 (274) T ss_pred CCCcCchhhcccce--eEEEEEeeeceeeec-----HHHHHhhcchHHHHHHHHHHHHHHHHHHHHHHHHHhcCCCCc-- Confidence 44444555555443 334445554222333 322 34567899999999999999999999998875533211 Q ss_pred eccccceeeeeeccccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhccccccccccccccccc Q lcl|NC_015280. 259 ANVANAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIG 338 (455) Q Consensus 259 ~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~ 338 (455) ....+ | .+.+-..+.++..+ ....++++|+|.+++.|..-...+|.+......+ T Consensus 136 ----~~~~~---------~-~d~i~dA~~~l~d~---------~~~~~~ivv~p~~~~~L~k~~~~~f~~~~~~g~~--- 189 (274) T protein:vir:96 136 ----EADIT---------K-LDGLQTAIDKFNDE---------DLEPMVLFVNPLDAGGLRTSASDNFTRPTQLGDN--- 189 (274) T ss_pred ----Ccccc---------c-HHHHHHHHHHhccc---------CCCceEEEeCHHHHHHHHhccccccccccccccc--- Confidence 11111 1 22222232333321 2367899999999999987654444443322111 Q ss_pred ccccCCceeEEEecCceEEEEeccccccCCcc-eEEEEEecCccccceeEEcccccccceeec-CCccccceeeeeeecc Q lcl|NC_015280. 339 EIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQ-YYVVGYKGTNAYDAGLFYCPYVPLQMYRAI-GQDTFQPRIGFKTRYG 416 (455) Q Consensus 339 ~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~d-Y~~vG~KG~~~~daglfyaPYv~l~~~~~~-Dp~s~qP~~g~~tRY~ 416 (455) ...+-..|.+. |++|++|.. -|.. -+++| +|.-. |+.. .+.. +..- ||.+++-.+-...+|| T Consensus 190 ---~~~~g~ig~~~-G~~Vi~s~~----~p~~t~~l~~-~gA~~-----~~~~-~~~~-vE~~Rd~~~~~d~i~~~~~yg 253 (274) T protein:vir:96 190 ---IIVKGAFGEAL-GAVIVRSNK----LNKGEALLAK-KGAVK-----LITK-RDFF-LEKDRDASRKSTALYSDKHYV 253 (274) T ss_pred ---ceeecccceec-CeeEEEcCC----CCcceEEEEe-Cccee-----eeec-CCcc-cccccchhhcccEEEEeeEEE Confidence 11222467774 689999954 3322 12222 22211 1111 0111 2222 8889999999999999 Q ss_pred e-eecccccccccccccCchhhhhccchhhhhhhhhhcCC Q lcl|NC_015280. 417 M-VLNPFAKGLTALSDSDPQAAGNLNANAYYRRVRVANLM 455 (455) Q Consensus 417 l-~~nP~~~~~~~~~~~~~~~~~~~~~n~y~r~~~v~~~~ 455 (455) . ..||-..- .+.++ ++++ +| T Consensus 254 ~~~~~~~~vv--~~t~~------~~~~-----------~~ 274 (274) T protein:vir:96 254 AYLYDESKVV--KITKG------AGDE-----------VM 274 (274) T ss_pred EEEEcCccEE--EEEcC------cccc-----------cC Confidence 9 66774331 11111 1111 11 No 69 >protein:vir:4226 Length: 326 # NCBI annotation: observed 35.2Kd protein # Family: family:all:507 # MgeID: mge:89 # MgeName: L5 # Cross-refs: genbank:acc:NP_039681;swissprot:sw:q05223;genbank:gi:9625447;uniprot:Q05223;genbank:GeneID:2942929 Probab=80.48 E-value=0.094 Score=26.20 Aligned_cols=303 Identities=12% Similarity=-0.011 Sum_probs=126.6 Q ss_pred HhhhHHHHH----HHHHHhhhhhhhchhhhccccccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEE Q lcl|NC_015280. 35 LLENQERAL----AEERAVLTEAPTNVGPINTPTTSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMR 110 (455) Q Consensus 35 ~~enq~~~~----~e~~~~l~ea~~~~~~~~~~st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMR 110 (455) +.=|-+|.. .+|++.+.. +++++.-.--.+.+-.+++.+.+..+-..++-+-||++++.-+.-.. T Consensus 1 ~~~~~~r~~~~~~~~e~~a~~~-----------~~~~~g~~ip~~~~~~ii~~~~~~s~i~~~~~~~~~~~~~~~~p~~~ 69 (326) T protein:vir:42 1 MAVNPDRTTPFLGVNDPKVAQT-----------GDSMFEGYLEPEQAQDYFAEAEKISIVQQFAQKIPMGTTGQKIPHWT 69 (326) T ss_pred CCCCccchhhhcCcchhhheec-----------cccCCcceechhhHHHHHHHHHhcchhhhhcceeeccCCceEEEEEe Confidence 222322210 112222222 12111111123333445555556666777888999987653221110 Q ss_pred eeecCCCCcccccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccce Q lcl|NC_015280. 111 SRYTNQSGNEAFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMA 190 (455) Q Consensus 111 srY~~qsG~EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMa 190 (455) ++.+ +...++ +..++|-. T Consensus 70 ------~~~~------------------------------------------------a~~v~E--------g~~~~~~~ 87 (326) T protein:vir:42 70 ------GDVS------------------------------------------------ASWIGE--------GDMKPITK 87 (326) T ss_pred ------CCcc------------------------------------------------eEEecC--------Cccccccc Confidence 0000 000001 12344555 Q ss_pred eEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeee Q lcl|NC_015280. 191 FSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLD 270 (455) Q Consensus 191 FsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~ 270 (455) .+++++++.+|..+-.-.+|-||.+|-. .|.++.|.+-|+..|+..+++.+|.---+-...|-.......+..... T Consensus 88 ~~f~~i~~~~~k~~~~v~iS~ell~~s~----~~~~~~i~~~l~~a~~~~~d~a~l~G~gs~~p~gi~~~~~~~~~~~~~ 163 (326) T protein:vir:42 88 GNMTSQTIAPHKIATIFVASAETVRANP----ANYLGTMRTKVATAFAMAFDNAAINGTDSPFPTFLAQTTKEVSLVDPD 163 (326) T ss_pred cceeEEEEeeEEEEEeehhhHHHHhcCH----HHHHHHHHHHHHHHHHHHHHHHhhcccCCCccccccccccccceeecc Confidence 6677777777777778889999999843 578999999999999999999888421100000000000000000000 Q ss_pred ccccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccc--cccccCCceeE Q lcl|NC_015280. 271 VDSNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGI--GEIDDTGNTFV 348 (455) Q Consensus 271 ~~~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~--~~~d~t~~~~~ 348 (455) ..+-+........ .+...... ........+.+|+++.....|.. |.... |..-+. ........... T Consensus 164 --~~~~~~~~~~~~~--~~~~~~~~--~~~~~~~~a~~v~n~~~~~~L~~---lkd~~---G~~l~~~~~~~~~~~~~~~ 231 (326) T protein:vir:42 164 --GTGSNADLTVYDA--VAVNALSL--LVNAGKKWTHTLLDDITEPILNG---AKDKS---GRPLFIESTYTEENSPFRL 231 (326) T ss_pred --cccccccchhHHH--HHHHHHhh--hhhhccCccEEEEeHHHHHHHHH---hhccC---CceeeccccccCccccccC Confidence 0000000000000 00000011 12234456678899999998885 22111 110000 00000111123 Q ss_pred EEecCceEEEEeccccccCCcceEEEEEecCccccceeEEcccccccce--------eecCCcc-----cc---ceeeee Q lcl|NC_015280. 349 GTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMY--------RAIGQDT-----FQ---PRIGFK 412 (455) Q Consensus 349 G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~--------~~~Dp~s-----~q---P~~g~~ 412 (455) |+| .+++|+++.+... . +. +++-|+-. -+||...-.+... ...|+.. || =.+=.. T Consensus 232 ~~l-~G~pv~~~~~~~~--~-~~--~~~~Gd~s---~~~~~~~~~~~v~~~~e~~~~~~~~~~~~~~~~~~~d~~~~r~~ 302 (326) T protein:vir:42 232 GRI-VARPTILSDHVAS--G-TV--VGYQGDFR---QLVWGQVGGLSFDVTDQATLNLGTPQAPNFVSLWQHNLVAVRVE 302 (326) T ss_pred cee-eeeeEEEcCCCCC--C-ce--EEEEeecc---eEEEEEecceEEEEeecceeeecccccccchhhhhcCcEEEEEE Confidence 334 3688998876532 1 11 12222211 1223222121111 1112221 22 333456 Q ss_pred eecce-eecccccccccccccCchhhhhccch Q lcl|NC_015280. 413 TRYGM-VLNPFAKGLTALSDSDPQAAGNLNAN 443 (455) Q Consensus 413 tRY~l-~~nP~~~~~~~~~~~~~~~~~~~~~n 443 (455) .|++. +.+|-+- ..+.+ -.|. ++ T Consensus 303 ~~~d~~v~~~~a~--~~l~~-~~~~-----~~ 326 (326) T protein:vir:42 303 AEYAFHCNDKDAF--VKLTN-VDAT-----EA 326 (326) T ss_pred EEeccEEecccce--EEEee-cccc-----CC Confidence 78887 6666433 22221 1111 11 No 70 >protein:vir:10364 Length: 390 # NCBI annotation: head protein; major capsid subunit precursor # Family: family:all:585 # MgeID: mge:183 # MgeName: Xp10 # Cross-refs: genbank:acc:NP_858956;genbank:gi:32128421;genbank:GeneID:2648357 Probab=79.00 E-value=0.11 Score=25.87 Aligned_cols=323 Identities=13% Similarity=0.047 Sum_probs=123.9 Q ss_pred CcchHHHHHHhhHhhcCCCCccccchhhHHHHHHHhhhHHH--------------HHHHH---HHhh---hhhh----hc Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHEGLNDIKDPYRKSVTAILLENQER--------------ALAEE---RAVL---TEAP----TN 56 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~~~~~i~~~~~~~v~~~~~enq~~--------------~~~e~---~~~l---~ea~----~~ 56 (455) -...++.++++..+...-. +. ....+++..+ ++.-++ .+.+. ++++ .+.. .. T Consensus 30 ~~~~~e~~~~~~~~~~e~~--~l-~~~i~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 105 (390) T protein:vir:10 30 GELNASARSKVDELFATVG--NL-SAEVQAARQR-VAELEGNGAGGDVQHVSVGDLFVASEQFQASAGRWNDRSARATMN 105 (390) T ss_pred cccCHHHHHHHHHHHHHHH--HH-HHHHHHHHHH-HHHHHhhcccccccccchhhhhhhhHHHHHHHHhhhhhhhhhhhH Confidence 0011222233332211000 00 0000000000 000000 00000 0000 0000 00 Q ss_pred hhhhcc---ccccccccccccch-hhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccccccccccc Q lcl|NC_015280. 57 VGPINT---PTTSSGAVAGFDPI-LISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSG 132 (455) Q Consensus 57 ~~~~~~---~st~tg~i~~~~P~-Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg 132 (455) ...... .++++.+-.-.-|. +-.++.+.-....-.+++.+.||++++.-+.-.. +..+. + . T Consensus 106 ~~~~~~~~~~~~~~~~g~~~~~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~----~~~~~-a-------~--- 170 (390) T protein:vir:10 106 IKAALNTASTDAAGSAGALTTPNRLPGFITQPDARLTVRDLIGSGRTDSALIEYVQET----GFVNN-A-------A--- 170 (390) T ss_pred HHHHHHhhhcccccccccccchhHHHHHHHHHHhhchhhhhcceeeccCCceEEEEEe----cCCcc-e-------e--- Confidence 000000 01111111111111 1223333334445667899999987753332111 00000 0 0 Q ss_pred cccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHH Q lcl|NC_015280. 133 TDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVE 212 (455) Q Consensus 133 ~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiE 212 (455) ... | +...++-..+++++++.+|..+....+|-| T Consensus 171 --------------------------------------~v~--E------g~~~~~~~~~~~~i~~~~~k~~~~~~is~e 204 (390) T protein:vir:10 171 --------------------------------------IVA--E------GALKPESSLKFAKKTDTTHVIAHTMKATRQ 204 (390) T ss_pred --------------------------------------eec--C------CccccccccceeEEEEeeEEEEEeehhhHH Confidence 000 0 112334445666777777777778899999 Q ss_pred HHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeecc------ccchhhHHHHHHHH Q lcl|NC_015280. 213 LAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVD------SNGRWSVEKFKGLL 286 (455) Q Consensus 213 LAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~------~~gr~~ve~~k~l~ 286 (455) |.||-- |.++.|.+-|+..|...+|+.||.- .-.+-...|++..... ..+-...+....++ T Consensus 205 ll~d~~-----~l~~~i~~~l~~~~~~~~~~~il~G--------~G~~~~p~Gi~~~~~~~~~~~~~~~~~~~~~~~~~~ 271 (390) T protein:vir:10 205 ILSDAP-----QLASYMNNRLIRGLKVKEDAEILRG--------TGANDGLLGLIPQATTYAAPTTIAGATRVDQLRLAM 271 (390) T ss_pred HHHhHH-----HHHHHHHHHHHHHHHHHHHHHHhhc--------CCCCccccccccccccccccccccccchHHHHHHHH Confidence 999852 4678899999999999998888732 1111123333321110 11111122233333 Q ss_pred HHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEecccccc Q lcl|NC_015280. 287 FQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANV 366 (455) Q Consensus 287 ~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~ 366 (455) +++. .....++-+|+||.....|.. +... +|..=+ ..+.++ -.++| .+++|+++... T Consensus 272 ~~l~---------~~~~~~~~~v~n~~~~~~L~~---lkd~---~g~~l~--~~~~~~--~~~~l-~G~pv~~~~~~--- 328 (390) T protein:vir:10 272 LQAS---------LAEYPASGIVINPIDWAAIEL---AKDA---NNQYLI--GNARGT--LTPTL-WGLPVVATQAM--- 328 (390) T ss_pred Hhhc---------cccCCCCEEEEcHHHHHHHHH---hhcC---CCceee--cCCcCc--CCcee-cceeeEEcCCC--- Confidence 3332 223456678899998888874 2211 111100 000111 12344 36789988764 Q ss_pred CCcceEEEEEecCccccceeEEcccccccceeecCC---ccccceeeeeeecce-eeccccccccccc Q lcl|NC_015280. 367 SDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQ---DTFQPRIGFKTRYGM-VLNPFAKGLTALS 430 (455) Q Consensus 367 s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp---~s~qP~~g~~tRY~l-~~nP~~~~~~~~~ 430 (455) |..-+++|-- ..+++.+...-+......+. .+.+=.+-...|++. +.+|-+--.-.+. T Consensus 329 -p~~~~~~gdf-----~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~r~~~r~d~~v~~~~a~~~~~~a 390 (390) T protein:vir:10 329 -APGEFLVGAF-----DLAAQIFDQWDARVEIGYVNDDFQRNMVTVLAEERLALVVYRPEALISGSFA 390 (390) T ss_pred -CCCcEEEEec-----cceEEEEEecceEEEEeecccccccCcEEEEEEEeeccEEeccccEEEEEeC Confidence 3333444421 01122211111111111111 122223334457777 5566433211111 No 71 >protein:vir:99749 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:1497 # MgeName: phiETA2 # Cross-refs: genbank:acc:YP_001004307;genbank:gi:122891761;genbank:GeneID:4712304 Probab=78.97 E-value=0.11 Score=25.86 Aligned_cols=302 Identities=10% Similarity=0.018 Sum_probs=124.3 Q ss_pred HHHHhhhHHHHHHHHHHhhhhhhhchhhhccccccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEe Q lcl|NC_015280. 32 TAILLENQERALAEERAVLTEAPTNVGPINTPTTSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRS 111 (455) Q Consensus 32 ~~~~~enq~~~~~e~~~~l~ea~~~~~~~~~~st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRs 111 (455) |.+- ||.+..++.....+.+. .+.+.....++.+++..--....-.+++.+..+.+..+++.+.||++.+.-|. . T Consensus 1 ~~k~-~~~~~~~~~~~~~~~~~-~~~~a~~~~~~~~~~~lip~~~~~~ii~~~~~~s~l~~~~~~~~~~~~~~~~p-~-- 75 (324) T protein:vir:99 1 MEQT-QKLKLNLQHFASNNVKP-QVFNPDNVMMHEKKDGTLLNDFTTPILQEVMENSKIMRLGKYEPMEGTEKKFT-F-- 75 (324) T ss_pred CCCc-hHhhHHHHHHHHHhhhh-hhccccceeccCCCcceechhHHHHHHHHHHhhchhhhhcceeeccCCceEEE-E-- Confidence 1111 11111111111111100 00111111111221111111122233444455667788899999987763322 1 Q ss_pred eecCCCCcccccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCcccccee Q lcl|NC_015280. 112 RYTNQSGNEAFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAF 191 (455) Q Consensus 112 rY~~qsG~EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaF 191 (455) +.. +.+ +-..++ +..+++... T Consensus 76 -~~~--~~~------------------------------------------------a~~v~E--------g~~~~~~~~ 96 (324) T protein:vir:99 76 -WAD--KPG------------------------------------------------AYWVGE--------GQKIETSKA 96 (324) T ss_pred -Eec--Ccc------------------------------------------------eeEecc--------Ccccccccc Confidence 100 000 000011 122444555 Q ss_pred EEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeec Q lcl|NC_015280. 192 SIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDV 271 (455) Q Consensus 192 sIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~ 271 (455) ++++++++.|.-+---..|-||.+|-. .|.+++|.+.|+..|...+++.||.---. +-...|++.... T Consensus 97 ~~~~v~~~~~k~~~~~~iS~ell~ds~----~~l~~~i~~~l~~ai~~~~d~~~l~G~g~--------~~~~~~~~~~~~ 164 (324) T protein:vir:99 97 TWVNATMRAFKLGVILPVTKEFLNYTY----SQFFEEMKPMIAEAFYKKFDEAGILNQGN--------NPFGKSIAQSIE 164 (324) T ss_pred ceeEEEEeeEEEEEeehhhHHHHhcch----HHHHHHHHHHHHHHHHHHHHHHhhhcCCC--------CccCcccccccc Confidence 566666666666666779999999974 46899999999999999999999843111 111111111000 Q ss_pred ----cccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCcee Q lcl|NC_015280. 272 ----DSNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTF 347 (455) Q Consensus 272 ----~~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~ 347 (455) ...+--..+....++..+. ..-...+.+|+|+.....|... ... ++..- .. +.+ T Consensus 165 ~~~~~~~~~~~~~~i~~~~~~l~---------~~~~~~~~~v~n~~~~~~L~~l---~d~---~g~~~--~~-~~~---- 222 (324) T protein:vir:99 165 KTNKVIKGDFTQDNIIDLEALLE---------DDELEANAFISKTQNRSLLRKI---VDP---ETKER--IY-DRN---- 222 (324) T ss_pred ccceeccccCCHHHHHHHHHhhh---------hccCCCCEEEEcHHHHHHHHHh---hcC---CCcee--ec-CCC---- Confidence 0011112233444433332 2234455689999999988853 211 11111 10 111 Q ss_pred EEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEcccccccc--------eeecCCc--------cccceeee Q lcl|NC_015280. 348 VGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQM--------YRAIGQD--------TFQPRIGF 411 (455) Q Consensus 348 ~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~--------~~~~Dp~--------s~qP~~g~ 411 (455) .++|. +++|++.+... .+...+++|-... +++..--...+ ....|+. +-|=.+=. T Consensus 223 ~~~l~-G~PVv~~~~~~--~~~~~~i~gd~~~------~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~f~~~~~~~r~ 293 (324) T protein:vir:99 223 SDTLD-GLPVVNLKSSN--LKRGELITGDFDK------LIYGIPQLIEYKIDETAQLSTVKNEDGTPVNLFEQDMVALRA 293 (324) T ss_pred Ccccc-ceeEEeecCCC--CCcceEEEEeccc------EEEEEecCcEEEEeecccccccccccccchhhhhcCcEEEEE Confidence 23454 46777765432 2233344442210 11111111110 1111111 11223333 Q ss_pred eeecce-eecccccc-cccccccCchhhhhccch Q lcl|NC_015280. 412 KTRYGM-VLNPFAKG-LTALSDSDPQAAGNLNAN 443 (455) Q Consensus 412 ~tRY~l-~~nP~~~~-~~~~~~~~~~~~~~~~~n 443 (455) ..|+|. +.||-+-- .+...-+... ..++= T Consensus 294 ~~r~d~~v~~~~a~~~lt~a~~~~~~---~~~~~ 324 (324) T protein:vir:99 294 TMHVALHIADDKAFAKLVPADKKTDS---VPGEV 324 (324) T ss_pred EEEEccEEecccceEEEEeccCCCCC---CCCCC Confidence 467776 55655431 1111101110 11111 No 72 >protein:vir:105334 Length: 276 # NCBI annotation: putative phage major capsid protein # Family: family:all:522 # MgeID: mge:1679 # MgeName: PH15 # Cross-refs: genbank:acc:YP_950669;genbank:gi:119967839;genbank:GeneID:4643213 Probab=78.42 E-value=0.11 Score=25.74 Aligned_cols=259 Identities=11% Similarity=0.056 Sum_probs=115.9 Q ss_pred CC-CcceeeeEEEeeecCCCCcccccccc-----------cccccccccccccccccccCcccCCCCCCCCccccccccc Q lcl|NC_015280. 99 MT-GPTGLIFAMRSRYTNQSGNEAFFDEP-----------DAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLA 166 (455) Q Consensus 99 mT-GPTGLIFAMRsrY~~qsG~EAlfnEa-----------~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~ 166 (455) |. +.|-| ..-+.+|- ..-|++-... +.... ..+|.+-.-..+ T Consensus 1 Ma~~~T~l-------------~d~i~Pev~~~~v~~~~~~~~~~~~~~~~---------~~~l~--g~~G~ti~iP~~-- 54 (276) T protein:vir:10 1 MAQGTTTK-------------STQIVPEVLAPMMQAELDKKLRFAQFADI---------DSTLV--GQPGDTLTFPAF-- 54 (276) T ss_pred CCcceeeh-------------hhhhchHHHHHHHHHHHHhhhhhccccee---------ccccc--CCCCCEEEeeee-- Confidence 11 00100 00111110 0011110000 00000 000110000000 Q ss_pred ccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHH-hhCCChhHHHHHHHHHHHHHHhhHHH Q lcl|NC_015280. 167 SSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKA-IHGLDAESELANILSTEILAEINREV 245 (455) Q Consensus 167 ~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkA-iHGLDAE~ELanILStEImlEINReI 245 (455) . ...++|.++++..-+..++.+ .+.+++.+-|.-.=++| |+-+ .-+.|.-.+..+-++.-|+..++.++ T Consensus 55 -~--~igda~~~~eg~~i~~~~lt~--~~~~a~i~~~~k~~~~t-----D~a~~~~~~dp~~~~~~~~~~~~a~~~d~~~ 124 (276) T protein:vir:10 55 -V--YSGDATVVPEGQKIPVDKIET--NRREAKIHKIGKGTDIT-----DEALLSGYGDPQGEAVRQHGLAIANKVDNDV 124 (276) T ss_pred -c--CCCccccccCCCccCcccccc--ceeeEEeehcccccccc-----HHHHHhhccchHHHHHHHHHHHHHHHHHHHH Confidence 0 113444555544434444444 44455555554333333 3333 23679999999999999999999999 Q ss_pred HHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccc Q lcl|NC_015280. 246 VRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLD 325 (455) Q Consensus 246 I~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~ 325 (455) +..+....... ..+.+.+ +.+-..+..+.. .-.+.++++++|++++.|.-....+ T Consensus 125 ~~~l~~~~~~~------~~~~~t~----------d~i~~A~~~lgd---------~~~~~~~ivv~p~~~~~L~k~~~~~ 179 (276) T protein:vir:10 125 LEALRGTKLTV------SADIGTL----------AGLEAAIDTFDD---------EDLEPMVLFINPKDAGKLRSSASDN 179 (276) T ss_pred HHHHhcccccc------cccccCH----------HHHHHHHHHhcc---------ccCcccEEEEcHHHHHHHHHhcccc Confidence 98876543321 1112211 112122222221 1346889999999999997544444 Q ss_pred cccccccccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEe-cCccccceeEEcccccccceeec-CCc Q lcl|NC_015280. 326 YDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYK-GTNAYDAGLFYCPYVPLQMYRAI-GQD 403 (455) Q Consensus 326 ~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~K-G~~~~daglfyaPYv~l~~~~~~-Dp~ 403 (455) |....+.... ...+-..|++. |++|++|.. .|. |-.+-++ |.-. ++... +. ....- |++ T Consensus 180 f~~~s~~g~~------~~~~G~ig~~~-G~~Vi~s~~----~p~-~t~~l~~~gAi~----~~~~~--~~-~vE~dRd~~ 240 (276) T protein:vir:10 180 FTRATELGDN------IIVKGAFGEAL-GAVIVRSKK----LDE-GEAILAKRGAVK----LITKR--DF-FLETDRDPS 240 (276) T ss_pred cccccccccc------ceeccccceec-ceeEEEcCC----CCc-ceEEEEecccee----eeecC--Cc-eeecccchh Confidence 4333222111 11223467774 689999954 232 2222222 2221 11111 11 12222 888 Q ss_pred cccceeeeeeecce-eeccccc-ccccccccCchhh Q lcl|NC_015280. 404 TFQPRIGFKTRYGM-VLNPFAK-GLTALSDSDPQAA 437 (455) Q Consensus 404 s~qP~~g~~tRY~l-~~nP~~~-~~~~~~~~~~~~~ 437 (455) .++-.+--.-+||. ..||--. -.++.....|.+| T Consensus 241 ~~~d~i~~~~~y~~~~~~~~~vv~~t~~~~~~~~~~ 276 (276) T protein:vir:10 241 TKTTALYSDKHYVAYLYDESKAVKVTKGAGTTDSGA 276 (276) T ss_pred hcccEEEEeeEEEEEEEcCcceEEEecCCcCCcCCC Confidence 89888888889998 6666432 1111122233332 No 73 >protein:vir:98635 Length: 377 # NCBI annotation: major coat protein # Family: family:all:635 # MgeID: mge:1601 # MgeName: phi3396 # Cross-refs: genbank:acc:YP_001039923;genbank:gi:126011098;genbank:GeneID:4818471 Probab=78.39 E-value=0.11 Score=25.74 Aligned_cols=329 Identities=9% Similarity=-0.013 Sum_probs=111.3 Q ss_pred Ccch----HHHHHHhhHhhcCCCCccccchhhHHHHHHHhhhHHHHH----------------------HHHHHhhhhhh Q lcl|NC_015280. 1 MYNA----ENLQEKWAPVLNHEGLNDIKDPYRKSVTAILLENQERAL----------------------AEERAVLTEAP 54 (455) Q Consensus 1 m~~~----~~~~~kw~~~l~~~~~~~i~~~~~~~v~~~~~enq~~~~----------------------~e~~~~l~ea~ 54 (455) |.-+ +++.+|=..+.++... +.....+.+....+++.-..++ .|||++++++. T Consensus 1 M~i~~k~~~~~~~~~~~l~~~~~~-~~~~ee~~~~~~~~~~~~~~~~~~~~~~e~~~~~~~~~~~~~lt~ee~~~~~~~~ 79 (377) T protein:vir:98 1 MAINLKELPKYREAVAELSAKISA-GATSEEQEKLFEAAFTTMGDEILAKNEEEMERMFDLRDKNRELTAEEIKFFNDID 79 (377) T ss_pred CCCcHHHHHHHHHHHHHHHHHHHh-hhhhHHHHHHHHHHHHhHHHHHHHHHHHHHHHHHHhccCCcccCHHHHHHHHHHH Confidence 4433 2333333332221110 0000011111111111111111 13344444332 Q ss_pred hchhhhccccccccccccccchhhh-HHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCccccccccccccccc Q lcl|NC_015280. 55 TNVGPINTPTTSSGAVAGFDPILIS-LIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGT 133 (455) Q Consensus 55 ~~~~~~~~~st~tg~i~~~~P~Lv~-l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~ 133 (455) ..++.++.-...-+.++. ++++....-.-..+|-|+|++|.+-++. .. .+ . ...| T Consensus 80 -------~~~~~~~gg~~vP~~~~~~I~~~l~~~s~i~~~~~v~~~~~~~~~~~-----~~-~~-~-------~a~w--- 135 (377) T protein:vir:98 80 -------KNVGGKDKFKLLPEETMVQVFDDLVAEHPLLKVINFKNTSLRLKALT-----AE-TS-G-------TAVW--- 135 (377) T ss_pred -------hccCCCCCccccCHHHHHHHHHHHHHhhhhhhheeeEecCcceEEEE-----ec-CC-c-------ceeE--- Confidence 112222111112122221 2222222223345688999887653321 00 00 0 0000 Q ss_pred ccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHH Q lcl|NC_015280. 134 DGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVEL 213 (455) Q Consensus 134 ~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiEL 213 (455) ..+.+...+.....|.++.|..-|... ....|-|| T Consensus 136 --------------------------------------~~e~~~~~~~~~~~f~~i~l~~~kl~a-------~~~is~el 170 (377) T protein:vir:98 136 --------------------------------------GDIFGEIKGQLKQAFKEQDFSQFKLTA-------FVVIPKDA 170 (377) T ss_pred --------------------------------------eecccccCcccCccceeEeecceeEEe-------eecccHHh Confidence 000001111123457777777777654 23467777 Q ss_pred HHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHH--------Hhh-h--heeeeeeccccceeeeee-ccccchhhH-H Q lcl|NC_015280. 214 AQDLKAIHGLDAESELANILSTEILAEINREVVRT--------VYR-G--AKPGAQANVANAGVFDLD-VDSNGRWSV-E 280 (455) Q Consensus 214 AQDLkAiHGLDAE~ELanILStEImlEINReII~~--------l~~-v--A~~~k~~~v~~~gv~Dl~-~~~~gr~~v-e 280 (455) .+|- .+|.|+.|.+-|+..|..-++..||.- |.+ . ....+..+....++.+.. .-.+--++. . T Consensus 171 L~ds----~~~ie~~i~~~la~~~a~~~~~a~i~G~G~~qP~Gil~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~ 246 (377) T protein:vir:98 171 LKFG----PKWIKQFITEQLKEAIAVALELAIVKGDGLLQPVGLLKDLSQPTVDQSTGRDITTYKTDKEAIADLSDLTPD 246 (377) T ss_pred hhcc----HhHHHHHHHHHHHHHHHHHHhhceEeccCCCcceeeeecccccccccccccccccccchhhhHhhhhhhchh Confidence 7663 467899999999999999999888752 111 1 111111111112211100 000000000 0 Q ss_pred -HHHHHHHHHHHHHHHHHHHhcCCCccEEE-EchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEE Q lcl|NC_015280. 281 -KFKGLLFQIERDANAIAQETRRGKGNIII-TSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVY 358 (455) Q Consensus 281 -~~k~l~~qi~~ean~i~~~T~~~~gn~~v-~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy 358 (455) ..+...+-+.+....-.++-..+.|+++. +.|.-.-.+. |...... ..-.++..|.=.++|. T Consensus 247 ~~~~~a~~~m~~~t~~~~~klkd~~G~~i~~~n~~~~~~~~--------p~~~~~~--------~~G~~~t~lg~p~~vv 310 (377) T protein:vir:98 247 NAPKKLVPVMKHLSVNDKKRPLKIAGQVKLILNPEDRWALE--------AQFTSRN--------QFGEYVTVLPHGITIL 310 (377) T ss_pred HHHHHHHHHHHHHHHHHHhhhhccCCceEEEecccchhhcc--------ccccccC--------CCCccccccCCCceEE Confidence 01112222333333334566678898876 3443221111 1110000 0000111121012222 Q ss_pred EeccccccCCcceEEEEEecCcc--ccceeEEcccccccceeecCCcccc-ceeeeee--ec-ceeeccccccccccccc Q lcl|NC_015280. 359 IDPYSANVSDNQYYVVGYKGTNA--YDAGLFYCPYVPLQMYRAIGQDTFQ-PRIGFKT--RY-GMVLNPFAKGLTALSDS 432 (455) Q Consensus 359 ~D~y~~~~s~~dY~~vG~KG~~~--~daglfyaPYv~l~~~~~~Dp~s~q-P~~g~~t--RY-~l~~nP~~~~~~~~~~~ 432 (455) .+.+ -|..-++.|.....- ...++-+..|-+ .-|. -.++|.. |+ |-..||-+--.=+++-| T Consensus 311 ~s~~----~p~~~i~fgdf~~Y~i~~r~~~~i~~~~~---------~~~~~d~~~f~~~~r~dg~~~~~~a~~vl~i~~~ 377 (377) T protein:vir:98 311 ESLA----VETGKAIAFVANRYDAFMATASTIEEYDQ---------TFAMEDLQLYLTKNYFYGKAKDNHTAALLTLAGG 377 (377) T ss_pred ecCC----CCcccEEEEEecceeEEeecceEEEeech---------hhhhcCceEEEEEEEEcCEEeccCcEEEEEEecC Confidence 2222 122223334321110 001111111111 1111 1122222 22 22444444322122222 No 74 >protein:vir:93742 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1475 # MgeName: 55 # Cross-refs: genbank:acc:YP_240459;genbank:gi:66396126;genbank:GeneID:5133511 Probab=77.72 E-value=0.12 Score=25.60 Aligned_cols=264 Identities=11% Similarity=0.025 Sum_probs=116.1 Q ss_pred EEeeecCCC---Cccccccc------ccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcC Q lcl|NC_015280. 109 MRSRYTNQS---GNEAFFDE------PDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALG 179 (455) Q Consensus 109 MRsrY~~qs---G~EAlfnE------a~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG 179 (455) |=...+.-+ -.|-+-+. ...-|++-... +.... ..+|.+-.-... . .+..++.+. T Consensus 1 ma~~~T~~~~~iiPev~~~~v~~~~~~~~~~~~~~~~---------~~~l~--g~~G~tv~ip~~----~-~~g~~~~~~ 64 (274) T protein:vir:93 1 MPQGITKTSNQIIPEVLAPMMQAQLEKKLRFASFAEV---------DSTLQ--GQPGDTLTFPAF----V-YSGDAQVVA 64 (274) T ss_pred CCccceehhheechHHHHHHHHHHHHhhhhhcccccc---------ccccc--CCCCCEEEEEee----c-cCCCccccc Confidence 110000000 00100000 00011110000 00000 000000000000 0 112334444 Q ss_pred CCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeee Q lcl|NC_015280. 180 DGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQA 259 (455) Q Consensus 180 ~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~ 259 (455) ++...++.++. ....+++-|-|+-.-+++=|. .+ .-+-|.-.+..+-++..+...++++++..+.+..... T Consensus 65 eg~~i~~~~it--~~~~~~~i~~~~~~~~i~D~~--~~--~~~~d~~~~~~~~~~~~~a~~~d~~~~~~~~~a~~~~--- 135 (274) T protein:vir:93 65 EGEKIPTDILE--TKKREAKIRKIAKGTSITDEA--LL--SGYGDPQGEQVRQHGLAHANKVDNDVLEALMGAKLTV--- 135 (274) T ss_pred CCCcccccccc--cceeEEEeeeecccccccHHH--HH--hhccchHHHHHHHHHHHHHHHHHHHHHHHHhcccccc--- Confidence 44444455554 444455556665322333222 22 2357889999999999999999999998875543211 Q ss_pred ccccceeeeeeccccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccc Q lcl|NC_015280. 260 NVANAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGE 339 (455) Q Consensus 260 ~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~ 339 (455) +...+ ..+-+-..+.++..+ -..+++++|+|.+++.|......+|.+...... T Consensus 136 ---~~~~~----------~~d~i~dA~~~l~d~---------~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~----- 188 (274) T protein:vir:93 136 ---NADIT----------KLNGLQSAIDKFNDE---------DLEPMVLFINPLDAGKLRGDASTNFTRATELGD----- 188 (274) T ss_pred ---ccccc----------CHHHHHHHHHHhhhc---------cCCccEEEeCHHHHHHHHhhhhhcccccccccc----- Confidence 11111 112232233333321 246789999999999998654444433322110 Q ss_pred cccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEcccccccceeecCCccccceeeeeeecce-e Q lcl|NC_015280. 340 IDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQDTFQPRIGFKTRYGM-V 418 (455) Q Consensus 340 ~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s~qP~~g~~tRY~l-~ 418 (455) +...+-..|.+. |++||+|.. .|..-.++.-+|.-. .+..+=++.... =||.+++=.+-...+||. . T Consensus 189 -~~~~~G~ig~~~-G~~Vi~s~~----~p~~t~~l~~~gai~----~~~~~~~~vE~~--Rd~~~~~d~i~~~~~y~~~~ 256 (274) T protein:vir:93 189 -DIIVKGAFGEAL-GAIIVRTNK----LEAGTAILAKKGAVK----LILKRDFFLEVA--RDASTKTTALYSDKHYVAYL 256 (274) T ss_pred -cceeecccceec-CeeEEEcCC----CCcceEEEEeCCeEE----EEecCCcccccc--cchhhcccEEEEEEEEEEEE Confidence 111223467775 689999954 443322222223211 111111111111 289999999999999999 6 Q ss_pred ecccccccccccccCchhhh Q lcl|NC_015280. 419 LNPFAKGLTALSDSDPQAAG 438 (455) Q Consensus 419 ~nP~~~~~~~~~~~~~~~~~ 438 (455) .||-..-. +......-.+ T Consensus 257 ~~~~~~v~--~t~~~~s~~~ 274 (274) T protein:vir:93 257 YDESKAVK--ITKGSGSLEM 274 (274) T ss_pred EcCCceEE--EeeCccccCC Confidence 67733311 1111110010 No 75 >protein:vir:6212 Length: 434 # NCBI annotation: prohead protease # Family: family:all:21 # MgeID: mge:128 # MgeName: phBC6A52 # Cross-refs: genbank:acc:NP_852592;genbank:gi:31415852;genbank:GeneID:1489210 Probab=77.14 E-value=0.13 Score=25.48 Aligned_cols=343 Identities=12% Similarity=0.080 Sum_probs=123.3 Q ss_pred Ccch--HHHHHHhhHhh-----------cCCCCcc--ccchhhHHHHHHHhhhH-------HHHHHHHHHhhhhhhh-ch Q lcl|NC_015280. 1 MYNA--ENLQEKWAPVL-----------NHEGLND--IKDPYRKSVTAILLENQ-------ERALAEERAVLTEAPT-NV 57 (455) Q Consensus 1 m~~~--~~~~~kw~~~l-----------~~~~~~~--i~~~~~~~v~~~~~enq-------~~~~~e~~~~l~ea~~-~~ 57 (455) |-.. ..-.++..... .++...+ +....++......+.+. .....|+|..+.+-.. +. T Consensus 56 i~~le~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~e~~~~~~~~~~~~~~~~~~~~~~~~e~r~a~~~~l~~~~ 135 (434) T protein:vir:62 56 LAKLEEKEKEEDPAKKKDDDPEKKEDPTAKENPNEKTELSEEQRSAISASIAAALSTKGHRTNKETEIRSVFANYIVGNI 135 (434) T ss_pred HHHHHHHHHHHHHHhhhcchhhhhcchhhhcchhhhHHHHHHHHHHHHHHHHhhhhhccccchHHHHHHHHHHHHhcccc Confidence 1111 01111111110 1111000 00001111111111111 1111233333222110 00 Q ss_pred h--hhccccccccccccccchhh--hHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCccccccccccccccc Q lcl|NC_015280. 58 G--PINTPTTSSGAVAGFDPILI--SLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGT 133 (455) Q Consensus 58 ~--~~~~~st~tg~i~~~~P~Lv--~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~ 133 (455) . ..-+-+++|++-.-.=|.-+ .+++..-+..+...++-|.|++|..- |-. +.... T Consensus 136 ~~~e~~a~~~~t~~GG~lvP~~~~~~Ii~~l~~~~~i~~~~~~~~~~~~~~--~p~---~~~~~---------------- 194 (434) T protein:vir:62 136 DEKEARALGLVTGNGSVTIPDFLSKEIITYAQEENFLRRLGTGVKTKENIK--YPV---LVKKA---------------- 194 (434) T ss_pred chhhhhhhcccccccceecchhhHHHHHHhhhhhhhhhhhcceeccCCceE--EEE---EecCC---------------- Confidence 0 00111222222111113222 25555556667778888888765311 111 10000 Q ss_pred ccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHH Q lcl|NC_015280. 134 DGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVEL 213 (455) Q Consensus 134 ~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiEL 213 (455) .+.... ....+...++-..++++++..+|.-+-...+|-|| T Consensus 195 -~a~~~~--------------------------------------~~~e~~~~~~~~~~f~~v~~~~~k~~~~~~iS~el 235 (434) T protein:vir:62 195 -EAQGHK--------------------------------------NERTNNEMPETDIEFDEIELSPTEFDALATVTKKL 235 (434) T ss_pred -ccccee--------------------------------------cccccccccccccceeeEEeeheeeEeehhhHHHH Confidence 000000 00001112223335566666677777778899999 Q ss_pred HHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHHHHHHHHHHH Q lcl|NC_015280. 214 AQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLLFQIERDA 293 (455) Q Consensus 214 AQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ea 293 (455) .+|- ++|.++.|.+-|+..|..-+++.||.-==+- ....++.......+...... ..+....|.+.+... T Consensus 236 l~ds----~~~l~~~i~~~la~~~~~~~d~~~l~G~G~~---~~~~g~~~~~~~~~~~~~~~--~~d~l~~l~~~l~~~- 305 (434) T protein:vir:62 236 LART----GLPIEQIVMDELKKAYVRKETQYMVNGDEAN---NINDGALAKKAVEFKTDEKN--LYDALVKMKNTPVKE- 305 (434) T ss_pred Hhcc----hHHHHHHHHHHHHHHHHHHHHHHHhccCCCC---ccccceeecccccccccccc--hhhHHHHHHhhcchh- Confidence 9995 3578899999999999999998888411000 00001100000111111111 122333344444321 Q ss_pred HHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCC-ceeEEEecCceEEEEeccccccC--Ccc Q lcl|NC_015280. 294 NAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTG-NTFVGTLNGRFKVYIDPYSANVS--DNQ 370 (455) Q Consensus 294 n~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~-~~~~G~l~~~~~vy~D~y~~~~s--~~d 370 (455) -+..+- .|+++.....|.. |... +|..=+ ..+.+. .-.-.+|. +++|+++.+..... ... T Consensus 306 -------~~~~a~-~v~n~~~~~~L~~---lkd~---~G~~l~--~~~~~~~~g~~~tl~-G~pV~~~~~~~~~~~~~~~ 368 (434) T protein:vir:62 306 -------VRKKAR-WVLNTAALTKIET---MKTD---DGFPLL--RPFNQAEGGIGYTLL-GFPVEEEDAIDIPDSPDTP 368 (434) T ss_pred -------hhcCCE-EEEcHHHHHHHHH---hhcc---CCCEee--ccCCCccCCCCceec-ceeeEEecCccCccCCCce Confidence 233443 4778888888864 2211 111111 000000 00012353 57888775542100 001 Q ss_pred eEEEEEecCccccceeE-Ecccc-cccceeecCCc--cccceeeeeeecc-eee-cccccccccccccCchhhhhcc Q lcl|NC_015280. 371 YYVVGYKGTNAYDAGLF-YCPYV-PLQMYRAIGQD--TFQPRIGFKTRYG-MVL-NPFAKGLTALSDSDPQAAGNLN 441 (455) Q Consensus 371 Y~~vG~KG~~~~daglf-yaPYv-~l~~~~~~Dp~--s~qP~~g~~tRY~-l~~-nP~~~~~~~~~~~~~~~~~~~~ 441 (455) -+++| +- +-| ..... ++.+.+..++- .-|=.+..+.|.+ ..+ .|+....=+. .+..-.+ + T Consensus 369 ~i~~G---df----s~~~i~~~~g~~~i~~~~~~~~~~~~v~~~~~~r~Dgk~i~~~~~~~~~~~-~~~~~~~---~ 434 (434) T protein:vir:62 369 VFYFG---DF----SKFYIQDVIGSLEVQKLVELFSRTNRVGFRIWNLLDAQLIHSPFEVPVYKY-VLKAPTG---A 434 (434) T ss_pred EEEEe---ec----cceEEEEeeceeEEEeehhhhcccCceEEEEEeeecceeecCcccceEEEE-EeccCCC---C Confidence 11111 11 000 00011 11122222332 2233345557774 444 4887732111 1111010 1 No 76 >protein:vir:80376 Length: 435 # NCBI annotation: gp6, major capsid head protein # Family: family:all:21 # MgeID: mge:1881 # MgeName: phi644-2 # Cross-refs: genbank:acc:YP_001111085;genbank:gi:134288639;genbank:GeneID:4960624 Probab=76.88 E-value=0.13 Score=25.43 Aligned_cols=333 Identities=13% Similarity=0.072 Sum_probs=117.3 Q ss_pred CcchHHHHHHhhHhhc---------CCCCcc---c-------------cchhhHHHHH----HHhhhHHHHHHHHHHh-- Q lcl|NC_015280. 1 MYNAENLQEKWAPVLN---------HEGLND---I-------------KDPYRKSVTA----ILLENQERALAEERAV-- 49 (455) Q Consensus 1 m~~~~~~~~kw~~~l~---------~~~~~~---i-------------~~~~~~~v~~----~~~enq~~~~~e~~~~-- 49 (455) +.+-+.|.++..-+-+ .+..+. . .+..|..-.. .+..++...-...+.+ T Consensus 42 ~~ei~~l~~~i~~~e~~e~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 121 (435) T protein:vir:80 42 SSKFNELTAQIERAEAAERMAAAAAVPVDPNPAAVTASAAAPVYAQPKAPEVKGAKMARMVRALAAARGDAQLASKLAIE 121 (435) T ss_pred HHHHHHHHHHHHHHHHHHHHHHhhcccccchhhhhccccccccccccchhhhhHHHHHHHHHHHHhccchhHHHHHHHHh Confidence 2222333333322110 000000 0 0001111111 1111110000000000 Q ss_pred --hhhhhhchhhhccccccccccccccchhhhHHHHHHhhhhhhhe-eeeccCCCcceeeeEEEeeecCCCCcccccccc Q lcl|NC_015280. 50 --LTEAPTNVGPINTPTTSSGAVAGFDPILISLIRRAMPKLIAYDI-AGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEP 126 (455) Q Consensus 50 --l~ea~~~~~~~~~~st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI-~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa 126 (455) +.+...+ .+...++..|...--....-.++.++-+..+...+ +=+-||+.+. +-+... . ++.++ T Consensus 122 ~~~~~~~~~--~~~~~~~~~gg~lvP~~~~~~ii~~l~~~~~i~~~~~~~v~~~~~~-~~~p~~---~--~~~~a----- 188 (435) T protein:vir:80 122 RGFGEEVAM--SLNTLSPGAGGVLVPENLSSEVIELLRPKSVVRKLGARTLPLSNGN-ITIPRL---K--GGAIV----- 188 (435) T ss_pred hhhhhhhhh--hhcccCCCCCccccchhHHHHHHHHHhhhchhhhccceeeecCCCc-eEEEEE---e--CCcce----- Confidence 0010000 01011111121111111111133333344444444 2234443332 111111 0 00000 Q ss_pred cccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccc Q lcl|NC_015280. 127 DAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALR 206 (455) Q Consensus 127 ~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLK 206 (455) -..++ +..+++...++++++...+.-+-. T Consensus 189 -------------------------------------------~~v~E--------~~~~~~~~~~f~~i~~~~~k~~~~ 217 (435) T protein:vir:80 189 -------------------------------------------GYIGA--------DTDIPTTQQQFDDLKLTAKKMAAL 217 (435) T ss_pred -------------------------------------------eeecc--------CccccccccceeeEEEeeEEEEEe Confidence 00000 112344555666666666666667 Q ss_pred cceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeee------ccccchhhHH Q lcl|NC_015280. 207 ADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLD------VDSNGRWSVE 280 (455) Q Consensus 207 AEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~------~~~~gr~~ve 280 (455) ...|-||.+|-.- +.|.|+.|.+-|+..|...+++-||.- .-.++ ...|++... ...++ . T Consensus 218 ~~is~ell~ds~~--~~~l~~~i~~~l~~a~~~~~d~a~l~G----~G~~~----~p~Gi~~~~~~~~~~~~~~~----~ 283 (435) T protein:vir:80 218 VPIANDLIKYAGV--NPNVDQIVVGDLTAAIGAREDKAFIRD----DGTAN----TPKGLRFWALPGNVITASDG----S 283 (435) T ss_pred ehhhHHHHHhhcc--cHHHHHHHHHHHHHHHHHHHHHHhhcc----CCCCC----cccceeecccccceeecccc----c Confidence 7889999998432 356778888888888888887777643 11111 122332111 01111 0 Q ss_pred HHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEe Q lcl|NC_015280. 281 KFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYID 360 (455) Q Consensus 281 ~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D 360 (455) .+......+.+-...+...........+|+++.....|.. +... +|..-+ .+.++ |+|. +++||++ T Consensus 284 ~~~~~~~d~~~~~~~~~~~~~~~~~~~~vmn~~~~~~L~~---lkd~---~G~~l~---~~~~~----~~l~-G~pv~~~ 349 (435) T protein:vir:80 284 TLQKIETDLGKAILALENADANLTQPGWIMAPRTFRFLEG---LRDG---NGNKVY---PELAN----GMLK-GYPVGKT 349 (435) T ss_pred chhhHHHHHHHHHHHhhccccccccCEEEEcHHHHHHHHh---hhcc---CCceec---cCCCC----CeEe-eeeeEEe Confidence 1111111111111111111112234557899999988876 2211 121111 12222 3453 4788887 Q ss_pred ccccccC----CcceEE--------EEEecCccccceeEEcccccccceeecCCc-----cc---cceeeeeeecce-ee Q lcl|NC_015280. 361 PYSANVS----DNQYYV--------VGYKGTNAYDAGLFYCPYVPLQMYRAIGQD-----TF---QPRIGFKTRYGM-VL 419 (455) Q Consensus 361 ~y~~~~s----~~dY~~--------vG~KG~~~~daglfyaPYv~l~~~~~~Dp~-----s~---qP~~g~~tRY~l-~~ 419 (455) .+...+. +.--++ +|-.+.... -..+|.-+ .|+. .| +=.+=..-|++. +. T Consensus 350 ~~~p~~~~~~~~~~~i~~gd~s~~~i~~~~~~~i----~~~~~~~~-----~~~~~~~~~~f~~n~~~~r~~~r~d~~~~ 420 (435) T protein:vir:80 350 TQVPINLGEAGKESEIYFTDFGDVFIGEEETLEI----DYSKEATY-----KDADGHMVSAFQRDQTLIRVIAKNDFGPR 420 (435) T ss_pred ccccccccCCCCcceEEEEEcccEEEEeecceEE----EEeccccc-----cccccchhhhhhcCcceeeeeeeeCcEee Confidence 6532110 011122 222222211 11111110 0110 01 123334567776 44 Q ss_pred cccccccccccccCchhh Q lcl|NC_015280. 420 NPFAKGLTALSDSDPQAA 437 (455) Q Consensus 420 nP~~~~~~~~~~~~~~~~ 437 (455) +|-+-. ..+|-.|++ T Consensus 421 ~~~a~~---~l~~~~~~~ 435 (435) T protein:vir:80 421 HVESIA---VLSGVAWGA 435 (435) T ss_pred cccceE---EEeccCCCC Confidence 454442 236777886 No 77 >protein:vir:100135 Length: 418 # NCBI annotation: gp5 # Family: family:all:585 # MgeID: mge:1639 # MgeName: phi1026b # Cross-refs: genbank:acc:NP_945035;genbank:gi:38707895;genbank:GeneID:2744182 Probab=76.73 E-value=0.13 Score=25.40 Aligned_cols=329 Identities=12% Similarity=0.015 Sum_probs=124.2 Q ss_pred Cc-----chHHHHHHhhHhhcC--------CCCccccchhhHHHHHHH-hhhHHHH-----HHHHHHh--hhhhhhchhh Q lcl|NC_015280. 1 MY-----NAENLQEKWAPVLNH--------EGLNDIKDPYRKSVTAIL-LENQERA-----LAEERAV--LTEAPTNVGP 59 (455) Q Consensus 1 m~-----~~~~~~~kw~~~l~~--------~~~~~i~~~~~~~v~~~~-~enq~~~-----~~e~~~~--l~ea~~~~~~ 59 (455) .. ..+.+..+...+-.. ...+........+..... -++.+.. +.+...+ -.+...+... T Consensus 55 ~~~e~~~~~~~l~~~~~~l~~~~~~~e~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 134 (418) T protein:vir:10 55 LGVETKATVDELLIKQGELQARLLEAEQKLARGGGSAELETPKTLGQLVTESEEMKGMDGSARKSVRVRVDRKSIMNVPA 134 (418) T ss_pred hhHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhcccccccchhhhhhHHhhhHHHHHHHHHHHhhhhhhhhHHHHHHHhhh Confidence 00 001122222221110 000000011111111110 0111000 0000000 0000001111 Q ss_pred hccccccccccccccchhh-hHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccccccccccccccccc Q lcl|NC_015280. 60 INTPTTSSGAVAGFDPILI-SLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGTDGATP 138 (455) Q Consensus 60 ~~~~st~tg~i~~~~P~Lv-~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~~~ 138 (455) ....+++++. .-.-|.+. .+++.+.+..+..+++.+-||++++.-+ .| ..+.. . ... T Consensus 135 ~~~~~~~~~g-~lvp~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~--~~--~~~~~-~-------~a~--------- 192 (418) T protein:vir:10 135 TVGSGVSGSN-SLVVADRQAGIIAPPQRKMTIRDLLMPGQTSSSSIEY--TV--ETGFT-N-------NAA--------- 192 (418) T ss_pred hccCCCCCCc-cccchhHHHHHHHHHhhhhhHHhhcceeeccCCceeE--EE--EecCC-C-------cee--------- Confidence 1111122221 12222222 3455566677788899999998875322 11 00000 0 000 Q ss_pred cccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHH Q lcl|NC_015280. 139 PTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLK 218 (455) Q Consensus 139 ~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLk 218 (455) .+++ +...++-..++++++..+|.-+-...+|-||.||.- T Consensus 193 --------------------------------~v~E--------~~~~~~~~~~f~~v~~~~~k~~~~~~is~ell~ds~ 232 (418) T protein:vir:10 193 --------------------------------AVAE--------GAQKPTSDLKFNLKNQPVRTIAHLFKASRQILDDAP 232 (418) T ss_pred --------------------------------eecc--------CccccccccceeeEEEeeeeEEEeehhhHHHHHhHH Confidence 0000 011233334556666666666667789999999852 Q ss_pred HhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeee------ccccchhhHHHHHHHHHHHHHH Q lcl|NC_015280. 219 AIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLD------VDSNGRWSVEKFKGLLFQIERD 292 (455) Q Consensus 219 AiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~------~~~~gr~~ve~~k~l~~qi~~e 292 (455) |.++.|.+-|+..|..-+|+-||.- .-.+....|++-.. ...++--.++....+++.+. T Consensus 233 -----~l~~~i~~~l~~a~~~~~d~a~l~G--------~g~~~~p~Gi~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~-- 297 (418) T protein:vir:10 233 -----ALQSYIDGRARYGLQLTEEGQILKG--------DGTGANILGILPQASAFMPSITLANATPIDKIRLALLQAV-- 297 (418) T ss_pred -----HHHHHHHHHHHHHHHHHHHHHHhcc--------CCCCccccccccccccccccccccccccHHHHHHHHHhhc-- Confidence 4677777777777777777766631 11111122222111 01111111223333433332 Q ss_pred HHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEeccccccCCcceE Q lcl|NC_015280. 293 ANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYY 372 (455) Q Consensus 293 an~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~ 372 (455) ..-+..+-+|||+.....|.. +.. .+|..=+ .+.+. ...|+|. +++|+++.+. |.+=+ T Consensus 298 -------~~~~~~~~~v~n~~~~~~L~~---lkd---~~G~~i~---~~~~~-~~~~~l~-G~pV~~~~~~----p~~~~ 355 (418) T protein:vir:10 298 -------LAEFPATGIVLNPIDWASIEL---TKD---SQGRYIV---GNPVN-GTTPRLW-NLPVVETQAM----TANEF 355 (418) T ss_pred -------cccCCCCEEEEcHHHHHHHHH---hhc---CCCceec---ccccc-CCCceec-ceeeEEcCCC----CCCcE Confidence 233455668999999988875 221 1111111 11111 1135664 4799988764 22223 Q ss_pred EEEEecCccccceeEEcccccccceeecCCcc---cc---ceeeeeeecce-eecccccccccccccCchhhhhcc Q lcl|NC_015280. 373 VVGYKGTNAYDAGLFYCPYVPLQMYRAIGQDT---FQ---PRIGFKTRYGM-VLNPFAKGLTALSDSDPQAAGNLN 441 (455) Q Consensus 373 ~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s---~q---P~~g~~tRY~l-~~nP~~~~~~~~~~~~~~~~~~~~ 441 (455) ++|---. +|-=+.-..+...+|+.. |+ =.+=+..|++. ..+|-+--.-... .- +.| T Consensus 356 ~~gd~s~-------~~~~~~~~~~~i~~~~~~~~~f~~~~~~~r~~~~~d~~~~~~~a~~~~~~~---~~---~~g 418 (418) T protein:vir:10 356 LVGAFSM-------AAQIFDRMEIEVLLSTENVDDFEKNMVSIRAEERLALAVYRPESFVTGALV---EQ---AGG 418 (418) T ss_pred EEeeccc-------eEEEEEecceEEEEecccchhhhcCceEEEEEEeeccEEecccceEEEEec---cC---CCC Confidence 3442100 000011111111222221 22 23334567777 5555433111111 00 111 No 78 >protein:vir:9704 Length: 394 # NCBI annotation: hypothetical protein # Family: family:all:21 # MgeID: mge:174 # MgeName: 315.2 # Cross-refs: genbank:acc:NP_795466;genbank:gi:28876225;genbank:GeneID:1257769 Probab=76.50 E-value=0.13 Score=25.35 Aligned_cols=320 Identities=14% Similarity=0.127 Sum_probs=124.6 Q ss_pred Cc------------chHHHHHHhhH----------hhcCCCCccc--------cchhhHHHHHHHhhhHHHHHHHH--HH Q lcl|NC_015280. 1 MY------------NAENLQEKWAP----------VLNHEGLNDI--------KDPYRKSVTAILLENQERALAEE--RA 48 (455) Q Consensus 1 m~------------~~~~~~~kw~~----------~l~~~~~~~i--------~~~~~~~v~~~~~enq~~~~~e~--~~ 48 (455) ++ .-+.|.++-.. ..+....+.. ....|+.+ ...+..+....+.. +. T Consensus 31 ~~~~~~~~~~~l~~eie~l~~ei~~l~~~~~~~e~~~e~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~ 109 (394) T protein:vir:97 31 LESDDLEAARSIKAEVEQAKANLVEAENDLKLYESSVEVGGAENIGGKEVTQEEKTYRESV-NDFIRSKGKIVNDSLRFE 109 (394) T ss_pred hchhhHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhccccccccccchhhHHHHHHH-HHHHHHHHHHhhhhhhhh Confidence 11 11111111110 0000000000 00011111 11111111100000 00 Q ss_pred hhhhhhhch-----hhhccccccc--cccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCccc Q lcl|NC_015280. 49 VLTEAPTNV-----GPINTPTTSS--GAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEA 121 (455) Q Consensus 49 ~l~ea~~~~-----~~~~~~st~t--g~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EA 121 (455) ...+..... ....+.+.++ |...--....-.+++.+.+......++.+.||+++++-+--++ ..+. T Consensus 110 ~~~~~~~~~~~~~~~~~~~~~~t~~~gg~liP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~-----~~~~-- 182 (394) T protein:vir:97 110 GKDEVLMPINETTPVEPQKDGIKKENAKPVSSEEILYTPAREVKTVVDLKPFTTVYQAKKASGKYPVLQ-----RATT-- 182 (394) T ss_pred hHHHHHHHHHhhhhhhhhccccccccccccChHHHHHHHHHHhhhhhhhhhhceeeeccCcceEEEEEe-----cCCC-- Confidence 000000000 0011111111 2211111222235555556667788999999988876543222 0000 Q ss_pred ccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccc-eeEEEEEEEEe Q lcl|NC_015280. 122 FFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEM-AFSIDKIAVEA 200 (455) Q Consensus 122 lfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EM-aFsIEK~tVtA 200 (455) . +-..+++ ...++. ...+++++..+ T Consensus 183 -------------~---------------------------------~~~v~E~--------~~~~~~~~~~~~~v~l~~ 208 (394) T protein:vir:97 183 -------------K---------------------------------MVTVAEL--------EKNPALAKPDFKDVAWNI 208 (394) T ss_pred -------------c---------------------------------cceeccc--------ccccccccccceeEEeeh Confidence 0 0000000 011222 23456666666 Q ss_pred eccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHH Q lcl|NC_015280. 201 KGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVE 280 (455) Q Consensus 201 KSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve 280 (455) +.-+-...+|-||.+|- +.|.+++|.+-|+..|..-+|..||.-+-+. +..+...++ T Consensus 209 ~k~~~~i~is~ell~ds----~~~~~~~i~~~la~~~~~~~~~~i~~g~~~~---------~~~~~~~~~---------- 265 (394) T protein:vir:97 209 DTYRGAIPLSQESIDDA----DVDLVGIVSESISQIKVNTTNDAIAKVLKSF---------TTKTVKNLD---------- 265 (394) T ss_pred hheeeehhhHHHHHhhh----hHHHHHHHHHHHHHHHHHHHHHHHhhccccc---------cccccccHH---------- Confidence 66666788999999986 3467788888888888888888777543221 112222111 Q ss_pred HHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEE- Q lcl|NC_015280. 281 KFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYI- 359 (455) Q Consensus 281 ~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~- 359 (455) ....+ + +.. .. ....+. +|+|+.+...|... ... +|.. +...+.++. .-++|.| ++|++ T Consensus 266 ~~~~~-~------~~~-~~-~~~~a~-~v~n~~~~~~l~~l---kd~---~G~~--i~~~~~~~~-~~~~l~G-~pv~~~ 325 (394) T protein:vir:97 266 EIKAL-L------NGG-FD-PAYNVS-LIVSQSFYQTLDTL---KDG---NGRY--LLQDDITAV-SGKVLLG-KPVFVL 325 (394) T ss_pred HHHHH-H------Hhh-hh-hhhCCE-EEEcHHHHHHHHHh---hcc---CCCe--eeecCcCCC-CCceecc-ceeEEe Confidence 11111 1 110 11 122344 57899998888752 211 1111 011111111 1245654 66655 Q ss_pred -eccccccCCcceEEEEEecCccccceeEEcccccccceeecCCccccceeeeeeecce-eeccccccccccc-ccCch Q lcl|NC_015280. 360 -DPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQDTFQPRIGFKTRYGM-VLNPFAKGLTALS-DSDPQ 435 (455) Q Consensus 360 -D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s~qP~~g~~tRY~l-~~nP~~~~~~~~~-~~~~~ 435 (455) |... +..-+++|-- ..++++..-..+. ....|...++..+-...|++. +.+|-+--.-++. ...|. T Consensus 326 ~~~~~----~~~~~~~gd~-----~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~r~d~~v~~~~a~~~~~~~~~~~p~ 394 (394) T protein:vir:97 326 SDEVL----GANKAFIGDF-----KRGVLFADRKDLG-LRWADNEIYGQYLQAVLRFGVSKVDDKAGYYVTFTPEPLPL 394 (394) T ss_pred ccccc----CCccEEEeec-----cccEEEEEecceE-EEEecccccceeEEEEEEEccEEecccceEEEEecccccCC Confidence 3322 2223333320 0111222221111 233455556666666678887 6666444221221 23342 No 79 >protein:vir:104085 Length: 320 # NCBI annotation: gp17 # Family: family:all:507 # MgeID: mge:1656 # MgeName: Che12 # Cross-refs: genbank:acc:YP_655596;genbank:gi:109392467;genbank:GeneID:4156953 Probab=75.78 E-value=0.14 Score=25.22 Aligned_cols=293 Identities=11% Similarity=0.006 Sum_probs=117.2 Q ss_pred HhhhHHHHHHHHHHhhhhhhhchhhhccccccccccccccchhhh-HHHHHHhhhhhhheeeeccCCCcceeeeEEEeee Q lcl|NC_015280. 35 LLENQERALAEERAVLTEAPTNVGPINTPTTSSGAVAGFDPILIS-LIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRY 113 (455) Q Consensus 35 ~~enq~~~~~e~~~~l~ea~~~~~~~~~~st~tg~i~~~~P~Lv~-l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY 113 (455) +.+.+ ..=.|++.+. ..+++++.- ..-|.+.. +++.+....+-.+++-+.||++.+.-|.- . T Consensus 1 ~~~~~-~~~~~~~~~~-----------~t~~~~~~~-~ip~~~~~~ii~~~~~~s~l~~~~~~~~~~~~~~~~p~----~ 63 (320) T protein:vir:10 1 MAAGT-AFQVDHAQIA-----------QTGDTMFKG-YLEPEQAKDYFAEAEKTSIVQQFAQKVPMGTTGQKIPH----W 63 (320) T ss_pred CCCCc-cCCHHHHHhh-----------ccccccccc-cccHHHHHHHHHHHHhccchhhhcceeeccCCceEEEE----E Confidence 11111 0001111111 111111111 12233322 44444455667888999999876533321 1 Q ss_pred cCCCCcccccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEE Q lcl|NC_015280. 114 TNQSGNEAFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSI 193 (455) Q Consensus 114 ~~qsG~EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsI 193 (455) . ++.+ +-..++. ..+++-..++ T Consensus 64 ~--~~~~------------------------------------------------a~~v~E~--------~~~~~~~~~f 85 (320) T protein:vir:10 64 I--GDVS------------------------------------------------AQWIGEG--------DMKPITKGNM 85 (320) T ss_pred e--CCcc------------------------------------------------eEEecCC--------ccccccccce Confidence 0 0000 0000111 1133333445 Q ss_pred EEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHh-------hhheeeeeecccccee Q lcl|NC_015280. 194 DKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVY-------RGAKPGAQANVANAGV 266 (455) Q Consensus 194 EK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~-------~vA~~~k~~~v~~~gv 266 (455) ++++...+..+-...+|.||.+|-. .|.++.|.+.|...|...+|+-+|.--- ..... ...+...+. T Consensus 86 ~~v~~~~~k~~~~~~is~ell~ds~----~~l~~~i~~~l~~a~a~~~d~a~l~G~g~~~~~~~~~~~~--~~~~~~~~~ 159 (320) T protein:vir:10 86 TSQNIAPHKIATIFVASAETVRANP----ANYLGTMRTKVATAFAMAFDSAALNGTDSPFPTYLAQTTK--SVSLADPGG 159 (320) T ss_pred eEEEEeeEEEEEeehhhHHHHhcCh----HHHHHHHHHHHHHHHHHHHHHHhhcccCCCCCcccccccc--cccceeccc Confidence 5666666777777789999999865 4688888888888888888888763210 00000 001111111 Q ss_pred eeeeccccchhhHHHH-HHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccc--cccccC Q lcl|NC_015280. 267 FDLDVDSNGRWSVEKF-KGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGI--GEIDDT 343 (455) Q Consensus 267 ~Dl~~~~~gr~~ve~~-k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~--~~~d~t 343 (455) ... +.-+..+.. -.+++++ .........+|+++.....|.. +....+ ..-+. ...... T Consensus 160 ~~~----~~~~~~~~~~~~~~~~~---------~~~~~~~~~~v~n~~~~~~L~~---lkd~~G---~~l~~~~~~~~~~ 220 (320) T protein:vir:10 160 ATA----SDLTAYDAVAVNGLSLL---------VNAKKKWTHTLLDDIVEPILNG---AKDKNG---RPLFIESTYTDEN 220 (320) T ss_pred ccc----cccccHHHHHHHHHhhh---------hcccCCCcEEEEcHHHHHHHHH---hhccCC---ceeeccccccCcc Confidence 101 111111111 1111111 1223345578999999999975 222111 10000 000011 Q ss_pred CceeEEEecCceEEEEecccccc------CCcceEEEEEecCccccceeEEcccccccceeecCCcc-----c---ccee Q lcl|NC_015280. 344 GNTFVGTLNGRFKVYIDPYSANV------SDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQDT-----F---QPRI 409 (455) Q Consensus 344 ~~~~~G~l~~~~~vy~D~y~~~~------s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s-----~---qP~~ 409 (455) .....++| .+++|+++.....+ .++.++++|..+..+++-+ -+.......|+.. | |=.+ T Consensus 221 ~~~~~~~i-~g~pv~~~~~~~~~~~~~~~gd~~~~~~~~~~~~~i~~~------~~~~~~~~~~~~~~~~~~f~~~~~~~ 293 (320) T protein:vir:10 221 SPFRAGRI-VSRPTILSDHVADGTTVGYMGDFRNVIWGQVGGLSFDVT------DQATLNLGTPTEPNFVSLWQHNLVAV 293 (320) T ss_pred ccccCcee-eeeeeEecCCCCCCceEEEEeecceEEEEEecCeEEEEe------ecceeeeccccccccchhhhcCcEEE Confidence 11222333 36778887654221 0111222333322211100 0000111111111 1 1122 Q ss_pred eeeeecce-eeccccccc-ccccccCchh Q lcl|NC_015280. 410 GFKTRYGM-VLNPFAKGL-TALSDSDPQA 436 (455) Q Consensus 410 g~~tRY~l-~~nP~~~~~-~~~~~~~~~~ 436 (455) =...|++. +.+|-+-.. ++.. .|.. T Consensus 294 r~~~~~d~~v~~~~a~~~l~~~~--ap~~ 320 (320) T protein:vir:10 294 RVEAEYAFHNNDKDAFVKLTNVV--TPDA 320 (320) T ss_pred EEEEeeccEEecccceEEEEecc--CCCC Confidence 23467777 566644311 1111 1211 No 80 >protein:vir:4997 Length: 397 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:109 # MgeName: Sfi21 # Cross-refs: genbank:acc:NP_049971;genbank:gi:9632943;genbank:GeneID:1262106 Probab=74.77 E-value=0.15 Score=25.03 Aligned_cols=329 Identities=12% Similarity=0.082 Sum_probs=120.8 Q ss_pred Ccch-------------------------HHHHHHhhHhhc------CCCCccccchhhHHHHHHHhhhHHHHHHHHHHh Q lcl|NC_015280. 1 MYNA-------------------------ENLQEKWAPVLN------HEGLNDIKDPYRKSVTAILLENQERALAEERAV 49 (455) Q Consensus 1 m~~~-------------------------~~~~~kw~~~l~------~~~~~~i~~~~~~~v~~~~~enq~~~~~e~~~~ 49 (455) |.+. +.+.++=....+ .+.........++.+.. +-.+.-.+++.. T Consensus 18 ~~~l~~~~~~~~~~~~~~~ee~~~l~~ei~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~----~~~~~~~~~~~~ 93 (397) T protein:vir:49 18 VENLNEKLNVAMLDDSVSAEELQAIKNERDTAKMKRDLFKEQYTEARANEVANMSEEEKKPLTK----NEEEVKANFVKD 93 (397) T ss_pred HHHHHHHHHHHHhcchhhHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccccccccc----hhhHHHHHHHHH Confidence 1000 000000000000 00000000000000000 000000111111 Q ss_pred hhhhhh----chhhhccccc-cccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccc Q lcl|NC_015280. 50 LTEAPT----NVGPINTPTT-SSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFD 124 (455) Q Consensus 50 l~ea~~----~~~~~~~~st-~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfn 124 (455) +..... +...-.+.++ +.|...--....-.+++..-+...-.+++.|+||++.+|-+-=.+ .....+ T Consensus 94 ~~~~l~~~~~~~~~~~~~~t~~~gg~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~--~~~~~~------ 165 (397) T protein:vir:49 94 FKNLVRGRYQNLLDSKTDGSGSDAGLTIPQDIRTAINTLVRQFDSLQEYVNVENVTTLTGSRVYEK--WADITG------ 165 (397) T ss_pred HHHHhhcchhhHHHhhhccCCccCcceecHHHHHHHHHHHHhhhhHhhhcceeeccCCcceEEEEe--eccCCc------ Confidence 111110 0001111111 122221111112234555556667788999999999887543222 100000 Q ss_pred cccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccc Q lcl|NC_015280. 125 EPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRA 204 (455) Q Consensus 125 Ea~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRa 204 (455) .+...++.+..++.....|.++.|++.|. + T Consensus 166 -------------------------------------------~a~~v~E~~~~~~~~~~~~~~v~~~~~k~-------~ 195 (397) T protein:vir:49 166 -------------------------------------------LAKLDDEGGQIGQNDDPKLSLIRYAIKRY-------A 195 (397) T ss_pred -------------------------------------------ceeeeccccccccccccceeeeEeeeeee-------E Confidence 00001111111111122345555555544 4 Q ss_pred cccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHH Q lcl|NC_015280. 205 LRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKG 284 (455) Q Consensus 205 LKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~ 284 (455) -...+|-||.+|-. +|.+++|.+-|+..|..-+|+.||.-.- .+....+++.++ .... T Consensus 196 ~~~~iS~ell~ds~----~~l~~~i~~~l~~~~~~~~d~ail~G~g--------~~~~~~~~~~~d----------~i~~ 253 (397) T protein:vir:49 196 GISTVTNSLLADSA----ENILAWLSGWIAKKVVVTRNKAILEAIG--------TLPNKPTLAKWD----------DIID 253 (397) T ss_pred eehhhHHHHHhhhh----HHHHHHHHHHHHHHHHHHHHHHHHhccc--------cccccccccCHH----------HHHH Confidence 45678999999853 5788999999999999999998874321 222223333221 2233 Q ss_pred HHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEE-e-cc Q lcl|NC_015280. 285 LLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYI-D-PY 362 (455) Q Consensus 285 l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~-D-~y 362 (455) +...+.. .-.....+|++|.....|... ... +|. .....+.+. ...++|.|+ +|++ + .+ T Consensus 254 ~~~~l~~---------~~~~~a~~v~n~~~~~~l~~l---kd~---~g~--~l~~~~~~~-g~~~~l~G~-pV~~~~~~~ 314 (397) T protein:vir:49 254 LQAKVDP---------AIKQTSLFLTNTSGFTALKKV---KNA---MGD--YLMERDVKS-PTGYSIDGF-VVKEISDRF 314 (397) T ss_pred HHHhhhh---------hhcCCCEEEEcHHHHHHHHHh---hcc---CCc--eeecccccC-CCCceecce-eeEEecccc Confidence 3333331 122345688999998888762 211 111 111111111 112456544 5543 2 11 Q ss_pred ccc-cCCcceEEEE---------EecCccccceeEEcccccccceeecCCccccceeeeeeecce-eeccccccccccc- Q lcl|NC_015280. 363 SAN-VSDNQYYVVG---------YKGTNAYDAGLFYCPYVPLQMYRAIGQDTFQPRIGFKTRYGM-VLNPFAKGLTALS- 430 (455) Q Consensus 363 ~~~-~s~~dY~~vG---------~KG~~~~daglfyaPYv~l~~~~~~Dp~s~qP~~g~~tRY~l-~~nP~~~~~~~~~- 430 (455) ... .++..-+++| ..+..+ +-..||.. .+-...+-.+-...|++. +.+|-.--.-... T Consensus 315 ~~~~~~~~~~~~~gd~~~~~~~~~~~~~~----i~~~~~~~------~~~~~~~~~~~~~~r~d~~~~~~~a~~~~~~~~ 384 (397) T protein:vir:49 315 LPNGTGGAMPLYFGDLKQAVTLFDRQHLS----LLSTNIGG------GAFETDTTKVRVIDRFDVVSTDTEAFVPASFKA 384 (397) T ss_pred cccccCCceeEEEeeccceEEEEeecccE----EEEecccc------chhhcCeeeEEEEEeeccEEecccceEEEEecc Confidence 100 0111112222 111111 11222211 011233445556677777 5555433111111 Q ss_pred -ccCchhhhhccc Q lcl|NC_015280. 431 -DSDPQAAGNLNA 442 (455) Q Consensus 431 -~~~~~~~~~~~~ 442 (455) ...+...-..++ T Consensus 385 ~~~~~~~~~~~~~ 397 (397) T protein:vir:49 385 IADQKAKLSTAGA 397 (397) T ss_pred cccccCcccccCC Confidence 001111112222 No 81 >protein:vir:1084 Length: 437 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:21 # MgeName: bIL309 # Cross-refs: genbank:acc:NP_076738;genbank:gi:13095848;genbank:GeneID:920418 Probab=74.36 E-value=0.16 Score=24.96 Aligned_cols=320 Identities=13% Similarity=0.073 Sum_probs=114.0 Q ss_pred CcchHHHH----------HHhhHhhcCCCCccccchhhHHHHHHHhhhHHHHHHHHH----------------------- Q lcl|NC_015280. 1 MYNAENLQ----------EKWAPVLNHEGLNDIKDPYRKSVTAILLENQERALAEER----------------------- 47 (455) Q Consensus 1 m~~~~~~~----------~kw~~~l~~~~~~~i~~~~~~~v~~~~~enq~~~~~e~~----------------------- 47 (455) ....+..+ ++-....+.+...+ . ..-.....+.+.+...++. T Consensus 67 ~~~~e~~~~~~~~~~~e~~~~~~~~e~~~~~~----~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 141 (437) T protein:vir:10 67 ALKVEEKRDDSDLVAPELEENSADNEEDDPEK----L-KTETKSEAEKDKKTVKDEEKRDAGGLQDMKLKVGGEIADKKV 141 (437) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH----H-HHHHHHHHHHHHHHHHHHHHHhHHHHhHHHHHHHHHHHHhhh Confidence 00000000 00000000000000 0 0000111111111111100 Q ss_pred Hhhhhhhhchh-hhccccc-cccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCccccccc Q lcl|NC_015280. 48 AVLTEAPTNVG-PINTPTT-SSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDE 125 (455) Q Consensus 48 ~~l~ea~~~~~-~~~~~st-~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnE 125 (455) ..+.+...... .....++ +.+... .-..+...++.........+++.|.||+.+.+-+--.+.. +.. T Consensus 142 ~~~~~~~~~~e~~~~~~~~~~~~g~l-vp~~~~~~i~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~-----~~~----- 210 (437) T protein:vir:10 142 TAFADYLKTGEVRDVTGIALKDGKVI-IPETILTPEKEVHQFPRLGSLVRTESVTTTTGKLPIFNNS-----TDL----- 210 (437) T ss_pred hhhHHHHHhhhhhhhhhccccccccc-chHHHHHHHHHhhhhhhhhhcceeEeeccCceeeEEeecc-----ccc----- Confidence 00000000000 0001111 111111 1111122222211222345668888888776654433300 000 Q ss_pred ccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeecccc Q lcl|NC_015280. 126 PDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRAL 205 (455) Q Consensus 126 a~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaL 205 (455) . -...+.+..++.....|.++.|.+.|..+ T Consensus 211 ---------~----------------------------------~~~~e~~~~~e~~~~~~~~v~~~~~k~~~------- 240 (437) T protein:vir:10 211 ---------L----------------------------------TAHTEYGQTTKNATPVITPILWDLKTYTG------- 240 (437) T ss_pred ---------c----------------------------------ccccccccccccccccceeeeeehhheee------- Confidence 0 00000011111222346666666666543 Q ss_pred ccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHHH Q lcl|NC_015280. 206 RADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGL 285 (455) Q Consensus 206 KAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l 285 (455) -..+|-||.+|- ..|.+++|.+.|+..|..-+|..||.-+-+. ...+++....-| ...+ T Consensus 241 ~~~is~ell~ds----~~~~~~~i~~~l~~~~~~~~~~~i~~g~g~~----~~~~~~~~~~~~-------------~~~~ 299 (437) T protein:vir:10 241 GYVFSQELISDS----SYDWQAELQSRLIELRDNTDDSLIITALTDG----IKKTTSTYLLGD-------------LKKV 299 (437) T ss_pred ehhhhHHHHhhh----HHHHHHHHHHHHHHHHHHHHHHHHhhhhccc----ccccccccchhh-------------HHHH Confidence 467899999884 3578889999999999999999888754321 111111111111 1111 Q ss_pred -HHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEecccc Q lcl|NC_015280. 286 -LFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSA 364 (455) Q Consensus 286 -~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~ 364 (455) -+.+... -+..+ .+|+++.....|... .. .+|.. +...+.++. ..++|.| ++|++...+. T Consensus 300 ~~~~l~~~--------~~~~~-~~~~~~~~~~~l~~l---kd---~~g~~--~~~~~~~~~-~~~~l~G-~pv~~~~~~~ 360 (437) T protein:vir:10 300 LNVTLKPQ--------DSAAA-SIVMSQSAYNLFDMA---TD---AMGRP--LLQPNVTAA-TGYTLLG-KTVVIVDDKL 360 (437) T ss_pred HHhhhhhh--------hhcCC-EEEEcHHHHHHHHHh---hc---cCCCe--eeccCccCC-CCccccc-ceeEEecccc Confidence 0111111 11223 468899888888653 11 11111 111111111 1235654 5665532110 Q ss_pred --ccCCcceEEEEEecCccccceeEEcccccc--------cceeec-CCccccceeeeeeecce-eeccccccc-c-ccc Q lcl|NC_015280. 365 --NVSDNQYYVVGYKGTNAYDAGLFYCPYVPL--------QMYRAI-GQDTFQPRIGFKTRYGM-VLNPFAKGL-T-ALS 430 (455) Q Consensus 365 --~~s~~dY~~vG~KG~~~~daglfyaPYv~l--------~~~~~~-Dp~s~qP~~g~~tRY~l-~~nP~~~~~-~-~~~ 430 (455) +...-+ ..+||+.+-.+ ...... +-+.++..+.+..||+. +++|-+--. + +.. T Consensus 361 ~~~~~~~~-------------~~~~~gd~~~~~~~~~r~~~~~~~~~~~~~~~~~~~~~~r~d~~~~~~~a~~~l~~~~~ 427 (437) T protein:vir:10 361 FPSASAGD-------------VNIVVAPLKKAVINFKLTEITGQFQDTYDIWYKQLGIFLRQNVVQASKDLIVNLTGKLK 427 (437) T ss_pred cCCcCCCc-------------eEEEEeeccccEEEEeeeceEEEEecccccccceeeEEEEEccEEecccceEEEEeecc Confidence 000011 11333332110 011111 33455566667789987 666655411 0 000 Q ss_pred -ccCchhhhhc Q lcl|NC_015280. 431 -DSDPQAAGNL 440 (455) Q Consensus 431 -~~~~~~~~~~ 440 (455) -.....+ ++ T Consensus 428 ~~~~~~~~-~~ 437 (437) T protein:vir:10 428 AVTVVQST-AV 437 (437) T ss_pred ccccCCCC-CC Confidence 0011111 11 No 82 >protein:vir:96223 Length: 324 # NCBI annotation: ORF011 # Family: family:all:507 # MgeID: mge:1607 # MgeName: 69 # Cross-refs: genbank:acc:YP_239571;genbank:gi:66395304;genbank:GeneID:5132771 Probab=72.55 E-value=0.18 Score=24.65 Aligned_cols=301 Identities=11% Similarity=0.060 Sum_probs=119.7 Q ss_pred ccchhhHHHHHHHhhhHHHHHHHHHHh-hhhhhhchhhhccccccccccccccchh-hhHHHHHHhhhhhhheeeeccCC Q lcl|NC_015280. 23 IKDPYRKSVTAILLENQERALAEERAV-LTEAPTNVGPINTPTTSSGAVAGFDPIL-ISLIRRAMPKLIAYDIAGVQPMT 100 (455) Q Consensus 23 i~~~~~~~v~~~~~enq~~~~~e~~~~-l~ea~~~~~~~~~~st~tg~i~~~~P~L-v~l~RRa~p~LIa~DI~GVQPmT 100 (455) |.+..+.+ .++++.... ...+..+..+ ..++++++. -.-|.+ -.+++.+..+.+..+++.+-||+ T Consensus 1 ~~~~~~~~----------~~~~~f~~~~~~~~~~~a~~--~~~~~~~~~-lip~~~~~~ii~~~~~~s~l~~l~~~~~~~ 67 (324) T protein:vir:96 1 MEQTQKLK----------LNLQHFASNNVKPQVFNPDN--VMMHEKKDG-TLLNDFTTPILQEVMENSKIMQLGKYEPME 67 (324) T ss_pred CCcchhhh----------HHHHHHHHhhhhhhhccccc--ccccCCCcc-eechhHHHHHHHHHHhhchhhhhcceeecc Confidence 11111111 111111110 1111111111 111111211 122323 23455566677788999999999 Q ss_pred CcceeeeEEEeeecCCCCcccccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCC Q lcl|NC_015280. 101 GPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGD 180 (455) Q Consensus 101 GPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~ 180 (455) +++.-|.-.. . +.++ -..++.+..++ T Consensus 68 ~~~~~~p~~~----~--~~~a------------------------------------------------~~v~Eg~~~~~ 93 (324) T protein:vir:96 68 GTEKKFTFWA----D--KPGA------------------------------------------------YWVGEGQKIET 93 (324) T ss_pred CCceEEEEEe----c--Ccce------------------------------------------------eeecCCccccc Confidence 8764332111 0 0000 00011111111 Q ss_pred CCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeec Q lcl|NC_015280. 181 GASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQAN 260 (455) Q Consensus 181 s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~ 260 (455) + ...|.+..+.+.|..+- ...|-||.+|-. .|.+++|.+.|...|...+++.||.--- .+ T Consensus 94 ~-~~~f~~v~~~~~k~~~~-------~~is~ell~ds~----~~l~~~i~~~l~~aia~~~d~~~l~G~g--------~~ 153 (324) T protein:vir:96 94 S-KATWVNATMRAFKLGVI-------LPVTKEFLNYTY----SQFFEEMKPMIAEAFYKKFDEAGILNQG--------NN 153 (324) T ss_pred c-ccceeEEEEEeEEEEEe-------ehhhHHHHhcch----HHHHHHHHHHHHHHHHHHHHHHhhhcCC--------CC Confidence 1 12355555555555544 458999999853 4688889999999999888888885311 11 Q ss_pred cccceeeeeecccc----chhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhccccccccccccccc Q lcl|NC_015280. 261 VANAGVFDLDVDSN----GRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGG 336 (455) Q Consensus 261 v~~~gv~Dl~~~~~----gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~ 336 (455) ....|++....... +.-..+....+...+. ..-+..+.++||+.....|+.. ... +|..- T Consensus 154 ~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~i~---------~~~~~~~~~i~n~~~~~~L~~l---kd~---~G~~~- 217 (324) T protein:vir:96 154 PFGKSIAQSIKKTNKVIKGDFTQDNIIDLEALLE---------DDELEANAFISKTQNRSLLRKI---VDP---ETKER- 217 (324) T ss_pred CcCccccccccccceecccccchHHHHHHHHhhh---------hccCCCCEEEEcHHHHHHHHHh---hCC---CCCee- Confidence 11122221110000 0001122222322222 1234556789999998888753 111 11110 Q ss_pred ccccccCCceeEEEecCceEEEEeccccccCCcceEEEE--------EecCccccceeEEcccccccceeecCCcc---- Q lcl|NC_015280. 337 IGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVG--------YKGTNAYDAGLFYCPYVPLQMYRAIGQDT---- 404 (455) Q Consensus 337 ~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG--------~KG~~~~daglfyaPYv~l~~~~~~Dp~s---- 404 (455) . .+..+ ++| .+++|++++... .+..-+++| ..+.-+.+ ...+ ..+....|+.. T Consensus 218 -~-~~~~~----~~l-~G~PV~~~~~~~--~~~~~~~~gd~s~~~~~~~~~~~i~----~~~~--~~~~~~~~~~~~~~~ 282 (324) T protein:vir:96 218 -I-YDRNS----DSL-DGLPVVNLKSSN--LKRGELITGDFDKLIYGIPQLIEYK----IDET--AQLSTVKNEDGTPVN 282 (324) T ss_pred -e-cCCCC----Ccc-cceeeEeecCCC--CCcceEEEEecceEEEEEecCcEEE----Eeec--ccccccccccccchh Confidence 0 01112 233 357777765321 222233333 22221110 0000 00011111110 Q ss_pred -c---cceeeeeeecce-eecccccccccccccCchhhhhccch Q lcl|NC_015280. 405 -F---QPRIGFKTRYGM-VLNPFAKGLTALSDSDPQAAGNLNAN 443 (455) Q Consensus 405 -~---qP~~g~~tRY~l-~~nP~~~~~~~~~~~~~~~~~~~~~n 443 (455) | |=.+=..-|||. +.+|-+- ..+...++-..-+.++- T Consensus 283 ~~~~n~v~~r~~~r~d~~v~~~~a~--~~l~~a~~~~~~~~~~~ 324 (324) T protein:vir:96 283 LFEQDMVALRATMHVALHIADDKAF--AKLVPADKRTDSVPGEV 324 (324) T ss_pred hhhcCcEEEEEEEEeccEEecccce--EEEecccccCCCCCCCC Confidence 1 223334567777 5666332 11111111111122332 No 83 >protein:vir:962 Length: 397 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:19 # MgeName: bIL285 # Cross-refs: genbank:acc:NP_076616;genbank:gi:13095724;genbank:GeneID:920264 Probab=72.19 E-value=0.19 Score=24.59 Aligned_cols=322 Identities=15% Similarity=0.100 Sum_probs=115.1 Q ss_pred CcchHHHHHHhhHhhc------------CCCCcccc-chhhHHHHHHHhhhHHHHHHHHHHhhhhhhhchh-hhcccccc Q lcl|NC_015280. 1 MYNAENLQEKWAPVLN------------HEGLNDIK-DPYRKSVTAILLENQERALAEERAVLTEAPTNVG-PINTPTTS 66 (455) Q Consensus 1 m~~~~~~~~kw~~~l~------------~~~~~~i~-~~~~~~v~~~~~enq~~~~~e~~~~l~ea~~~~~-~~~~~st~ 66 (455) ....+.|.++..-+-. .+..++.. ...++.+. .+...++...+++..+.+...... ......++ T Consensus 60 ~~~i~~l~~~i~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 137 (397) T protein:vir:96 60 EKQVKDLDEKIAELQKEKQDLEDELAKAADPTDQKPKDGEKRKMK--KFKVTEEELAEKRSAINAFVKSKGAEKRDGFTS 137 (397) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhHHHHHHHHH--HHhhhhHHHHHHHHHHHHHHHhhhhhhhhcccc Confidence 0011111111111000 00000000 00000000 000011111222222222111111 11111122 Q ss_pred ccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccccccccccccccccccccccccC Q lcl|NC_015280. 67 SGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGTDGATPPTATTEKN 146 (455) Q Consensus 67 tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~~~~~~~~~~~ 146 (455) +..-...-+.+..-+.+.-.-....+.+.+.|++++.+-+--.+. .+.. ..+ T Consensus 138 ~~~~~~vp~~~~~~i~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~-----~~~~-------~~~---------------- 189 (397) T protein:vir:96 138 VEGGALIPQELLQPQLEPKDIVDLSKYVRSVPVNSASGKFPVISK-----SGSK-------MAT---------------- 189 (397) T ss_pred cccccchhHHHHHHHHHhhhhhhHHHhhhhccccccceeEEEEec-----cCCc-------ccc---------------- Confidence 111122222222222212222234678889999888766544330 0000 000 Q ss_pred cccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChh Q lcl|NC_015280. 147 PALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAE 226 (455) Q Consensus 147 ~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE 226 (455) ..+.+...+.....|.+..|++.|. +-....|.||.+|- ..|.+ T Consensus 190 -------------------------~~E~~~~~~~~~~~~~~i~~~~~~~-------~~~~~~s~ell~ds----~~~l~ 233 (397) T protein:vir:96 190 -------------------------VQQLEKNPQLANPKMVEIDYSVATR-------RGYIPISQEMIDDA----SYDVT 233 (397) T ss_pred -------------------------ccccccccccccccccceeecHhHh-------hcchhhHHHHHhhh----HHHHH Confidence 0000001111122355555555444 44557899999984 34678 Q ss_pred HHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCcc Q lcl|NC_015280. 227 SELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAIAQETRRGKGN 306 (455) Q Consensus 227 ~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn 306 (455) +.|.+-|+..|...+|..|+.-... ....|+...+ ....+++... .. .+.+ T Consensus 234 ~~i~~~l~~~~~~~~~~~i~~g~g~---------~~~~~~~~~d----------~~~~~~~~~~--------~~-~~~a- 284 (397) T protein:vir:96 234 GLIADEIQDQSLNTKNADIAAVLKT---------ATAKSVVGVD----------GLKDLINKEI--------KK-VYDV- 284 (397) T ss_pred HHHHHHHHHHHHHHHHHHHhhcccc---------cccccccchH----------HHHHHHHHhh--------hh-hcCc- Confidence 8899999999999999988754322 1223333221 1223322211 11 1123 Q ss_pred EEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEeccccccCCcce-EEEEEecCccccce Q lcl|NC_015280. 307 IIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQY-YVVGYKGTNAYDAG 385 (455) Q Consensus 307 ~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY-~~vG~KG~~~~dag 385 (455) -.|+++.....|... ... +|..- ...|.++. ..++|.|+=.+++|.++......++ +++| +-. . + T Consensus 285 ~~v~n~~~~~~l~~l---kd~---~G~~~--~~~~~~~~-~~~~l~G~pv~~~~~~~~~~~~~~~~~~~g---d~~-~-~ 350 (397) T protein:vir:96 285 KLFISASMYSELDKL---KDK---NGRYL--LQDSITAA-SGKQLLGKEVVVLDDDVIGKSVGNVVGFIG---DAK-A-F 350 (397) T ss_pred EEEEcHHHHHHHHHh---hcc---CCCeE--eccCccCC-CcccccccceEEecccccCCCCCceEEEEe---ehh-c-c Confidence 468888888887652 211 11110 01111111 1235544433345543322211121 2222 100 0 0 Q ss_pred eEEcccccccceeecCCccccceeeeeeecce-eeccccccccccccc Q lcl|NC_015280. 386 LFYCPYVPLQMYRAIGQDTFQPRIGFKTRYGM-VLNPFAKGLTALSDS 432 (455) Q Consensus 386 lfyaPYv~l~~~~~~Dp~s~qP~~g~~tRY~l-~~nP~~~~~~~~~~~ 432 (455) ..+...-.+. ....+-..|+-.+-...|++. +.+|-.--.-...-+ T Consensus 351 ~~~~~~~~~~-~~~~~~~~~~~~~~~~~r~d~~~~~~~a~~~~~~~~a 397 (397) T protein:vir:96 351 ASFFDRKQVS-VSWVDNNIYGQLLAGIIRYDVKATDKKAGFYVTFTIG 397 (397) T ss_pred eEeEeecceE-EEEecccccceeEEEEEEEccEEecccceEEEEeecC Confidence 0000000011 112233334444444568887 555544321111111 No 84 >protein:vir:105038 Length: 428 # NCBI annotation: major capsid head protein precursor # Family: family:all:21 # MgeID: mge:1465 # MgeName: phiKO2 # Cross-refs: genbank:acc:YP_006586;genbank:gi:46402092;genbank:GeneID:2777903 Probab=72.10 E-value=0.19 Score=24.58 Aligned_cols=332 Identities=15% Similarity=0.086 Sum_probs=112.6 Q ss_pred Ccc------------hHHHHHHhhHhhcCCCC----------------cccc---chhhHH----HHHHHhhhHHHHHHH Q lcl|NC_015280. 1 MYN------------AENLQEKWAPVLNHEGL----------------NDIK---DPYRKS----VTAILLENQERALAE 45 (455) Q Consensus 1 m~~------------~~~~~~kw~~~l~~~~~----------------~~i~---~~~~~~----v~~~~~enq~~~~~e 45 (455) |+. -+.|.++...+-..|.+ +... ...+.. ....+.+.. ..+.+ T Consensus 31 lt~ee~~~~~~l~~e~~~l~~~i~~~e~~e~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~ 109 (428) T protein:vir:10 31 LTAEQLTEFAGLQQQFTDISAKMDRMEATERAAALVAKPVKATQHGPAVIVKAEPKQYTGAGMTRMVMSIAAAQ-GNLQD 109 (428) T ss_pred CCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhchhhccccccccccchhhhHHHHHHHHHHHHhh-hhHHH Confidence 222 12222222211000000 0000 000000 000010000 00001 Q ss_pred HHHhhhhhh--hchhhhccccccccccc---cccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcc Q lcl|NC_015280. 46 ERAVLTEAP--TNVGPINTPTTSSGAVA---GFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNE 120 (455) Q Consensus 46 ~~~~l~ea~--~~~~~~~~~st~tg~i~---~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~E 120 (455) .+.+..+.. .........++++|.+. .+.+-++.+.| +..+..++ |++..++++|-+-=.| ..+ +.. T Consensus 110 ~~~~~~~~~~~~~~~~~~~~~~~~gg~liP~~~~~~ii~~l~---~~~~l~~~-~~~~~~~~~g~~~~p~--~~~--~~~ 181 (428) T protein:vir:10 110 AAKFASDELNDQSVSMAISTAAGSGGVLIPQNIHSEVIELLR---DRTIVRKL-GARSIPLPNGNMSLPR--LAG--GAT 181 (428) T ss_pred HHHHhhhhhhhhhHhhhhcccccCCccccchhHHHHHHHHHh---hhchhhhh-cceeeecCCcceEEEE--EeC--Ccc Confidence 111110100 00001112222233221 12223333333 44455555 3333334444321111 100 000 Q ss_pred cccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEe Q lcl|NC_015280. 121 AFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEA 200 (455) Q Consensus 121 AlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtA 200 (455) +-..++ +...++...++++++... T Consensus 182 ------------------------------------------------a~~v~E--------g~~~~~~~~~f~~i~~~~ 205 (428) T protein:vir:10 182 ------------------------------------------------ASYTGE--------NQDAKVSEARFDDVKLTA 205 (428) T ss_pred ------------------------------------------------eeeecc--------CccccccccceeeEEeee Confidence 000111 122344445556666666 Q ss_pred eccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeee----------ee Q lcl|NC_015280. 201 KGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFD----------LD 270 (455) Q Consensus 201 KSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~D----------l~ 270 (455) |.-+-...+|-||.+|- ..|.++.|.+-|...|...+|+.||.- .-.+....|++- .. T Consensus 206 ~k~~~~v~is~ell~ds----~~~l~~~i~~~l~~ai~~~~d~~~l~G--------~G~~~~p~Gi~~~~~~~~~~~~~~ 273 (428) T protein:vir:10 206 KTMIAMVPISNALIGRA----GFNVEQLVLQDILTAISVREDKAFMRD--------DGTGDTPIGMKARATQWNRLLPWA 273 (428) T ss_pred EEEEEeehhhHHHHhhh----hHHHHHHHHHHHHHHHHHHHHHHHhcc--------CCCCcccccccccccccccccccc Confidence 66666788999999884 245788888888888888888877632 111111223321 11 Q ss_pred ccccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEE Q lcl|NC_015280. 271 VDSNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGT 350 (455) Q Consensus 271 ~~~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~ 350 (455) .....- ......+ .....-+...... .....-.|+++.....|.. +.. .+|..-+ .+. .-|+ T Consensus 274 ~~~~~~--~~~~~~~-~~~~~~~~~~~~~--~~~~~~~v~n~~~~~~L~~---lkd---~~G~~i~---~~~----~~g~ 335 (428) T protein:vir:10 274 ADAAVN--LDTIDTY-LDSIILMSMDGNS--NMISSGWGMSNRTYMKLFG---LRD---GNGNKVY---PEM----AQGM 335 (428) T ss_pred cccccc--HHHHHHH-HHHHHHhhhcccc--ccccCEEEEcHHHHHHHHH---hhc---cCCceec---cCC----CCCe Confidence 011110 0111111 1111111111111 1123345678888887765 221 1111111 011 1245 Q ss_pred ecCceEEEEeccccccC------------CcceEEEEEecCccccceeEEcccccccceee---cCCccccceeeeeeec Q lcl|NC_015280. 351 LNGRFKVYIDPYSANVS------------DNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRA---IGQDTFQPRIGFKTRY 415 (455) Q Consensus 351 l~~~~~vy~D~y~~~~s------------~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~---~Dp~s~qP~~g~~tRY 415 (455) | .+++||++.+...+. ++-++++|..|.-+.+ ..||........ ..=..-+=.+=...|+ T Consensus 336 l-~G~pv~~~~~~p~~~~~~~~~~~i~~gd~s~~~i~~~~~i~i~----~~~~~~~~~~~~~~~~~f~~~~~~~R~~~r~ 410 (428) T protein:vir:10 336 L-KGYPIQRTSAIPANLGEGGKESEIYFADFNDVVIGEDGNMKVD----FSKEASYIDTDGKLVSAFSRNQSLIRVVTEH 410 (428) T ss_pred e-eceeeEEeccccccccCCCccceEEEEecceEEEEEecceEEE----eecccccccccccccchhhcchhheeeeeee Confidence 5 367888875532110 1112333333333211 122211100000 0000011222245666 Q ss_pred ce-eecccccccccccccCch Q lcl|NC_015280. 416 GM-VLNPFAKGLTALSDSDPQ 435 (455) Q Consensus 416 ~l-~~nP~~~~~~~~~~~~~~ 435 (455) ++ +.+|-+--. .++-.| T Consensus 411 d~~v~~p~a~~~---~t~~~~ 428 (428) T protein:vir:10 411 DIGFRHPEGLVL---GTGVLF 428 (428) T ss_pred CceeeccceEEE---EeccCC Confidence 66 344543311 134444 No 85 >protein:vir:2344 Length: 397 # NCBI annotation: gp14 # Family: family:all:507 # MgeID: mge:51 # MgeName: Bxb1 # Cross-refs: genbank:acc:NP_075281;genbank:gi:12657868;genbank:GeneID:920118 Probab=70.58 E-value=0.21 Score=24.33 Aligned_cols=312 Identities=16% Similarity=0.026 Sum_probs=118.3 Q ss_pred HhhhHHHHHHHHHHhhhhhhhchhhhccccccccccccccchh-hhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeee Q lcl|NC_015280. 35 LLENQERALAEERAVLTEAPTNVGPINTPTTSSGAVAGFDPIL-ISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRY 113 (455) Q Consensus 35 ~~enq~~~~~e~~~~l~ea~~~~~~~~~~st~tg~i~~~~P~L-v~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY 113 (455) +=.| .|++.++.... +..+. -.-|.+ -.+++++....+-.+++-+.||++++.-|.-. T Consensus 1 ~g~~-----~e~~~~~~~~t----------~~~~g--~l~~~~~~~ii~~l~~~s~i~~l~~~~~~~~~~~~ip~~---- 59 (397) T protein:vir:23 1 MGFS-----ADHSQIAQTKD----------TMFTG--YLDPVQAKDYFAEAEKTSIVQRVAQKIPMGATGIVIPHW---- 59 (397) T ss_pred CCcC-----HHHHHHhhccC----------CCCcc--ccchhHHHHHHHHHHhccchhhhcceeeccCCceEEEEE---- Confidence 0001 12222222211 11111 111221 12333344455667788899998775322111 Q ss_pred cCCCCcccccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEE Q lcl|NC_015280. 114 TNQSGNEAFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSI 193 (455) Q Consensus 114 ~~qsG~EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsI 193 (455) . .+.+ +-..++ +..+++-..++ T Consensus 60 ~--~~~~------------------------------------------------a~wv~E--------g~~~~~s~~~f 81 (397) T protein:vir:23 60 T--GDVS------------------------------------------------AQWIGE--------GDMKPITKGNM 81 (397) T ss_pred c--CCcc------------------------------------------------eEEecC--------Cccccccccce Confidence 0 0000 000011 11234444556 Q ss_pred EEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeecc- Q lcl|NC_015280. 194 DKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVD- 272 (455) Q Consensus 194 EK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~- 272 (455) +++++..|..+-.-.+|-||.+|-. .|.+++|.+-|...|...+|+.+|.-.-+- + ...++.+.... T Consensus 82 ~~v~l~~~k~~~~v~iS~ell~ds~----~~l~~~i~~~l~~aia~~~d~a~l~G~gt~----~----~~~~~~~~~~~~ 149 (397) T protein:vir:23 82 TKRDVHPAKIATIFVASAETVRANP----ANYLGTMRTKVATAIAMAFDNAALHGTNAP----S----AFQGYLDQSNKT 149 (397) T ss_pred eEEEEeeEEEEEeehhhHHHHhcch----HHHHHHHHHHHHHHHHHHHHHHHhhcccCC----c----ccccccccccce Confidence 6777777777777889999999863 678999999999999999999998532110 0 00001000000 Q ss_pred --ccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhh----cccccccccccccccccccccCCce Q lcl|NC_015280. 273 --SNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMS----GVLDYDSGISGAVGGIGEIDDTGNT 346 (455) Q Consensus 273 --~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~s----G~l~~~~~~~~~~~~~~~~d~t~~~ 346 (455) ..+-...+....++..+. .--+..+-+|++++....|... |-+-+.|...+. ..... T Consensus 150 ~~~~~~~~~~~~~~~~~~l~---------~~~~~~a~~vmn~~~~~~L~~lkd~~G~~i~~~~~~~~--------~~~~~ 212 (397) T protein:vir:23 150 QSISPNAYQGLGVSGLTKLV---------TDGKKWTHTLLDDTVEPVLNGSVDANGRPLFVESTYES--------LTTPF 212 (397) T ss_pred eeecccchhHHHHHHHHhhh---------hcccCCCEEEEcHHHHHHHHHhhccCCceeeccccccc--------ccccc Confidence 000000111112222222 1223456689999999888852 111111211111 01112 Q ss_pred eEEEecCceEEEEeccccccC------CcceEEEEEecCcccc----ceeE------Eccc-------ccccc-----ee Q lcl|NC_015280. 347 FVGTLNGRFKVYIDPYSANVS------DNQYYVVGYKGTNAYD----AGLF------YCPY-------VPLQM-----YR 398 (455) Q Consensus 347 ~~G~l~~~~~vy~D~y~~~~s------~~dY~~vG~KG~~~~d----aglf------yaPY-------v~l~~-----~~ 398 (455) ..|+| .+++|+++.....+. ++..+++|..+.-..+ +++- ..|| +-+.. .. T Consensus 213 ~~~tl-~G~Pv~~s~~~~~g~~~~~~gDfs~~~i~~~~~i~i~~~~e~~~~~~~~~~~~~~~lf~~d~v~~ra~~r~d~~ 291 (397) T protein:vir:23 213 REGRI-LGRPTILSDHVAEGDVVGYAGDFSQIIWGQVGGLSFDVTDQATLNLGSQESPNFVSLWQHNLVAVRVEAEYGLL 291 (397) T ss_pred cCcee-eeeeEEEeCCCCCCceEEEEeecceEEEEEEeceEEEEeeeeeeeeccccccceeeeeeccceeEEEEeeeccc Confidence 23455 478888887653221 1222334443322111 1100 0011 00000 00 Q ss_pred ecCCccccceeeee--eecceeeccccccccccc-ccCchhhhhccchhhhhhhhhhcCC Q lcl|NC_015280. 399 AIGQDTFQPRIGFK--TRYGMVLNPFAKGLTALS-DSDPQAAGNLNANAYYRRVRVANLM 455 (455) Q Consensus 399 ~~Dp~s~qP~~g~~--tRY~l~~nP~~~~~~~~~-~~~~~~~~~~~~n~y~r~~~v~~~~ 455 (455) +.||+.|-...+-- .=|-+.+.|-+.+.-... +|+.-...+-|++.===+..+..|= T Consensus 292 v~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~ 351 (397) T protein:vir:23 292 INDVNAFVKLTFDPVLTTYALDLDGASAGNFTLSLDGKTSANIAYNASTATVKSAIVAID 351 (397) T ss_pred eecccceEEEeeccccceeeecccccCcceEEEEecCccccCcccccchhhhHHHhhhcc Confidence 11222221111000 001112222222111110 1111000011111100011111111 No 86 >protein:vir:4856 Length: 293 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:106 # MgeName: DT1 # Cross-refs: genbank:acc:NP_049396;genbank:gi:9632424;genbank:GeneID:1258532 Probab=68.02 E-value=0.24 Score=23.95 Aligned_cols=274 Identities=13% Similarity=0.077 Sum_probs=121.0 Q ss_pred hhhhhhhchhhhccccccccccccccchhh-hHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCccccccccc Q lcl|NC_015280. 49 VLTEAPTNVGPINTPTTSSGAVAGFDPILI-SLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPD 127 (455) Q Consensus 49 ~l~ea~~~~~~~~~~st~tg~i~~~~P~Lv-~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~ 127 (455) +|..- +.++++..-.-.-+.+. .+++.+-+..+-.+++.+=||++.+|-+-=.+ .....+ T Consensus 1 ~l~~~--------~~~t~~~gg~liP~~~~~~Ii~~~~~~~~l~~~~~~~~~~~~~g~~~~~~--~~~~~~--------- 61 (293) T protein:vir:48 1 MLDSK--------TDHSGSDAGLTIPQDIRTAINTLVRQYDSLQEYVNVENVTTLTGSRVYEK--WTDITG--------- 61 (293) T ss_pred Cceee--------cccccCcCceEechhHHHHHHHHHHhhhhhhhhceeeeccCCcceEEEEe--ecCCCc--------- Confidence 33321 12222222122222222 35555556667788888888888775221111 000000 Q ss_pred ccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccce-eEEEEEEEEeeccccc Q lcl|NC_015280. 128 AQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMA-FSIDKIAVEAKGRALR 206 (455) Q Consensus 128 t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMa-FsIEK~tVtAKSRaLK 206 (455) .+-..++ +..++|.+ .++++++..+|.-+-. T Consensus 62 ----------------------------------------~a~~v~E--------g~~~~~~~~~~~~~i~l~~~k~~~~ 93 (293) T protein:vir:48 62 ----------------------------------------LANIDDE--------AGKIADIDDPKLSLIKYTIKRYAGI 93 (293) T ss_pred ----------------------------------------ceeeecC--------CcccccccccceeEEEEeeeEEEEe Confidence 0000111 11233432 4566666666666667 Q ss_pred cceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHHHH Q lcl|NC_015280. 207 ADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLL 286 (455) Q Consensus 207 AEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~ 286 (455) ..+|-||.+|.. +|.|++|.+-|+..|..-+|+.|+.-+-..+. ..+.+.+ +..+.|. T Consensus 94 ~~iS~ell~ds~----~~l~~~i~~~la~~~~~~~~~~i~~g~~~~~~--------~~~~~~~----------d~i~~~~ 151 (293) T protein:vir:48 94 STVTNSLLADSA----ENILAWLSGWIAKKVVVTRNKAILGVVDKLPT--------KPTLTKW----------DDIIDLE 151 (293) T ss_pred ehhhHHHHhhhh----HHHHHHHHHHHHHHHHHHHHhHHhhccccccc--------cccccCH----------HHHHHHH Confidence 789999999863 67899999999999999999988865433322 1222211 2234444 Q ss_pred HHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEE--ecccc Q lcl|NC_015280. 287 FQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYI--DPYSA 364 (455) Q Consensus 287 ~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~--D~y~~ 364 (455) ..+... -+.. ...++++.....|.. +... +|.. ....+.++ -..++|. +++|++ |.+.. T Consensus 152 ~~l~~~--------~~~~-a~~vmn~~~~~~L~~---lkd~---~g~~--l~~~~~~~-~~~~~l~-G~Pv~~~~~~~~~ 212 (293) T protein:vir:48 152 AKVDPA--------IKQT-SFFLTNTSGFTALKK---VKNA---LGDY--LMERDVKS-PTGYSIA-GFAVKEISDRWLP 212 (293) T ss_pred Hhhhhh--------hcCC-CEEEEcHHHHHHHHH---hhcc---CCce--EeecCcCC-CCCceec-ceeeEEecccccC Confidence 444311 1222 356789988888875 2211 1111 11111111 1123553 446554 32221 Q ss_pred ccCCcc----------eEEEEEecCccccceeEEcccccccceeecCCccccceeeeeeecce-eecccccccccccc-c Q lcl|NC_015280. 365 NVSDNQ----------YYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQDTFQPRIGFKTRYGM-VLNPFAKGLTALSD-S 432 (455) Q Consensus 365 ~~s~~d----------Y~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s~qP~~g~~tRY~l-~~nP~~~~~~~~~~-~ 432 (455) +....+ ++.++.++... +-..++.. -+-.+.|=.+-...||+. +.+|-.--.-+... . T Consensus 213 ~~~~~~~~~~~gd~~~~~~~~~~~~~~----i~~~~~~~------~~~~~~~~~~r~~~r~d~~~~~~~a~~~l~~~~~~ 282 (293) T protein:vir:48 213 NASSGVMPLYFGDLKQAVTLFDRQQMS----LLSTNIGG------GAFETDTTKVRVIDRFDVVATDTEAFVPASFKAIA 282 (293) T ss_pred CccCCceEEEEEeccceEEEEEecceE----EEEecccc------hhhhcCeEEEEEEEeeCcEEecccceEEEEeeccc Confidence 111111 11122111111 11111100 011233455566677776 55654332111111 0 Q ss_pred Cc---hhhhhc Q lcl|NC_015280. 433 DP---QAAGNL 440 (455) Q Consensus 433 ~~---~~~~~~ 440 (455) .+ .++.+. T Consensus 283 ~~~~~~~~~~~ 293 (293) T protein:vir:48 283 DQKGNIGSTAV 293 (293) T ss_pred cCCccccccCC Confidence 11 111111 No 87 >protein:vir:96262 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1612 # MgeName: ROSA # Cross-refs: genbank:acc:YP_240311;genbank:gi:66395978;genbank:GeneID:5133339 Probab=67.69 E-value=0.25 Score=23.90 Aligned_cols=263 Identities=10% Similarity=0.034 Sum_probs=116.0 Q ss_pred CCC-cceeeeEEEeeecCCCCccccccc------ccccccccccccccccccccCcccCCCCCCCCcccccccccccccc Q lcl|NC_015280. 99 MTG-PTGLIFAMRSRYTNQSGNEAFFDE------PDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFS 171 (455) Q Consensus 99 mTG-PTGLIFAMRsrY~~qsG~EAlfnE------a~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~ 171 (455) |.. .|- -.+.--.|-|-+. ..--|++-.... ....+ .+|.+-.-..+ . . T Consensus 1 m~~~~T~--------l~d~i~Pev~~~~v~~~~~~~l~~~~~~~~~---------~~l~g--~~G~tv~iP~~----~-~ 56 (274) T protein:vir:96 1 MAQGMTK--------LTNQIVPEVLAPMMQAELEKKLRFASFAEID---------NTLVG--QPGDTLTFPAF----I-Y 56 (274) T ss_pred CCcceee--------hhheechHHHHHHHHHHHHhhhhccccceec---------ccccC--CCCCEEEeeee----c-C Confidence 111 000 0000000100000 000111100000 00000 00000000000 0 1 Q ss_pred hhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhh-CCChhHHHHHHHHHHHHHHhhHHHHHHHh Q lcl|NC_015280. 172 TSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIH-GLDAESELANILSTEILAEINREVVRTVY 250 (455) Q Consensus 172 Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiH-GLDAE~ELanILStEImlEINReII~~l~ 250 (455) +.++|.+.++..-...++..+=. +++.+-|+ |+ |.+. |+-+.- +-|.-.+..+-++..++.+++++++..+. T Consensus 57 ig~a~~~~~g~~i~~~~lt~~~~--~~~i~~~~-~a-~~i~---D~~~~~~~~d~~~~~~~~~~~~~a~~vd~~i~~~l~ 129 (274) T protein:vir:96 57 SGDAKVVAEGEKIPTDILETKKR--EAKIRKIA-KG-TSIS---DEALLSGYGDPQGEQVRQHGLAHANKVDDDVLEALK 129 (274) T ss_pred CCccccccCCCccchhhccccee--EEEeeeee-cc-eeeh---HHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHHh Confidence 12334444333334444444333 33334443 22 2222 555544 35889999999999999999999998876 Q ss_pred hhheeeeeeccccceeeeeeccccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhccccccccc Q lcl|NC_015280. 251 RGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGI 330 (455) Q Consensus 251 ~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~ 330 (455) +.... + +...++ .+.+-..+.++..| -..+++++++|++++.|......+|.+.. T Consensus 130 ~a~~~-----~-~~~~~~----------~d~i~~A~~~lgd~---------~~~~~~ivv~p~~~~~L~k~~~~~f~~~s 184 (274) T protein:vir:96 130 SAKLT-----V-EADITK----------LTGLQTAIDKFNDE---------DLEPMVLFISPLDAGKLRGDATTNFTRAT 184 (274) T ss_pred ccccc-----c-cccccC----------HHHHHHHHHHhccc---------cccccEEEeCHHHHHHHHhhccccccccc Confidence 54322 1 111111 11222233334322 13678999999999999876555544433 Q ss_pred ccccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEe-cCccccceeEEcccccccceeec-CCccccce Q lcl|NC_015280. 331 SGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYK-GTNAYDAGLFYCPYVPLQMYRAI-GQDTFQPR 408 (455) Q Consensus 331 ~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~K-G~~~~daglfyaPYv~l~~~~~~-Dp~s~qP~ 408 (455) +...+ .-.+-..|++. +++||+|-. .| +|-.+-++ |.-. ||.. -+.. ...- ||.+++-. T Consensus 185 ~~g~~------~~~~G~ig~~~-G~~Vi~s~~----~~-~~t~~l~~~gA~~-----~~~~-~~~~-vE~~Rd~~~~~d~ 245 (274) T protein:vir:96 185 ELGDD------VIVKGAFGEAL-GAVIVRSNK----LE-AGTAILAKKGAVK-----LITK-RDFF-LETDRDPSTKTTA 245 (274) T ss_pred ccccc------ceeccccceec-CeEEEEeCC----CC-CceEEEEecccee-----eeec-CCcc-cccccccccccCE Confidence 32211 11122467774 689999943 33 22222222 2111 1111 0111 2222 89999999 Q ss_pred eeeeeecce-eecccccccccccccCchhhh Q lcl|NC_015280. 409 IGFKTRYGM-VLNPFAKGLTALSDSDPQAAG 438 (455) Q Consensus 409 ~g~~tRY~l-~~nP~~~~~~~~~~~~~~~~~ 438 (455) +-..-+||+ +.||--. -.+.+|.+.-.+ T Consensus 246 i~~~~~y~~~~~~~~~~--v~~tk~~~~~~~ 274 (274) T protein:vir:96 246 LYSDKHYVAYLYDESKA--VKITKGSGSLEM 274 (274) T ss_pred EEEeEEEEEEEEcCCcE--EEEEcCCccccC Confidence 999999999 6777333 122233332221 No 88 >protein:vir:95898 Length: 274 # NCBI annotation: ORF014 # Family: family:all:522 # MgeID: mge:1588 # MgeName: 71 # Cross-refs: genbank:acc:YP_240385;genbank:gi:66396054;genbank:GeneID:5133409 Probab=67.69 E-value=0.25 Score=23.90 Aligned_cols=263 Identities=10% Similarity=0.034 Sum_probs=116.0 Q ss_pred CCC-cceeeeEEEeeecCCCCccccccc------ccccccccccccccccccccCcccCCCCCCCCcccccccccccccc Q lcl|NC_015280. 99 MTG-PTGLIFAMRSRYTNQSGNEAFFDE------PDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFS 171 (455) Q Consensus 99 mTG-PTGLIFAMRsrY~~qsG~EAlfnE------a~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~ 171 (455) |.. .|- -.+.--.|-|-+. ..--|++-.... ....+ .+|.+-.-..+ . . T Consensus 1 m~~~~T~--------l~d~i~Pev~~~~v~~~~~~~l~~~~~~~~~---------~~l~g--~~G~tv~iP~~----~-~ 56 (274) T protein:vir:95 1 MAQGMTK--------LTNQIVPEVLAPMMQAELEKKLRFASFAEID---------NTLVG--QPGDTLTFPAF----I-Y 56 (274) T ss_pred CCcceee--------hhheechHHHHHHHHHHHHhhhhccccceec---------ccccC--CCCCEEEeeee----c-C Confidence 111 000 0000000100000 000111100000 00000 00000000000 0 1 Q ss_pred hhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhh-CCChhHHHHHHHHHHHHHHhhHHHHHHHh Q lcl|NC_015280. 172 TSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIH-GLDAESELANILSTEILAEINREVVRTVY 250 (455) Q Consensus 172 Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiH-GLDAE~ELanILStEImlEINReII~~l~ 250 (455) +.++|.+.++..-...++..+=. +++.+-|+ |+ |.+. |+-+.- +-|.-.+..+-++..++.+++++++..+. T Consensus 57 ig~a~~~~~g~~i~~~~lt~~~~--~~~i~~~~-~a-~~i~---D~~~~~~~~d~~~~~~~~~~~~~a~~vd~~i~~~l~ 129 (274) T protein:vir:95 57 SGDAKVVAEGEKIPTDILETKKR--EAKIRKIA-KG-TSIS---DEALLSGYGDPQGEQVRQHGLAHANKVDDDVLEALK 129 (274) T ss_pred CCccccccCCCccchhhccccee--EEEeeeee-cc-eeeh---HHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHHh Confidence 12334444333334444444333 33334443 22 2222 555544 35889999999999999999999998876 Q ss_pred hhheeeeeeccccceeeeeeccccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhccccccccc Q lcl|NC_015280. 251 RGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGI 330 (455) Q Consensus 251 ~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~ 330 (455) +.... + +...++ .+.+-..+.++..| -..+++++++|++++.|......+|.+.. T Consensus 130 ~a~~~-----~-~~~~~~----------~d~i~~A~~~lgd~---------~~~~~~ivv~p~~~~~L~k~~~~~f~~~s 184 (274) T protein:vir:95 130 SAKLT-----V-EADITK----------LTGLQTAIDKFNDE---------DLEPMVLFISPLDAGKLRGDATTNFTRAT 184 (274) T ss_pred ccccc-----c-cccccC----------HHHHHHHHHHhccc---------cccccEEEeCHHHHHHHHhhccccccccc Confidence 54322 1 111111 11222233334322 13678999999999999876555544433 Q ss_pred ccccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEe-cCccccceeEEcccccccceeec-CCccccce Q lcl|NC_015280. 331 SGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYK-GTNAYDAGLFYCPYVPLQMYRAI-GQDTFQPR 408 (455) Q Consensus 331 ~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~K-G~~~~daglfyaPYv~l~~~~~~-Dp~s~qP~ 408 (455) +...+ .-.+-..|++. +++||+|-. .| +|-.+-++ |.-. ||.. -+.. ...- ||.+++-. T Consensus 185 ~~g~~------~~~~G~ig~~~-G~~Vi~s~~----~~-~~t~~l~~~gA~~-----~~~~-~~~~-vE~~Rd~~~~~d~ 245 (274) T protein:vir:95 185 ELGDD------VIVKGAFGEAL-GAVIVRSNK----LE-AGTAILAKKGAVK-----LITK-RDFF-LETDRDPSTKTTA 245 (274) T ss_pred ccccc------ceeccccceec-CeEEEEeCC----CC-CceEEEEecccee-----eeec-CCcc-cccccccccccCE Confidence 32211 11122467774 689999943 33 22222222 2111 1111 0111 2222 89999999 Q ss_pred eeeeeecce-eecccccccccccccCchhhh Q lcl|NC_015280. 409 IGFKTRYGM-VLNPFAKGLTALSDSDPQAAG 438 (455) Q Consensus 409 ~g~~tRY~l-~~nP~~~~~~~~~~~~~~~~~ 438 (455) +-..-+||+ +.||--. -.+.+|.+.-.+ T Consensus 246 i~~~~~y~~~~~~~~~~--v~~tk~~~~~~~ 274 (274) T protein:vir:95 246 LYSDKHYVAYLYDESKA--VKITKGSGSLEM 274 (274) T ss_pred EEEeEEEEEEEEcCCcE--EEEEcCCccccC Confidence 999999999 6777333 122233332221 No 89 >protein:vir:103955 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:1662 # MgeName: phiNM # Cross-refs: genbank:acc:YP_873992;genbank:gi:118430767;genbank:GeneID:4525449 Probab=66.60 E-value=0.26 Score=23.75 Aligned_cols=299 Identities=9% Similarity=0.023 Sum_probs=126.3 Q ss_pred ccchhhHHHHHHHhhhHHHHHHHHHHhhhhhhhchhhhcc---ccccccccccccchhhhHHHHHHhhhhhhheeeeccC Q lcl|NC_015280. 23 IKDPYRKSVTAILLENQERALAEERAVLTEAPTNVGPINT---PTTSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPM 99 (455) Q Consensus 23 i~~~~~~~v~~~~~enq~~~~~e~~~~l~ea~~~~~~~~~---~st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPm 99 (455) ++...+.+. |-+++ .........+-+ .++.+++..--....-.+++.+...-+..+++-+-|| T Consensus 1 ~~~~~~~~~-------------~~~~f-~~~~~~~~~~~a~~~~~~~~~~~liP~~~~~~ii~~~~~~s~l~~~~~~~~~ 66 (324) T protein:vir:10 1 MEQTQKLKL-------------NLQHF-ASNNVKPQVFNPDNVMMHEKKDGTLLNDFTTPILQEVMENSKIMQLGKYEPM 66 (324) T ss_pred CCCchHHHH-------------HHHHH-HHHhhccceecccceeccCCCcceechhHHHHHHHHHHhhchhhhhcceeec Confidence 111111110 11111 110000111111 1111111111111222344445556677888999999 Q ss_pred CCcceeeeEEEeeecCCCCcccccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcC Q lcl|NC_015280. 100 TGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALG 179 (455) Q Consensus 100 TGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG 179 (455) ++.+.-|.- .. .+.+ +-..++ T Consensus 67 ~~~~~~~p~----~~--~~~~------------------------------------------------a~~v~E----- 87 (324) T protein:vir:10 67 EGTEKKFTF----WA--DKPG------------------------------------------------AYWVGE----- 87 (324) T ss_pred cCCceEEEE----Ee--CCcc------------------------------------------------eeEecc----- Confidence 877633321 10 0000 000001 Q ss_pred CCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeee Q lcl|NC_015280. 180 DGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQA 259 (455) Q Consensus 180 ~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~ 259 (455) +..+++...+++++++..|.-+-.-..|-||.+|-. .|.+++|.+.|+..|...+++.+|.---+. T Consensus 88 ---g~~~~~~~~~~~~v~~~~~k~~~~~~iS~ell~ds~----~~l~~~i~~~l~~ai~~~~d~a~l~G~g~~------- 153 (324) T protein:vir:10 88 ---GQKIETSKATWVNATMRAFKLGVILPVTKEFLNYTY----SQFFEEMKPMIAEAFYKKFDEAGILNQGNN------- 153 (324) T ss_pred ---CccccccccceeEEEEeeEEEEEeehhhHHHHhcch----HHHHHHHHHHHHHHHHHHHHHHhhhcCCCC------- Confidence 122444555667777777777777889999999864 468999999999999999999888532111 Q ss_pred ccccceeeeeecc----ccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccc Q lcl|NC_015280. 260 NVANAGVFDLDVD----SNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVG 335 (455) Q Consensus 260 ~v~~~gv~Dl~~~----~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~ 335 (455) -...|++..... ..+--..+....++..+. ..-+..+.+++|+.....|... ... ++..- T Consensus 154 -~~~~~i~~~~~~~~~~~~~~~t~~~i~~~~~~l~---------~~~~~~~~~v~n~~~~~~L~~l---~d~---~g~~~ 217 (324) T protein:vir:10 154 -PFGKSIAQSIEKTNKVIKGDFTQDNIIDLEALLE---------DDELEANAFISKTQNRSLLRKI---VDP---ETKER 217 (324) T ss_pred -ccCccccccccccceeccccCCHHHHHHHHHhhh---------hccCCCCEEEEcHHHHHHHHHh---hcc---CCcee Confidence 111111111000 001111233444433332 1223455688999999988753 211 11111 Q ss_pred cccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEcccccccc--------eeecCCc---- Q lcl|NC_015280. 336 GIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQM--------YRAIGQD---- 403 (455) Q Consensus 336 ~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~--------~~~~Dp~---- 403 (455) . .+.++ ++|. +++|++.+... .+...+++|-.. .+++...-...+ ....|+. T Consensus 218 ~---~~~~~----~~l~-G~PV~~~~~~~--~~~~~~~~gd~~------~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~ 281 (324) T protein:vir:10 218 I---YDRNS----DTLD-GLPVVNLKSSN--LKRGELITGDFD------KLIYGIPQLIEYKIDETAQLSTVKNEDGTPV 281 (324) T ss_pred e---cCCCC----cccc-ceeEEeecCCC--CCcceEEEEecc------cEEEEEecCcEEEEeecccccccccccccch Confidence 0 01111 3343 46777766432 223334444211 111111111111 1111111 Q ss_pred ----cccceeeeeeecce-eecccccc-cccccccCchhhhhccch Q lcl|NC_015280. 404 ----TFQPRIGFKTRYGM-VLNPFAKG-LTALSDSDPQAAGNLNAN 443 (455) Q Consensus 404 ----s~qP~~g~~tRY~l-~~nP~~~~-~~~~~~~~~~~~~~~~~n 443 (455) +-+=.+=...|||. +.||-+-- .+...-+... .+++= T Consensus 282 ~~~~~~~~~~r~~~r~d~~v~~~~A~~~l~~a~~~~~~---~~~~~ 324 (324) T protein:vir:10 282 NLFEQDMVALRATMHVALHIADDKAFAKLVPADKKTDS---VPGEV 324 (324) T ss_pred hhhhcCcEEEEEEEEEccEEecccceEEEEeccCCCCC---CCCCC Confidence 11233334467887 55664321 1111111111 11111 No 90 >protein:vir:3158 Length: 321 # NCBI annotation: capsid protein gpE # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:316 # MgeName: PhiCh1 # Cross-refs: genbank:acc:NP_665929;genbank:gi:22091115;genbank:GeneID:951342 Probab=65.32 E-value=0.29 Score=23.57 Aligned_cols=296 Identities=11% Similarity=0.069 Sum_probs=117.0 Q ss_pred HHHHHhhhHHHHHHHHHHhhhhhhhchhhhccccccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEE Q lcl|NC_015280. 31 VTAILLENQERALAEERAVLTEAPTNVGPINTPTTSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMR 110 (455) Q Consensus 31 v~~~~~enq~~~~~e~~~~l~ea~~~~~~~~~~st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMR 110 (455) +-++.|.|..+.+.+ ++ .+..+..++...--.|..-.|++++..+-.....+-|.||+...|.|=.+- T Consensus 1 ~~~k~~~~~l~~~~~-~~-----------~~~~~~~~~g~~v~~~~~~~l~~~i~e~s~~l~~i~v~~v~~~~~~i~~~~ 68 (321) T protein:vir:31 1 MASRTINNDLSRITE-KN-----------ALTVDDLDAGGTLPDPLWDEFWTDMIEETPLLDAIRTETVGAKKTRIPTLN 68 (321) T ss_pred CchHHHHHHHHHHHH-hc-----------cccccccCCcceeCHHHHHHHHHHHHHhhhhhhhceeeeccCcceeeeeec Confidence 444555554332222 11 111122222222223333456666666555667788999998887763321 Q ss_pred eeecCCCCcccccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccce Q lcl|NC_015280. 111 SRYTNQSGNEAFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMA 190 (455) Q Consensus 111 srY~~qsG~EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMa 190 (455) + .+..... +. .+. .+ -..+...|.++. T Consensus 69 --~-----------------~~~~~~~---~~------------------e~~-----------~~--~~~~~~~~~~~~ 95 (321) T protein:vir:31 69 --I-----------------GERHRRP---QD------------------EGE-----------WN--ENESDVSTGTID 95 (321) T ss_pred --c-----------------CCccccc---cc------------------ccc-----------cc--cccccceeeeee Confidence 0 0000000 00 000 00 000012245555 Q ss_pred eEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhhe-------eeeeeccc- Q lcl|NC_015280. 191 FSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAK-------PGAQANVA- 262 (455) Q Consensus 191 FsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~-------~~k~~~v~- 262 (455) +...|+.+- ...|-||.+|= .||.|-|+.|.+.++..|.+.+++-++.-- .++. .|+-+... T Consensus 96 ~~~~k~~~~-------~~it~e~L~d~--a~~~d~e~~i~~~ia~~~a~~~~~~~~nGd-~~~~~~~~~~n~G~l~~a~~ 165 (321) T protein:vir:31 96 ISTEKATVA-------WDLPREVVQEN--PEGEALADRILNLMTDAWSADVEDLAANGD-EDAEDSFENQNDGFITVAEG 165 (321) T ss_pred eeeEEEEee-------hhccHHHHHhh--hcchhHHHHHHHHHHHHHHHHHHhheeecc-ccCCCcccccchhhhhhhcc Confidence 555555443 44678888872 367888888888888888777765554321 1111 12111000 Q ss_pred cceeeeeeccccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccE-EEEchhHHHHHHhhcccccccccccccccccccc Q lcl|NC_015280. 263 NAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNI-IITSADVASALAMSGVLDYDSGISGAVGGIGEID 341 (455) Q Consensus 263 ~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~-~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d 341 (455) .....+. ..+.+..+.+..|.+.|.. +- |..+++ .++|++....+... +......-+... . .. T Consensus 166 ~~~~~~~---~~~~~~~d~l~~l~~~l~~-------~y-r~~~~~v~im~~~~~~~~~~~--l~~~~~~~~~~~--l-~~ 229 (321) T protein:vir:31 166 DVETIDA---ADDILDNDLVIRTIAGLDS-------KY-RARMNPALIVSEDQLLSYHYT--LTDRDTPLGDNV--I-MG 229 (321) T ss_pred ccccccc---cccccCHHHHHHHHHhccH-------hH-hcCCCeEEEechHHHHHHHHH--HhcCCCccccch--h-hc Confidence 0111111 1122333445555555432 22 333454 47888876544321 111111000000 0 00 Q ss_pred cCCceeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEcccccccceeecC--Ccccc-ceeee--eeecc Q lcl|NC_015280. 342 DTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIG--QDTFQ-PRIGF--KTRYG 416 (455) Q Consensus 342 ~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~D--p~s~q-P~~g~--~tRY~ 416 (455) .... +| ++++|++.+++ |.+.++++-.-. |.|.=+-...+.+..| +.+.. -++=+ ..+++ T Consensus 230 ~~~~----tl-~G~pvv~~~~m----P~~~il~t~~~n------l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 294 (321) T protein:vir:31 230 EADV----NP-FSFPIIGSGLW----PDDKAMFTDPQN------LIYALYRDLEIDVLTESDKVSERDLHARYFMRGDDD 294 (321) T ss_pred cccc----cc-cceeEEEcCCC----CCCcEEEecccc------EEEEEeeccEEEEeecCccccccceeeEeeeeeecc Confidence 0111 22 47888888775 444444432110 0111111112222222 22222 11111 11222 Q ss_pred eeecccccccccccccCchhhhhccchhhhhhhhhhcCC Q lcl|NC_015280. 417 MVLNPFAKGLTALSDSDPQAAGNLNANAYYRRVRVANLM 455 (455) Q Consensus 417 l~~nP~~~~~~~~~~~~~~~~~~~~~n~y~r~~~v~~~~ 455 (455) -++- + |---+.|.|+= T Consensus 295 ~~ve-------------~----------~~a~a~~~~i~ 310 (321) T protein:vir:31 295 FAIE-------------N----------TEAVVLAEGLG 310 (321) T ss_pred eeEe-------------c----------cccEEEEecCC Confidence 2111 0 11113344443 No 91 >protein:vir:78830 Length: 324 # NCBI annotation: major head protein # Family: family:all:507 # MgeID: mge:1858 # MgeName: 80alpha # Cross-refs: genbank:acc:YP_001285361;genbank:gi:148717889;genbank:GeneID:5246961 Probab=64.58 E-value=0.3 Score=23.47 Aligned_cols=299 Identities=10% Similarity=0.020 Sum_probs=123.2 Q ss_pred HhhhHHHHHHHHHHhhhhhhhchhhhc---cccccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEe Q lcl|NC_015280. 35 LLENQERALAEERAVLTEAPTNVGPIN---TPTTSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRS 111 (455) Q Consensus 35 ~~enq~~~~~e~~~~l~ea~~~~~~~~---~~st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRs 111 (455) .-++|+ ++++++.+.........+- ...+.+++..--....-.+++.+.......+++-+-||++++--|.-.. T Consensus 1 ~~~~~~--~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~l~~~~~~~~~~~~~p~~~- 77 (324) T protein:vir:78 1 MEQTQK--LKLNLQHFASNNVKPQVFNPDNVMMHEKKDGTLMNEFTTPILQEVMENSKIMQLGKYEPMEGTEKKFTFWA- 77 (324) T ss_pred CCcchh--hhHHHHHHHHHhhhhhhhccccccccCcCccccchhHHHHHHHHHHhhchhhhhcceeeccCCceEEEEEe- Confidence 111111 1222222221111110000 1111222221112222235556666777788888989988763332111 Q ss_pred eecCCCCcccccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCcccccee Q lcl|NC_015280. 112 RYTNQSGNEAFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAF 191 (455) Q Consensus 112 rY~~qsG~EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaF 191 (455) .+.+ +-..+++ ..+++... T Consensus 78 -----~~~~------------------------------------------------a~~v~Eg--------~~~~~~~~ 96 (324) T protein:vir:78 78 -----DKPG------------------------------------------------AYWVGEG--------QKIETSKA 96 (324) T ss_pred -----cCcc------------------------------------------------eeEecCC--------cccccccc Confidence 0000 0000111 12334444 Q ss_pred EEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeec Q lcl|NC_015280. 192 SIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDV 271 (455) Q Consensus 192 sIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~ 271 (455) ++++++++.+.-+.-..+|-||.+|-. .|.+++|.+-|+..|...|++-+|.---+. ....|+..... T Consensus 97 ~~~~v~~~~~k~~~~~~is~ell~ds~----~~l~~~i~~~la~ai~~~~d~a~l~G~g~~--------~~~~gi~~~~~ 164 (324) T protein:vir:78 97 TWVNATMRAFKLGVILPVTKEFLNYTY----SQFFEEMKPMIAEAFYKKFDEAGILNQGNN--------PFGKSIAQSIE 164 (324) T ss_pred ceeEEEEeeEEEEEeehhhHHHHhcch----HHHHHHHHHHHHHHHHHHHHHHHhccCCCC--------CcCcccccccc Confidence 555555555555556669999999864 578999999999999999999887532111 11122221110 Q ss_pred c----ccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCcee Q lcl|NC_015280. 272 D----SNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTF 347 (455) Q Consensus 272 ~----~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~ 347 (455) . ..+-...+....+.+.+.. .....+.+++|++....|... ... +|..- . .+..+ T Consensus 165 ~~~~~~~~~~t~~~i~~~~~~l~~---------~~~~~~~~vmn~~~~~~L~~l---~d~---~G~~~--~-~~~~~--- 223 (324) T protein:vir:78 165 KTNKVIKGDFTQDNIIDLEALLED---------DELEANAFISKTQNRSLLRKI---VDP---ETKER--I-YDRNS--- 223 (324) T ss_pred ccceeccccccHHHHHHHHHhhhh---------ccCCCCEEEEcHHHHHHHHHh---hcc---CCCee--e-cCCCC--- Confidence 0 1111122334444444331 223445689999999888753 111 11111 0 01112 Q ss_pred EEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEccccccccee--------ecCCc-----cc---cceeee Q lcl|NC_015280. 348 VGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYR--------AIGQD-----TF---QPRIGF 411 (455) Q Consensus 348 ~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~--------~~Dp~-----s~---qP~~g~ 411 (455) ++|. +++|++++... .+...+++|-. +.+++...-...+.. ..|+. -| |=.+=. T Consensus 224 -~~l~-G~PV~~~~~~~--~~~~~~~~gd~------~~~~~g~~~~~~i~~~~~~~~~~~~~~~~~~~~~f~~d~~~~r~ 293 (324) T protein:vir:78 224 -DSLD-GLPVVNLKSSN--LKRGELITGDF------DKLIYGIPQLIEYKIDETAQLSTVKNEDGTPVNLFEQDMVALRA 293 (324) T ss_pred -Cccc-ceeeEeeCCCC--CCcceEEEEec------ceEEEEEecCcEEEEeecccccccccccccchhhhhcCcEEEEE Confidence 2332 46777765431 23333444421 111121111111100 00111 01 222233 Q ss_pred eeecce-eecccccccccccccCchhh-hhccch Q lcl|NC_015280. 412 KTRYGM-VLNPFAKGLTALSDSDPQAA-GNLNAN 443 (455) Q Consensus 412 ~tRY~l-~~nP~~~~~~~~~~~~~~~~-~~~~~n 443 (455) ..||+. +.+|-+-- .+ ++.+|+. -..+.- T Consensus 294 ~~r~d~~v~~~~A~~--~l-~~a~~~~~~~~~~~ 324 (324) T protein:vir:78 294 TMHVALHIADDKAFA--KL-VPADKRTDSVPGEV 324 (324) T ss_pred EEEEccEEecccceE--EE-ecccccCCCCCCCC Confidence 457776 55554321 11 1111110 011111 No 92 >protein:vir:96392 Length: 324 # NCBI annotation: ORF011 # Family: family:all:507 # MgeID: mge:1613 # MgeName: 53 # Cross-refs: genbank:acc:YP_239648;genbank:gi:66395381;genbank:GeneID:5132868 Probab=64.58 E-value=0.3 Score=23.47 Aligned_cols=299 Identities=10% Similarity=0.020 Sum_probs=123.2 Q ss_pred HhhhHHHHHHHHHHhhhhhhhchhhhc---cccccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEe Q lcl|NC_015280. 35 LLENQERALAEERAVLTEAPTNVGPIN---TPTTSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRS 111 (455) Q Consensus 35 ~~enq~~~~~e~~~~l~ea~~~~~~~~---~~st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRs 111 (455) .-++|+ ++++++.+.........+- ...+.+++..--....-.+++.+.......+++-+-||++++--|.-.. T Consensus 1 ~~~~~~--~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~l~~~~~~~~~~~~~p~~~- 77 (324) T protein:vir:96 1 MEQTQK--LKLNLQHFASNNVKPQVFNPDNVMMHEKKDGTLMNEFTTPILQEVMENSKIMQLGKYEPMEGTEKKFTFWA- 77 (324) T ss_pred CCcchh--hhHHHHHHHHHhhhhhhhccccccccCcCccccchhHHHHHHHHHHhhchhhhhcceeeccCCceEEEEEe- Confidence 111111 1222222221111110000 1111222221112222235556666777788888989988763332111 Q ss_pred eecCCCCcccccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCcccccee Q lcl|NC_015280. 112 RYTNQSGNEAFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAF 191 (455) Q Consensus 112 rY~~qsG~EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaF 191 (455) .+.+ +-..+++ ..+++... T Consensus 78 -----~~~~------------------------------------------------a~~v~Eg--------~~~~~~~~ 96 (324) T protein:vir:96 78 -----DKPG------------------------------------------------AYWVGEG--------QKIETSKA 96 (324) T ss_pred -----cCcc------------------------------------------------eeEecCC--------cccccccc Confidence 0000 0000111 12334444 Q ss_pred EEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeec Q lcl|NC_015280. 192 SIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDV 271 (455) Q Consensus 192 sIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~ 271 (455) ++++++++.+.-+.-..+|-||.+|-. .|.+++|.+-|+..|...|++-+|.---+. ....|+..... T Consensus 97 ~~~~v~~~~~k~~~~~~is~ell~ds~----~~l~~~i~~~la~ai~~~~d~a~l~G~g~~--------~~~~gi~~~~~ 164 (324) T protein:vir:96 97 TWVNATMRAFKLGVILPVTKEFLNYTY----SQFFEEMKPMIAEAFYKKFDEAGILNQGNN--------PFGKSIAQSIE 164 (324) T ss_pred ceeEEEEeeEEEEEeehhhHHHHhcch----HHHHHHHHHHHHHHHHHHHHHHHhccCCCC--------CcCcccccccc Confidence 555555555555556669999999864 578999999999999999999887532111 11122221110 Q ss_pred c----ccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCcee Q lcl|NC_015280. 272 D----SNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTF 347 (455) Q Consensus 272 ~----~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~ 347 (455) . ..+-...+....+.+.+.. .....+.+++|++....|... ... +|..- . .+..+ T Consensus 165 ~~~~~~~~~~t~~~i~~~~~~l~~---------~~~~~~~~vmn~~~~~~L~~l---~d~---~G~~~--~-~~~~~--- 223 (324) T protein:vir:96 165 KTNKVIKGDFTQDNIIDLEALLED---------DELEANAFISKTQNRSLLRKI---VDP---ETKER--I-YDRNS--- 223 (324) T ss_pred ccceeccccccHHHHHHHHHhhhh---------ccCCCCEEEEcHHHHHHHHHh---hcc---CCCee--e-cCCCC--- Confidence 0 1111122334444444331 223445689999999888753 111 11111 0 01112 Q ss_pred EEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEccccccccee--------ecCCc-----cc---cceeee Q lcl|NC_015280. 348 VGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYR--------AIGQD-----TF---QPRIGF 411 (455) Q Consensus 348 ~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~--------~~Dp~-----s~---qP~~g~ 411 (455) ++|. +++|++++... .+...+++|-. +.+++...-...+.. ..|+. -| |=.+=. T Consensus 224 -~~l~-G~PV~~~~~~~--~~~~~~~~gd~------~~~~~g~~~~~~i~~~~~~~~~~~~~~~~~~~~~f~~d~~~~r~ 293 (324) T protein:vir:96 224 -DSLD-GLPVVNLKSSN--LKRGELITGDF------DKLIYGIPQLIEYKIDETAQLSTVKNEDGTPVNLFEQDMVALRA 293 (324) T ss_pred -Cccc-ceeeEeeCCCC--CCcceEEEEec------ceEEEEEecCcEEEEeecccccccccccccchhhhhcCcEEEEE Confidence 2332 46777765431 23333444421 111121111111100 00111 01 222233 Q ss_pred eeecce-eecccccccccccccCchhh-hhccch Q lcl|NC_015280. 412 KTRYGM-VLNPFAKGLTALSDSDPQAA-GNLNAN 443 (455) Q Consensus 412 ~tRY~l-~~nP~~~~~~~~~~~~~~~~-~~~~~n 443 (455) ..||+. +.+|-+-- .+ ++.+|+. -..+.- T Consensus 294 ~~r~d~~v~~~~A~~--~l-~~a~~~~~~~~~~~ 324 (324) T protein:vir:96 294 TMHVALHIADDKAFA--KL-VPADKRTDSVPGEV 324 (324) T ss_pred EEEEccEEecccceE--EE-ecccccCCCCCCCC Confidence 457776 55554321 11 1111110 011111 No 93 >protein:vir:80930 Length: 278 # NCBI annotation: Cps # Family: family:all:522 # MgeID: mge:1886 # MgeName: A500 # Cross-refs: genbank:acc:YP_001468392;genbank:gi:157324966;genbank:GeneID:5601363 Probab=62.94 E-value=0.33 Score=23.25 Aligned_cols=271 Identities=10% Similarity=0.061 Sum_probs=113.8 Q ss_pred CCCc-c--eeeeEEEe-eecCCCCcccccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhh Q lcl|NC_015280. 99 MTGP-T--GLIFAMRS-RYTNQSGNEAFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSE 174 (455) Q Consensus 99 mTGP-T--GLIFAMRs-rY~~qsG~EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~ 174 (455) |--+ | +-+| .. .|...- .|.| . ...-|+.-. ..+.... ..+|.+-.-..+ . ...+ T Consensus 1 Ma~~~T~~~~~i--iPev~s~~v-~~~~-~-~~~v~~~~~---------~~~~~l~--g~~G~tv~ip~~----~-~~g~ 59 (278) T protein:vir:80 1 MADLTTKLANLI--DPEVMGPMI-SAKL-P-KAIKFGKIA---------PIDNSLE--GQPGSEITVPKY----K-YIGD 59 (278) T ss_pred CCCcceehhhee--cHHHHHHHH-HHHH-H-Hhhhhcccc---------eeccccc--CCCCCEEEEeee----c-cCCc Confidence 1100 0 0000 00 000000 0000 0 000010000 0000000 000000000000 0 1123 Q ss_pred hhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHH-hhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhh Q lcl|NC_015280. 175 QEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKA-IHGLDAESELANILSTEILAEINREVVRTVYRGA 253 (455) Q Consensus 175 aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkA-iHGLDAE~ELanILStEImlEINReII~~l~~vA 253 (455) ++.+.++..-.+.+ .+..+.+++-|-|+- + ++ .-|+.+ .-+-|.-.+..+-++.-+..+++++++..+.... T Consensus 60 a~~~~~g~~i~~~~--lt~~~~~~~i~~~~~-a---~~-v~D~~~~~~~~d~~~~~~~~~a~~~a~~~d~~l~~~l~~a~ 132 (278) T protein:vir:80 60 AQDVAEGAAIDYSA--LETESVKHGIKKAGK-G---VK-LTDESVLSGYGDPVEEAQKQIRMAIASKVDNDILEEALTTT 132 (278) T ss_pred ceeecCCCcCcccc--cccceeeEeeehhhc-c---cc-ccHHHHhhccccHHHHHHHHHHHHHHHHHHHHHHHHHhccc Confidence 34444433333444 345566666666652 2 22 334444 3467899999999999999999999998886542 Q ss_pred eeeeeeccccceeeeeeccccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccc Q lcl|NC_015280. 254 KPGAQANVANAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGA 333 (455) Q Consensus 254 ~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~ 333 (455) .. ++..-+.|..+. +.+.+-.++-++..+ .--...+++++|.+.+.|......++.+..... T Consensus 133 ~~-----~~~~~t~~~~~~-----~~~~~~da~~~l~~~--------~~~~~~~ivv~p~~~~~L~k~~~~~~~~~~~~g 194 (278) T protein:vir:80 133 LE-----VKGAINIGLIDK-----IENTFTDAPDAIEDE--------SITTTGVLFLNYKDTAKLREEAAGSWTKASQLG 194 (278) T ss_pred cc-----cccccccchhhh-----HHHHHHHHHHhhccc--------CCCcccEEEECHHHHHHHHhhhhhhcccccccc Confidence 21 111112221100 011111111111111 111234799999999999865544444332211 Q ss_pred cccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEcccccccceeec-CCccccceeeee Q lcl|NC_015280. 334 VGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAI-GQDTFQPRIGFK 412 (455) Q Consensus 334 ~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~-Dp~s~qP~~g~~ 412 (455) .+ .--+-..|++. |++||++... |. +-.+-+ +.+. -+ |+..= +.. ...- ||..++-.+-.. T Consensus 195 ~~------~~~~G~ig~~~-G~~Vi~s~~~----p~-~t~~l~-~~gA--i~-~~~~~-~~~-vE~~Rd~~~~~d~i~~~ 256 (278) T protein:vir:80 195 DD------LLVKGAFGELL-GWEIVRTKKL----AD-GNALAV-KAGA--LK-TFLKR-NLL-AESGRDMDHKLTKFNAD 256 (278) T ss_pred cc------ceeeccceeec-ceeEEEcCCC----Cc-ceEEEE-eccc--ee-eeecC-Ccc-cccccchhhccceeeee Confidence 11 01123477774 6899999643 32 211111 1111 01 22111 111 2222 888999999889 Q ss_pred eecce-eecccccccccccccCchhhhhcc Q lcl|NC_015280. 413 TRYGM-VLNPFAKGLTALSDSDPQAAGNLN 441 (455) Q Consensus 413 tRY~l-~~nP~~~~~~~~~~~~~~~~~~~~ 441 (455) .+||+ ..||-..- .+..+.. + T Consensus 257 ~~yg~~v~~~~~~v--~it~~a~------~ 278 (278) T protein:vir:80 257 QHYAVALVDETKAV--KVVPVAG------N 278 (278) T ss_pred eEEEEEEEcCcceE--EEeeccC------C Confidence 99999 66775441 1112111 0 No 94 >protein:vir:1638 Length: 298 # NCBI annotation: Structural protein # Family: family:all:966 # MgeID: mge:33 # MgeName: r1t # Cross-refs: genbank:acc:NP_695059;genbank:gi:23455750;genbank:GeneID:955469 Probab=62.10 E-value=0.34 Score=23.14 Aligned_cols=278 Identities=14% Similarity=0.103 Sum_probs=117.7 Q ss_pred ccccccccccccccchh-hhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCccccccccccccccccccccc Q lcl|NC_015280. 61 NTPTTSSGAVAGFDPIL-ISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGTDGATPP 139 (455) Q Consensus 61 ~~~st~tg~i~~~~P~L-v~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~~~~ 139 (455) .+ +++|.. .-|.+ -.+++.+-+..+..+++.+.||++...-|. .. . ++.+ T Consensus 1 ma--~~gG~l--vp~~~~~~ii~~~~~~s~i~~l~~~~~~~~~~~~ip-~~---~--~~~~------------------- 51 (298) T protein:vir:16 1 MV--LNKGTL--FDPTLVTDLISKVAGKSSIARLSAQKPIPFNGEKVF-TF---T--MDSE------------------- 51 (298) T ss_pred Cc--ccCcce--echhHHHHHHHHHHhhhhhhhhcceeeccCCceEEE-EE---e--cCcc------------------- Confidence 22 222222 12222 233444456778899999999986432222 11 0 0000 Q ss_pred ccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHH Q lcl|NC_015280. 140 TATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKA 219 (455) Q Consensus 140 ~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkA 219 (455) +-..++. ..+++-..++++++..+|.-+-....|-||.++--. T Consensus 52 -----------------------------a~~v~E~--------~~~~~~~~~f~~v~l~~~k~a~~~~iS~ell~~s~d 94 (298) T protein:vir:16 52 -----------------------------IDVVAES--------GKKTHGGVTLAPQTMVPIKVEYGARISDEFMYASDE 94 (298) T ss_pred -----------------------------eEEecCC--------ccccccccceeEEEEeeeeEEEeehhhHHHhhcCcc Confidence 0001111 223444445566666666666677899999875432 Q ss_pred hhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeecc-ccceeee---eecccc-chhh-HHHHHHHHHHHHHHH Q lcl|NC_015280. 220 IHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANV-ANAGVFD---LDVDSN-GRWS-VEKFKGLLFQIERDA 293 (455) Q Consensus 220 iHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v-~~~gv~D---l~~~~~-gr~~-ve~~k~l~~qi~~ea 293 (455) -..|-+++|.+-|+..|...|+..++.-...- .|+..++ ...++.. ...... .-+. ......++..+. T Consensus 95 -~~~~l~~~i~~~la~ai~~~~d~~~l~G~~~~--~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~--- 168 (298) T protein:vir:16 95 -EKINILQEFNDGFAKKVARGIDLMAFHGVNPR--LGTASAVIGTNHFDSKVTQKVEAPRGIADPNGAIENAVELLT--- 168 (298) T ss_pred -cHHHHHHHHHHHHHHHHHHHHHHHhhccccCC--CCcccccccccccccccccccccccccccHHHHHHHHHHHhh--- Confidence 12456777777788888877777777542110 0111100 0001100 000011 1110 112222322222 Q ss_pred HHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEeccccc--cCCcce Q lcl|NC_015280. 294 NAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSAN--VSDNQY 371 (455) Q Consensus 294 n~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~--~s~~dY 371 (455) ...++..-+|++++....|... .+. +|..-+ ..+.++. -.|+|.| ++|+++..... ..+.+. T Consensus 169 ------~~~~~~~~~vmn~~~~~~l~~l---kd~---~G~~i~--~~~~~~~-~~~~l~G-~PV~~~~~v~~~~~~~~~~ 232 (298) T protein:vir:16 169 ------GVDADVTGIAINPSFRSALAKQ---KDL---QDNALF--PELKWGA-TPDTING-LPVDVNKTVSDMSLTQRDR 232 (298) T ss_pred ------hcCCCccEEEEcHHHHHHHHHh---hcc---CCCeee--cCcccCC-CCceecc-eeeEEecccccccCCCccE Confidence 1123445588899988888752 211 111111 1111111 1256754 68888865321 123345 Q ss_pred EEEEEecCccccceeEEccccc--ccceeecCCcc-----cc-ceeee--eeecce-eecccccccccccccC Q lcl|NC_015280. 372 YVVGYKGTNAYDAGLFYCPYVP--LQMYRAIGQDT-----FQ-PRIGF--KTRYGM-VLNPFAKGLTALSDSD 433 (455) Q Consensus 372 ~~vG~KG~~~~daglfyaPYv~--l~~~~~~Dp~s-----~q-P~~g~--~tRY~l-~~nP~~~~~~~~~~~~ 433 (455) +++|-- ..++.|..--. +...+-.||++ || =.++| ..|++. +.+|-.-- .+.+.. T Consensus 233 ~~~GDf-----s~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~v~~ra~~r~d~~v~~~~a~~--~l~~at 298 (298) T protein:vir:16 233 AIIGDF-----ANGFKWGYAKEVPLEVIQYGDPDNSGLDLKGYNQVYIRAELFLGWGILDATKFA--RVTEAN 298 (298) T ss_pred EEEeec-----cceEEEEEecCceEEEeeccCCcCcchhhhhcCcEEEEEEEEEccEeecccceE--EEeecC Confidence 555510 01122222111 11222224432 32 11333 557776 66664432 222211 No 95 >protein:vir:1268 Length: 397 # NCBI annotation: hypothetical protein # Family: family:all:21 # MgeID: mge:329 # MgeName: phi-105 # Cross-refs: genbank:acc:NP_690760;genbank:gi:22855000;genbank:GeneID:955203 Probab=61.97 E-value=0.34 Score=23.13 Aligned_cols=313 Identities=15% Similarity=0.089 Sum_probs=121.4 Q ss_pred CcchHHHHHHhhHhhcC-------------------------CCCccccc----hhhHHHHHHHhhhHHHHH-HHHHHhh Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNH-------------------------EGLNDIKD----PYRKSVTAILLENQERAL-AEERAVL 50 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~-------------------------~~~~~i~~----~~~~~v~~~~~enq~~~~-~e~~~~l 50 (455) +..-+.|.++...+-+. +..++-.. .+++.+...+. ++ .+ .++|..+ T Consensus 39 ~~e~~~l~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~-~~--~~~~~~~~~~ 115 (397) T protein:vir:12 39 LDEVKQLKNQIELMTEGRSLDVPDLPGGVNFVPEQERNPEGQRSQGQGNEERQQQYSKAFLKGLR-GK--RLTDEERDLL 115 (397) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhhcccccccchhhHHHHHHHHHHHHHHh-cc--CCcHHHHHHH Confidence 11112222222211100 00000000 01111100000 00 00 1222222 Q ss_pred hhhhhchhhhccccccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccccccccc Q lcl|NC_015280. 51 TEAPTNVGPINTPTTSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQF 130 (455) Q Consensus 51 ~ea~~~~~~~~~~st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~f 130 (455) .... ...+...++++|.+.--....-.+++.+.+..+-.+++.+.||+++.|-+--.|.. ++.. T Consensus 116 ~~~~--~~a~~~~~~~~gg~lvP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~----~~~~---------- 179 (397) T protein:vir:12 116 DSPE--FRAMSGINDEDGGILIPEDIGRQIHEFKRQFEPLEQYVTVEPVTTRSGTRLLEKNA----DMVP---------- 179 (397) T ss_pred hhhh--hhhccccccccCcccCchhHHHHHHHhhhhhhhHHhhcceeeccCCceeEEEEEec----CCcc---------- Confidence 1111 01111222223332211222223444455666778999999999988865333300 0000 Q ss_pred cccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeecccccccee Q lcl|NC_015280. 131 SGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYS 210 (455) Q Consensus 131 Sg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYT 210 (455) +-..++++..++.....|.++.|+..|..+- ..+| T Consensus 180 --------------------------------------a~~v~Eg~~~~~~~~~~~~~v~~~~~k~~~~-------~~is 214 (397) T protein:vir:12 180 --------------------------------------FSPVEELGNLPEIDQPRFTKVSYSIIDYGGI-------MTLS 214 (397) T ss_pred --------------------------------------eeeecccccccccccccceeEEeeheeeEee-------ehhh Confidence 0000001111111122366666666666654 4589 Q ss_pred HHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHHHHH-HH Q lcl|NC_015280. 211 VELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLLF-QI 289 (455) Q Consensus 211 iELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~-qi 289 (455) -||.+|-- +|.++.|.+.|...|...+|+-|+.-.-+ ....|+..++ ....+++ .+ T Consensus 215 ~e~l~ds~----~~l~~~i~~~l~~~~~~~~d~~il~G~g~---------~~~~g~~~~~----------~i~~~~~~~l 271 (397) T protein:vir:12 215 NSMLNDSD----QAIMTYVAKWFAKKSVVTRNNLILAAIAS---------LKKVDIDGLD----------GIKKALNVTL 271 (397) T ss_pred HHHHhhch----HHHHHHHHHHHHHHHHHHHHHHHHhcccc---------ccccccccHH----------HHHHHHhhcc Confidence 99998754 46788888888888888888887754321 1233443221 1222221 22 Q ss_pred HHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEecc-ccccCC Q lcl|NC_015280. 290 ERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPY-SANVSD 368 (455) Q Consensus 290 ~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y-~~~~s~ 368 (455) . .. -..+..+++++.....|... ... +|.. ....+.+.. .-++|. +++|++... +..... T Consensus 272 ~---~~------~~~~a~~~~n~~~~~~L~~l---kd~---~G~~--l~~~~~~~g-~~~~l~-G~pv~~~~~~~~~~~~ 332 (397) T protein:vir:12 272 D---PM------VAPGSIVLTNQDGYDWLDTL---KDG---TGRY--LLQPDPTNP-TKKLLD-GRPVVPFTNRVLKTQK 332 (397) T ss_pred c---hh------hhCCCEEEEcHHHHHHHHHh---hcc---CCce--eecccccCC-CCcccc-ceeeEEecccccccCC Confidence 1 11 11234578999888888652 111 1111 111111111 124554 457765421 100000 Q ss_pred cceEEEEEecCccccceeEEcccccc-------cceeecC--C----ccccceeeeeeecce-eecccccccccccccC Q lcl|NC_015280. 369 NQYYVVGYKGTNAYDAGLFYCPYVPL-------QMYRAIG--Q----DTFQPRIGFKTRYGM-VLNPFAKGLTALSDSD 433 (455) Q Consensus 369 ~dY~~vG~KG~~~~daglfyaPYv~l-------~~~~~~D--p----~s~qP~~g~~tRY~l-~~nP~~~~~~~~~~~~ 433 (455) | +.-++|+.|-.+ .....++ + .+-+-.+-...|++. +.||-+--.-+++- . T Consensus 333 ---------~----~~~~~~gd~~~~~~~~~~~~~~i~~~~~~~~~f~~~~~~~r~~~r~d~~~~~~~a~~~~~~t~-~ 397 (397) T protein:vir:12 333 ---------G----KAPLIIGNLKEAIVLFDREQQSIASTDTGAGAFETNSTKVRGIEREDVRKWDEDAVVFGQITV-E 397 (397) T ss_pred ---------C----ccEEEEEehhceEEEEeecceEEEEeccccchhhcCceEEEEEEeeccEEecccceEEEEEee-C Confidence 0 111222222110 0000111 1 123445666677777 55654442111110 0 No 96 >protein:vir:101607 Length: 379 # NCBI annotation: major capsid protein precursor # Family: family:all:585 # MgeID: mge:1646 # MgeName: 11b # Cross-refs: genbank:acc:YP_112497;genbank:gi:53793597;uniprot:Q5ZGF6;genbank:GeneID:3101715 Probab=58.48 E-value=0.41 Score=22.69 Aligned_cols=323 Identities=15% Similarity=0.081 Sum_probs=120.4 Q ss_pred Cc-----chHHHHHHhhHhhcCCCCccccchhhHHH---HHHHhhhHHH---HHH-----------------HHHHhhhh Q lcl|NC_015280. 1 MY-----NAENLQEKWAPVLNHEGLNDIKDPYRKSV---TAILLENQER---ALA-----------------EERAVLTE 52 (455) Q Consensus 1 m~-----~~~~~~~kw~~~l~~~~~~~i~~~~~~~v---~~~~~enq~~---~~~-----------------e~~~~l~e 52 (455) +- ..++..++...... .+.+..+...+.+ .+.|.+..++ ..+ +......+ T Consensus 17 l~~~~~~~~~e~~~~~e~~~~--~~~~~~~~~~~e~~~~~~~l~~~~~~~e~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 94 (379) T protein:vir:10 17 VDSKSSAQALEVKGLIEALEA--KMTSEKDLAVNELKSDMAALQAHADKLDVKLKEKAKSEDKSDSLVKSITENFNDIKE 94 (379) T ss_pred HHHHHHHHHHHHHHHHHHHHh--HhhHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcccccccchhHHHHHHHHHHhHHH Confidence 00 00111111111100 0000000000000 0111111000 000 00000000 Q ss_pred hhh--chhhhc-ccccccccccc-----ccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccc Q lcl|NC_015280. 53 APT--NVGPIN-TPTTSSGAVAG-----FDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFD 124 (455) Q Consensus 53 a~~--~~~~~~-~~st~tg~i~~-----~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfn 124 (455) ... .+.... +..+++++.+. +.+.++.+.| ....-.+++.|.||++++.-|.-.. T Consensus 95 ~~~~~~~~~~~~~~~~~~~~~~~~ip~~~~~~ii~~~~---~~~~i~~~~~~~~~~~~~~~~~~~~-------------- 157 (379) T protein:vir:10 95 VRNGKSIQVKAVGDMTLPVNLTGAQPKDYNFDVVLNPS---QMLNVSDIVGAVSISGGTYTFVREN-------------- 157 (379) T ss_pred HHhhhhhhhhhhcccccCCCCccccchhhhhHHHHhHH---hhhhHHhhceeeeccCCceEEEEee-------------- Confidence 000 000000 11122222222 2333333333 3456678899999988864332100 Q ss_pred cccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccc Q lcl|NC_015280. 125 EPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRA 204 (455) Q Consensus 125 Ea~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRa 204 (455) ++++.. .-..++ +...+++..++++++..+|.=+ T Consensus 158 ----~~~~~~----------------------------------~~~v~E--------g~~~~~~~~~f~~i~~~~~k~~ 191 (379) T protein:vir:10 158 ----GAGEGA----------------------------------IGAQVE--------GATKGQKDYDISMIDVNTDFIA 191 (379) T ss_pred ----cCCCcc----------------------------------cccccC--------CccccccccceeeeEeeeeeEE Confidence 000000 000001 1123344445555555555544 Q ss_pred cccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHH Q lcl|NC_015280. 205 LRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKG 284 (455) Q Consensus 205 LKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~ 284 (455) --..+|-||.||-- +.++.|.+-|+..|+.-+|..++.-+...+.-+... + .+ -..++.... T Consensus 192 ~~~~iS~ell~D~~-----~l~~~i~~~la~~~~~~~~~~~~~g~~~~~~~~~~~-~-----------~~-~~~~d~i~~ 253 (379) T protein:vir:10 192 GFTRYSKKMANNLP-----FLTSFIPNALRRDYAKAENAAFNAVLAANATASTEI-I-----------TN-KNKVEMLIN 253 (379) T ss_pred eeehhhHHHHhhHH-----HHHHHHHHHHHHHHHHHHHHHHhccccccccccccc-c-----------cC-cccHHHHHH Confidence 55779999999963 277888888999998888888876554432211111 0 11 112333444 Q ss_pred HHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhccccccccccccccccccccc-CCceeEEEecCceEEEEeccc Q lcl|NC_015280. 285 LLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDD-TGNTFVGTLNGRFKVYIDPYS 363 (455) Q Consensus 285 l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~-t~~~~~G~l~~~~~vy~D~y~ 363 (455) +++++.. .-..++-+|++|.....|... .. .+|..=+..+... .+.. .+|. +++|+++++. T Consensus 254 ~~~~~~~---------~~~~~~~~vmn~~~~~~l~~l---kd---~~G~~l~~~~~~~~~~~~--~~l~-G~pvv~s~~~ 315 (379) T protein:vir:10 254 EIAKQEN---------LDFPVTAIVLRPTDYYDILVT---QK---SVGAGYGLPGVVTQDNGV--LRIN-GIPLFRATWL 315 (379) T ss_pred HHHhhhh---------ccCCCCEEEEcHHHHHHHHHh---hc---cCCceeccCCccCCCCCc--ceec-ceeeEecCCC Confidence 4444431 234555688999988777642 11 1111100000000 0110 1332 5899999875 Q ss_pred cccCCcceEEEEEecCccccceeEEcccccccceeec--CCccccceeeeeeecce-eecccccccccccccCchhhhhc Q lcl|NC_015280. 364 ANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAI--GQDTFQPRIGFKTRYGM-VLNPFAKGLTALSDSDPQAAGNL 440 (455) Q Consensus 364 ~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~--Dp~s~qP~~g~~tRY~l-~~nP~~~~~~~~~~~~~~~~~~~ 440 (455) . ..-+++|=- . ..-+++--=+.+...+.. +-.+.+=.+=+..|+|+ +.+|-+--. T Consensus 316 ~----ag~~~~gdf---~-~~~~~~~~~~~i~~~~~~~~~f~~~~~~~r~~~R~~~~v~~p~a~v~-------------- 373 (379) T protein:vir:10 316 A----ANKYYVGDW---T-RVTKVTTEGLSLEFSEVEGTNFVKNNITARIEAQVALAVEQPAALIF-------------- 373 (379) T ss_pred C----CCceEEeec---c-cEEEEEEeceEEEEeecccccccCCcEEEEEEEEeccEEecCccEEE-------------- Confidence 3 222333211 1 011111100000000000 11222223334468877 666644311 Q ss_pred cchhhhhhhhhhcC Q lcl|NC_015280. 441 NANAYYRRVRVANL 454 (455) Q Consensus 441 ~~n~y~r~~~v~~~ 454 (455) +-+..| T Consensus 374 --------~~~~~~ 379 (379) T protein:vir:10 374 --------GDFTAV 379 (379) T ss_pred --------EEecCC Confidence 111111 No 97 >protein:vir:9759 Length: 303 # NCBI annotation: putative structural protein # Family: family:all:966 # MgeID: mge:175 # MgeName: 315.3 # Cross-refs: genbank:acc:NP_795521;genbank:gi:28876283;genbank:GeneID:1257824 Probab=58.37 E-value=0.41 Score=22.68 Aligned_cols=280 Identities=11% Similarity=0.095 Sum_probs=122.7 Q ss_pred ccccccccccccccchh-hhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCccccccccccccccccccccc Q lcl|NC_015280. 61 NTPTTSSGAVAGFDPIL-ISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGTDGATPP 139 (455) Q Consensus 61 ~~~st~tg~i~~~~P~L-v~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~~~~ 139 (455) .+.+++.|. -..|.+ -.+++++.+..+..+++.+-||++.+.-|.- +.. +.+ T Consensus 1 m~t~t~gg~--liP~~~~~~ii~~l~~~s~i~~l~~~~~~~~~~~~ip~----~~~--~~~------------------- 53 (303) T protein:vir:97 1 MGTETSKAS--LFDKHLVSDLINKVKGHSSLAKLSSQKPIPFNGSKEFT----FTL--DSD------------------- 53 (303) T ss_pred CcccCCCCe--EcchhHHHHHHHHHHhhchhhhhcceeecCCCceEEEE----Eec--Ccc------------------- Confidence 444443332 233333 3566666778889999999999876544421 110 000 Q ss_pred ccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHH Q lcl|NC_015280. 140 TATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKA 219 (455) Q Consensus 140 ~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkA 219 (455) +-..++++. +++-..+++.++..+|.-+-....|-||.|.... T Consensus 54 -----------------------------a~wv~E~~~--------~~~s~~~f~~v~l~~~kl~~~~~iS~ell~~~~d 96 (303) T protein:vir:97 54 -----------------------------IDVVAENGK--------KTHGGLSLEPVTIVPIKVEYGARLSDEFLYATEE 96 (303) T ss_pred -----------------------------eEEeecCcc--------ccccccceeeEEeeeEEEEEeehhhHHHhhcCcc Confidence 001111111 2222333445555555555566789999863322 Q ss_pred hhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeec---c----ccchhhHHHHHHHHHHHHHH Q lcl|NC_015280. 220 IHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDV---D----SNGRWSVEKFKGLLFQIERD 292 (455) Q Consensus 220 iHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~---~----~~gr~~ve~~k~l~~qi~~e 292 (455) ..++-+++|.+-|+..|...|+..+|.-..... ..+....+...+.. . ..+.-..+-...++..+. T Consensus 97 -~~~~l~~~i~~~la~a~~~~ld~a~l~G~~~~~----g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~-- 169 (303) T protein:vir:97 97 -EKIDILKAFNEGFAKKLARGIDLMAMHGINPRT----KKASDVIGTNHFDSKVTQVVKFTESEDADANIEAAVNLIQ-- 169 (303) T ss_pred -chHHHHHHHHHHHHHHHHHHHHhhhhcccccCC----ccccccccccccccccccccccccccchHHHHHHHHHHHh-- Confidence 235677888888888888888888875542111 11111111111100 0 001001122222322221 Q ss_pred HHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEecccccc----CC Q lcl|NC_015280. 293 ANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANV----SD 368 (455) Q Consensus 293 an~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~----s~ 368 (455) ...+..+-+|++|+....|.. +... +|. .....+.....-.|+|.| ++|+++.+.... .+ T Consensus 170 -------~~~~~~~~~vmn~~~~~~L~~---lkd~---~g~--~~~~~~~~~~~~~~~l~G-~Pv~~s~~v~~~~~~~~~ 233 (303) T protein:vir:97 170 -------GAEGVVTGLAMDTEFSTALAK---VTNG---EMG--PKMYPELAWGANPDSING-LKSSVNTTVGAGADEAES 233 (303) T ss_pred -------hcCCCccEEEEcHHHHHHHHH---hhcc---CCC--eEEecCccCCCCCceecc-eeeEEecccCCccccCCC Confidence 123455568899998888863 1111 010 000011111112356764 899987643110 11 Q ss_pred cceEEEEEecCccccceeEEcccccc--cceeecCCcc-----ccc-eeee--eeecce-eecccccccccccccCc Q lcl|NC_015280. 369 NQYYVVGYKGTNAYDAGLFYCPYVPL--QMYRAIGQDT-----FQP-RIGF--KTRYGM-VLNPFAKGLTALSDSDP 434 (455) Q Consensus 369 ~dY~~vG~KG~~~~daglfyaPYv~l--~~~~~~Dp~s-----~qP-~~g~--~tRY~l-~~nP~~~~~~~~~~~~~ 434 (455) .+.+++| +- ...+.+...-.+ ......|++. ||- -++| ..||+. +.||-+- ..+.++.= T Consensus 234 ~~~~~~G---df--~~~~~~~~~~~~~~~~~~~~~~d~~~~~~~~~n~~~~r~~~r~~~~v~~p~af--~~l~~~~~ 303 (303) T protein:vir:97 234 KDLVIIG---DF--ESMFKWGYAKQIPMEIIKYGDPDNSGKDLKGYNQIYLRAEAYIGWGILDAKSF--ARVTKGEV 303 (303) T ss_pred ccEEEEe---ec--cccEEEEEecCcEEEEeeccCCCCcchhhhhcCcEEEEEEEEeccEeecccce--EEeeCCCC Confidence 2223222 10 111122222111 1222223331 221 2344 568887 6666333 22222221 No 98 >protein:vir:3613 Length: 272 # NCBI annotation: MHP # Family: family:all:522 # MgeID: mge:74 # MgeName: TP901-1 # Cross-refs: genbank:acc:NP_112699;genbank:gi:13786567;genbank:GeneID:921035 Probab=56.78 E-value=0.45 Score=22.49 Aligned_cols=263 Identities=12% Similarity=0.068 Sum_probs=110.2 Q ss_pred CCC-cceeeeEEEeeecCCCCccccccc------ccccccccccccccccccccCcccCCCCCCCCcccccccccccccc Q lcl|NC_015280. 99 MTG-PTGLIFAMRSRYTNQSGNEAFFDE------PDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFS 171 (455) Q Consensus 99 mTG-PTGLIFAMRsrY~~qsG~EAlfnE------a~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~ 171 (455) |.- -|- ..+---.|-|-+. ...-|++.... +.... ..+|.+-.-..+ . . T Consensus 1 ma~~~T~--------~~d~iiPev~~~~v~~~~~~~~~~~~~~~~---------~~~l~--g~~G~ti~iP~~----~-~ 56 (272) T protein:vir:36 1 MSKQKTT--------LADLVNPEVLAPIVSYELNKALRFAPLAQV---------DTTLQ--GQPGNTLKFPAF----T-Y 56 (272) T ss_pred CCCccee--------hhhhhchHHHHHHHHHHHHhhhhhcccccc---------ccccc--cCCCCEEEEeee----c-c Confidence 100 000 0000001111000 00001110000 00000 000110000000 0 1 Q ss_pred hhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhh Q lcl|NC_015280. 172 TSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYR 251 (455) Q Consensus 172 Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~ 251 (455) ..+++.+.++..-+..++. ..+.+++-|-|+-.-++|=|. ++.-+-|.-.+..+-++..++.+++++|+..+.. T Consensus 57 ~gda~~~~eg~~i~~~~lt--~~~~~~~i~~~~k~~~vtD~~----~~~~~~d~~~~~~~~~a~~~a~~~d~~i~~~l~~ 130 (272) T protein:vir:36 57 IGDAADVAEGGEISLDKIG--TTTKSVTIKKAAKGTEITDEA----ALSGYGDPIGESNKQLGLSLANKVDDDLLSAAKT 130 (272) T ss_pred CccccccCCCCccChhhcC--CcceeEeeehhhccccccHHH----HhhccchHHHHHHHHHHHHHHHHHHHHHHHHhcc Confidence 2334445544444445554 444555556665322232221 2234678999999999999999999999987755 Q ss_pred hheeeeeeccccceeeeeeccccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccc Q lcl|NC_015280. 252 GAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGIS 331 (455) Q Consensus 252 vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~ 331 (455) .... ++.. +.+ + .+-.+ +..+..+ -...++++|+|+++..|.--.-..+.+... T Consensus 131 ~~~~-----~~~~--~~~----d---~i~~A---~~~lgd~---------~~~~~~ivv~p~~~~~L~k~~~~~~~~~~~ 184 (272) T protein:vir:36 131 TSQT-----VSTK--ANV----D---GVQAA---LDIFNDE---------DAQAYVLIVNPKDAAKIRKDANAKNIGSEV 184 (272) T ss_pred cccc-----cccc--ccH----H---HHHHH---HHHhhhc---------CCCceEEEEcHHHHHHHhcccccccccccc Confidence 3321 1111 111 1 12122 2233221 234679999999999986532222221111 Q ss_pred cccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEE-ecCccccceeEEcccccccceeecCCccccceee Q lcl|NC_015280. 332 GAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGY-KGTNAYDAGLFYCPYVPLQMYRAIGQDTFQPRIG 410 (455) Q Consensus 332 ~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~-KG~~~~daglfyaPYv~l~~~~~~Dp~s~qP~~g 410 (455) +. +.-- +-..|.+. +++|++|-...... .-|..+.+ +|.-. +|--.=++....| |+..|+-.+- T Consensus 185 ~~-----~~~~--~G~ig~~~-G~~Vv~s~~~p~~~-~~~~~~~~~~gA~~----~~~~~~~~vE~~R--~~~~~~d~i~ 249 (272) T protein:vir:36 185 GA-----NALI--NGTYADVL-GAQIVRSKKLAEGS-ALMFKIVSNSPALK----LVLKRGVQVETDR--DIVTKTTVIT 249 (272) T ss_pred cc-----ccee--eeccceec-CeeEEEeCCCCCCc-eeEEEEEeccccee----eeecCCccccccc--chhhcCcEEE Confidence 11 0001 12356674 48999996542211 11222222 12111 1111111111122 8888998888 Q ss_pred eeeecce-eecccccccccccccCchhhhhccchhhhhhhhhhcC Q lcl|NC_015280. 411 FKTRYGM-VLNPFAKGLTALSDSDPQAAGNLNANAYYRRVRVANL 454 (455) Q Consensus 411 ~~tRY~l-~~nP~~~~~~~~~~~~~~~~~~~~~n~y~r~~~v~~~ 454 (455) -..+||+ +.||-..- ++-.||+ T Consensus 250 ~~~~y~~~v~~~~~vv----------------------~~t~~g~ 272 (272) T protein:vir:36 250 ADEHYAAYLYDLTKVV----------------------NITFTGV 272 (272) T ss_pred EEEEEEEEEEcCccEE----------------------EEeecCC Confidence 8889998 66665321 1111222 No 99 >protein:vir:1239 Length: 274 # NCBI annotation: similar to phage B1 major head protein # Family: family:all:522 # MgeID: mge:25 # MgeName: phi ETA # Cross-refs: genbank:acc:NP_510938;genbank:gi:17426272;genbank:GeneID:927376 Probab=50.97 E-value=0.59 Score=21.82 Aligned_cols=265 Identities=9% Similarity=0.022 Sum_probs=114.6 Q ss_pred CCCcceeeeEEEeeecCCCCccccccc------ccccccccccccccccccccCcccCCCCCCCCcccccccccccccch Q lcl|NC_015280. 99 MTGPTGLIFAMRSRYTNQSGNEAFFDE------PDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFST 172 (455) Q Consensus 99 mTGPTGLIFAMRsrY~~qsG~EAlfnE------a~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~T 172 (455) |. .+. -+..+.--.|-|-+. ..--|++-..... ... ..+|.+-.-..+ . .. T Consensus 1 ma--~~~-----T~l~d~iiPev~~~~v~~~~~~~l~~~~~~~~d~---------~l~--g~~G~tv~iP~~----~-~i 57 (274) T protein:vir:12 1 MA--QGL-----TKTSNQIIPEVLAPMMQAQLEKKLRFASFAEVDS---------TLQ--GQPGDTLTFPAF----V-YS 57 (274) T ss_pred CC--cce-----eehhhhhchHHHHHHHHHHHHhhhhhcccceecc---------ccc--CCCCCEEEEeee----c-CC Confidence 10 000 000000000100000 0011111100000 000 000100000000 0 11 Q ss_pred hhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhh Q lcl|NC_015280. 173 SEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRG 252 (455) Q Consensus 173 a~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~v 252 (455) .+++.+.++..-...++..+=.+ ++.+-|+-.=+++=| ..+.+ +-|.-.+..+-++..+..+++.+++..+.+. T Consensus 58 g~a~~~~~g~~i~~~~lt~~~~~--~~i~~~~~~~~i~D~--~~~~~--~~d~~~~~~~q~~~~~a~~vd~~~l~~~~~a 131 (274) T protein:vir:12 58 GDAQVVAEGEKIPTDILETKKRE--AKIRKIAKGTSITDE--ALLSG--YGDPQGEQVRQHGLAHANKVDNDVLEALMGA 131 (274) T ss_pred CccccccCCCccchhhcccceee--EEeeeecceeeecHH--HHHhc--ccchHHHHHHHHHHHHHHHHHHHHHHHHhcc Confidence 23344444444445555544333 333444322222211 12223 5688899999999999999999999888764 Q ss_pred heeeeeeccccceeeeeeccccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhccccccccccc Q lcl|NC_015280. 253 AKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISG 332 (455) Q Consensus 253 A~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~ 332 (455) ..... ...+ ..+-+-..+.++..+ -..+++++++|.|++.|......+|.+..++ T Consensus 132 ~~~~~------~~a~----------~~d~i~dA~~~lgd~---------~~~~~~ivv~p~~~~~L~k~~~~~fv~~s~~ 186 (274) T protein:vir:12 132 KLTVN------ADIT----------KLNGLQSAIDKFNDE---------DLEPMVLFINPLDAGKLRGDASTNFTRATEL 186 (274) T ss_pred ccccc------cccc----------CHHHHHHHHHHhccc---------cccccEEEeCHHHHHHHHhhhhhhccccccc Confidence 33211 1111 112222233333322 1367899999999999987655455444332 Q ss_pred ccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEe-cCccccceeEEcccccccceeec-CCccccceee Q lcl|NC_015280. 333 AVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYK-GTNAYDAGLFYCPYVPLQMYRAI-GQDTFQPRIG 410 (455) Q Consensus 333 ~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~K-G~~~~daglfyaPYv~l~~~~~~-Dp~s~qP~~g 410 (455) ..+ ...+-..|++. +++||+|.. -|. |-.+-++ |.-. ||. --+.. ...- ||..++-.+- T Consensus 187 g~~------~~~~G~ig~~~-G~~Vi~s~~----~p~-~t~~l~~~gA~~-----~~~-~~~~~-vE~~Rd~~~~~d~i~ 247 (274) T protein:vir:12 187 GDD------IIVKGAFGEAL-GAIIVRSNK----LEA-GTAILAKKGAVK-----LIL-KRDFF-LEVARDASTKTTALY 247 (274) T ss_pred ccc------ceecccceeec-CeeEEEeCC----CCc-ceEEEEecccee-----eee-cCCce-eccccchhhcccEEE Confidence 211 11223467774 689999953 332 2222222 2111 111 11111 2222 8889999999 Q ss_pred eeeecce-eecccccccccccccCchhhh Q lcl|NC_015280. 411 FKTRYGM-VLNPFAKGLTALSDSDPQAAG 438 (455) Q Consensus 411 ~~tRY~l-~~nP~~~~~~~~~~~~~~~~~ 438 (455) ..-+||. ..||--.- .+..+...-.+ T Consensus 248 ~~~~y~~~~~~~~~vv--~~t~~~~~~~~ 274 (274) T protein:vir:12 248 SDKHYVAYLYDESKAV--KITKGSGSLEM 274 (274) T ss_pred eeeEEEEEEEcCCceE--EEEcCCccccC Confidence 9999997 56663321 11111111111 No 100 >protein:vir:94673 Length: 419 # NCBI annotation: major capsid protein # Family: family:all:585 # MgeID: mge:1527 # MgeName: mu1/6 # Cross-refs: genbank:acc:YP_579208;genbank:gi:93007444;genbank:GeneID:5076792 Probab=50.73 E-value=0.6 Score=21.79 Aligned_cols=340 Identities=11% Similarity=-0.007 Sum_probs=118.5 Q ss_pred CcchHHHHHHhhHhh---------------------cCCC-CccccchhhHHHHHHHhh-----hHHHHHHH-----HHH Q lcl|NC_015280. 1 MYNAENLQEKWAPVL---------------------NHEG-LNDIKDPYRKSVTAILLE-----NQERALAE-----ERA 48 (455) Q Consensus 1 m~~~~~~~~kw~~~l---------------------~~~~-~~~i~~~~~~~v~~~~~e-----nq~~~~~e-----~~~ 48 (455) ....+.+.++...-. +... ..+......+.+-....+ +......+ +.. T Consensus 32 ~~e~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 111 (419) T protein:vir:94 32 VAEARGLADALQAESDRAAARAALLRTAPPAPKGPADGGTPLTPAEAGTFRSLAQRFADSDGLREYRARDKRGQFQVEMR 111 (419) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhccccccccccccchhhhhhhHHHHHHHHHhhhhhhhhHHHH Confidence 000001111111100 0000 000000011111111000 00000000 000 Q ss_pred hhhhhhhchhhhccccccccccccccchhhhHHHHH--HhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccccc Q lcl|NC_015280. 49 VLTEAPTNVGPINTPTTSSGAVAGFDPILISLIRRA--MPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEP 126 (455) Q Consensus 49 ~l~ea~~~~~~~~~~st~tg~i~~~~P~Lv~l~RRa--~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa 126 (455) .+.+.........+ ++++.+....-|.+++=.... ...+...+++.+.||++++.-+ +| . T Consensus 112 ~~~~~~~~~~~~~~-~~~~~~~~~~~p~~~~~~i~~~~~~~~~i~~~~~~~~~~~~~~~~--~~--~------------- 173 (419) T protein:vir:94 112 DIDPNRLLSRDAPA-GTITNPNVPHLPQLVPGIVPTTPDLPLLVADLLDQQNADYNVLEY--IR--D------------- 173 (419) T ss_pred HHHHHHhhcccccc-ccccCCcccccchhhhHHHHHHHhhhhhhhhcceeeeccCCceee--ee--e------------- Confidence 00000000000111 111222222233333321111 1233557899999998765322 32 0 Q ss_pred cccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccc Q lcl|NC_015280. 127 DAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALR 206 (455) Q Consensus 127 ~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLK 206 (455) +++.... ... .. .+-.+++ +..+++...++++++..+|.=+-. T Consensus 174 -~~~~~~~-------~~~-------~~--------------~a~~v~E--------g~~~~~~~~~~~~i~~~~~k~~~~ 216 (419) T protein:vir:94 174 -TSGTAGA-------GST-------WN--------------KAAVVPE--------GTAKPQSTLSFDTITTTLKTVAHW 216 (419) T ss_pred -ccccccc-------ccc-------Cc--------------ccceecC--------CccccccccceeeEEeeeeeEEEe Confidence 0000000 000 00 0001111 122444555556666666666666 Q ss_pred cceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeecc-ccceeeeeecc-----ccchhhHH Q lcl|NC_015280. 207 ADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANV-ANAGVFDLDVD-----SNGRWSVE 280 (455) Q Consensus 207 AEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v-~~~gv~Dl~~~-----~~gr~~ve 280 (455) ..+|-||.||.- +.+++|.+-|+..|...+|+.||.- .-.++-.|+ ...|+.-.... ..+--..+ T Consensus 217 ~~is~ell~d~~-----~l~~~i~~~la~a~~~~~d~aii~G----~G~~~p~Gi~~~~~~~~~~~~~~~~~~t~~~~~~ 287 (419) T protein:vir:94 217 LPITRQAADDNS-----QLMGYIQGRLTYGLRFLRDRQLLNG----NGSTEMQGILTTPGIGTYQQPKPTAPATDEPPLV 287 (419) T ss_pred ehhhHHHHHhHH-----HHHHHHHHHHHHHHHHHHHHHHHhc----cCcccccceecccccccccccccccccccchhHH Confidence 789999999952 3689999999999999999999741 000111111 01111100000 00000111 Q ss_pred HHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEe Q lcl|NC_015280. 281 KFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYID 360 (455) Q Consensus 281 ~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D 360 (455) ....+.+.+. ..-+..+.+||++.....|... .+ ... .... ...+.. ....++|. +++|+++ T Consensus 288 ~l~~~~~~~~---------~~~~~~~~~v~n~~~~~~l~~~--k~-~~~-~~~~---~~~~~~-~~~~~~l~-G~pV~~~ 349 (419) T protein:vir:94 288 DIRRAKTVAE---------IAGFPPDGVVVHPQDWESIELD--QA-PGS-GVFR---VIANVQ-GEATPRIW-GLNVVST 349 (419) T ss_pred HHHHHHHhhh---------hccCCCCEEEEcHHHHHHHHHH--hh-cCC-Ccee---ecCCcc-cCCCcccc-ceeeEEc Confidence 2222222222 1234566799999988887643 11 100 0000 000000 11123454 5788888 Q ss_pred ccccccCCcceEEEE-EecCccccceeEEcccccccceeecCCc------cccceeeeeeecce-eeccccccccccccc Q lcl|NC_015280. 361 PYSANVSDNQYYVVG-YKGTNAYDAGLFYCPYVPLQMYRAIGQD------TFQPRIGFKTRYGM-VLNPFAKGLTALSDS 432 (455) Q Consensus 361 ~y~~~~s~~dY~~vG-~KG~~~~daglfyaPYv~l~~~~~~Dp~------s~qP~~g~~tRY~l-~~nP~~~~~~~~~~~ 432 (455) .... ..+ +++| ++- +|--+.-..+...+++. .-+-.+=+..|++. +.+|-.--.-++. . T Consensus 350 ~~~~---~~~-~~~gd~~~--------~~~~~~~~~~~v~~~~~~~~~~~~~~~~~r~~~r~d~~v~~~~a~~~~~~~-a 416 (419) T protein:vir:94 350 VAIA---QGT-ALVGGFRQ--------GATLWSRQGITVLMTDSHADFFTANTLVILAEFRANLAVYQPKAFVRVTFA-A 416 (419) T ss_pred CCCC---Ccc-EEEeeccc--------eEEEEEecceEEEEeccccchhhcCcEEEEEEEeeccEEeccccEEEEEec-c Confidence 6542 122 3333 110 01001001111111221 12233445567776 4445332111111 0 Q ss_pred Cch Q lcl|NC_015280. 433 DPQ 435 (455) Q Consensus 433 ~~~ 435 (455) -+. T Consensus 417 a~~ 419 (419) T protein:vir:94 417 ATT 419 (419) T ss_pred CCC Confidence 110 No 101 >protein:vir:97148 Length: 324 # NCBI annotation: ORF010 # Family: family:all:507 # MgeID: mge:1654 # MgeName: 85 # Cross-refs: genbank:acc:YP_239726;genbank:gi:66394880;genbank:GeneID:5130881 Probab=48.46 E-value=0.67 Score=21.54 Aligned_cols=299 Identities=10% Similarity=0.040 Sum_probs=119.6 Q ss_pred HHHHhhhHHHHHHHHHHhhhhhhhchh---hhccccccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeE Q lcl|NC_015280. 32 TAILLENQERALAEERAVLTEAPTNVG---PINTPTTSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFA 108 (455) Q Consensus 32 ~~~~~enq~~~~~e~~~~l~ea~~~~~---~~~~~st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFA 108 (455) |. ++| .++++.+.+........ .....++++++..--....-.+++.+....+..+++-+.||++.+--|-- T Consensus 1 ~~---~~~--~~~~~~~~f~~~~~~~~~~~a~~~~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~~~~~~~~~~~~~~ip~ 75 (324) T protein:vir:97 1 ME---QTQ--KLKLNLQHFASNNVKPQVFNPDNVMMHEKKDGTLMNEFTTPILQEVMENSKIMQLGKYEPMEGTEKKFTF 75 (324) T ss_pred Cc---cch--hHHHHHHHHHHhhhhhhhhccccccccCCCcceechhHHHHHHHHHHhhcchhhhcceeeccCCceEEEE Confidence 11 111 11122222111111111 11111222233221122223356666677788899999999887633211 Q ss_pred EEeeecCCCCcccccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCcccc Q lcl|NC_015280. 109 MRSRYTNQSGNEAFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFME 188 (455) Q Consensus 109 MRsrY~~qsG~EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~E 188 (455) +.. +.+ +-..++.+ .+++ T Consensus 76 ----~~~--~~~------------------------------------------------a~~v~Eg~--------~~~~ 93 (324) T protein:vir:97 76 ----WAD--KPG------------------------------------------------AYWVGEGQ--------KIET 93 (324) T ss_pred ----Eec--Ccc------------------------------------------------eeEeccCc--------cccc Confidence 100 000 00011111 1223 Q ss_pred ceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeee Q lcl|NC_015280. 189 MAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFD 268 (455) Q Consensus 189 MaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~D 268 (455) ...++++++.++|.=+--..+|-||.+|-. .|.+++|.+-|+..|...+++.||.---.. ....|++. T Consensus 94 ~~~~f~~v~~~~~k~~~~~~is~ell~ds~----~~l~~~i~~~l~~aia~~~d~a~l~G~g~~--------~~~~gi~~ 161 (324) T protein:vir:97 94 SKATWVNATMRAFKLGVILPVTKEFLNYTY----SQFFEEMKPMIAEAFYKKFDEAGILNQGNN--------PFGKSIAQ 161 (324) T ss_pred cccceeEEEEeeEEEEEeehhhHHHHhcch----HHHHHHHHHHHHHHHHHHHHHHhhccCCCC--------ccCccccc Confidence 333444444444444445559999999863 578999999999999999999988632111 11112221 Q ss_pred eecc----ccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCC Q lcl|NC_015280. 269 LDVD----SNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTG 344 (455) Q Consensus 269 l~~~----~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~ 344 (455) .... ..+-...+....+...+.. -- .....+|+|+.....|... ... +|..- .. +.+ T Consensus 162 ~~~~~~~~~~~~~~~~~i~~~~~~l~~--------~~-~~~~~~v~n~~~~~~L~~l---kd~---~g~~~--~~-~~~- 222 (324) T protein:vir:97 162 SIEKTNKVIKGDFTQDNIIDLEALLED--------DE-LEANAFISKTQNRSLLRKI---VDP---ETKER--IY-DRN- 222 (324) T ss_pred cccccceeccccCCHHHHHHHHHhhhh--------cc-CCCCEEEEcHHHHHHHHHh---hcC---CCcee--ec-CCC- Confidence 1000 0011112233334333332 12 2334578999999888852 211 11111 10 111 Q ss_pred ceeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEcccccccceeecCC--------cc------cc---c Q lcl|NC_015280. 345 NTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQ--------DT------FQ---P 407 (455) Q Consensus 345 ~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp--------~s------~q---P 407 (455) .|+|. +++|++.+-.. .+...+++|-. +.+++...-.+ -.+..|. .. || = T Consensus 223 ---~~tl~-G~PV~~~~~~~--~~~~~~~~gd~------~~~~i~~~~~~-~i~~~~~~~~~~~~~~~~~~~~~f~~d~~ 289 (324) T protein:vir:97 223 ---SDTLD-GLPVVNLKSSN--LKRGELITGDF------DKLIYGIPQLI-EYKIDETAQLSTVKNEDGTPVNLFEQDMV 289 (324) T ss_pred ---Ccccc-ceeeEeecCCC--CCcceEEEEec------ccEEEEEecCc-EEEEeecccccccccccccchhhhhcCcE Confidence 23453 45777654321 12223333311 01111111110 0111111 01 11 2 Q ss_pred eeeeeeecce-eecccccccccccccCchhhhhccch Q lcl|NC_015280. 408 RIGFKTRYGM-VLNPFAKGLTALSDSDPQAAGNLNAN 443 (455) Q Consensus 408 ~~g~~tRY~l-~~nP~~~~~~~~~~~~~~~~~~~~~n 443 (455) .+=+..||+. ..||-+-..=+..+.. .....++- T Consensus 290 ~~r~~~r~d~~v~~~~a~~~l~~~~~~--~~~~~~~~ 324 (324) T protein:vir:97 290 ALRATMHVALHIADDKAFAKLVPADKK--TDSVPGEV 324 (324) T ss_pred EEEEEEEeccEEecccceEEEEeccCC--CCCCCCCC Confidence 2223467776 5555443111111100 00011111 No 102 >protein:vir:94494 Length: 274 # NCBI annotation: ORF015 # Family: family:all:522 # MgeID: mge:1508 # MgeName: 88 # Cross-refs: genbank:acc:YP_240676;genbank:gi:66396348;genbank:GeneID:5133758 Probab=47.49 E-value=0.7 Score=21.43 Aligned_cols=266 Identities=9% Similarity=0.023 Sum_probs=114.0 Q ss_pred CcceeeeEEEeeecCCCCccccccc------ccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhh Q lcl|NC_015280. 101 GPTGLIFAMRSRYTNQSGNEAFFDE------PDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSE 174 (455) Q Consensus 101 GPTGLIFAMRsrY~~qsG~EAlfnE------a~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~ 174 (455) ++.+. -+..+.--.|-+-+. ...-|++-.... .... ..+|.+-.-..+ . .+.+ T Consensus 1 ma~~~-----T~~~d~iiPev~~~~v~~~~~~~l~~~~~~~~d---------~~l~--g~~G~tv~iP~~----~-~~g~ 59 (274) T protein:vir:94 1 MPQGL-----TKTSDQIIPEVLAPMMQAQLEKKLRFASFAEVD---------STLQ--GQPGDTLTFPAF----V-YSGD 59 (274) T ss_pred CCccc-----eehhheechHHHHHHHHHhhhhhhhhcccceec---------cccc--CCCCCEEEEeee----c-CCCc Confidence 00000 000000000111000 001111110000 0000 000000000000 0 1223 Q ss_pred hhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhhe Q lcl|NC_015280. 175 QEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAK 254 (455) Q Consensus 175 aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~ 254 (455) +|.+.++...+..++.++ +.+++.+-|+-.=+++=| ..+.+ +-|.-.+..+-++..|+.+++.+++..+.+.+. T Consensus 60 a~~~~~g~~i~~~~lt~~--~~~~~i~~~~~~~~i~D~--~~~~~--~~dp~~~~~~~~a~a~a~~vd~~~~~~l~~a~~ 133 (274) T protein:vir:94 60 AQVVAEGEKIPTDILETK--KREAKIRKIAKGTSITDE--ALLSG--YGDPQGEQVRQHGLAHANKVDNDVLEALMGAKL 133 (274) T ss_pred cccccCCCcccccccccc--eeEEEeeeecceecccHH--HHHhc--cchHHHHHHHHHHHHHHHHHHHHHHHHHhccCc Confidence 444444444445555433 344444555522222222 22223 467888999999999999999999988866543 Q ss_pred eeeeeccccceeeeeeccccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhccccccccccccc Q lcl|NC_015280. 255 PGAQANVANAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAV 334 (455) Q Consensus 255 ~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~ 334 (455) .. +...++ .+-+-..+.++..+ -..+++++|+|.+++.|......+|.+...... T Consensus 134 ~~------~~~~~~----------~d~i~dA~~~l~d~---------~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~ 188 (274) T protein:vir:94 134 TV------NADITK----------LNGLQSAIDKFNDE---------DLEPMVLFVNPLDAGKLRGDASTNFTRATELGD 188 (274) T ss_pred cc------cccccC----------HHHHHHHHHHhhcc---------CCCceEEEeCHHHHHHHHhhhhhhccccCcccc Confidence 21 111111 22232333344322 236789999999999998754444433322111 Q ss_pred ccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEcccccccceeec-CCccccceeeeee Q lcl|NC_015280. 335 GGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAI-GQDTFQPRIGFKT 413 (455) Q Consensus 335 ~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~-Dp~s~qP~~g~~t 413 (455) . ..-+-..|.+. |++||+|.. -|. |-.+-++ -+.+-|.---+. ....- ||..+.-.+-..- T Consensus 189 ~------~~~~G~ig~~~-G~~Vi~s~~----~p~-~t~~l~~-----~gA~~~~~~~~~-~vE~~Rd~~~~~d~i~~~~ 250 (274) T protein:vir:94 189 D------IIVKGAFGEAL-GAIIVRTNK----LEA-GTAILAK-----KGAVKLILKRDF-FLEVARDASTKTTALYSDK 250 (274) T ss_pred c------ceeccccceec-CeeEEEcCC----CCc-ceEEEEe-----CcceEeeecCCc-eeccccchhhcccEEEEEE Confidence 0 11122467774 679999954 342 3222222 111222111111 12222 8889999999999 Q ss_pred ecce-eecccccccccccccCchhhh Q lcl|NC_015280. 414 RYGM-VLNPFAKGLTALSDSDPQAAG 438 (455) Q Consensus 414 RY~l-~~nP~~~~~~~~~~~~~~~~~ 438 (455) +||+ ..||--.- .+......-.+ T Consensus 251 ~y~~~~~~~~~vv--~~t~~~~~~~~ 274 (274) T protein:vir:94 251 HYVAYLYDESKAV--KITKGSGSLEM 274 (274) T ss_pred EEEEEEEcCCceE--EEecCcccccC Confidence 9999 66663221 11111110010 No 103 >protein:vir:97433 Length: 274 # NCBI annotation: ORF014 # Family: family:all:522 # MgeID: mge:1676 # MgeName: 92 # Cross-refs: genbank:acc:YP_240749;genbank:gi:66396420;genbank:GeneID:5133789 Probab=47.49 E-value=0.7 Score=21.43 Aligned_cols=266 Identities=9% Similarity=0.023 Sum_probs=114.0 Q ss_pred CcceeeeEEEeeecCCCCccccccc------ccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhh Q lcl|NC_015280. 101 GPTGLIFAMRSRYTNQSGNEAFFDE------PDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSE 174 (455) Q Consensus 101 GPTGLIFAMRsrY~~qsG~EAlfnE------a~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~ 174 (455) ++.+. -+..+.--.|-+-+. ...-|++-.... .... ..+|.+-.-..+ . .+.+ T Consensus 1 ma~~~-----T~~~d~iiPev~~~~v~~~~~~~l~~~~~~~~d---------~~l~--g~~G~tv~iP~~----~-~~g~ 59 (274) T protein:vir:97 1 MPQGL-----TKTSDQIIPEVLAPMMQAQLEKKLRFASFAEVD---------STLQ--GQPGDTLTFPAF----V-YSGD 59 (274) T ss_pred CCccc-----eehhheechHHHHHHHHHhhhhhhhhcccceec---------cccc--CCCCCEEEEeee----c-CCCc Confidence 00000 000000000111000 001111110000 0000 000000000000 0 1223 Q ss_pred hhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhhe Q lcl|NC_015280. 175 QEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAK 254 (455) Q Consensus 175 aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~ 254 (455) +|.+.++...+..++.++ +.+++.+-|+-.=+++=| ..+.+ +-|.-.+..+-++..|+.+++.+++..+.+.+. T Consensus 60 a~~~~~g~~i~~~~lt~~--~~~~~i~~~~~~~~i~D~--~~~~~--~~dp~~~~~~~~a~a~a~~vd~~~~~~l~~a~~ 133 (274) T protein:vir:97 60 AQVVAEGEKIPTDILETK--KREAKIRKIAKGTSITDE--ALLSG--YGDPQGEQVRQHGLAHANKVDNDVLEALMGAKL 133 (274) T ss_pred cccccCCCcccccccccc--eeEEEeeeecceecccHH--HHHhc--cchHHHHHHHHHHHHHHHHHHHHHHHHHhccCc Confidence 444444444445555433 344444555522222222 22223 467888999999999999999999988866543 Q ss_pred eeeeeccccceeeeeeccccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhccccccccccccc Q lcl|NC_015280. 255 PGAQANVANAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAV 334 (455) Q Consensus 255 ~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~ 334 (455) .. +...++ .+-+-..+.++..+ -..+++++|+|.+++.|......+|.+...... T Consensus 134 ~~------~~~~~~----------~d~i~dA~~~l~d~---------~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~ 188 (274) T protein:vir:97 134 TV------NADITK----------LNGLQSAIDKFNDE---------DLEPMVLFVNPLDAGKLRGDASTNFTRATELGD 188 (274) T ss_pred cc------cccccC----------HHHHHHHHHHhhcc---------CCCceEEEeCHHHHHHHHhhhhhhccccCcccc Confidence 21 111111 22232333344322 236789999999999998754444433322111 Q ss_pred ccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEcccccccceeec-CCccccceeeeee Q lcl|NC_015280. 335 GGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAI-GQDTFQPRIGFKT 413 (455) Q Consensus 335 ~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~-Dp~s~qP~~g~~t 413 (455) . ..-+-..|.+. |++||+|.. -|. |-.+-++ -+.+-|.---+. ....- ||..+.-.+-..- T Consensus 189 ~------~~~~G~ig~~~-G~~Vi~s~~----~p~-~t~~l~~-----~gA~~~~~~~~~-~vE~~Rd~~~~~d~i~~~~ 250 (274) T protein:vir:97 189 D------IIVKGAFGEAL-GAIIVRTNK----LEA-GTAILAK-----KGAVKLILKRDF-FLEVARDASTKTTALYSDK 250 (274) T ss_pred c------ceeccccceec-CeeEEEcCC----CCc-ceEEEEe-----CcceEeeecCCc-eeccccchhhcccEEEEEE Confidence 0 11122467774 679999954 342 3222222 111222111111 12222 8889999999999 Q ss_pred ecce-eecccccccccccccCchhhh Q lcl|NC_015280. 414 RYGM-VLNPFAKGLTALSDSDPQAAG 438 (455) Q Consensus 414 RY~l-~~nP~~~~~~~~~~~~~~~~~ 438 (455) +||+ ..||--.- .+......-.+ T Consensus 251 ~y~~~~~~~~~vv--~~t~~~~~~~~ 274 (274) T protein:vir:97 251 HYVAYLYDESKAV--KITKGSGSLEM 274 (274) T ss_pred EEEEEEEcCCceE--EEecCcccccC Confidence 9999 66663221 11111110010 No 104 >protein:vir:1328 Length: 392 # NCBI annotation: gp36 # Family: family:all:21 # MgeID: mge:28 # MgeName: phi-C31 # Cross-refs: genbank:acc:NP_047927;swissprot:trembl:q9zwv6;genbank:gi:9631145;uniprot:Q9ZWV6;genbank:GeneID:2715889 Probab=43.95 E-value=0.82 Score=21.04 Aligned_cols=330 Identities=14% Similarity=0.069 Sum_probs=113.1 Q ss_pred Ccch----------HHHHHHhhHhhcCCCCccccch------hhHHHHHH---HhhhHHHHHHHH--------------- Q lcl|NC_015280. 1 MYNA----------ENLQEKWAPVLNHEGLNDIKDP------YRKSVTAI---LLENQERALAEE--------------- 46 (455) Q Consensus 1 m~~~----------~~~~~kw~~~l~~~~~~~i~~~------~~~~v~~~---~~enq~~~~~e~--------------- 46 (455) |.=+ +++++...-.-+.+-..|+... ..+.+-.+ .+|.++....++ T Consensus 4 ~~l~~l~e~r~~~~~e~~~l~~~~~~~~~~~e~~~~~~~l~~e~~~l~~~i~~~~e~~~~~~~~~~~~~~~~~~~~~~~~ 83 (392) T protein:vir:13 4 TTLSANFEARERATAELRSLTDEFAGKEMTAEAREKEERLLTAVADFDGRIKRGIDAIKATDAVTSLLSGLQGSGSGAQR 83 (392) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHhhcccccHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcccCCcccchhh Confidence 2211 1122211111111111111110 01111111 111111100000 Q ss_pred ------HHhh-----hhhhh-chhhhccccccccccccccch-hhhHHHHHHh-hhhhhheeeeccCCCcceeeeEEEee Q lcl|NC_015280. 47 ------RAVL-----TEAPT-NVGPINTPTTSSGAVAGFDPI-LISLIRRAMP-KLIAYDIAGVQPMTGPTGLIFAMRSR 112 (455) Q Consensus 47 ------~~~l-----~ea~~-~~~~~~~~st~tg~i~~~~P~-Lv~l~RRa~p-~LIa~DI~GVQPmTGPTGLIFAMRsr 112 (455) .+.+ .|... ....-...+|++++-...-|. .-.++.+... ..+..+++-|=|+++...+-+-.. T Consensus 84 ~~~~~~~~~~r~g~~~~~~~~~~~~~~~~~t~~~~g~~~~~~~~~~~i~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~-- 161 (392) T protein:vir:13 84 SADHDDDAVLRAGNLGEARSFEFAPEKRDGTKAGNPNVLSRTLYGQLIAQAVERSAIMRGGASTFTTSDANPMDFTVI-- 161 (392) T ss_pred hhhHHHHHHHhccchhhhHHHHhhhhhhcccccCCCccccccchHHHHHHHHhhhhhhhhcceeeecCCCceeEEEEE-- Confidence 0000 00000 000000111222211111111 1111222222 123334444433332221111100 Q ss_pred ecCCCCcccccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeE Q lcl|NC_015280. 113 YTNQSGNEAFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFS 192 (455) Q Consensus 113 Y~~qsG~EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFs 192 (455) . +. ..+...++.+ .++|-..+ T Consensus 162 -~-----------------------------------------~~---------~~a~~v~E~~--------~~~~~~~~ 182 (392) T protein:vir:13 162 -T-----------------------------------------GR---------ATAGIVGETA--------EIPESYPA 182 (392) T ss_pred -c-----------------------------------------CC---------cceeeecccc--------cccccccc Confidence 0 00 0001111222 23334444 Q ss_pred EEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeecc Q lcl|NC_015280. 193 IDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVD 272 (455) Q Consensus 193 IEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~ 272 (455) +++++...+.-+-...+|-||.+|= ..|.++.|.+-|...|..-+|..||.- .-.+ ...|++-.... T Consensus 183 f~~v~~~~~k~~~~~~iS~ell~ds----~~~l~~~i~~~l~~~i~~~~d~~~l~G----~Gt~-----~p~Gil~~~~~ 249 (392) T protein:vir:13 183 TTQRSMGGFKYGFASVVSYEFATDQ----VLDLVGFLVSDAGPAIGDAMGRHFLTG----TGTG-----QPRGILTDATG 249 (392) T ss_pred eeeEEeeeeeEEeeehhHHHHHhcc----hHHHHHHHHHHHHHHHHHHHHHHHhcc----cCCc-----ccccccccccc Confidence 4555555555555667899999983 357888999999999999998888742 0001 11222211100 Q ss_pred c--------cchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCC Q lcl|NC_015280. 273 S--------NGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTG 344 (455) Q Consensus 273 ~--------~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~ 344 (455) . .+.-..+....|.+.+.. . -+..+- .|+++.....|.. +... +|.. ....+.+. T Consensus 250 ~~~~~~~~~~~~~~~d~l~~~~~~l~~-------~-~~~~a~-~v~n~~~~~~l~~---lkd~---~G~~--l~~~~~~~ 312 (392) T protein:vir:13 250 ANAAFGEADADSKVSDALIDLFHEVPS-------A-YRKNAK-FVVNDLRAAQMRK---LKDA---NGQY--LWQSALTV 312 (392) T ss_pred ccccccccccccccHHHHHHHHHhhhh-------h-hhcCCE-EEEcHHHHHHHHH---hhcc---CCce--eecCCcCC Confidence 0 000011122233333221 1 233333 5778888777764 2211 1111 11111111 Q ss_pred ceeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEcccccccceeecCCccc--cceeeeeeecce-eecc Q lcl|NC_015280. 345 NTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQDTF--QPRIGFKTRYGM-VLNP 421 (455) Q Consensus 345 ~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s~--qP~~g~~tRY~l-~~nP 421 (455) .. -++|. +++|+++.+. |.+-|++|-- +. .++.---.+...+..|+..- |-.+-...|.+. +.|| T Consensus 313 g~-~~~l~-G~Pv~~~~~~----~~~~i~~Gdf--~~----~~i~~~~~~~i~~~~~~~~~~~~~~~r~~~r~d~~~~~~ 380 (392) T protein:vir:13 313 GA-PDTFN-GKVVETDDGM----PADKVLFADL--SK----YRVRFAGSLRVDRSVDAKFSTDQIVYRFLQRADGLLVDA 380 (392) T ss_pred CC-Cceec-ceeeEEcCCC----CCCcEEEeec--cc----eeEEeecceEEEeeccccccCCcEEEEEEEEeccEEecc Confidence 11 13554 5899998764 4343444321 00 11111111122222233322 223334456655 6666 Q ss_pred cccccccccccC Q lcl|NC_015280. 422 FAKGLTALSDSD 433 (455) Q Consensus 422 ~~~~~~~~~~~~ 433 (455) -+--.-++..+. T Consensus 381 ~A~~~~~~~~aa 392 (392) T protein:vir:13 381 RGAKVLTVTPAA 392 (392) T ss_pred cceEEEEeeccC Confidence 555333333222 No 105 >protein:vir:1025 Length: 408 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:20 # MgeName: bIL286 # Cross-refs: genbank:acc:NP_076679;genbank:gi:13095788;genbank:GeneID:920362 Probab=39.07 E-value=1 Score=20.50 Aligned_cols=319 Identities=11% Similarity=0.049 Sum_probs=121.9 Q ss_pred CcchHHHHHHhhHhhcCCCCccccchhhHHHHHHHhhhHHHH------------------HH-HHHHhhhhhhhchh--- Q lcl|NC_015280. 1 MYNAENLQEKWAPVLNHEGLNDIKDPYRKSVTAILLENQERA------------------LA-EERAVLTEAPTNVG--- 58 (455) Q Consensus 1 m~~~~~~~~kw~~~l~~~~~~~i~~~~~~~v~~~~~enq~~~------------------~~-e~~~~l~ea~~~~~--- 58 (455) +-..+++.+++..+... .+++-.++-+.+.+. .. +..+.+.+...+.. T Consensus 39 ~ee~~~~~~~~~~~~~~----------~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 108 (408) T protein:vir:10 39 AEAMSELKNKRDNEKVR----------RDALREQLVEAQAEQVVNMREEEKGPLNKSENELKDKFVKDFVNMVRNPMAFM 108 (408) T ss_pred HHHHHHHHHHHHHHHHH----------HHHHHHHHHHHHHHHHhccccccccccccchhhhHHHHHHHHHHHhhcchhhh Confidence 11112233333222111 000111111110000 00 00111111111110 Q ss_pred ------hhccccccccccccccchh-hhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCccccccccccccc Q lcl|NC_015280. 59 ------PINTPTTSSGAVAGFDPIL-ISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFS 131 (455) Q Consensus 59 ------~~~~~st~tg~i~~~~P~L-v~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fS 131 (455) .+...+...|... .-+.+ -.+++.+.......+++.+.||+++.|-+--.+ ..+ .+ T Consensus 109 ~~~~~~a~~~~t~~~gg~~-vP~~~~~~Ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~--~~~--------------~~ 171 (408) T protein:vir:10 109 NTVSSKTETSGSDSAAGLT-IPQDIRTMINTLVRQYDSLQQYVRVESVSTSNGSRVYEK--WTD--------------VT 171 (408) T ss_pred hhhhhhhhhcccccCCcee-ccHhHHHHHHHHHHhhchhhhhcceeeccCCcceEEEee--ccc--------------cc Confidence 1111111122111 11111 224454555666789999999999998765444 000 00 Q ss_pred ccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeH Q lcl|NC_015280. 132 GTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSV 211 (455) Q Consensus 132 g~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTi 211 (455) + .+-..++.+..++.....|.++.|+..|..+- ..+|- T Consensus 172 ~-----------------------------------~a~~v~E~~~~~~~~~~~~~~i~~~~~k~~~~-------~~iS~ 209 (408) T protein:vir:10 172 P-----------------------------------LTVMDAEDGKIPDLDNPQLTIIKYLIKRYAGI-------ITATN 209 (408) T ss_pred c-----------------------------------ceeeecCccccccccCcceeeEEeeeeeEEee-------ehhHH Confidence 0 00001111111111222366666666665544 55999 Q ss_pred HHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHHHHHHHHH Q lcl|NC_015280. 212 ELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLLFQIER 291 (455) Q Consensus 212 ELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ 291 (455) ||.+|- .+|.++.|.+-|+..|..-+|+.||.-.-+.. ...++.++ +....+++... T Consensus 210 ell~ds----~~~l~~~i~~~l~~~~~~~~~~~il~g~g~~~--------~~~~~~~~----------~~l~~~~~~~~- 266 (408) T protein:vir:10 210 TSLKDT----AENILAWLSSWIAKKVVVTRNQAIIEVMKAAP--------KKPTIAKF----------DDVITMINTAV- 266 (408) T ss_pred HHHhhc----hHHHHHHHHHHHHHHHHHHHHHHHhhcccccc--------cccccccH----------HHHHHHHHHhh- Confidence 999984 35778889999999998888888775432221 11222211 11222221111 Q ss_pred HHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEeccccccCCcce Q lcl|NC_015280. 292 DANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQY 371 (455) Q Consensus 292 ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY 371 (455) ..--+..+ -+|||+.....|... ... +|..- ...+.+.. ..++|. +++|++-.. T Consensus 267 ------~~~~~~~a-~~v~n~~~~~~l~~l---kd~---~G~~i--~~~~~~~~-~~~~l~-G~PV~~~~~--------- 320 (408) T protein:vir:10 267 ------DPAIIATS-SLLTNQSGLNKLALV---KTA---EGKYL--LEPDPTKP-NSYLIK-GKQVIVVAD--------- 320 (408) T ss_pred ------hhhhccCC-EEEEcHHHHHHHHHh---hcc---CCceE--eccCcCCC-CCceec-ceeeEEecc--------- Confidence 11112222 467999998888752 211 11111 11111111 113443 555555211 Q ss_pred EEEEEecCccccceeEEccccc-------ccceeecCCc------cccceeeeeeecce-eeccccccccc--------c Q lcl|NC_015280. 372 YVVGYKGTNAYDAGLFYCPYVP-------LQMYRAIGQD------TFQPRIGFKTRYGM-VLNPFAKGLTA--------L 429 (455) Q Consensus 372 ~~vG~KG~~~~daglfyaPYv~-------l~~~~~~Dp~------s~qP~~g~~tRY~l-~~nP~~~~~~~--------~ 429 (455) ..++-.|++. ..+||+-+-. ..+...+++. +.+-.+-+..||+. +.+|-.--.-+ . T Consensus 321 ~~~~~~~~~~--~~i~~gd~~~~~~~~~~~~~~v~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~~~~~~~~~~~~~ 398 (408) T protein:vir:10 321 RWLPNTGSTV--YPLYYGDMSQAITLFDRENMSLLPTNIGAGAFETDTTKIRVIDRFDVKATDSEALVAGSFSAIADQVG 398 (408) T ss_pred cccCccCCCc--eEEEEEehhccEEEEEecceEEEEcccccchhhcCceEEEEEEeeccEEeccccEEEEEeeccccCCC Confidence 1111122111 1123322211 1111112222 23445556677777 55653321101 0 Q ss_pred cccCchhhhhccchhh Q lcl|NC_015280. 430 SDSDPQAAGNLNANAY 445 (455) Q Consensus 430 ~~~~~~~~~~~~~n~y 445 (455) .++.+ ++.. . T Consensus 399 ~~~~~----~~~~--~ 408 (408) T protein:vir:10 399 NFKTT----TSTA--V 408 (408) T ss_pred CCCCC----Cccc--C Confidence 11111 1111 1 No 106 >protein:vir:739 Length: 231 # NCBI annotation: major structural protein 4 # Family: family:all:522 # MgeID: mge:14 # MgeName: Tuc2009 # Cross-refs: genbank:acc:NP_108716;genbank:gi:13487838;genbank:GeneID:920884 Probab=38.48 E-value=1.1 Score=20.43 Aligned_cols=217 Identities=9% Similarity=0.060 Sum_probs=110.2 Q ss_pred CCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHH Q lcl|NC_015280. 150 INDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESEL 229 (455) Q Consensus 150 ~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~EL 229 (455) .|....|.+-.- ....+++|.++++..-+..+|+++=.+.++ |-+.=.-++|=|- .|.+ + =|.-.|. T Consensus 1 ~~~~~~Gdtit~-------P~~iGda~~v~eG~~i~~~~l~~t~~~atI--k~~gk~~~itD~a--~l~~-~-gDp~~ea 67 (231) T protein:vir:73 1 ENGINLANLCEY-------PNDIGDAADVAEGGEISLDKIGTTTKSVTI--KKAAKGTEITDEA--ALSG-Y-GDPIGES 67 (231) T ss_pred CccccCCceEEe-------cccccchhhhcCCCcCChhhccccceeeeE--eeeccceeeeHHH--Hhhc-c-CchHHHH Confidence 222222222111 122567788888777777777765444444 5443333343222 2444 3 3889999 Q ss_pred HHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeeccccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEE Q lcl|NC_015280. 230 ANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIII 309 (455) Q Consensus 230 anILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v 309 (455) .+-|+..|+..++.||+..+-+.+... +. .+. -+.+-.+..+ +.-| -....+++ T Consensus 68 ~~Q~~~~iA~kvD~di~~~~~~a~l~~-----~~--~~t-------~d~i~~A~~~---fgde---------~~~~~viv 121 (231) T protein:vir:73 68 NKQLGLSLANKVDDDLLKAAKTTSQTV-----ST--KAN-------VDGVQAALDI---FNDE---------DAQAYVLI 121 (231) T ss_pred HHHHHHHHHHhhhHHHHHhhccccccc-----cc--ccc-------HHHHHHHHHH---hccc---------cccceEEE Confidence 999999999999999998777655431 11 111 1223333333 1111 24677999 Q ss_pred EchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEeccccccCCcceEEEEEecCccccceeEEc Q lcl|NC_015280. 310 TSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYC 389 (455) Q Consensus 310 ~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfya 389 (455) |+|+++.-|...--..... .... . +.=-+| ..|.+. +++|+++... | +++.+++ T Consensus 122 v~p~~~~~Lrk~~~~~~~~-~~~g-~---~i~~~G--~iG~i~-G~~Vi~S~~~----~--------------~~~~~~~ 175 (231) T protein:vir:73 122 VNPKDAAKIRKDANAKNIG-SEVG-A---NALING--TYADVL-GAQIVRSKKL----A--------------EGSALMF 175 (231) T ss_pred EcchHHHhhhhccchhhhh-hhhc-c---ceeeec--ccceEc-ceEEEEcCCC----C--------------CCceeee Confidence 9999999887621110000 0000 0 000111 245553 4777777432 1 2333455 Q ss_pred cccc------ccceee------cCCccccceeeeeeecce-eeccccccccccc-ccC Q lcl|NC_015280. 390 PYVP------LQMYRA------IGQDTFQPRIGFKTRYGM-VLNPFAKGLTALS-DSD 433 (455) Q Consensus 390 PYv~------l~~~~~------~Dp~s~qP~~g~~tRY~l-~~nP~~~~~~~~~-~~~ 433 (455) +|+. +...+. =|+..+.-.+----.|++ ..||=-.- .+. .|- T Consensus 176 ~~i~~~gAl~~~~k~~~~vEtdRd~~~k~~~i~~~~~y~v~l~~~~~vv--~~t~~g~ 231 (231) T protein:vir:73 176 KIVSNSPALKLVLKRGVQVETDRDIVTKTTVITADEHYAAYLYDLTKVV--NITFTGV 231 (231) T ss_pred eEEeeccceeeeecccceeeccccccccccEEEEeEEEEEEEEcCccEE--EEEeecC Confidence 5532 001111 177778777777778887 44553321 110 122 No 107 >protein:vir:2430 Length: 318 # NCBI annotation: major head subunit # Family: family:all:507 # MgeID: mge:52 # MgeName: D29 # Cross-refs: genbank:acc:NP_046832;genbank:gi:9630400;genbank:GeneID:1261582 Probab=36.94 E-value=1.1 Score=20.26 Aligned_cols=290 Identities=9% Similarity=-0.042 Sum_probs=118.9 Q ss_pred HHHHhhhHHHHHHHHHHhhhhhhhchhhhccccccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEe Q lcl|NC_015280. 32 TAILLENQERALAEERAVLTEAPTNVGPINTPTTSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRS 111 (455) Q Consensus 32 ~~~~~enq~~~~~e~~~~l~ea~~~~~~~~~~st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRs 111 (455) |++ =.++-.|++.+..- +++.+...--....-.+++.+.+..+..+++.+-||++++.-|. . T Consensus 1 ~~~----~~~~~~e~~~~~~~-----------~~~~~~~~ip~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~ip-~-- 62 (318) T protein:vir:24 1 MAA----GTAFAVDHAQIAQT-----------GDTMFKGYLEPEQAKDYFAEAEKTSIVQQFAQKVPMGTTGQKIP-H-- 62 (318) T ss_pred CCC----CCCCCHHHHHhhcc-----------cCcccceeechhHHHHHHHHHHhhchhhhhcceeeccCCceEEE-E-- Confidence 000 00001233332221 11111111111122234444556667788899999988753332 1 Q ss_pred eecCCCCcccccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCcccccee Q lcl|NC_015280. 112 RYTNQSGNEAFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAF 191 (455) Q Consensus 112 rY~~qsG~EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaF 191 (455) ... +.+ +-..++ +..+++... T Consensus 63 -~~~--~~~------------------------------------------------a~~v~E--------g~~~~~~~~ 83 (318) T protein:vir:24 63 -WVG--DVS------------------------------------------------AQWIGE--------GDMKPITKG 83 (318) T ss_pred -EeC--Ccc------------------------------------------------eEEecC--------Ccccccccc Confidence 100 000 000111 122334445 Q ss_pred EEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeec Q lcl|NC_015280. 192 SIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDV 271 (455) Q Consensus 192 sIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~ 271 (455) ++++++.+.|..+-...+|-||.+|-. .|.+++|.+.|+..|...|++.+|.---+ ++ ..|++.... T Consensus 84 ~f~~i~~~~~k~~~~~~iS~e~l~ds~----~~~~~~i~~~l~~~~~~~~d~a~l~G~g~----~~-----~~~~~~~~~ 150 (318) T protein:vir:24 84 NMTSQTIAPHKIATIFVASAETVRANP----ANYLGTMRTKVATAFAMAFDGAAMHGTDS----PF-----PTYIGQTTK 150 (318) T ss_pred ceeEEEEeeEEEEEeehhhHHHhhcCh----HHHHHHHHHHHHHHHHHHHHHhhhcccCC----CC-----Ccccccccc Confidence 566666666666667789999999854 57899999999999999999988743211 11 111111100 Q ss_pred c-------ccchhhHHHHHHHHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCC Q lcl|NC_015280. 272 D-------SNGRWSVEKFKGLLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTG 344 (455) Q Consensus 272 ~-------~~gr~~ve~~k~l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~ 344 (455) . ...-+.-.....+.+.+ ...-.....+|+|+.....|... ...- |.. ....+.++ T Consensus 151 ~~~~~~~~~~~~~~~~~~~~~~~~~---------~~~~~~~~~~v~n~~~~~~L~~l---kd~~---G~~--l~~~~~~~ 213 (318) T protein:vir:24 151 AISIADTTGATTVYDQVAVNGLSLL---------VNDGKKWTHTLLDDITEPILNGA---KDQN---GRP--LFIESTYG 213 (318) T ss_pred cccccccccccchHHHHHHHHHHhh---------ccccCCCCEEEEcHHHHHHHHHh---hccC---Cce--eecCcccc Confidence 0 00111111111121221 12233445689999999999852 2110 100 00001111 Q ss_pred ---ceeE-EEecCceEEEEeccccccCCcceEEEEEecCccccceeEEccccccc--------ceeecCCcc-----c-- Q lcl|NC_015280. 345 ---NTFV-GTLNGRFKVYIDPYSANVSDNQYYVVGYKGTNAYDAGLFYCPYVPLQ--------MYRAIGQDT-----F-- 405 (455) Q Consensus 345 ---~~~~-G~l~~~~~vy~D~y~~~~s~~dY~~vG~KG~~~~daglfyaPYv~l~--------~~~~~Dp~s-----~-- 405 (455) ..+- +.+. +++|++.+.... ....+++| +- +.++|+-.-.+. +....|+.. | T Consensus 214 ~~~~~~~~~~i~-g~pv~~~~~~~~--~~~~~~~g---df---s~~~~~~~~~l~i~~~~~~~~~~~~~~~~~~~~~f~~ 284 (318) T protein:vir:24 214 EAASPFRSGRIV-ARPTILSDHVVE--GTTVGFMG---DF---SQLIWGQIGGLSFDVTDQATLNLGTVESPNFVSLWQH 284 (318) T ss_pred CccccccCceEE-EEeeEEeCCCCC--CccEEEEe---ec---ceEEEEEecCeEEEEeeccceeccccccccchhhhhc Confidence 1111 1221 356666654321 11112211 11 112233221111 111122211 2 Q ss_pred -cceeeeeeecce-eecccccccccccccCchhhhhcc Q lcl|NC_015280. 406 -QPRIGFKTRYGM-VLNPFAKGLTALSDSDPQAAGNLN 441 (455) Q Consensus 406 -qP~~g~~tRY~l-~~nP~~~~~~~~~~~~~~~~~~~~ 441 (455) |=.+=...|++. +.+|-.- ..+..... + +..+ T Consensus 285 ~~~~~r~~~r~d~~v~~~~a~--~~i~~~~a-~-~~~~ 318 (318) T protein:vir:24 285 NLVAVRVEAEYAFHCNDAEAF--VALTNVVS-G-GGEG 318 (318) T ss_pred CcEEEEEEEEEccEEecccce--EEEEeecc-C-CCCC Confidence 233344668887 4565433 11111000 0 0111 No 108 >protein:vir:94711 Length: 347 # NCBI annotation: capsid # Family: family:all:975 # MgeID: mge:1528 # MgeName: K1F # Cross-refs: genbank:acc:YP_338120;genbank:gi:77118198;genbank:GeneID:3707734 Probab=36.55 E-value=1.2 Score=20.22 Aligned_cols=300 Identities=18% Similarity=0.132 Sum_probs=126.7 Q ss_pred CCCcceeeeEEEeeecCCCCc------ccccccccccccccccccccccccccCcccCCCCCCCCcccccccccccccc- Q lcl|NC_015280. 99 MTGPTGLIFAMRSRYTNQSGN------EAFFDEPDAQFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFS- 171 (455) Q Consensus 99 mTGPTGLIFAMRsrY~~qsG~------EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~- 171 (455) |.--++.-.+.|.-++..+++ |-|-.|..+.|.-..- ..+.... -....|. .......+.... T Consensus 1 m~~~~~~~~~t~~g~~~~~~d~~al~ik~f~~eV~~~f~~~s~---~~~~~~~-----r~i~~G~--sv~i~~iG~~tv~ 70 (347) T protein:vir:94 1 MANVPGQKIGTDQGKGKSSSDALALFLKVFAGEVLTAFTRRSV---TADKHIV-----RTIQNGK--SAQFPVMGRTSGV 70 (347) T ss_pred CCCCCccccccccccCCccccHHHHHHHHHhHHHHHHHHHHHh---hhccccc-----ccccccc--eEEEecccceeee Confidence 555555444444444333333 2233333343321110 0000000 0000000 000000111000 Q ss_pred -hhhhhhcCCC-CCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHH Q lcl|NC_015280. 172 -TSEQEALGDG-ASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTV 249 (455) Q Consensus 172 -Ta~aE~LG~s-~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l 249 (455) ...++.+.+. ....-.|+.++||++.+ +..-+.-.-|.++ | .|-.+|++.-....+..++.+-|++.+ T Consensus 71 ~~t~G~~l~~~~~~~~~~e~~itID~~~~--------~~~~VddiD~~q~-~-~D~~~~~~~~~g~aLa~~~D~~i~~~~ 140 (347) T protein:vir:94 71 YLAPGERLSDKRKGIKHTEKVITIDGLLT--------ADVMIFDIEDAMN-H-YDVAGEYSNQLGEALAIAADGAVLAEM 140 (347) T ss_pred eecCCCCcCCCCCCCCcceEEEEecchhh--------hhHHhhhHHHHhc-C-cchHHHHHHHHHHHHHHHHHHHHHHHH Confidence 0011222111 12234567788887532 3334444444444 4 788999999999999999999999888 Q ss_pred hhhhe-eee----eeccccceeeeeeccccchhhHHHHHHHHHHHHHHHHHHH-HHhcCCCccEEEEchhHHHHHHhhcc Q lcl|NC_015280. 250 YRGAK-PGA----QANVANAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAIA-QETRRGKGNIIITSADVASALAMSGV 323 (455) Q Consensus 250 ~~vA~-~~k----~~~v~~~gv~Dl~~~~~gr~~ve~~k~l~~qi~~ean~i~-~~T~~~~gn~~v~S~~va~~L~~sG~ 323 (455) ..++- ... ..|....-+++.....+..-. +....-++.-..++.+.- .+----.|-|+|++|+..++|-.. T Consensus 141 ~~~aa~~~~~~~~~~g~~~~s~~~~~~~~~~~~~-~~~~~~~~~~i~~a~~~Lde~~VP~~~R~~vv~P~~~~~Ll~~-- 217 (347) T protein:vir:94 141 AILCNLPAASNENIAGLGTASVLEVGKKADLDTP-AKLGEAIIGQLTIARAKLTSNYVPAGDRYFYTTPDNYSAILAA-- 217 (347) T ss_pred HHHhccccccccccCCCcccceeeccccccccch-hhhHHHHHHHHHHHHHHHhhcCCCCCCcEEEeCHHHHHHHhcc-- Confidence 65432 121 122222223332222211101 111111122222222221 222233578999999999988532 Q ss_pred cccccccccccccccccccCCceeEEEecCceEEEEeccccccCCc----------ceEEEE-------------EecCc Q lcl|NC_015280. 324 LDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDN----------QYYVVG-------------YKGTN 380 (455) Q Consensus 324 l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~----------dY~~vG-------------~KG~~ 380 (455) .++.... ..+ .. ...+-.+|.+ .+++||.-. +-|+ .|-++. |+++- T Consensus 218 ~~~~~~~--~~~---~~-~~~~G~Vg~i-~G~~V~~Sn----~lp~~~~t~~~~~~~~~~~aG~~~~~~~~~~~~~~~~~ 286 (347) T protein:vir:94 218 LMPNAAN--YAA---LI-DPETGNIRNV-MGFVVVEVP----HLVQGGAGETRGDDGITIASGQKHAFPATASSDVKVTM 286 (347) T ss_pred chhhhhh--ccc---cc-cccccceEEE-eceEEEecC----cccccccccccccCcceecCcccccccccchhhhcccc Confidence 2221111 111 11 1122256777 678888864 3343 222221 33333 Q ss_pred cccceeEEcccccccceeec--------CCccccceeeeeeecce-eecccccccccccccC Q lcl|NC_015280. 381 AYDAGLFYCPYVPLQMYRAI--------GQDTFQPRIGFKTRYGM-VLNPFAKGLTALSDSD 433 (455) Q Consensus 381 ~~daglfyaPYv~l~~~~~~--------Dp~s~qP~~g~~tRY~l-~~nP~~~~~~~~~~~~ 433 (455) .-..+|||-|=.- ...+.+ |+..|-=.|==+..||- +.+|-+.+.-.....+ T Consensus 287 ~~~~~l~~h~~A~-~~v~~~~~~~e~~r~~~~~~d~i~~~~~~G~~~~rP~~a~~~~~~~A~ 347 (347) T protein:vir:94 287 DNVVGLFSHRSAV-GTVKLRDLALERDRDVDAQGDLIVGKYAMGHGGLRPEAAGALVFSPAE 347 (347) T ss_pred cceeEEEeehhhh-hhhhcccccccchhchhhHHHHhhhhhhhcCcccccceeEEEEecCCC Confidence 3346788877622 233333 33444333333344555 6777665322222222 No 109 >protein:vir:3364 Length: 347 # NCBI annotation: major capsid protein 10A # Family: family:all:975 # MgeID: mge:67 # MgeName: T3 # Cross-refs: genbank:acc:NP_523335;genbank:gi:17570826;genbank:GeneID:927448 Probab=35.49 E-value=1.2 Score=20.10 Aligned_cols=306 Identities=17% Similarity=0.090 Sum_probs=133.1 Q ss_pred CC-CcceeeeEEEeeecCCCCc-ccccc-----cccccccccccccc-ccccccc-CcccCCCCCCCCcccccccccccc Q lcl|NC_015280. 99 MT-GPTGLIFAMRSRYTNQSGN-EAFFD-----EPDAQFSGTDGATP-PTATTEK-NPALINDATGGGTTATNYDLASSK 169 (455) Q Consensus 99 mT-GPTGLIFAMRsrY~~qsG~-EAlfn-----Ea~t~fSg~~~~~~-~~~~~~~-~~~~~~~~~~g~t~~~~~~~~~~g 169 (455) |- .|+|.--+-|..+++-+|+ .|+|= |-.+.|.-.+-... ..-.... +.....+. .|.. ....+. T Consensus 1 ~~~~~~~~~~~t~~g~~~~~~~~~al~ie~~~g~V~~~f~~~s~~~~~v~~r~~~~G~sv~i~~-iG~~-t~~~~~---- 74 (347) T protein:vir:33 1 MANIQGGQQIGTNQGKGQSAADKLALFLKVFGGEVLTAFARTSVTMPRHMLRSIASGKSAQFPV-IGRT-KAAYLK---- 74 (347) T ss_pred CCCCccCcccccccccCCcccchHHHHHHHHHHHHHHHHHHHHhhhhhhccccccccceeEeee-ccce-eeeeec---- Confidence 32 3344433344444433332 22222 22222221110000 0000000 00000000 0000 000000 Q ss_pred cchhhhhhc-CCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHH Q lcl|NC_015280. 170 FSTSEQEAL-GDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRT 248 (455) Q Consensus 170 m~Ta~aE~L-G~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~ 248 (455) .++.+ +.-.+....|+-++||++- -+...|+-.-+.++ | .|-..|+..=....++..+++-|+.. T Consensus 75 ----~g~~l~~~~~~~~~~e~~ltiD~~~--------y~~~~VddiD~~q~-~-~D~~~~~~~~~g~aLA~~~D~~i~~~ 140 (347) T protein:vir:33 75 ----PGENLDDKRKDIKHTEKVIHIDGLL--------TADVLIYDIEDAMN-H-YDVRAEYTAQLGESLAMAADGAVLAE 140 (347) T ss_pred ----CCCCCCCCCCCCccceEEEEechhh--------hhhHHHhhHHHHhc-C-CchhHHHHHHHHHHHHHHHHHHHHHH Confidence 01122 1111234567788888753 34455666666666 4 78889999999999999999999887 Q ss_pred Hhhhhe-----eeeeeccccceeeeeeccccc-hhhHHHHHHHHHHHHHHHHHHHHHh-cCCCccEEEEchhHHHHHHhh Q lcl|NC_015280. 249 VYRGAK-----PGAQANVANAGVFDLDVDSNG-RWSVEKFKGLLFQIERDANAIAQET-RRGKGNIIITSADVASALAMS 321 (455) Q Consensus 249 l~~vA~-----~~k~~~v~~~gv~Dl~~~~~g-r~~ve~~k~l~~qi~~ean~i~~~T-~~~~gn~~v~S~~va~~L~~s 321 (455) |..... .+-..+....+.+.....+.| -|..+.....+|....++.+..-+- ---.|-|+|++|+.-+.|-.. T Consensus 141 l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~tg~~~d~~~~a~~i~~~i~~a~~~Lde~~VP~~gR~~vv~P~~y~~Ll~~ 220 (347) T protein:vir:33 141 LAGLVNLPDGSNENIEGLGKPTVLTLVKPTTGSLTDPVELGKAIIAQLTIARASLTKNYVPAADRTFYTTPDNYSAILAA 220 (347) T ss_pred HHHhhhhhcccccccccccccccccccccccccccchhhhHHHHHHHHHHHHHHHhhcCCCccCcEEEeCHHHHHHHhcc Confidence 754321 111222223333333222222 2333333334455555555544332 223588999999999988754 Q ss_pred cccccccccccccccccccccCCceeEEEecCceEEEEeccccccCCcceEE---EEEe------------cCcccccee Q lcl|NC_015280. 322 GVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVSDNQYYV---VGYK------------GTNAYDAGL 386 (455) Q Consensus 322 G~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s~~dY~~---vG~K------------G~~~~dagl 386 (455) --+ .. .... ..+....-.+|.+ .+++||.-+...+++..+.-+ .|-+ +.---..|| T Consensus 221 ~~~--~~---~d~~---~~~~~~~G~V~~i-~G~~V~~Sn~lp~~~~~~~~~~~~ag~~~~~~~~~~~~~~~a~~~~~gl 291 (347) T protein:vir:33 221 LMP--NA---ANYQ---ALLDPERGTIRNV-MGFEVVEVPHLTAGGAGDTREDAPADQKHAFPATSSTTVKVALDNVVGL 291 (347) T ss_pred ccc--cc---cccc---cccccccceeEEE-eceeEEEecccccCccccccccccccccccccCCcccceeccccceeee Confidence 221 11 1111 1122334456777 678998875432221111100 1111 000012355 Q ss_pred EEcccccc----cc---eeecCCccccceeeeeeecce-eecccccccccccc-cC Q lcl|NC_015280. 387 FYCPYVPL----QM---YRAIGQDTFQPRIGFKTRYGM-VLNPFAKGLTALSD-SD 433 (455) Q Consensus 387 fyaPYv~l----~~---~~~~Dp~s~qP~~g~~tRY~l-~~nP~~~~~~~~~~-~~ 433 (455) ||.|=... .. -+.=|+.+|--.|=-+..||- +.+|-....=+..+ .+ T Consensus 292 ~~h~~A~g~v~~~~~~~e~~r~~~~~~d~i~~~~~~G~~vlrP~~av~i~~~~~~~ 347 (347) T protein:vir:33 292 FQHRSAVGTVKLKDLALERARRANYQADQIIAKYAMGHGGLRPEAAGAIVLPKVSE 347 (347) T ss_pred eecchhheeeeeeceeeeeccchhhhhHhhhhhhhcCCceecccceEEEecCCCCC Confidence 65554321 11 111266666665656666666 66666653111111 01 No 110 >protein:vir:8187 Length: 311 # NCBI annotation: gp7 # Family: family:all:966 # MgeID: mge:153 # MgeName: Che9d # Cross-refs: genbank:acc:NP_817980;genbank:gi:29566414;genbank:GeneID:2700968 Probab=33.85 E-value=1.3 Score=19.91 Aligned_cols=280 Identities=11% Similarity=0.088 Sum_probs=113.6 Q ss_pred hhhhhchhhhccccccccccccccchhhhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccccccccc Q lcl|NC_015280. 51 TEAPTNVGPINTPTTSSGAVAGFDPILISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQF 130 (455) Q Consensus 51 ~ea~~~~~~~~~~st~tg~i~~~~P~Lv~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~f 130 (455) |= .+++|...--....-.+++++-+.-+..+++-|-||++..- -+-. +. ++.+ T Consensus 1 ma-----------t~~~gg~lvP~~~~~~ii~~~~~~s~i~~~~~~i~~~~~~~-~~p~---~~--~~~~---------- 53 (311) T protein:vir:81 1 MV-----------ALATGTFQLPKHLVPGVWQKAQGQSVLARLSMAEPQEFGEQ-QYMT---LT--APPR---------- 53 (311) T ss_pred Cc-----------eecCCceEcchhHHHHHHHHHHhcchhhhhcceeecCCCce-EEEE---Ee--CCce---------- Confidence 11 22233332222222345566667778889999999876431 1111 10 0000 Q ss_pred cccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeecccccccee Q lcl|NC_015280. 131 SGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYS 210 (455) Q Consensus 131 Sg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYT 210 (455) +-..++++. +++...++++++..+|.=+-....| T Consensus 54 --------------------------------------a~wv~Eg~~--------~~~~~~~f~~v~l~~~kl~~~~~iS 87 (311) T protein:vir:81 54 --------------------------------------GEVVGEGAQ--------KSESTATFAPVTAIPRKVQVTQRFS 87 (311) T ss_pred --------------------------------------eEEeecCcc--------cccccceeeEEEEeeEEEEEeehhh Confidence 001111221 2233333455555444444455789 Q ss_pred HHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeee-----ecccc-ceeeeeeccccchhhHHHHHH Q lcl|NC_015280. 211 VELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQ-----ANVAN-AGVFDLDVDSNGRWSVEKFKG 284 (455) Q Consensus 211 iELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~-----~~v~~-~gv~Dl~~~~~gr~~ve~~k~ 284 (455) -||.|+--. -.++-|++|.+-|+..|...|+.-++.-.- +..+.. +++.. ..+.... .......+.. T Consensus 88 ~ell~~~~d-~~~~l~~~i~~~la~ai~~~~d~a~l~G~~--~~~~~~~~gi~~~~~~~~~~~~~~--~~~~~~~~~~-- 160 (311) T protein:vir:81 88 QEVKWADES-RQLGVLQTMADLSGVALGRALDLIGIHGIN--PLTGAALSGSPAKILDTTNIVELT--TGTSATPDLA-- 160 (311) T ss_pred HHHhhcCcc-cHHHHHHHHHHHHHHHHHHHHHHhhhcccc--CCCCcccccccccccccceeeeec--ccccchHHHH-- Confidence 999875322 224456777777777777777766664321 111111 11101 1111111 1111111111 Q ss_pred HHHHHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEecccc Q lcl|NC_015280. 285 LLFQIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSA 364 (455) Q Consensus 285 l~~qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~ 364 (455) |.+....+ ...++..+-+|++++....|.. |.+. +|..-+ ..+.++ -..|+|.| ++|+++.+.. T Consensus 161 ----i~~~~~~~--~~~~~~~~~~vmn~~~~~~l~~---lkd~---~G~~l~--~~~~~~-~~~~tl~G-~Pv~~~~~i~ 224 (311) T protein:vir:81 161 ----VEAAVGLV--LGDNLSPDGVALDNTFSFMLAT---QRDS---QGRKLY--PELGFG-TDVASFAG-LNAAVSDTVR 224 (311) T ss_pred ----HHHHHHHh--hhcCCCceEEEEcHHHHHHHHh---hhcc---CCCeee--cCcccc-CCCceecc-eeEEeccccc Confidence 11111111 2345677778889998888864 2211 111111 001111 12456654 7788764431 Q ss_pred cc--------------CCcceEEEEEecCccccceeEEcccccccceeec--CCcc----ccc-eeee--eeecce-eec Q lcl|NC_015280. 365 NV--------------SDNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAI--GQDT----FQP-RIGF--KTRYGM-VLN 420 (455) Q Consensus 365 ~~--------------s~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~--Dp~s----~qP-~~g~--~tRY~l-~~n 420 (455) .+ ...+.+++| + -+.+++...-++.+...- |+.. ||- .++| ..|+|. +.+ T Consensus 225 ~~~~~~~~~~~~~~~~~~~~~~~~g---D---fs~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~v~~r~~~r~d~~v~~ 298 (311) T protein:vir:81 225 GGPEAVTASTGVYRTTNPNVKAIAG---D---FSAFRWGVQVSIPLELIEFGDPDGLGDLKRQNQIAIRAEVVYGIGIMS 298 (311) T ss_pred ccccccccccchhcccCCccEEEEE---e---cccEEEEEeccceEEEeccCCCCcchhhhhcCcEEEEEEEEeccEeec Confidence 10 001111111 0 012333333333322222 2221 222 1344 468886 677 Q ss_pred ccccc-ccccccc Q lcl|NC_015280. 421 PFAKG-LTALSDS 432 (455) Q Consensus 421 P~~~~-~~~~~~~ 432 (455) |-+-. ..+.... T Consensus 299 ~~a~~~l~~a~~~ 311 (311) T protein:vir:81 299 TDAFAVVRDADES 311 (311) T ss_pred ccceEEEEeeccC Confidence 73321 1111111 No 111 >protein:vir:5739 Length: 366 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:122 # MgeName: PY54 # Cross-refs: genbank:acc:NP_892050;genbank:gi:33770513;interpro:IPR006444;uniprot:Q7Y410;genbank:GeneID:1732928 Probab=25.03 E-value=2.1 Score=18.81 Aligned_cols=325 Identities=15% Similarity=0.090 Sum_probs=116.3 Q ss_pred Ccch--HHHHHHhhH--hhcCCCCccccchh-hHHHHHHHhh---hHHHHHHHHHHhhhhhhhchhhhcccccccccccc Q lcl|NC_015280. 1 MYNA--ENLQEKWAP--VLNHEGLNDIKDPY-RKSVTAILLE---NQERALAEERAVLTEAPTNVGPINTPTTSSGAVAG 72 (455) Q Consensus 1 m~~~--~~~~~kw~~--~l~~~~~~~i~~~~-~~~v~~~~~e---nq~~~~~e~~~~l~ea~~~~~~~~~~st~tg~i~~ 72 (455) |.-+ +.-++++.+ ++.+++.++-+... .|.+.+ |.. |..+.+...+..+.+.. ...-+..++.+|... T Consensus 1 ~a~~~a~~~~~~~~~~~~~~~~~~~~~kg~~~~~~~~a-~a~~~g~~~~a~~~a~~~~~~~~--~~~a~~~~~~~Gg~l- 76 (366) T protein:vir:57 1 MAAAVAVPVKAHSVAPGIIIKEELQQYKGAGMTRMVMS-IAAGKGNLADAAKFAATELGDTG--LSMAISTAAGSGGAL- 76 (366) T ss_pred CcccccccccccccccccccccccccccchhHHHHHHH-HHhcccchhHHHHHHHHhhcchh--hhhhccccccCCccc- Confidence 3222 111112211 11112221111111 112221 111 11111111111111110 001111122222211 Q ss_pred ccchhh--hHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccccccccccccccccccccccccCcccC Q lcl|NC_015280. 73 FDPILI--SLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDAQFSGTDGATPPTATTEKNPALI 150 (455) Q Consensus 73 ~~P~Lv--~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t~fSg~~~~~~~~~~~~~~~~~~ 150 (455) =|.-+ .++.++-+..+...+ |++.+++++|-+-=.| .. ++. T Consensus 77 -vP~~~~~~ii~~l~~~s~l~~l-g~~~v~~~~g~~~~p~--~t--~~~------------------------------- 119 (366) T protein:vir:57 77 -IPQNMQNEVIELLRDRTVVRIL-GARSIPLPNGNLSMPR--LS--GGA------------------------------- 119 (366) T ss_pred -cchhHHHHHHHHHhhhcchhhh-ceeeeecCCCceEEEE--Ee--CCc------------------------------- Confidence 12211 123322232222222 3433333333211111 00 000 Q ss_pred CCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccceeHHHHHhHHHhhCCChhHHHH Q lcl|NC_015280. 151 NDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRADYSVELAQDLKAIHGLDAESELA 230 (455) Q Consensus 151 ~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAEYTiELAQDLkAiHGLDAE~ELa 230 (455) .+-..++. ..+++...+++++++..|.-+-...+|-||.+|-- .|.|+.|. T Consensus 120 -----------------~a~wv~E~--------~~~~~s~~~f~~i~~~~~k~~~~~~iS~ell~ds~----~~~~~~i~ 170 (366) T protein:vir:57 120 -----------------TAGYVGEG--------KDVVATGATFDDVKLSAKTMIALVPVSNQLIGRAG----FNVEQLLL 170 (366) T ss_pred -----------------ceeeeccC--------ccccccccceeEEEEeeEEEEEeehhhHHHHhhhh----HHHHHHHH Confidence 00011111 12334444556666666666667778999998753 46788899 Q ss_pred HHHHHHHHHHhhHHHHHHHhhhheeeeeeccccceeeeeecc---------ccchh-hHHHHHHHHHHHHHHHHHHHHHh Q lcl|NC_015280. 231 NILSTEILAEINREVVRTVYRGAKPGAQANVANAGVFDLDVD---------SNGRW-SVEKFKGLLFQIERDANAIAQET 300 (455) Q Consensus 231 nILStEImlEINReII~~l~~vA~~~k~~~v~~~gv~Dl~~~---------~~gr~-~ve~~k~l~~qi~~ean~i~~~T 300 (455) +-|+..|...+++.||.-=-+ +-...|++-.... ....| .+...-.++...... .. T Consensus 171 ~~l~~a~~~~~d~a~l~G~G~--------~~~p~Gi~~~~~~~~~~~~~~~t~~~~~~~~~~~~~~~~~~~~------~~ 236 (366) T protein:vir:57 171 GDILSAIATREDKAFLRDDGT--------GDTPKGMKAVATAANRLVAWTGTAINLTTIDEYLDSLILKHMD------SN 236 (366) T ss_pred HHHHHHHHHHHHHHhhccCCC--------CccccceeeccccccceeeccccccchhhHHHHHHHHHHhhhc------cc Confidence 999999998888887743110 0112222211000 00111 111121221211111 11 Q ss_pred cCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEecccccc------------CC Q lcl|NC_015280. 301 RRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANV------------SD 368 (455) Q Consensus 301 ~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~------------s~ 368 (455) ....+...|+++.....|.. +.. .+|..-+ .+.+ -|+|. +++|+++.+...+ .+ T Consensus 237 ~~~~~a~~vmn~~~~~~L~~---lkd---~~G~~l~---~~~~----~g~l~-G~Pvv~s~~ip~~~~~~~~~~~i~~gd 302 (366) T protein:vir:57 237 SNMIRCGWGLSNRTYMTLFG---LRD---GNGNKVY---PEMS----QGILK-GYPIQRTSAIPANLGDDGNESEIYFCD 302 (366) T ss_pred cccccCEEEecHHHHHHHHh---hhc---cCCceec---cCCC----CCeec-ceeeEEccccccccccCCCccEEEEEe Confidence 22233445788888888775 221 1122111 1112 25663 5788887653211 01 Q ss_pred cceEEEEEecCccccceeEEcccccccceeecCCc--------cccceeeeeeecce-eecccccccccccccCch Q lcl|NC_015280. 369 NQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQD--------TFQPRIGFKTRYGM-VLNPFAKGLTALSDSDPQ 435 (455) Q Consensus 369 ~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~--------s~qP~~g~~tRY~l-~~nP~~~~~~~~~~~~~~ 435 (455) +.++++|-.+..+.+ .+++.. ..|+. +-|=.+=...|+++ +.+|-.-- ..+|-.| T Consensus 303 fs~~~i~~~~~i~i~----~~~ea~-----~~~~~g~~~~~f~~~~~~iR~~~~~d~~v~~~~a~~---~lt~~~~ 366 (366) T protein:vir:57 303 FNDVVIGEDGMMKVD----FSTEAT-----YKDADGQLVSAFARNQSLIRVVTEHDIGFRHPEGLV---LGTGVIW 366 (366) T ss_pred cceEEEEEecceEEE----Eeeccc-----cccccccchhhhhcCceeEEeeeeeCcEeeccccEE---EEecccC Confidence 222223333332221 111100 00111 11223344556777 44453221 1245666 No 112 >protein:vir:9361 Length: 402 # NCBI annotation: SLT orf 37-like protein # Family: family:all:658 # MgeID: mge:166 # MgeName: phi 12 # Cross-refs: genbank:acc:NP_803339;genbank:gi:29028650;genbank:GeneID:1258088 Probab=21.11 E-value=2.7 Score=18.26 Aligned_cols=317 Identities=15% Similarity=0.093 Sum_probs=113.5 Q ss_pred Ccch---------HHHHHHhhHhhc--------------CCCCccccchhhHHHHHHHhhhHHHHHHH------HHHhhh Q lcl|NC_015280. 1 MYNA---------ENLQEKWAPVLN--------------HEGLNDIKDPYRKSVTAILLENQERALAE------ERAVLT 51 (455) Q Consensus 1 m~~~---------~~~~~kw~~~l~--------------~~~~~~i~~~~~~~v~~~~~enq~~~~~e------~~~~l~ 51 (455) ++.. +.|+++..-+-+ ....+.-.+..+++. ..+...+++. .+..+. T Consensus 49 ~~~ee~~~~~~~~~~l~~~~~~l~~~~~~~e~~~~~~~~~~~~~~~~~~~~~~~----~~~~~~~~r~~~~~~~~~~~~~ 124 (402) T protein:vir:93 49 IDMEDIKQLETEKAGLQQRFNIVERQVQDIEEKEKAKVKDKGEAYQSLSDNEKM----VKAKAEFYRHAILPNEFEKPSM 124 (402) T ss_pred cCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhccccCCCCchhHHH----HHHHHHHHHHHHhhhhHHHHHH Confidence 1100 112222211100 000000000000000 0000000100 000010 Q ss_pred hhhhchhhhccccccc-cccccccch-h-hhHHHHHHhhhhhhheeeeccCCCcceeeeEEEeeecCCCCcccccccccc Q lcl|NC_015280. 52 EAPTNVGPINTPTTSS-GAVAGFDPI-L-ISLIRRAMPKLIAYDIAGVQPMTGPTGLIFAMRSRYTNQSGNEAFFDEPDA 128 (455) Q Consensus 52 ea~~~~~~~~~~st~t-g~i~~~~P~-L-v~l~RRa~p~LIa~DI~GVQPmTGPTGLIFAMRsrY~~qsG~EAlfnEa~t 128 (455) .+. ...+....++++ |... . |. + -.+++.....-+-.+++.|-|+++.+.- |-.+... T Consensus 125 ~~~-~~~~a~~~~t~~~GG~l-I-P~~~~~~Ii~~~~~~~~l~~~~~v~~~~~~~~p----~~~~~~~------------ 185 (402) T protein:vir:93 125 EAQ-RLLHALPTGNDSGGDKL-L-PKTLSKEIVSEPFAKNQLREKARLTNIKGLEIP----RVSYTLD------------ 185 (402) T ss_pred hHH-HHHhhhccCCCcCCccc-c-chhHHHHHHHhHHhhhhhhhhceeeecCCceee----eeeccCC------------ Confidence 000 000111112211 1111 1 21 1 1123323333345788888887654321 1001000 Q ss_pred cccccccccccccccccCcccCCCCCCCCcccccccccccccchhhhhhcCCCCCCccccceeEEEEEEEEeeccccccc Q lcl|NC_015280. 129 QFSGTDGATPPTATTEKNPALINDATGGGTTATNYDLASSKFSTSEQEALGDGASTAFMEMAFSIDKIAVEAKGRALRAD 208 (455) Q Consensus 129 ~fSg~~~~~~~~~~~~~~~~~~~~~~~g~t~~~~~~~~~~gm~Ta~aE~LG~s~~~~f~EMaFsIEK~tVtAKSRaLKAE 208 (455) . +-..++++...++ ...|.+..|.+.|. +-... T Consensus 186 -----~----------------------------------a~~v~Eg~~~~~~-~~~f~~i~~~~~k~-------~~~i~ 218 (402) T protein:vir:93 186 -----D----------------------------------DDFITDVETAKEL-KAKGDTVKFTTNKF-------KVFAA 218 (402) T ss_pred -----c----------------------------------ccccccccccccc-ccccceeeecceee-------eeech Confidence 0 0001111111111 12344554444444 44567 Q ss_pred eeHHHHHhHHHhhCCChhHHHHHHHHHHHHHHhhHHHHHHHhhhheeeeeecc-ccceeeeeeccccchhhHHHHHHHHH Q lcl|NC_015280. 209 YSVELAQDLKAIHGLDAESELANILSTEILAEINREVVRTVYRGAKPGAQANV-ANAGVFDLDVDSNGRWSVEKFKGLLF 287 (455) Q Consensus 209 YTiELAQDLkAiHGLDAE~ELanILStEImlEINReII~~l~~vA~~~k~~~v-~~~gv~Dl~~~~~gr~~ve~~k~l~~ 287 (455) +|-||.+|- ..|.+++|.+-|+..|..-.|..++-.-. ..|...++ ...++.-.. +--.....+.|++ T Consensus 219 iS~ell~Ds----~~~l~~~i~~~la~~~~~~e~~~~~~~g~---g~g~p~g~~~~~~~~~~~----~~~~~d~l~~~~~ 287 (402) T protein:vir:93 219 ISDTVIHGS----DVDLVNWVENALQSGLAAKERKDALAVSP---KSGLEHMSFYNGSVKEVE----GADMYDAIINALA 287 (402) T ss_pred hhHHHHhhh----HHHHHHHHHHHHHHHHHHHHHHhHhhcCC---Cccccceeeecccccccc----ccchHHHHHHHHh Confidence 999999985 34678888888888888765655543222 12221111 111221111 1101122333434 Q ss_pred HHHHHHHHHHHHhcCCCccEEEEchhHHHHHHhhcccccccccccccccccccccCCceeEEEecCceEEEEeccccccC Q lcl|NC_015280. 288 QIERDANAIAQETRRGKGNIIITSADVASALAMSGVLDYDSGISGAVGGIGEIDDTGNTFVGTLNGRFKVYIDPYSANVS 367 (455) Q Consensus 288 qi~~ean~i~~~T~~~~gn~~v~S~~va~~L~~sG~l~~~~~~~~~~~~~~~~d~t~~~~~G~l~~~~~vy~D~y~~~~s 367 (455) .+.. --+..+.|++-+.....++.. ++... +. .. ...+ ++|. +++||+..++ T Consensus 288 ~l~~--------~y~~na~~imn~~t~~~~~~~---~~d~~---~~---~~--~~~~----~~ll-G~PV~~t~~~---- 339 (402) T protein:vir:93 288 DLHE--------DYRDNATIYMRYADYVKIISV---LSNGT---TN---FF--DTPA----EKVF-GKPVVFTDAA---- 339 (402) T ss_pred ccCh--------hhhcCCEEEEechHHHHHHHH---HhcCC---Cc---cc--ccCC----cccc-ccceEEecCC---- Confidence 3332 124466776655555555543 33110 00 00 0111 2465 5699887543 Q ss_pred CcceEEEEEecCccccceeEEcccccccceeecCCccccceeeeeeecce-eeccccccccccc---ccCch Q lcl|NC_015280. 368 DNQYYVVGYKGTNAYDAGLFYCPYVPLQMYRAIGQDTFQPRIGFKTRYGM-VLNPFAKGLTALS---DSDPQ 435 (455) Q Consensus 368 ~~dY~~vG~KG~~~~daglfyaPYv~l~~~~~~Dp~s~qP~~g~~tRY~l-~~nP~~~~~~~~~---~~~~~ 435 (455) + + +++|-- +-||.=|-....-+..|+.+.+-.+-...|++. +.||-+.-.-++. ...|. T Consensus 340 ~-~-i~~GDf-------~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~Dg~v~~~~A~~~l~ik~~~~~~~~ 402 (402) T protein:vir:93 340 V-K-PIVGDF-------NYFGINYDGTTYDTDKDVKKGEYLFVLTAWYDQQRTLDSAFRIAKAKENTGPLPS 402 (402) T ss_pred C-c-eeeech-------hhhhhhhhhhhhhhhhcccCCceEEEEEEEeCcEEechhheEEEEeecCCCCCCC Confidence 1 1 344421 112222211112222344454444444558877 7777666322221 22222 Done!